28 کامیت‌ها

مولف SHA1 پیام تاریخ
Sajina PK 15c82d6da8 [rocprofiler-system]: Enable UCX Communication API tracing (#2306)
## Motivation

Enable UCX communication tracing and communication metadata 

## Technical Details

Implement UCX API wrappers to trace transport-layer communication. This adds communication data tracking and exposes “UCX Comm Send/Recv” timelines, enabling detailed analysis of MPI, OpenSHMEM, and other UCX-based runtime communication patterns.

- Implements function interception for UCX functions across multiple categories using gotcha component.
- Extended comm_data component to track UCX send/recv operations - Added ucx_send and ucx_recv labels for Perfetto counter tracks. Integrated UCX data tracking with existing MPI/RCCL tracking infrastructure.
- Added ROCPROFSYS_USE_UCX configuration option (enabled by default).
- Created FindUCX.cmake module for UCX header detection. Falls back to internal UCX headers if system headers not found.
- Updated all Dockerfiles  to include UCX dependencies.
2026-01-20 13:16:43 -05:00
Jason Bonnell 082e7adb81 Updated VERSION regex for tarball in Dockerfiles (#1321) 2025-10-10 15:37:13 -04:00
David Galiffi c0f8627e7f Update CI Docker files (#1202)
- Add `nlohmann-json-dev` (or equivalent) to CI Docker images for RHEL, SUSE, and Ubuntu.
- Add `gmock-dev` and `gtest-dev` (or equivalent) to CI Docker images for RHEL, SUSE, and Ubuntu.
- Add `--set solver classic` to conda config to resolve an issue setting up the conda environment
- Fix Perfetto package installation on ubuntu noble image.
- Add a check and log error if pip installation fail 

---------

Co-authored-by: jbonnell-amd <jason.bonnell@amd.com>
2025-10-02 21:06:01 -04:00
Jason Bonnell 953fd60e9b rocprofiler GHCR Rename (#1112)
- Rename the GHCR packages for rocprofiler Docker images to reduce the number of packages that will be released on the repository
- Changed package name to only include the OS instead of OS+Version - version moved to the tag instead.
- Updated Dockerfile.*.ci files to specify target ROCm version from tarball in name.
2025-09-30 15:15:12 -04:00
Jason Bonnell 8b52d71cc7 rocprofiler-systems - add gfx containers to ghcr (#883)
* Initial skeleton code for rocprofiler-systems-continuous-integration.yml

* Add python3-devel to opensuse and rhel ci images

* Update rocprofiler-systems-containers.yml to include TheRock tarballs

* Update pip install command for Dockerfile.ubuntu.ci

* Fix pip install again for Dockerfile.ubuntu.ci

* Remove skeleton workflow for CI

* Add new ci-gfx containers for TheRock installs

* Add set -e and pipefail to ci Dockerfiles to detect errors

* Upgrade pip in Dockerfile.ubuntu.ci

* revert pipefail set -e change

* Replace build-docker-ci.sh script with Docker step for ci-base

* Add support for gfx950, add containers-ci-gfx.yml

* Add working-directory to matrix setup steps

* Try changing containers-ci-gfx.yml

* make more changes to containers-ci-gfx.yml

* Remove build-docker-ci.sh script from gfx step, fix typo in Dockerfile

* Remove gfx110X and gfx120X for now

* Update ci-gfx docker workflow to use ghcr.io

* Temporary change to test one image

* Enable push to test out ghcr package

* Add labels to debug oauth issue

* add pacakages permissions to step

* add rocprofiler-systems-ghcr.yml workflow

* Remove cache from Docker push action step

* Add prefix to tag

* Add back gfx94X and gfx950 support, add back no push on PR

* Remove gfx container creation from rocprofiler-systems-containers.yml

* Add a gfx950 image for now

* Revert change
2025-09-22 16:58:55 -04:00
David Galiffi d111e9a297 [rocprofiler-systems] Add Debian 12 workflows (#402)
* Create CI dockers for debian 12

* Create Debian workflow

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Fixing typo

* Apply suggestions from code review

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update default value for script's "VERSIONS" variable

* Fix Docker build warnings

LegacyKeyValueFormat: "ENV key=value" should be used instead of legacy "ENV key value" format

* Refactored the check for `pip install --break-system-packages`

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-20 14:58:49 -04:00
Milan Radosavljevic b793b183a4 Update rocprofiler-systems github workflows (#193)
* Fix rocprofiler-systems CI

* Fix 'Documentation' jobs

* Python Linting fix

* Add python 3.11, 3.12

* Fix python linting

* Re-add ubuntu-noble workflow

* Remove old workflows from project folder

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update rocprofiler-systems workflows

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Signed-off-by: Jason Bonnell <Jason.Bonnell@amd.com>

* Retire ubuntu-focal workflow

* Fix path to validation file in `build-docker.sh`

* Update .github/workflows/rocprofiler-systems-python.yml

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

* Revert dockerfile

* Retire rocprofiler-systems-ubuntu-focal workflow

* Include .github directory in cpack workflow sparse-checkout step

* Revert git from ubuntu ci image

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Signed-off-by: Jason Bonnell <Jason.Bonnell@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-08-12 19:42:01 +02:00
Jason Bonnell 4a677d6506 Add ninja to all Dockerfiles (#304)
[ROCm/rocprofiler-systems commit: 6f8cb05140]
2025-07-30 13:35:40 -04:00
David Galiffi 20434af022 Adding sqlite-dev to our ci docker images (#280)
[ROCm/rocprofiler-systems commit: 86f025e5fd]
2025-07-14 17:37:47 -04:00
David Galiffi 76cde58f56 Removing dyninst builds from CI docker files (#249)
Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 8535da17c8]
2025-06-13 17:16:01 -04:00
Pranjal Swarup 8a2ac0ae71 Change docker scripts to use miniforge instead of miniconda (#242)
[ROCm/rocprofiler-systems commit: c00070ddbf]
2025-06-11 23:43:34 -04:00
ajanicijamd 4c4fc2bebe Fixed NIC performance monitoring test (#189)
---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: b98c3c8b86]
2025-06-04 09:25:52 -04:00
David Galiffi 909602e1f4 Update Dockerfiles (#207)
Add chrpath to dockerfiles

[ROCm/rocprofiler-systems commit: fc08d56fb4]
2025-05-15 21:06:12 -04:00
David Galiffi 80423ff010 Updating release build dockers (#195)
- Bringing in recent changes from rocm-6.4 branch (https://github.com/ROCm/rocprofiler-systems/pull/171)
- Add libdrm-devel to rhel and suse files
- Update ROCm installation method in Ubuntu file
- Add additional output to `test-release.sh` to catch failures due to a Python version not included
- Add Python 3.13 to Dockers

[ROCm/rocprofiler-systems commit: 83a9eb3d7c]
2025-05-02 19:29:15 -04:00
David Galiffi 8a70f4b15d Update workflow runners due to the deprecation of ubuntu-20.04 runners (#102)
* Update runners to `ubuntu-latest`.

The `ubuntu-20.04` runner is deprecated and will be removed.

* Add 'vim' and 'perfetto' to CI docker images

For convenience when using the images locally.

[ROCm/rocprofiler-systems commit: a25554359b]
2025-03-04 19:31:13 -05:00
David Galiffi b29cfac106 Update to use rocprofiler-sdk (#55)
- Renames the CMake option "ROCPROFSYS_USE_HIP" to "ROCPROFSYS_USE_ROCM"
- Remove the "ROCPROFSYS_USE_ROCM_SMI option. Controlled with the "ROCPROFSYS_USE_ROCM" option, instead.
   - Runtime configuration can still toggle ROCPROFSYS_USE_ROCM_SMI to disable the sampling.
- Rename ROCPROFSYS_HIP_VERSION macro to ROCPROFSYS_ROCM_VERSION and remove blocks for `ROCPROFSYS_ROCM_VERSION < 60000`
- Remove ROCPROFSYS_USE_ROCTRACER and ROCPROFSYS_USE_ROCPROFILER
- Update test cases
- Update docker files and workflows to install cmake 3.21, which is required for the rocprofiler-sdk findPackage script.
- Removed rocm-6.2 from workflows due to a rocprofiler-sdk API change. 

[ROCm/rocprofiler-systems commit: 88aa2d3cbe]
2024-12-13 18:48:39 -05:00
David Galiffi b73bd13a86 Adding installer for Ubuntu 24.04 (#14)
* Add installers for ubuntu 24.04

* Formatting change to the ubuntu-focal and ubuntu-jammy workflows

* Initial Ubuntu 24.04 workflow - just build test

[ROCm/rocprofiler-systems commit: 398ea62629]
2024-12-11 19:36:04 -05:00
David Galiffi 9310643cf5 Update cmake version installed in dockerfiles (#25)
* Update cmake version installed in dockerfiles
* Standardize the cmake_minimum_required to 3.18.4 across dockerfiles
* Fix link to perl repo in opensuse docker.

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: cef228bfbd]
2024-11-08 18:49:42 -05:00
David Galiffi d8b0f12cd0 Update workflows and docker images (#10)
Updated OS test matrix to match ROCm 6.2. 
Update build and CI docker files
Remove the "docs" workflow, because "read-the-docs" is now being used for ROCm documentation

[ROCm/rocprofiler-systems commit: b15c9e94fc]
2024-10-21 14:58:30 -04:00
David Galiffi 5e5a9cabc9 This fixes the CPACK workflow failures for Ubuntu and RHEL (#368)
* Add texinfo to Ubuntu and RHEL docker files

* Add the `bison` package to Ubuntu dockerfile

* Update the CI docker files too.

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: d0d97490b4]
2024-08-02 16:03:19 -04:00
Jonathan R. Madsen 41e9dddc74 Installers for ROCm 5.7, Python 3.12 + Remove DEB and RPM installers (#313)
* Dockerfile update

- Python 3.12 support

* Bump version to 1.10.4

* Update docker scripts

- Support Python 3.12
- Set RETRY to 1 if less than 1
- Support ROCm 5.7

* Update scripts/build-release.sh

- Default to python 3.6-3.12 (i.e. add Python 3.12)

* Update cpack workflow

- Packaging for ROCm 5.7
  - Ubuntu 20.04
  - Ubuntu 22.04
  - OpenSUSE 15.4
  - RHEL 8.7
  - RHEL 9.1
- Packaging for older ROCms (by request)
  - RHEL 8.7 + ROCm 5.3
  - OpenSUSE 15.3 + ROCm 5.2
  - OpenSUSE 15.4 + ROCm 5.2
  - OpenSUSE 15.4 + ROCm 5.3
- Remove DEB and RPM installers
  - Only generate STGZ installers

* Update cpack workflow

- disable uploading DEB and RPM artifacts

[ROCm/rocprofiler-systems commit: 2e581e2a10]
2023-10-16 10:39:18 -05:00
Jonathan R. Madsen e70d684c98 Python 3.11 support + update RedHat CPack (#254)
* Fixes for Python 3.11

* Add python 3.11 to scripts

- also tweak to to{upper,lower} bash functions

* Fix PAPI RPM packaging in RedHat

- fix error from #!/usr/bin/python in papi_hl_output_writer.py
  - requires either python2 or python3 instead of python

* cpack updates

- only generate STGZ for RedHat
- support `--generators` arg in build-release.sh
- support 7z, zip, and other zip generators
- fix build-release.sh with `--mpi`
- support setting CONDA_ROOT

* Support rhel/fedora/centos in omnitrace-install.py

* RedHat status badge

* Fix support for Python 3.11 + tweak ubuntu ci

- Remove installing clang and mpich in Ubuntu CI container
- Fallback on conda-forge for Python 3.11
- Enable entrypoint-rhel.sh for RHEL CI
- Pull latest container by default

* Update ElfUtils and PAPI builds

- quieter build output
- disable-nls for ElfUtils
- use -s flag for make

* Development Guide Docs

[ROCm/rocprofiler-systems commit: 83f9ed8696]
2023-03-08 00:19:29 -06:00
Jonathan R. Madsen 446fd36a93 Add RedHat CI and release packaging (#251)
- additional miscellaneous tweaks to workflows and docker scripts, e.g. install perfetto python bindings
- improves the stability of MPI finalization
- reduces some debug messages within timemory when `OMNITRACE_DEBUG=ON`
- fixes issue found in RHEL where libunwind is using mutex and omnitrace was not treating this as an internal mutex call
  - this may have been affecting the causal profiling slightly (tests seem a bit more stable now)
- fix data race in timemory

* Add RedHat CI and release packaging

- additional miscellaneous tweaks to workflows and docker scripts, e.g. install perfetto python bindings

* Fix URL for ROCm packages in redhat workflow

* Fix dnf --enable-repo for ROCm perl packages

* Dockerfile.rhel and redhat.yml updates

- Fix dnf repo for ROCm PERL packages
- Disable python in CI (interpreter segfaults)
- Exclude parallel-overhead-locks tests due to inclusion of internal locks
  - This needs to be remedied in the future

* Exclude _dl_relocate_static_pie from instrumentation

* Testing updates

- OMNITRACE_SAMPLING_KEEP_INTERNAL=OFF for parallel-overhead-locks

* Fix redhat workflow

* redhat.yml update

- remove if condition on config/build/test step

* Update timemory submodule

- tweaks to verbosity messages

* Set thread state before unw_step

- on Redhat, unw_step calls mutex

* Update timemory submodule

- verbosity changes
- gotcha uses spin_lock/spin_mutex

* Remove using gsplit-dwarf unless OMNITRACE_BUILD_NUMBER > 2

* Re-enable parallel-overhead-locks tests in redhat workflow

* Always disable timemory manager metadata auto output

* testing updates

- tweak parallel-overhead-locks-timemory to higher instruction count min
- OMNITRACE_SAMPLING_KEEP_INTERNAL=OFF for parallel-overhead-locks-perfetto

* Update timemory submodule

- quiet realpath queries

* omnitrace exe updates

- detect text files
- improved bin/lib locating

* cmake format

* test-install.sh and redhat workflow updates

- handle testing when ls is script
- re-enable python testing on redhat workflow
- invoke test-install.sh in redhat workflow

* Misc guards for finalization

* omnitrace-exe, testing updates

- test-install.sh: LS_EXEC -> LS_NAME
- handle /usr/bin/ls being script in source/bin/tests
- improve locating the binary

* Fix mpi_gotcha compile error

* omnitrace-exe updates

- improve file locating

* formatting

* Misc fixes

- remove -static-libstdc++ for RHEL packaging (rocky-linux doesn't distribute static lib)

* omnitrace-exe paths

* Replace realpath with absolute

- using absolute path to symlink fixes issues with locating libdyninstAPI_RT at runtime

* omnitrace exe updates

- judicious use of realpath

* Update timemory submodule

- fix update main hash ids/aliases data race in merge

* bin tests update

- change working directory of omnitrace-exe-simulate-lib-basename

* omnitrace exe updates

- Update resolved exe/lib messaging

* bin tests update

- change working directory of omnitrace-exe-simulate-lib-basename

[ROCm/rocprofiler-systems commit: 1688a027d8]
2023-03-07 06:04:19 -06:00
Jonathan R. Madsen c87e69e522 Submitting jobs to cdash (#124)
* Submitting jobs to cdash

* Fail on submit

* submit url env

* submit url env

* try passing submit url as arg

* fix submit url

* Updated default URL

* Add submissions for remaining ubuntu focal workflow jobs

* Replace g++ with gcc in dashboard build name

* Add --ctest-args to run-ci.sh

* Add cdash support for bionic, jammy, and opensuse workflows

* Decrease CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE

* OMNITRACE_BUILD_CODECOV option

* Support code coverage in CDash script

* CI dyninst built with debug info

* Update ci-containers

- cron schedule moved 4 hours later to UTC+5

* Update implementation of config::configure_signal_handler

- using lambdas failed to compile with codecov flags

* Add codecov job to ubuntu focal workflow

* Fix support for --ctest-args in run-ci script

* Fix ubuntu workflows

* Fix quotation handling in run-ci script

* git safe directory for codecov

* New MPI examples

* Remove --stop-on-failure

* dynamic_library update

- find_library_path checks procfs maps
- invoke find_library_path with no additional args to resolve to mapped file

* RCCLP uses dynamic_library

* check if file exists for memory_map_files metadata

* Testing updates

- include new mpi examples in tests
- fix test labels
- test critical-trace exe

* Update MPI C examples tests (needed arg)

* Remove try/catch block from critical-trace

* Fix sampling max wait when shutting down

* Fix test env for critical-trace

* Fix settings for critical-trace

- disable time output: data is deterministic
- disable PID suffixes: not multiprocess

* Update critical-trace ctest

* Update critical-trace exe

- throw error if input cannot be opened
- throw error if input has no data

* Update lulesh example with more kokkos tools usage

* Fix tasking issue with critical_trace and roctracer

- were not setting pools to active
- also sync before critical_trace::get_entries

* Increase verbosity of critical-trace tests

* Update code coverage tests

- skip code coverage + preload
- code-coverage python example and test

* Remove duplication omnitrace.initialize function

* Skip python3.6 for ubuntu jammy

* Update MPI examples

- use MPI_Isend and MPI_Irecv
- explicitly use MPI_Bcast

* Update Formatting.cmake

- include C files in examples

* run-ci script does not check return of coverage

* mpi-allreduce link to libm

* Update ctest args in run-ci script

* Update dyninst submodule

- safety improvements in BinaryEdit::openResolvedLibraryName

* capture cmake error for ctest_coverage

[ROCm/rocprofiler-systems commit: 46b6db1a4c]
2022-10-31 15:39:45 -05:00
Jonathan R. Madsen 713d08de6d Support for Ubuntu 22.04 and ROCm 5.3 (#48)
* Testing and CI support for Ubuntu 22.04

* Fixes for ROCm

- Jammy does not have ROCm installers

* Name, timeout, and python updates

- renamed ubuntu-jammy-external.yml to ubuntu-jammy.yml
- increased all 5 minute timeouts to 10 minutes
- include python 3.10 in testing

* Update dyninst to remove interposed definition of _r_debug

* Rebuild Dyninst + test install script

* Revert container change

* git safe directory

* pushd -> cd

* fix MPI include

* Fix testing step

* OMPI_ALLOW_RUN_AS_ROOT

* Test script changes

* Fix mismatched malloc / delete[]

* Jammy workflow tweaks

* CPack tweak for boost deb deps

* pthread_mutex_gotcha config returns when not enabled

* fix echoing config in CI

* USE_CLANG_OMP

- option to disable using LLVM OpenMP when building OpenMP test executables
- Jammy workflow sets USE_CLANG_OMP=OFF

* Dyninst submodule boost download

- updated containers workflow to include jammy
- updated workflow to use ci

* Updates to workflows + replace test-install.sh

- test-install.sh in this branch was replaced with one in main branch

* Expand jammy test-install.sh args

* Fix openmp-cg-sampling-duration test

* update timemory submodule

- use-after-free violation in popen::pclose

* revert some tweaks to sampling-duration test

* Fix env of test-install.sh

* cmake format

* jammy bash

* CPack install for jammy

* formatting workflow action version bump

* Update timemory submodule

- libunwind submodule via timemory sets SOVERSION to 99 to avoid ABI conflicts with v8

* Fix help menu for omnitrace-sample

* Support other boolean forms in test-install.sh

* Update docker files and build-docker.sh

- consolidated cases in build-docker.sh
- support rocm version of 0.0 (no rocm install)
- support rocm v5.3
- updated centos handling

* update opensuse actions/checkout version

* Tweaks to ubuntu-focal testing

- actions/checkout@v3
- use test-install script

* update cpack

- ubuntu 22.04
- rocm 5.3
- rename os matrix field to os-version
- remove CI_ROCM_VERSION (no longer necessary)
- remove default-rocm-version matrix field (no longer necessary)
- CentOS packaging

* fix argparsing and omnitrace-sample tests in install-tests.sh

* focal rocm test install workflow fix

* Fix omnitrace-sample build

* Dockerfile.centos + build-docker.sh updates

* Update actions/upload-artifact version

* Dockerfile.ubuntu: install rocm-device-libs

* Refactor cpack

* fix cpack if quotes

* Dockerfile.ubuntu rocm < 5 installs rocm-dev

* build-release.sh defaults to boost version 1.79.0

[ROCm/rocprofiler-systems commit: ede6007f9b]
2022-10-17 12:54:26 -05:00
Jonathan R. Madsen 358a3a7e36 Docker and build-release script updates [skip ci] (#61)
- Update CPack

[ROCm/rocprofiler-systems commit: 9cba1f80ba]
2022-05-19 16:06:38 -05:00
Jonathan R. Madsen 28ade7fbb9 Update CI to test multiple python versions (#45)
* Update CI to test multiple python versions

* Ensure numpy is installed

* Handle lulesh with cmake < 3.16

* Fix typo

* Bump minimum CMake version to 3.16

- CMake 3.15 has issue with PTL object library

* Tweak CI test output

[ROCm/rocprofiler-systems commit: 22eaa780ec]
2022-04-22 03:05:07 -05:00
Jonathan R. Madsen 127e30a4d7 Documentation + Miscellaneous Fixes (#36)
* Added documentation markdown source

* Replaced AARInternal with AMDResearch in URLs

* Renamed cpack artifact names

* Fix to testing and lulesh submodule checkout

* Docker updates

* CMake and CPack

- force CMAKE_INSTALL_LIBDIR to lib
- CPACK_DEBIAN_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- CPACK_RPM_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- Tweak LIBOMP_LIBRARY find in examples/openmp
- Tweak setup-env.sh.in

* Partial update of README

- status badges
- docs link
- removed install info (covered by docs)

* OMNITRACE_SAMPLING_CPUS setting

- enables control over which CPUs are sampled for frequency

* omnitrace exe updates

- exclude transaction clone, virtual thunk, non-virtual thunk
- module_function::start_address
- module_function::instructions
- verbosity > 0 encodes instructions into JSON

* Miscellaneous fixes

- relocate setup-env.sh.in
- add modulefile.in
- Updated README.md and source/docs/about.md
- cmake fix for libomp
- fix license in miscellaneous places
- dl.hpp and dl.cpp

* Update timemory and dyninst submodules

- timemory signals updates
- dyninst Movement-adhoc updates

* cmake format

[ROCm/rocprofiler-systems commit: 945f541965]
2022-04-04 15:27:38 -05:00