28 Tiomáintí

Údar SHA1 Teachtaireacht Dáta
Sajina PK 15c82d6da8 [rocprofiler-system]: Enable UCX Communication API tracing (#2306)
## Motivation

Enable UCX communication tracing and communication metadata 

## Technical Details

Implement UCX API wrappers to trace transport-layer communication. This adds communication data tracking and exposes “UCX Comm Send/Recv” timelines, enabling detailed analysis of MPI, OpenSHMEM, and other UCX-based runtime communication patterns.

- Implements function interception for UCX functions across multiple categories using gotcha component.
- Extended comm_data component to track UCX send/recv operations - Added ucx_send and ucx_recv labels for Perfetto counter tracks. Integrated UCX data tracking with existing MPI/RCCL tracking infrastructure.
- Added ROCPROFSYS_USE_UCX configuration option (enabled by default).
- Created FindUCX.cmake module for UCX header detection. Falls back to internal UCX headers if system headers not found.
- Updated all Dockerfiles  to include UCX dependencies.
2026-01-20 13:16:43 -05:00
Jason Bonnell 463126770a Update build docker container workflow, opensuse dockerfiles (#1883)
## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve. -->

- __Reduced Code Duplication__: Version parsing logic moved from individual Dockerfiles to the central build script
- __Improved Edge Case Handling__: Better handling of ROCm versions with and without patch numbers (e.g., `6.2` vs `6.2.0`)
- __Easier Maintenance__: Future version-related changes only need to be made in one place
- __Cleaner Dockerfiles__: Simplified Dockerfiles focus on package installation rather than complex shell logic
- __Updated Platform Support__: Refreshed container matrix to reflect current platform/ROCm version combinations
- __Fix OpenSUSE Docker Generation__: OpenSUSE container generation fails due to a change to the `binutils-gold` package
- __Error Handling__: Fix bug where errors in docker image build were being masked, allowing workflow to pass anyway.


## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->
- Updated `Dockerfile.opensuse` and `Dockerfile.opensuse.ci` docker files to remove `binutils-gold`
  - Not needed since we build `binutils` with systems anyways
- Updated `rocprofiler-systems-containers.yml` to remove `pushd/popd` commands and just run the shell scripts
  - There was a silent failure observed here, which I verified in this PR before adding the fix for openSUSE
- Refactor ROCm version parsing. Move this logic to the `build-docker.sh` script to reduce duplication.
  - Fix bug that caused ROCm 7.0 to fail installation. The trailing `.0` was being trimmed.
- Fixed inconsistencies in `containers.yml` that lead to invalid ROCm-OS_VERSION combinations.
- Formatting fixes 
  - Removed trailing whitespace
  - Fix docker build warnings. Use an `=` rather than ` ` when assigning an environment variable.
2025-12-04 23:33:15 -05:00
Kian Cossettini 63713f01e0 [rocprofiler-systems] Add Fortran MPI CTests (#1172)
* Add MPI CTests (use gfortran)

* Add proper regex check

* Skip Runtime-Instrument due to incompatibility with MPI

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Sajina Kandy <sputhala@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-27 10:32:09 -05:00
Jason Bonnell 082e7adb81 Updated VERSION regex for tarball in Dockerfiles (#1321) 2025-10-10 15:37:13 -04:00
David Galiffi c0f8627e7f Update CI Docker files (#1202)
- Add `nlohmann-json-dev` (or equivalent) to CI Docker images for RHEL, SUSE, and Ubuntu.
- Add `gmock-dev` and `gtest-dev` (or equivalent) to CI Docker images for RHEL, SUSE, and Ubuntu.
- Add `--set solver classic` to conda config to resolve an issue setting up the conda environment
- Fix Perfetto package installation on ubuntu noble image.
- Add a check and log error if pip installation fail 

---------

Co-authored-by: jbonnell-amd <jason.bonnell@amd.com>
2025-10-02 21:06:01 -04:00
Jason Bonnell 953fd60e9b rocprofiler GHCR Rename (#1112)
- Rename the GHCR packages for rocprofiler Docker images to reduce the number of packages that will be released on the repository
- Changed package name to only include the OS instead of OS+Version - version moved to the tag instead.
- Updated Dockerfile.*.ci files to specify target ROCm version from tarball in name.
2025-09-30 15:15:12 -04:00
Jason Bonnell 8b52d71cc7 rocprofiler-systems - add gfx containers to ghcr (#883)
* Initial skeleton code for rocprofiler-systems-continuous-integration.yml

* Add python3-devel to opensuse and rhel ci images

* Update rocprofiler-systems-containers.yml to include TheRock tarballs

* Update pip install command for Dockerfile.ubuntu.ci

* Fix pip install again for Dockerfile.ubuntu.ci

* Remove skeleton workflow for CI

* Add new ci-gfx containers for TheRock installs

* Add set -e and pipefail to ci Dockerfiles to detect errors

* Upgrade pip in Dockerfile.ubuntu.ci

* revert pipefail set -e change

* Replace build-docker-ci.sh script with Docker step for ci-base

* Add support for gfx950, add containers-ci-gfx.yml

* Add working-directory to matrix setup steps

* Try changing containers-ci-gfx.yml

* make more changes to containers-ci-gfx.yml

* Remove build-docker-ci.sh script from gfx step, fix typo in Dockerfile

* Remove gfx110X and gfx120X for now

* Update ci-gfx docker workflow to use ghcr.io

* Temporary change to test one image

* Enable push to test out ghcr package

* Add labels to debug oauth issue

* add pacakages permissions to step

* add rocprofiler-systems-ghcr.yml workflow

* Remove cache from Docker push action step

* Add prefix to tag

* Add back gfx94X and gfx950 support, add back no push on PR

* Remove gfx container creation from rocprofiler-systems-containers.yml

* Add a gfx950 image for now

* Revert change
2025-09-22 16:58:55 -04:00
Jason Bonnell 4a677d6506 Add ninja to all Dockerfiles (#304)
[ROCm/rocprofiler-systems commit: 6f8cb05140]
2025-07-30 13:35:40 -04:00
David Galiffi 20434af022 Adding sqlite-dev to our ci docker images (#280)
[ROCm/rocprofiler-systems commit: 86f025e5fd]
2025-07-14 17:37:47 -04:00
David Galiffi 76cde58f56 Removing dyninst builds from CI docker files (#249)
Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 8535da17c8]
2025-06-13 17:16:01 -04:00
Pranjal Swarup 8a2ac0ae71 Change docker scripts to use miniforge instead of miniconda (#242)
[ROCm/rocprofiler-systems commit: c00070ddbf]
2025-06-11 23:43:34 -04:00
ajanicijamd 4c4fc2bebe Fixed NIC performance monitoring test (#189)
---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: b98c3c8b86]
2025-06-04 09:25:52 -04:00
David Galiffi 909602e1f4 Update Dockerfiles (#207)
Add chrpath to dockerfiles

[ROCm/rocprofiler-systems commit: fc08d56fb4]
2025-05-15 21:06:12 -04:00
David Galiffi 80423ff010 Updating release build dockers (#195)
- Bringing in recent changes from rocm-6.4 branch (https://github.com/ROCm/rocprofiler-systems/pull/171)
- Add libdrm-devel to rhel and suse files
- Update ROCm installation method in Ubuntu file
- Add additional output to `test-release.sh` to catch failures due to a Python version not included
- Add Python 3.13 to Dockers

[ROCm/rocprofiler-systems commit: 83a9eb3d7c]
2025-05-02 19:29:15 -04:00
David Galiffi 8a70f4b15d Update workflow runners due to the deprecation of ubuntu-20.04 runners (#102)
* Update runners to `ubuntu-latest`.

The `ubuntu-20.04` runner is deprecated and will be removed.

* Add 'vim' and 'perfetto' to CI docker images

For convenience when using the images locally.

[ROCm/rocprofiler-systems commit: a25554359b]
2025-03-04 19:31:13 -05:00
David Galiffi b29cfac106 Update to use rocprofiler-sdk (#55)
- Renames the CMake option "ROCPROFSYS_USE_HIP" to "ROCPROFSYS_USE_ROCM"
- Remove the "ROCPROFSYS_USE_ROCM_SMI option. Controlled with the "ROCPROFSYS_USE_ROCM" option, instead.
   - Runtime configuration can still toggle ROCPROFSYS_USE_ROCM_SMI to disable the sampling.
- Rename ROCPROFSYS_HIP_VERSION macro to ROCPROFSYS_ROCM_VERSION and remove blocks for `ROCPROFSYS_ROCM_VERSION < 60000`
- Remove ROCPROFSYS_USE_ROCTRACER and ROCPROFSYS_USE_ROCPROFILER
- Update test cases
- Update docker files and workflows to install cmake 3.21, which is required for the rocprofiler-sdk findPackage script.
- Removed rocm-6.2 from workflows due to a rocprofiler-sdk API change. 

[ROCm/rocprofiler-systems commit: 88aa2d3cbe]
2024-12-13 18:48:39 -05:00
David Galiffi 9310643cf5 Update cmake version installed in dockerfiles (#25)
* Update cmake version installed in dockerfiles
* Standardize the cmake_minimum_required to 3.18.4 across dockerfiles
* Fix link to perl repo in opensuse docker.

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: cef228bfbd]
2024-11-08 18:49:42 -05:00
David Galiffi d8b0f12cd0 Update workflows and docker images (#10)
Updated OS test matrix to match ROCm 6.2. 
Update build and CI docker files
Remove the "docs" workflow, because "read-the-docs" is now being used for ROCm documentation

[ROCm/rocprofiler-systems commit: b15c9e94fc]
2024-10-21 14:58:30 -04:00
Jonathan R. Madsen 3dad7a8c18 Installers for RHEL 8.8 and RHEL 8.9 (#334)
* Installers for RHEL 8.8 and RHEL 8.9

- RHEL 8.8: Supports ROCm 5.6, 5.7, 6.0
- RHEL 8.9: Supports ROCm 6.0

* Update build-docker.sh

- fix PERL_REPO for OpenSUSE

* Update some workflows to use Node.js 20

* Fix Dockerfile.opensuse*

[ROCm/rocprofiler-systems commit: 1df597e049]
2024-04-01 13:31:10 -05:00
Jonathan R. Madsen 1444bf4d85 ROCm 6.0 packaging (#325)
* Update build-docker.sh

- support rocm 6.0

* Update cpack workflow

- support rocm 6.0

* Update CI testing workflow paths-ignore

- changes to {docs,cpack,containers,formatting}.yml and docker do not require testing

* Update docker for OpenSUSE

- always use --non-interactive with zypper
- tweak to PERL_REPO when OS version >= 15.4

[ROCm/rocprofiler-systems commit: cfaace38a8]
2024-01-10 17:29:47 -06:00
Jonathan R. Madsen 41e9dddc74 Installers for ROCm 5.7, Python 3.12 + Remove DEB and RPM installers (#313)
* Dockerfile update

- Python 3.12 support

* Bump version to 1.10.4

* Update docker scripts

- Support Python 3.12
- Set RETRY to 1 if less than 1
- Support ROCm 5.7

* Update scripts/build-release.sh

- Default to python 3.6-3.12 (i.e. add Python 3.12)

* Update cpack workflow

- Packaging for ROCm 5.7
  - Ubuntu 20.04
  - Ubuntu 22.04
  - OpenSUSE 15.4
  - RHEL 8.7
  - RHEL 9.1
- Packaging for older ROCms (by request)
  - RHEL 8.7 + ROCm 5.3
  - OpenSUSE 15.3 + ROCm 5.2
  - OpenSUSE 15.4 + ROCm 5.2
  - OpenSUSE 15.4 + ROCm 5.3
- Remove DEB and RPM installers
  - Only generate STGZ installers

* Update cpack workflow

- disable uploading DEB and RPM artifacts

[ROCm/rocprofiler-systems commit: 2e581e2a10]
2023-10-16 10:39:18 -05:00
Jonathan R. Madsen e70d684c98 Python 3.11 support + update RedHat CPack (#254)
* Fixes for Python 3.11

* Add python 3.11 to scripts

- also tweak to to{upper,lower} bash functions

* Fix PAPI RPM packaging in RedHat

- fix error from #!/usr/bin/python in papi_hl_output_writer.py
  - requires either python2 or python3 instead of python

* cpack updates

- only generate STGZ for RedHat
- support `--generators` arg in build-release.sh
- support 7z, zip, and other zip generators
- fix build-release.sh with `--mpi`
- support setting CONDA_ROOT

* Support rhel/fedora/centos in omnitrace-install.py

* RedHat status badge

* Fix support for Python 3.11 + tweak ubuntu ci

- Remove installing clang and mpich in Ubuntu CI container
- Fallback on conda-forge for Python 3.11
- Enable entrypoint-rhel.sh for RHEL CI
- Pull latest container by default

* Update ElfUtils and PAPI builds

- quieter build output
- disable-nls for ElfUtils
- use -s flag for make

* Development Guide Docs

[ROCm/rocprofiler-systems commit: 83f9ed8696]
2023-03-08 00:19:29 -06:00
Jonathan R. Madsen 446fd36a93 Add RedHat CI and release packaging (#251)
- additional miscellaneous tweaks to workflows and docker scripts, e.g. install perfetto python bindings
- improves the stability of MPI finalization
- reduces some debug messages within timemory when `OMNITRACE_DEBUG=ON`
- fixes issue found in RHEL where libunwind is using mutex and omnitrace was not treating this as an internal mutex call
  - this may have been affecting the causal profiling slightly (tests seem a bit more stable now)
- fix data race in timemory

* Add RedHat CI and release packaging

- additional miscellaneous tweaks to workflows and docker scripts, e.g. install perfetto python bindings

* Fix URL for ROCm packages in redhat workflow

* Fix dnf --enable-repo for ROCm perl packages

* Dockerfile.rhel and redhat.yml updates

- Fix dnf repo for ROCm PERL packages
- Disable python in CI (interpreter segfaults)
- Exclude parallel-overhead-locks tests due to inclusion of internal locks
  - This needs to be remedied in the future

* Exclude _dl_relocate_static_pie from instrumentation

* Testing updates

- OMNITRACE_SAMPLING_KEEP_INTERNAL=OFF for parallel-overhead-locks

* Fix redhat workflow

* redhat.yml update

- remove if condition on config/build/test step

* Update timemory submodule

- tweaks to verbosity messages

* Set thread state before unw_step

- on Redhat, unw_step calls mutex

* Update timemory submodule

- verbosity changes
- gotcha uses spin_lock/spin_mutex

* Remove using gsplit-dwarf unless OMNITRACE_BUILD_NUMBER > 2

* Re-enable parallel-overhead-locks tests in redhat workflow

* Always disable timemory manager metadata auto output

* testing updates

- tweak parallel-overhead-locks-timemory to higher instruction count min
- OMNITRACE_SAMPLING_KEEP_INTERNAL=OFF for parallel-overhead-locks-perfetto

* Update timemory submodule

- quiet realpath queries

* omnitrace exe updates

- detect text files
- improved bin/lib locating

* cmake format

* test-install.sh and redhat workflow updates

- handle testing when ls is script
- re-enable python testing on redhat workflow
- invoke test-install.sh in redhat workflow

* Misc guards for finalization

* omnitrace-exe, testing updates

- test-install.sh: LS_EXEC -> LS_NAME
- handle /usr/bin/ls being script in source/bin/tests
- improve locating the binary

* Fix mpi_gotcha compile error

* omnitrace-exe updates

- improve file locating

* formatting

* Misc fixes

- remove -static-libstdc++ for RHEL packaging (rocky-linux doesn't distribute static lib)

* omnitrace-exe paths

* Replace realpath with absolute

- using absolute path to symlink fixes issues with locating libdyninstAPI_RT at runtime

* omnitrace exe updates

- judicious use of realpath

* Update timemory submodule

- fix update main hash ids/aliases data race in merge

* bin tests update

- change working directory of omnitrace-exe-simulate-lib-basename

* omnitrace exe updates

- Update resolved exe/lib messaging

* bin tests update

- change working directory of omnitrace-exe-simulate-lib-basename

[ROCm/rocprofiler-systems commit: 1688a027d8]
2023-03-07 06:04:19 -06:00
Jonathan R. Madsen c87e69e522 Submitting jobs to cdash (#124)
* Submitting jobs to cdash

* Fail on submit

* submit url env

* submit url env

* try passing submit url as arg

* fix submit url

* Updated default URL

* Add submissions for remaining ubuntu focal workflow jobs

* Replace g++ with gcc in dashboard build name

* Add --ctest-args to run-ci.sh

* Add cdash support for bionic, jammy, and opensuse workflows

* Decrease CTEST_CUSTOM_MAXIMUM_PASSED_TEST_OUTPUT_SIZE

* OMNITRACE_BUILD_CODECOV option

* Support code coverage in CDash script

* CI dyninst built with debug info

* Update ci-containers

- cron schedule moved 4 hours later to UTC+5

* Update implementation of config::configure_signal_handler

- using lambdas failed to compile with codecov flags

* Add codecov job to ubuntu focal workflow

* Fix support for --ctest-args in run-ci script

* Fix ubuntu workflows

* Fix quotation handling in run-ci script

* git safe directory for codecov

* New MPI examples

* Remove --stop-on-failure

* dynamic_library update

- find_library_path checks procfs maps
- invoke find_library_path with no additional args to resolve to mapped file

* RCCLP uses dynamic_library

* check if file exists for memory_map_files metadata

* Testing updates

- include new mpi examples in tests
- fix test labels
- test critical-trace exe

* Update MPI C examples tests (needed arg)

* Remove try/catch block from critical-trace

* Fix sampling max wait when shutting down

* Fix test env for critical-trace

* Fix settings for critical-trace

- disable time output: data is deterministic
- disable PID suffixes: not multiprocess

* Update critical-trace ctest

* Update critical-trace exe

- throw error if input cannot be opened
- throw error if input has no data

* Update lulesh example with more kokkos tools usage

* Fix tasking issue with critical_trace and roctracer

- were not setting pools to active
- also sync before critical_trace::get_entries

* Increase verbosity of critical-trace tests

* Update code coverage tests

- skip code coverage + preload
- code-coverage python example and test

* Remove duplication omnitrace.initialize function

* Skip python3.6 for ubuntu jammy

* Update MPI examples

- use MPI_Isend and MPI_Irecv
- explicitly use MPI_Bcast

* Update Formatting.cmake

- include C files in examples

* run-ci script does not check return of coverage

* mpi-allreduce link to libm

* Update ctest args in run-ci script

* Update dyninst submodule

- safety improvements in BinaryEdit::openResolvedLibraryName

* capture cmake error for ctest_coverage

[ROCm/rocprofiler-systems commit: 46b6db1a4c]
2022-10-31 15:39:45 -05:00
Jonathan R. Madsen d080fb3a1a CI for OpenSUSE (#12)
* CI for OpenSUSE

* OpenSUSE MPI --allow-run-as-root

- when testing in OpenSUSE docker container, tests using OpenMPI fail due to running as root
- This commit toggles OMNITRACE_CI_MPI_RUN_AS_ROOT to ON and tests whether --allow-run-as-root is supported

* Tweak to OMNITRACE_CI_MPI_RUN_AS_ROOT handling

[ROCm/rocprofiler-systems commit: 20e7b573d4]
2022-06-17 14:32:51 -05:00
Jonathan R. Madsen 76f30b1c6d omnitrace find_package support (#3)
* omnitrace find_package support

- Fix to INSTALL_DESTINATION for configure_package_config_file
- Fixes to ConfigInstall.cmake and omnitrace-config.cmake.in

* Test find_package

[ROCm/rocprofiler-systems commit: d2e635ed3c]
2022-05-24 22:45:26 -05:00
Jonathan R. Madsen 358a3a7e36 Docker and build-release script updates [skip ci] (#61)
- Update CPack

[ROCm/rocprofiler-systems commit: 9cba1f80ba]
2022-05-19 16:06:38 -05:00
Jonathan R. Madsen 127e30a4d7 Documentation + Miscellaneous Fixes (#36)
* Added documentation markdown source

* Replaced AARInternal with AMDResearch in URLs

* Renamed cpack artifact names

* Fix to testing and lulesh submodule checkout

* Docker updates

* CMake and CPack

- force CMAKE_INSTALL_LIBDIR to lib
- CPACK_DEBIAN_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- CPACK_RPM_PACKAGE_RELEASE uses OMNITRACE_CPACK_SYSTEM_NAME
- Tweak LIBOMP_LIBRARY find in examples/openmp
- Tweak setup-env.sh.in

* Partial update of README

- status badges
- docs link
- removed install info (covered by docs)

* OMNITRACE_SAMPLING_CPUS setting

- enables control over which CPUs are sampled for frequency

* omnitrace exe updates

- exclude transaction clone, virtual thunk, non-virtual thunk
- module_function::start_address
- module_function::instructions
- verbosity > 0 encodes instructions into JSON

* Miscellaneous fixes

- relocate setup-env.sh.in
- add modulefile.in
- Updated README.md and source/docs/about.md
- cmake fix for libomp
- fix license in miscellaneous places
- dl.hpp and dl.cpp

* Update timemory and dyninst submodules

- timemory signals updates
- dyninst Movement-adhoc updates

* cmake format

[ROCm/rocprofiler-systems commit: 945f541965]
2022-04-04 15:27:38 -05:00