Graf commitů

115 Commity

Autor SHA1 Zpráva Datum
Kian Cossettini 9f014db6a4 [rocprofiler-systems] Update install path for examples (#2625)
* Update install path for examples to `share/rocprofiler-systems/examples`

----

Co-authored-by: Kian Cossettini <Kian.Cossettini@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2026-01-15 21:51:16 -05:00
David Galiffi 2daec0e4d0 Revert 63713f01e0 (#2585)
## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve. -->
Remove Fortran example due to Palamida scan violation.

## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->
Revert 63713f01e0.
New test to be added later.

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2026-01-12 23:44:26 -05:00
David Galiffi cb17e59a57 [rocprofiler-systems] Improve build time by refactoring RCCL test cmake (#1656)
Improve cmake configuration time by making sure the rccl-tests are built during the build phase rather than the configuration phase.
2026-01-07 19:51:54 -05:00
marantic-amd ba1380a75d Put cached perfetto traces as default one (#2138)
* Put cached perfetto traces as default one

* Improve cached data and perfetto traces in order to be more aligned with E2E tests

* Addressing PR comments and findings

* Force early instrumentation bundle instantiation

* Sync-up insturumented containers with thread growth data

* Revert ompvv number of host threads to default 8

* Fixed counter track namings for amd-smi

* AIPROFSYST-34 [rocprof-sys] Update documentation describing newly introduced changes to default tracing mechanism
2025-12-22 12:47:35 +01:00
habajpai-amd 6b45657493 update build rccl-tests infrastructure and add getAlgoProtoChannels support (#2212) 2025-12-11 18:29:06 +05:30
Mario Limonciello d1aaae2539 Run pre-commit's whitespace related hooks on projects/rocprofiler-systems (#2123)
In order for pre-commit to be useful, everything needs to meet a common
baseline.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-12-04 23:39:42 -05:00
Kian Cossettini 63713f01e0 [rocprofiler-systems] Add Fortran MPI CTests (#1172)
* Add MPI CTests (use gfortran)

* Add proper regex check

* Skip Runtime-Instrument due to incompatibility with MPI

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Sajina Kandy <sputhala@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-27 10:32:09 -05:00
anujshuk-amd 85b5c03f36 [rocprof-sys] Fix test build failure on RHEL 10 (#1955)
## Motivation

To solve: SWDEV-566076 
FFmpeg versions >= 58.134 no longer expose read_seek and read_seek2 function pointers in AVInputFormat,
requiring alternative seek detection methods. This pull request updates the `VideoDemuxer` class to improve compatibility with newer versions of FFmpeg. The main change is how the code determines whether the input file is seekable, addressing differences in FFmpeg API versions.


## Technical Details

In `video_demuxer.h`, added a conditional check for `USE_AVCODEC_GREATER_THAN_58_134` to set `is_seekable_` to `true` for newer FFmpeg versions, since `read_seek` and `read_seek2` are no longer exposed in `AVFormatContext`. For older versions, the previous method of checking these fields remains in place. The conditional compilation
now assumes seek capability is available for newer FFmpeg versions.
2025-11-25 15:25:05 -05:00
habajpai-amd 1a3564a51a [rocprof-sys] Fix fork() handling for GPU profiling and AMD SMI (#1930)
- Fix fork() handling for GPU profiling and AMD SMI
- Add hipMallocConcurrency test for CI with GPU
2025-11-24 09:21:27 -05:00
Sajina PK 09b8342e22 [Rocprofiler-systems] : Add XGMI and PCIe metrics to the profiling data (#1628)
* Add XGMI and PCIe metrics to the profiling data

Add support for AMD XGMI (GPU-to-GPU interconnect) and PCIe
metrics:
  * XGMI link width in bits
  * XGMI link speed in GT/s
  * Per-link read bandwidth (KB)
  * Per-link write bandwidth (KB)

- Add new categories for PCIe metrics:
  * PCIe link width
  * PCIe link speed in GT/s
  * Accumulated bandwidth (MB)
  * Instantaneous bandwidth (MB/s)

* Fix VCN/JPEG insert logic

* Modify the gpu_metrics struct to accomodate XCP structure

* Add ctest automation for gpu interconnect metrics

* Refactor to move gpu_metrics struct and serialization to another file

* Possible fix for timeout in CI

Fix redundant skip check in ctest
Add xgmi and pcie option in rocprof-sys-avail.

* Change2: Address review comments

Change ctest sampling to avoid timeout
Change variable name and code structuring

* Add option in ctest to run rocprof-sys-run without rewrite

Run transferbench with rocprof-sys-run without sampling

* Change3: Fix sample insert bug and address review comments

xgmi and pci support check
renaming variables
additional hip_api validation in rocpd

* Reduce the load from the trnasferBench sample

The CI builds were timing out when flushing a big temporary file to the
DB: (2720824.23 KB / 2720.82 MB / 2.72 GB)...
2025-11-14 19:42:33 -05:00
ajanicijamd 2f9017f706 Fix build failure with Clang 20. (#1667)
* Modified for Clang

* Updated timemory version so it compiles with Clang 20

* Using TBB version 2018.6 for both GCC and Clang builds
2025-11-08 11:36:12 -05:00
Kian Cossettini f4d0aeb8f3 Adjust host thread count for OpenMP-VV tests (#1742)
Reducing test time
2025-11-06 16:04:47 -05:00
Kian Cossettini 2a080641a1 [rocprofiler-systems] Consolidate CTests to tests/ folder (#1461)
* Consolidate CTests to tests/ folder

* Remove comment

* Consolidate CTests to tests/ folder

* Remove comment

* Separate source code and test code for thread-limit into appropriate folders

* Remove sleeper.cpp and instead use linux sleep cmd

* Merge python-console tests into python-tests
2025-11-03 11:03:35 -05:00
Kian Cossettini db949445c3 [rocprofiler-systems] Overhaul OpenMP-VV Test compilation (#1389)
* Reworked Compilation

* Formatting

* Change compile log name

* Optimize Code

* Remove gfx940 and gfx941
2025-10-23 13:58:11 -04:00
habajpai-amd 74fc268a32 Add libomptarget discovery to prevent OpenMP/HIP segfaults (#1043)
This PR fixes a segmentation fault seen when running rocprof-sys-sample with multi-process OpenMP/HIP applications.
The crash was caused by missing libomptarget.so on the runtime loader path or incorrect LD_PRELOAD settings.

Fixes SWDEV-552804

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-10-01 09:51:26 -04:00
Julian Jose 8157437273 [Palamida scan] SWDEV-553054 Adding missing copyrights information (#900)
* Add missing copyright headers in rocprofiler-systems
* Update python-tests
* Update causal test

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-09-12 14:17:58 -04:00
Kian Cossettini 5d582fcd37 [rocprofiler-systems] Add Fortran OpenMP CTests (#874)
* Added Fortran (amdflang) openmp tests using the openmp-vv project

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-09-12 09:52:16 -04:00
habajpai-amd 1c7293e6d0 Add bounds checks in transpose_a for both load and store so edge tiles dont read/write past MxN (#950) 2025-09-12 17:32:30 +05:30
habajpai-amd fb6fe518e8 fix(transpose): correct host allocation and GB/s calculation (#860) 2025-09-04 16:08:16 -04:00
habajpai-amd cd729ab630 Improve library discovery in openmp-target example (#792)
cmake(openmp/target): make libomptarget discovery robust across ROCm layouts
2025-08-28 14:55:55 -04:00
Kian Cossettini 07a7b9b845 Use rocprofiler-SDK for OMPT tracing (#702)
Switch to using SDK for OMPT tracing and remove older OMPT code path
2025-08-26 16:54:01 -04:00
David Galiffi 847580dd9e Update minimum_cmake_required to match version used in CI (#679)
- Update minimum_cmake_required to match version used in CI
  - We should match the minimum version that we test against

- Ensure ".S" files are treated as assembly.
2025-08-21 15:56:47 -04:00
habajpai-amd 15fb4943e2 Fix the openmp-target ctest (#300)
- openmp-target: add runtime rpath for libomptarget and update tests
- Handle events not associated with a HIP Stream
  - Kernels from OpenMP target offload are not associated with a HIP stream. Fix handling with the callback record's stream_id is 0

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: c424dac261]
2025-07-31 10:41:27 -04:00
Kian Cossettini df7605835f fork-runtime-instrument ctest fix (#295)
* Fix fork-runtime-instrument ctest failure by adding -fno-inline flag

[ROCm/rocprofiler-systems commit: d1a2deba1f]
2025-07-31 07:59:41 -04:00
Sajina PK aa9b265302 Manually search for rocdecode and rocjpeg libraries in cmake (#294)
* Manually search for rocdecode ad rocjpeg libraries

* Update examples/jpegdecode/CMakeLists.txt

Fix typo.

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: f4e9846e1c]
2025-07-21 14:07:00 -04:00
habajpai-amd 7eb189db84 Add missing <cstring> include for C string functions in RCCL tests (#282)
* Fix: Add missing <string.h> include for C string functions in RCCL tests

* Update examples/rccl/rccl-tests/src/common.h

Yes, confirmed—<cstring> alone works in my environment. Updated the PR

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

* clang-format

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 0ec3072e05]
2025-07-16 11:23:50 -04:00
Sajina PK 1892bdef83 Make CMakeFile fixes to align with rocDecode and rocJpeg changes (#281)
[ROCm/rocprofiler-systems commit: be06384250]
2025-07-14 19:23:19 -04:00
Pranjal Swarup 0497b7934f Update RCCL-tests in examples folder (#261)
- Create a local copy for ROCm/rccl-tests for our examples.
- Update argument parsing to no longer use getopt_long.
- Workaround for Dyninst instrumentation.

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 4e5029221b]
2025-06-27 11:44:13 -04:00
anujshuk-amd 0c91a0d8ed Add ctests to verify roctx api (#260)
---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 3631362903]
2025-06-25 14:01:04 -04:00
David Galiffi 8fcf3a50b0 Use gersemi for CMake formatting (#257)
* Replace `cmake-format` with `gersemi`

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Remove .cmake-format.yaml

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update workflow to use gersemi

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CONTRIBUTING.md

* Update helper scripts

* Don't include `*/external/*` in workflows

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 122623a929]
2025-06-22 10:44:33 -04:00
David Galiffi 0403aaa97f Use clang-format-18 for source formatting (#256)
* Updating clang-format to v18

- Updates the pre-commit-config
- Formats source files according to the utility

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update format source workflow

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CONTRIBUTING

* Update comment in .clang-format

* Update CONTRIBUTING.md

* Update helper script

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 1e13b590e7]
2025-06-22 08:48:08 -04:00
Daniel Su f155355b33 Add thread header to videodecode example (#252)
[ROCm/rocprofiler-systems commit: 475d6c0f1f]
2025-06-18 12:40:24 -04:00
David Galiffi db21150ab0 Fix OpenMP-Target ctest (#241)
Test is missing from rocm-7.0 stack because of a HIP version check.
In these builds, hip_version.h is still reporting 6.5.0.
This check was originally put in to skip the test on older versions
of ROCm, which should no longer be required

- For SWDEV-537718

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 28bee27253]
2025-06-11 16:00:48 -04:00
Sajina PK 916aac1e92 Enable MPI tracing for Fortran (#185)
- Move the MPI gotcha functionality from Timemory to the repo.
- Add the PMPI Fortran MPI functions to the existing mpi gotcha handle.

[ROCm/rocprofiler-systems commit: 4fcd8cc78d]
2025-06-04 18:06:18 -04:00
David Galiffi 172fc80443 Fix openmp-target test on gfx950 (#204)
for SWDEV530112

[ROCm/rocprofiler-systems commit: 45b600b34e]
2025-05-12 14:20:24 -04:00
David Galiffi 29a312f7df Fix build failure for the openmp-target example when building in docker. (#197)
- Add cmake formatting rule for `rocprofiler-systems-custom-compilation`
- Resolve build failure observed with the `openmp-target` example when building in some environments
  - Observed in our docker images
  - Ensure `libomptarget-amdgpu-gfx*.bc` files are found


[ROCm/rocprofiler-systems commit: 96b31d92c3]
2025-05-05 21:49:46 -04:00
Sajina PK b5b5bac8b2 Updates to rocDecode and rocJPEG samples (#166)
* Edit CMakeList files and include paths for library headers.


[ROCm/rocprofiler-systems commit: e805c0f3e7]
2025-04-16 17:12:21 -04:00
Sajina PK 04fb7e4fe7 RocJpeg cmake and document fixes (#157)
- Fix for rocjpeg sample cmake due to changes in the rocJPEG project
- Fix for rocprofiler-sdk version check - change the format
- Edits to docs for jpeg and vcn activity support - mention that these values may not be supported on all ASICs.

[ROCm/rocprofiler-systems commit: fad3a0d341]
2025-04-09 16:20:02 -04:00
Sajina PK e004775878 Jpeg sample fix for seg fault (#158)
Update the rocJpeg sample used for testing with the latest on from the rocJPEG repo.

[ROCm/rocprofiler-systems commit: 83d6f73f03]
2025-04-03 13:54:32 -04:00
Sajina PK 3ca3d63d5c Fix for excluding JPEG and VCN activity test. (#135)
JPEG activity recording is currently only supported on MI300 serries.
VCN activity is supported in MI100 also but there is a bug currently being fixed by FW.

- Currently only testing the Activity verification tests for MI300
- Also moves the Jpeg image copying code to after the package is found.

[ROCm/rocprofiler-systems commit: e605e5d33f]
2025-03-11 14:12:28 -04:00
Sajina PK 1db0539c30 Add support for rocJPEG API tracing (#116)
- Add rocDecode API Tracing support using domain `rocjpeg_api` in ROCPROFSYS_ROCM_DOMAINS.
- Modify existing `videodecode` and `jpegdecode` ctests to verify API tracing
- Print Perfetto values for easy debugging in verbose mode
- Convert CMake error to a warning and skip building the "decode" examples if requirements are not found

[ROCm/rocprofiler-systems commit: 3bea1d8eac]
2025-02-25 21:14:14 -05:00
Sajina PK 572f9532ef JPEG Activity tracing in Perfetto (#108)
- Add JPEG activity track in perfetto trace
- Add JPEG decode tests to the examples
- Change existing videodecode test to include JPEG testing
- Rename videodecode test file to decode to include jpeg tests too
- Fix a bug in the test which checks for total activity of 0
- Disable rocDecode and rocJPEG samples from the github image files

[ROCm/rocprofiler-systems commit: 59d3399901]
2025-02-21 10:25:01 -05:00
David Galiffi 0c1cf2c7d7 Update the videodecode ctest (#85)
- Copy sample videos and include in install.
- Only using the H26* videos, so the same tests can be used on MI100, which lacks AV1 decode support
- Update test parameters to videodecode test, based on feedback from the rocDecode team.

[ROCm/rocprofiler-systems commit: 5b439d80a0]
2025-01-29 16:53:16 -05:00
Peter Park 3f9a3861ac Update copyright year to 2025 (#83)
[ROCm/rocprofiler-systems commit: 0a15d355e0]
2025-01-29 16:53:16 -05:00
Sajina PK b502b03be2 Add tests to check VCN activity tracing in perfetto output (#75)
The test will run sampling on the example from the rocDecode repo:
https://github.com/ROCm/rocDecode/tree/develop/samples/videoDecodeBatch

Then, ensure that the Perfetto output captures VCN activity in the trace.

[ROCm/rocprofiler-systems commit: 6dd4938b3e]
2025-01-29 16:53:16 -05:00
darren-amd 5a6f6ed83d Allow disabling of openmp examples (#64)
Add flag to disable openmp examples

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 888b9a43a0]
2024-12-16 14:31:56 -05:00
David Galiffi 6a6fd7f0f9 OMPT Target Offload Support (#17)
- Porting from https://github.com/ROCm/omnitrace/pull/411
- Improve OMPT support
- Add OpenMP target example to testing
- Update Timemory submodule to use ROCm/Timemory rather than NERSC/Timemory
- Update `actions/upload-artifacts` to v4
- Standardize the `cmake_minimum_required` to 3.18.4 across workflows, project, and examples
- Updated Ubuntu 20.04 workflows

[ROCm/rocprofiler-systems commit: 7dce5926a7]
2024-11-07 16:49:32 -05:00
David Galiffi 181a782835 Rename some examples still using the "omni" prefix. (#8)
* Rename some examples still using the "omni" prefix.

* CMake formatting

[ROCm/rocprofiler-systems commit: 656b34b61f]
2024-10-17 14:52:09 -04:00
David Galiffi 489eda995d Rename Omnitrace to ROCm Systems Profiler (#4)
The Omnitrace program is being renamed. 

Full name: "ROCm Systems Profiler"
Package name: "rocprofiler-systems"
Binary / Library names: "rocprof-sys-*"

---------
Co-authored-by: Xuan Chen <xuchen@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: d07bf508a9]
2024-10-15 11:20:40 -04:00
ajanicijamd 218f8bcbea Update Perfetto and fix tests (#378)
Fix for "SWDEV-479652" - Perfetto-based tests are failing.

Updated version of perfetto submodule to v46.0.
Modified Omnitrace code that uses Perfetto, so it can compile.
Modified the testing code, so it can run the version of trace_processor_shell provided (v46.0).

---------

Signed-off-by: Aleksandar Janicijevic <Aleksandar.Janicijevic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>

[ROCm/rocprofiler-systems commit: 96d7b8f0ab]
2024-09-13 13:43:26 -04:00