## Motivation
<!-- Explain the purpose of this PR and the goals it aims to achieve. -->
Remove Fortran example due to Palamida scan violation.
## Technical Details
<!-- Explain the changes along with any relevant GitHub links. -->
Revert 63713f01e0.
New test to be added later.
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
* Put cached perfetto traces as default one
* Improve cached data and perfetto traces in order to be more aligned with E2E tests
* Addressing PR comments and findings
* Force early instrumentation bundle instantiation
* Sync-up insturumented containers with thread growth data
* Revert ompvv number of host threads to default 8
* Fixed counter track namings for amd-smi
* AIPROFSYST-34 [rocprof-sys] Update documentation describing newly introduced changes to default tracing mechanism
## Motivation
To solve: SWDEV-566076
FFmpeg versions >= 58.134 no longer expose read_seek and read_seek2 function pointers in AVInputFormat,
requiring alternative seek detection methods. This pull request updates the `VideoDemuxer` class to improve compatibility with newer versions of FFmpeg. The main change is how the code determines whether the input file is seekable, addressing differences in FFmpeg API versions.
## Technical Details
In `video_demuxer.h`, added a conditional check for `USE_AVCODEC_GREATER_THAN_58_134` to set `is_seekable_` to `true` for newer FFmpeg versions, since `read_seek` and `read_seek2` are no longer exposed in `AVFormatContext`. For older versions, the previous method of checking these fields remains in place. The conditional compilation
now assumes seek capability is available for newer FFmpeg versions.
* Add XGMI and PCIe metrics to the profiling data
Add support for AMD XGMI (GPU-to-GPU interconnect) and PCIe
metrics:
* XGMI link width in bits
* XGMI link speed in GT/s
* Per-link read bandwidth (KB)
* Per-link write bandwidth (KB)
- Add new categories for PCIe metrics:
* PCIe link width
* PCIe link speed in GT/s
* Accumulated bandwidth (MB)
* Instantaneous bandwidth (MB/s)
* Fix VCN/JPEG insert logic
* Modify the gpu_metrics struct to accomodate XCP structure
* Add ctest automation for gpu interconnect metrics
* Refactor to move gpu_metrics struct and serialization to another file
* Possible fix for timeout in CI
Fix redundant skip check in ctest
Add xgmi and pcie option in rocprof-sys-avail.
* Change2: Address review comments
Change ctest sampling to avoid timeout
Change variable name and code structuring
* Add option in ctest to run rocprof-sys-run without rewrite
Run transferbench with rocprof-sys-run without sampling
* Change3: Fix sample insert bug and address review comments
xgmi and pci support check
renaming variables
additional hip_api validation in rocpd
* Reduce the load from the trnasferBench sample
The CI builds were timing out when flushing a big temporary file to the
DB: (2720824.23 KB / 2720.82 MB / 2.72 GB)...
* Consolidate CTests to tests/ folder
* Remove comment
* Consolidate CTests to tests/ folder
* Remove comment
* Separate source code and test code for thread-limit into appropriate folders
* Remove sleeper.cpp and instead use linux sleep cmd
* Merge python-console tests into python-tests
This PR fixes a segmentation fault seen when running rocprof-sys-sample with multi-process OpenMP/HIP applications.
The crash was caused by missing libomptarget.so on the runtime loader path or incorrect LD_PRELOAD settings.
Fixes SWDEV-552804
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
* Added Fortran (amdflang) openmp tests using the openmp-vv project
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
- Update minimum_cmake_required to match version used in CI
- We should match the minimum version that we test against
- Ensure ".S" files are treated as assembly.
- openmp-target: add runtime rpath for libomptarget and update tests
- Handle events not associated with a HIP Stream
- Kernels from OpenMP target offload are not associated with a HIP stream. Fix handling with the callback record's stream_id is 0
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: c424dac261]
* Fix: Add missing <string.h> include for C string functions in RCCL tests
* Update examples/rccl/rccl-tests/src/common.h
Yes, confirmed—<cstring> alone works in my environment. Updated the PR
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
* clang-format
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: 0ec3072e05]
- Create a local copy for ROCm/rccl-tests for our examples.
- Update argument parsing to no longer use getopt_long.
- Workaround for Dyninst instrumentation.
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: 4e5029221b]
Test is missing from rocm-7.0 stack because of a HIP version check.
In these builds, hip_version.h is still reporting 6.5.0.
This check was originally put in to skip the test on older versions
of ROCm, which should no longer be required
- For SWDEV-537718
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: 28bee27253]
- Move the MPI gotcha functionality from Timemory to the repo.
- Add the PMPI Fortran MPI functions to the existing mpi gotcha handle.
[ROCm/rocprofiler-systems commit: 4fcd8cc78d]
- Add cmake formatting rule for `rocprofiler-systems-custom-compilation`
- Resolve build failure observed with the `openmp-target` example when building in some environments
- Observed in our docker images
- Ensure `libomptarget-amdgpu-gfx*.bc` files are found
[ROCm/rocprofiler-systems commit: 96b31d92c3]
- Fix for rocjpeg sample cmake due to changes in the rocJPEG project
- Fix for rocprofiler-sdk version check - change the format
- Edits to docs for jpeg and vcn activity support - mention that these values may not be supported on all ASICs.
[ROCm/rocprofiler-systems commit: fad3a0d341]
JPEG activity recording is currently only supported on MI300 serries.
VCN activity is supported in MI100 also but there is a bug currently being fixed by FW.
- Currently only testing the Activity verification tests for MI300
- Also moves the Jpeg image copying code to after the package is found.
[ROCm/rocprofiler-systems commit: e605e5d33f]
- Add rocDecode API Tracing support using domain `rocjpeg_api` in ROCPROFSYS_ROCM_DOMAINS.
- Modify existing `videodecode` and `jpegdecode` ctests to verify API tracing
- Print Perfetto values for easy debugging in verbose mode
- Convert CMake error to a warning and skip building the "decode" examples if requirements are not found
[ROCm/rocprofiler-systems commit: 3bea1d8eac]
- Add JPEG activity track in perfetto trace
- Add JPEG decode tests to the examples
- Change existing videodecode test to include JPEG testing
- Rename videodecode test file to decode to include jpeg tests too
- Fix a bug in the test which checks for total activity of 0
- Disable rocDecode and rocJPEG samples from the github image files
[ROCm/rocprofiler-systems commit: 59d3399901]
- Copy sample videos and include in install.
- Only using the H26* videos, so the same tests can be used on MI100, which lacks AV1 decode support
- Update test parameters to videodecode test, based on feedback from the rocDecode team.
[ROCm/rocprofiler-systems commit: 5b439d80a0]
- Porting from https://github.com/ROCm/omnitrace/pull/411
- Improve OMPT support
- Add OpenMP target example to testing
- Update Timemory submodule to use ROCm/Timemory rather than NERSC/Timemory
- Update `actions/upload-artifacts` to v4
- Standardize the `cmake_minimum_required` to 3.18.4 across workflows, project, and examples
- Updated Ubuntu 20.04 workflows
[ROCm/rocprofiler-systems commit: 7dce5926a7]
The Omnitrace program is being renamed.
Full name: "ROCm Systems Profiler"
Package name: "rocprofiler-systems"
Binary / Library names: "rocprof-sys-*"
---------
Co-authored-by: Xuan Chen <xuchen@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: d07bf508a9]
Fix for "SWDEV-479652" - Perfetto-based tests are failing.
Updated version of perfetto submodule to v46.0.
Modified Omnitrace code that uses Perfetto, so it can compile.
Modified the testing code, so it can run the version of trace_processor_shell provided (v46.0).
---------
Signed-off-by: Aleksandar Janicijevic <Aleksandar.Janicijevic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
[ROCm/rocprofiler-systems commit: 96d7b8f0ab]