## Motivation
The idea is to unify the way and place where we store our traces. Current implementation uses `trace_cache` for rocpd traces, but perfetto is in lined inside of each module. This change allows us to have a single point in code where we will collect data, process it and store it in the desired format. This means that we can declutter the code further and have single point of responsibility and single point of failure.
## Technical Details
New `processor` (perfetto_post_processing.cpp) is added to the `trace_cache` which purpose is to use the cached data to populate perfetto tracks. Cache manager is responsible for keeping the instance of this processor and for its lifetime.
When doing this ticket, I also noticed the program would SEGFAULT when ROCPROFSYS_ROCM_DOMAINS=roctx even though the docs tell us we can do this. Went ahead and fixed that.
Also noticed that timemory push/pop in rocprofiler-sdk.cpp was always using category::rocm_marker_api instead of CategoryT. Fixed that as well.
* Enable HOST ompvv runtime-instrumentation ctests
* Fix rocprofiler-systems-avail-regex-negation test failure
* Exclude problematic function from instrumentation
* Make push pop skip an env option for ctests
* Remove SKIP_PUSH_POP_CHECK from argument parse
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
- Added `binary-rewrite-cleanup` and `runtime-instrument-cleanup` tests that remove instrumented binaries and output directories using `cmake -E rm -rf`
- Implemented CMake test fixtures (`FIXTURES_SETUP` and `FIXTURES_CLEANUP`) to establish proper test ordering:
- `binary-rewrite` sets up the `binary-rewrite-fixture`
- `binary-rewrite-run` and validation tests require this fixture
- `binary-rewrite-cleanup` performs cleanup for this fixture
- Same pattern applied for `runtime-instrument`
- Extended `ROCPROFILER_SYSTEMS_ADD_PYTHON_TEST` to accept `FIXTURES_REQUIRED` parameter
- Updated validation tests to require appropriate cleanup fixtures based on test name pattern matching
- Added fixture requirements to Python code-coverage tests
Update the format script to use absolute path for clang-format-diff.py
instead of relative path. This ensures the script works correctly
regardless of the current working directory when executed.
- Change from './clang-format-diff.py' to '${root}/projects/rocr-runtime/clang-format-diff.py'
- Improves script reliability and portability
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: change shmem size to 80
Some DGPU props have a lot of information,
so it is necessary to increase the size of shmem.
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: use BO handle instead of pointer in memory registration
Change vhsakmt_map_to_gpu() return type from void* to vhsakmt_bo_handle
to properly handle buffer object information. This allows access to
both the host address and resource ID needed for memory registration.
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Improve memory mapping logic
- Update vhsakmt_mappable() to check NoAddress flag and require HostAccess
- Remove mappable checks in cpu_map/unmap to allow all BOs to be mapped
- Set BO flags properly in vhsakmt_alloc_memory and scratch memory creation
- Ensure scratch memory is correctly flagged for proper handling
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: add no svm mode for libhsakmt virtio
Add no svm mode for libhsakmt virtio driver, in no svm mode userptrs
need UMD to manage, so add interval tree to manage them.
New Features:
- Add augmented red-black tree based interval tree implementation
* Implement RB-tree insertion, deletion, and color balancing
* Provide interval query for fast overlapping range lookup
* Based on Linux kernel's augmented rbtree implementation
- Improve userptr memory management
* Use interval tree to efficiently track userptr memory regions
* Support finding registered memory within given address ranges
* Optimize memory mapping and unmapping performance
Signed-off-by: Honglei Huang <honghuan@amd.com>
---------
Signed-off-by: Honglei Huang <honghuan@amd.com>
## Motivation
Resolved: SWDEV-566226
The current implementation of agents inside of rocprof-systems keeps just the minimal necessary set of information required for populating the `info_agent` table inside of rocpd database. There is a sufficient amount of data that is being left out from database, so this change should fix that and store the additional agent information as an `extdata` row inside of `info_agent` table.
## Technical Details
This PR introduces additional filed inside of `agent` structure inside which is representing the JSON formatted string of all the additional information we can acquire about particular agent. This data is processed and added during the initial fetching of agents, and afterwards pushed inside of the database.
---------
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
## Motivation
To solve: SWDEV-566076
FFmpeg versions >= 58.134 no longer expose read_seek and read_seek2 function pointers in AVInputFormat,
requiring alternative seek detection methods. This pull request updates the `VideoDemuxer` class to improve compatibility with newer versions of FFmpeg. The main change is how the code determines whether the input file is seekable, addressing differences in FFmpeg API versions.
## Technical Details
In `video_demuxer.h`, added a conditional check for `USE_AVCODEC_GREATER_THAN_58_134` to set `is_seekable_` to `true` for newer FFmpeg versions, since `read_seek` and `read_seek2` are no longer exposed in `AVFormatContext`. For older versions, the previous method of checking these fields remains in place. The conditional compilation
now assumes seek capability is available for newer FFmpeg versions.
Added runtime PM detection and DRM ioctl-based device wake
to handle GPUs in BACO state. Modified tests to wake
suspended devices before reading sysfs files.
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
* Added PCIE Atomic Operations enable check.
Tests if atomic operations are enabled for GPU devices.
Displays the Atomic routing capability via Link capability and status.
Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>