Victor Zhang
92fcc928b6
SWDEV-526773 - Modify LaunchDelayKernel to set a hard coded WallClock… ( #1911 )
...
* SWDEV-526773 - Modify LaunchDelayKernel to set a hard coded WallClock value when it's not avaliable
* Change hardcode clockrate in unit of KHz.
2025-11-25 11:21:03 -05:00
Ethan Trinh
2042191e23
Suppress deprecated-declaration warnings ( #1817 )
...
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com >
2025-11-25 10:31:30 -05:00
Ethan Trinh
bef946de1c
SWDEV-555551 - Remove hip-test warnings in linux ( #1031 )
...
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com >
2025-11-25 10:31:15 -05:00
Jason Bonnell
e68873c170
Gersemi formatting for rocprofiler-compute ( #1997 )
...
* Run gersemi formatting on cmake files in compute
* Run gersemi again but on updated version
2025-11-25 09:49:16 -05:00
Gerardo Hernandez
8abfee9f26
SWDEV-541351 - fix use of uninitialized memory in Unit___hip_atomic_compare_exchange tests ( #1976 )
2025-11-25 11:02:14 +00:00
solaiys
3466ec5458
Added PCIE Atomic Operations enable check. ( #1746 )
...
* Added PCIE Atomic Operations enable check.
Tests if atomic operations are enabled for GPU devices.
Displays the Atomic routing capability via Link capability and status.
Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com >
2025-11-25 14:29:30 +05:30
Gerardo Hernandez
c87014a54c
SWDEV-534207 fix order of kernel launch parameters when calling notifiedKernel in some tests: kernel<<<gridDim, blockDim>>> instead of kernel<<<blockDim, gridDim>>>. This was causing out of bounds accesses ( #1860 )
2025-11-25 06:37:47 +00:00
Pengda Xie
6c31785eaf
SWDEV-562761 - Cleanup static fatbin on runtime teardown ( #1873 )
2025-11-24 21:57:46 -08:00
darren-amd
16e7ee32e6
[rocm-smi-lib] Add iomanip include to frequencies_read ( #1797 )
2025-11-24 16:38:21 -05:00
Young Hui - AMD
a4f533fa92
[rocpd] Fix rocpd convenience scripts to accept --automerge-limit parameter ( #1926 )
...
* remove double RocpdImportData calls from execute() in each module
* formatting fix
2025-11-24 14:50:27 -05:00
Maisam Arif
1f7fc8d8a7
Fixed wrapper to respect symlink pathing ( #1984 )
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
2025-11-24 13:14:46 -06:00
systems-assistant[bot]
c404fbd851
[SWDEV-560235] Add gpu_board and base_board temperatures to monitor ( #1906 )
...
* Add helpers for gpu_board and base_board temperatures
* Added gpu_board and base_board temperatures arguments for non-default monitor subcommand
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com >
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com >
2025-11-24 13:12:09 -06:00
Marius Brehler
2dc32d645b
Explicitly load versioned libamdhip64.so ( #1872 )
...
* Explicitly load versioned libamdhip64.so
* Fix syntax errors
* Fix when patching happens in Windows workflow
---------
Co-authored-by: Joseph Macaranas <145489236+jayhawk-commits@users.noreply.github.com >
Co-authored-by: ammallya <ameyakeshava.mallya@amd.com >
2025-11-24 10:05:05 -08:00
sluzynsk-amd
2cf9faa93f
SWDEV-563777 - fix warnings related to inconsistent overrides ( #1625 )
...
This patch adds missing override keywords. Fixes this class of warnings.
Signed-off-by: Sebastian Luzynski <Sebastian.Luzynski@amd.com >
2025-11-24 18:50:07 +01:00
habajpai-amd
1a3564a51a
[rocprof-sys] Fix fork() handling for GPU profiling and AMD SMI ( #1930 )
...
- Fix fork() handling for GPU profiling and AMD SMI
- Add hipMallocConcurrency test for CI with GPU
2025-11-24 09:21:27 -05:00
marantic-amd
ebd55d2ce0
Track process_sampler state for CPU sampling ( #1993 )
2025-11-24 15:03:08 +01:00
Aleksandar Djordjevic
a5d554b85a
[rocprofiler-systems] Implement GTest/GMock integration for unit testing ( #1777 )
...
* googletest project set up
---------
Co-authored-by: Aleksandar Djordjevic <adjordje@amd.com >
Co-authored-by: Milan Radosavljevic <milan.radosavljevic@amd.com >
2025-11-24 11:49:30 +01:00
AidanBeltonS
0580e2053c
SWDEV-533546, SWDEV-540027 - Add e8m0 conversions and testing ( #987 )
...
* SWDEV-533546 - Add conversion functions for e8m0
* SWDEV-533546 - remove whitespace
* Add testing
* Update based on feedback
* Copilot suggestions
---------
Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
2025-11-24 09:14:03 +00:00
Ioannis Assiouras
36029ea1a8
SWDEV-559166 - Fix race condition in getDemangledName ( #1868 )
2025-11-23 08:45:45 +00:00
Ioannis Assiouras
7313c3752a
SWDEV-567475 - Fix failures in graph tests due to GraphExec destroy h… ( #1917 )
2025-11-22 23:01:47 +00:00
Ioannis Assiouras
75de915725
hip-tests: fix runpath in hipSquareGenericTargetOnly[Compressed] ( #1965 )
...
Co-authored-by: Rahul Manocha <rmanocha@amd.com >
2025-11-21 16:31:41 -08:00
jamessiddeley-amd
833425577f
fix roofline kernel_names test case ( #1954 )
2025-11-21 15:04:08 -05:00
ammallya
822b38b743
Migrating amdsmi ( #1922 )
2025-11-21 11:12:34 -08:00
jokim-amd
770f30bc4c
hsakmt: bump vgpr count for gfx1151 ( #1807 )
...
GFX1151 has 1.5x VGPR memory compared to the rest of GFX11.
2025-11-21 09:53:32 -08:00
gabrpham
6b1e6187f6
[SWDEV-560681] Allowed GPU enumeration to continue with non-contiguous render nodes ( #1609 )
...
* Fix uninitialized variable in GPU enumeration loop (#1643 )
* Initialize node_to_gpu_id to prevent undefined behavior
---------
Signed-off-by: gabrpham <Gabriel.Pham@amd.com >
Co-authored-by: Allan Xavier <axavier@digitalocean.com >
2025-11-21 10:01:10 -06:00
vedithal-amd
6540155c9d
Bugfixes ( #1971 )
...
* Implement AMDGPU driver info and GPU VRAM attributes in system info.
section of analysis report.
* Backward compatibility for rocprofiler-sdk avail module path migration
* Fix roofline calculation where AI data points are N/A
2025-11-21 10:54:25 -05:00
Jonathan R. Madsen
a2288eb50b
[rocprofiler-sdk] Install unit tests and helper functions for integration tests ( #921 )
...
* [rocprofiler-sdk] Install unit tests and helper functions for integration tests
* Fix rocprofiler-sdk-tests-target export
* Fix handling of cmake policy CMP0174
* Remove -vv from new pytest.ini files
* add unit tests and integration tests.
* add path to ci workflow.
* misc. fixes.
* pc sampling tests.
* bug fixes.
* pc sampling tests fix.
* misc.
* Update CMakeLists.txt
* Update rocprofiler_config_install_tests.cmake, correct license name
* fix units tests install issues.
* fix counters_def file path.
* fix bug, arg shifting.
* vendor pytest-cmake.
* cmake config fix. missing endfunction()
* disable tests, 1.rocprofv3-trace-hip-libs. 2.kernel-tracing. 3.external_correlation 4.rocpd.
* disable buffered-tracing test and remove pytest-cmake from requirements.txt.
* disable hip-graph-tracing test.
* fix building standalone tests to load rocprofiler-sdk cmake package first and then find rocprofiler_sdk_pytest module.
* addressed comments: 1.add local bin path to code cov workflow. 2.add to cmake prefix path local bin. 3.use ROCPROFILER_MEMCHECK_PRELOAD_ENV_VALUE 4.misc. fix
* enabled back tests api_buffered, external_correlation_id, hip-graph, kernel-tracing, rocpd, tracing-hip-in-libraries. and misc fixes(formating, extra fixtures for agent-index tests.)
* cpack to use llvm bin for .hsaco debug symbols.
* psdb tests fixes.
* EOL.
* misc. fixes and Disable api_buffered_tracing, external_correlation_id, hip-graph-tracing, kernel-tracing, rocpd, summary, tracing-hip-libraries, tracing-plus-counter-collection.
* fix incorrect cmakelists file.
* strip smallkernel.bin
* format.
* revert disabled tests commit.
* misc. fix in counter tests.
* misc.
* search codeobj unit test assets in curr bin and install bin.
* refactor newly added rocpd tests.
* modify tests for newly added hip-host-tracing.
* add LD LIB path to units, psdb is failing due to libs not being found.
---------
Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com >
Co-authored-by: Venkateshwar Reddy Kandula <Venkateshwarreddy.Kandula@amd.com >
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com >
2025-11-21 08:06:56 -06:00
Jason Bonnell
66ea1cdff2
Add workflow to remove old untagged rocprofiler GHCR Docker Images ( #1959 )
...
* Add WIP workflow step to delete untagged images older than 1 week
* Formatting fix for rocprofiler-systems-ghcr.yml
* Move step to new workflow
* Remove needs parameter from cleanup-rocprofiler-images
* Remove expand-packages option
* Expand cleanup for every OS
* Revert spacing change to rocprofiler-systems-ghcr.yml
* Turn off dry-run to do an initial clean
* Switch dry-run to be only on PR
* Added comment about schedule
2025-11-21 08:49:29 -05:00
Sajina PK
d77b245730
[Rocprofiler-systems] : Refactor papi enumeration to fix a hang on Intel systems ( #1672 )
...
* Refactor papi enumeration to fix a hang on Intel systems
- Add an exclude argument to available_events_info() for
perf_event_uncore causing hang like case on Intel systems with large
number of uncore events.
- Enumerate papi available events only when papi events are specified by
users inside early initialization logic
- Move papi available event query for ROCPROFSYS_SAMPLING_OVERFLOW_EVENT
config setting to the avail component, to move the heavy logic outside
initialization.
- Make category option for rocprof-sys-avail -H -c case insensitive
- Provide new option to query available overflow events that can be
specified for ROCPROFSYS_SAMPLING_OVERFLOW_EVENT using new command
option rocprof-sys-avail -H -c overflow
* Update projects/rocprofiler-systems/source/bin/rocprof-sys-avail/common.cpp
Co-authored-by: Milan Radosavljevic <milan.radosavljevic@amd.com >
* Update timemory submodule pointer
Signed-off-by: David Galiffi <David.Galiffi@amd.com >
* Fix errors on compile
* Change 1: Optimization for the category matching lambda
Optmization changes.
* Modify the rocprof-sys-avail -c option for overflow
Overflow should not be displayed as a device in rocprof-sys-avail -H -c CPU
Users can instead do regex on summary where overflow is appended in description
User can do rocprof-sys-avail -H -c CPU -d -r overflow
* Revert change to column width
---------
Signed-off-by: David Galiffi <David.Galiffi@amd.com >
Co-authored-by: Milan Radosavljevic <milan.radosavljevic@amd.com >
Co-authored-by: David Galiffi <David.Galiffi@amd.com >
2025-11-21 00:19:58 -05:00
Kunal Malviya
4f4352acd0
Enable ROCPROF_KERNEL_TRACE for PC sampling test by setting value to 1 ( #1952 )
2025-11-20 22:24:15 -05:00
cfreeamd
24c2a84e3f
rocr: GPU core file location support ( #1732 )
...
* rocr: WIP Support dump of GPU core file
* WIP new core dump tests compile
* WIP: anony namespaces, test updates, progress
Added disabled Fault test. Other non-disabled coredump tests don't work.
* WIP: address code review feedback
* WIP: gpu core dump rocrtst works; combined
* WIP: remove rocrtst changes for this commit
2025-11-20 18:50:51 -08:00
amd-hsivasun
adf6a5ec3b
[Ex CI] amdsmi monorepo enablement ( #1943 )
...
* [Ex CI] amdsmi monorepo enablement
* [Ex CI] Add amdsmi pipeline to monorepo
2025-11-20 14:19:02 -05:00
Milan Radosavljevic
4d670099fa
[rocprof-sys] Refactor trace_cache architecture with improved type erasure and processing pipeline ( #1710 )
...
- Redesigned buffer_storage with a flush_worker pattern for better thread management and resource cleanup
- Introduced type-safe abstractions through new components: cacheable.hpp, cache_type_traits.hpp, sample_processor.hpp, and type_registry.hpp
- Optimized type erasure implementation in sample processor to reduce runtime overhead
- Renamed rocpd_post_processing to rocpd_processor and restructured the processing pipeline
- Removed storage_parser.cpp and integrated functionality into header-based template implementation
- Enhanced cache_manager with improved processing workflow and better separation of concerns
2025-11-20 14:18:13 -05:00
Istvan Kiss
2f6fb89c51
Add GPU programming patterns tutorials ( #1918 )
...
Update projects/hip/docs/tutorial/programming-patterns/atomic_operations_histogram.rst
WIP
Co-authored-by: Julia Jiang <56359287+jujiang-del@users.noreply.github.com >
2025-11-20 10:03:22 -08:00
jonatluu
6b8aae3796
Enable Lintian Support rocm-systems ( #1578 )
...
* draft testing fix for no copyright file and no changelog
* test fix no-changelog no-copyright
* changelog copyright fixt
* remove utils.cmake
* rocr lintian
* lintian overrides, copyright, changelog install
* fix lintian overrides install
* comp_type static fix and remove debug logs
* syntax error
* update static build check
* update file permissions to 0755 to fix error control-file-has-bad-permissions 0664 != 0755
* fix lintian errors in rdc and remove logs from roctracer
* lintian error fix rocprofiler
* fix lintian error
* mmove lintian overrides install
* lintian errors fix
* move lintian overrides install
* use changelog already provided by rdc
* fix formatting use existing changelog if provided
* fix formatting use changelog in rocprofiler
* draft testing fix for no copyright file and no changelog
* test fix no-changelog no-copyright
* changelog copyright fixt
* lintian overrides, copyright, changelog install
* fix lintian overrides install
* comp_type static fix and remove debug logs
* fix lintian errors in rdc and remove logs from roctracer
* lintian error fix rocprofiler
* fix lintian error
* mmove lintian overrides install
* lintian errors fix
* move lintian overrides install
* use changelog already provided by rdc
* fix formatting use existing changelog if provided
* fix formatting use changelog in rocprofiler
* remove overrides. Use existing changelog and copyright
* resolve merge conflict
* update license for hsa-rocr. Use NCSA license
* install license
* install license
2025-11-20 11:38:39 -05:00
Sajina PK
124c23e2ff
Fix Sampling freq for xgmi tests ( #1888 )
...
Low sampling freq was collecting very less samples causing the tests
validation to fail on some systems.
2025-11-20 09:49:09 -05:00
ggottipa-amd
bf521c996b
Correcting peak VALU Roofline profiling and analysis by removing FP8 VALU and BF16 VALU benchmarking. ( #1442 )
...
* Removing FP8 from peak VALU datatypes - PEAK_OPS_DATATYPES.
* Similar change for BF16.
* Roofline binaries from rocm-amdgpu-bench generated 10/22.
https://github.com/ROCm/rocm-amdgpu-bench/commit/2113ef1f5eada8a4a6e44e6d07fd6abac9b0a3f8
Bins include change that removes FP8 and BF16 peak VALU benchmarks.
Built and tested on rhel8, azl3, ubuntu22.04, sles15sp6.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com >
* Re-committing the bins
accidentally copied over bins from the wrong folder earlier, caught by Gowthami during testing.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com >
* Updated changelog
* gersemi fix
* Changelog corrected.
* Changelog fix.
* Adding this to the 7.2.0 section to be picked up in an RC build.
* Moving changelog entry into unreleasesd section - team reconfirmed cutoff date after I requested this change so I am just quickly correcting my mistake in my ask.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com >
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com >
Co-authored-by: Carrie Fallows <Carrie.Fallows@amd.com >
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com >
2025-11-20 12:46:48 +05:30
habajpai-amd
b09834e784
refactor: duplicated path helpers into common/path.hpp ( #1249 )
...
* refactor: duplicated path helpers into common/path.hpp
* update rocprof-sys-instrument to use shared path utility
* Add path::realpath(std::string[, std::string*]) helper function in common/path.hpp for binaries
* common: centralize remove_env implementation in environment.hpp
* remove unused includes from rocprof-sys binaries and argparse
* changing set to unordered_set wherever sorting is not required and additional cleanup
* review comment incorporated
* Apply suggestion from @Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* copilot review for remove_env incorporated
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
2025-11-20 09:55:00 +05:30
Swati Rawat
2e0e613397
Fix Perfetto link ( #1645 )
...
* Update using-rocpd-output-format.rst
* Fixing build_docs_from_source
* Removing credentials for docker
* reverting credentials
---------
Co-authored-by: bgopesh <gopesh.bhardwaj@amd.com >
2025-11-19 15:58:12 -08:00
German Andryeyev
919642f721
rocr: Expose PM4 emulation in the agent info ( #1869 )
...
Native AQL path in Windows requires extra logic, which has a conflict
with the implementation of Pm4 emulation and needs a detection in the client.
2025-11-19 18:26:23 -05:00
anujshuk-amd
d36b8ec66d
[rocprof-sys] Add documentation for building with multiple Python versions ( #1870 )
...
* Adding support to build ROCm with Multiple Python Environments
* Update install.rst
* Update projects/rocprofiler-systems/docs/install/install.rst
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Update projects/rocprofiler-systems/docs/install/install.rst
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
* Updated documentation to align with instuctions in `profiling-python-scripts.rst`
---------
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com >
Co-authored-by: David Galiffi <David.Galiffi@amd.com >
2025-11-19 18:18:53 -05:00
andmar-amd
da6e939c6c
Disable PCSampling on upstream branches ( #1421 )
...
- PC Sampling ioctls/tests are not up-streamed. They should be skipped
for any and all upstream branches.
2025-11-19 14:15:40 -08:00
andmar-amd
70fc774ad0
Disable KFDDBGTest.HitMemoryViolation for navi 10 ( #1423 )
...
- Filter out KFDDBGTest.HitMemoryViolation for navi10, which is
currently failing
2025-11-19 14:15:05 -08:00
andmar-amd
2b4d17078a
Improve test script logic and error handling ( #1424 )
...
- Fix exclude+gtest_filter logic
- Improve error handling when detecting upstream branches
2025-11-19 14:14:40 -08:00
Sajina PK
4ef1e53269
[Rocprof-Systems]: Documentation update for profiling modes and PAPI counter enablement ( #1437 )
...
* Documentation update for profiling modes and papi counter enablement
Update the documentation to add more details regarding profiling modes.
Update the Papi event and hardware counter collection documentation.
* Change1 for review comments
* Formatting changes for Examples
* Apply suggestions from code review
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
* Formatting and code block error fixed
* Bold applied
---------
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com >
Co-authored-by: prbasyal <prbasyal@amd.com >
2025-11-19 17:04:35 -05:00
Bing Ma
171a5f5bda
[aqlprofile] Enable SPM support for MI200/MI300 ( #1768 )
...
* [SPM] Enable legacy SPM aqlprofile API
* [SPM] Enable SPM aqlprofile_v2 API
* [NPI][SPM] Fix crash from ctrl test
* Adding decode v1 (#189 )
Co-authored-by: Giovanni baraldi <gbaraldi@amd.com >
* Fix various issues on MI200
1. RLC_SPM_PERFMON_SEGMENT_SIZE_CORE1 support
2. ActiveCU patch for SPM delay table
* [SPM] Fix wrong SPM counter values on MI3xx
* Add mode and query blocks (#196 )
Co-authored-by: Giovanni baraldi <gbaraldi@amd.com >
* [aqlprofile][spm] Use existing SpmBlockId enum info for delay table size
* [aqlprofile][spm] Remove obsolete logic
* Update projects/aqlprofile/src/core/include/aqlprofile-sdk/aql_profile_v2.h
---------
Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com >
Co-authored-by: Giovanni baraldi <gbaraldi@amd.com >
2025-11-19 11:17:01 -08:00
xuchen-amd
9efd330fae
add warning msg for unsupported arch in profile mode. ( #1933 )
2025-11-19 13:42:13 -05:00
JeniferC99
09010eb68a
Revert "rocr: Fix VMM cpu mapping clean up ( #1831 )" ( #1923 )
...
This reverts commit 2327cd35c8 .
2025-11-19 10:33:50 -08:00
Julia Jiang
78a9d9ff70
[clr] SWDEV-566950 - Adding changelog for 7.2 ( #1891 )
...
* [clr]SWDEV-566950 - Adding changelog for 7.2
* Update CHANGELOG.md
* Update CHANGELOG.md
2025-11-19 09:10:14 -08:00
Gopesh Bhardwaj
56a829995e
Perfetto build failures ( #1878 )
2025-11-19 11:01:56 -06:00