Граф коммитов

1589 Коммитов

Автор SHA1 Сообщение Дата
andmar-amd 2b4d17078a Improve test script logic and error handling (#1424)
- Fix exclude+gtest_filter logic
 - Improve error handling when detecting upstream branches
2025-11-19 14:14:40 -08:00
Sajina PK 4ef1e53269 [Rocprof-Systems]: Documentation update for profiling modes and PAPI counter enablement (#1437)
* Documentation update for profiling modes and papi counter enablement

Update the documentation to add more details regarding profiling modes.
Update the Papi event and hardware counter collection documentation.

* Change1 for review comments

* Formatting changes for Examples

* Apply suggestions from code review

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Formatting and code block error fixed

* Bold applied

---------

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
Co-authored-by: prbasyal <prbasyal@amd.com>
2025-11-19 17:04:35 -05:00
Bing Ma 171a5f5bda [aqlprofile] Enable SPM support for MI200/MI300 (#1768)
* [SPM] Enable legacy SPM aqlprofile API

* [SPM] Enable SPM aqlprofile_v2 API

* [NPI][SPM] Fix crash from ctrl test

* Adding decode v1 (#189)

Co-authored-by: Giovanni baraldi <gbaraldi@amd.com>

* Fix various issues on MI200
1. RLC_SPM_PERFMON_SEGMENT_SIZE_CORE1 support
2. ActiveCU patch for SPM delay table

* [SPM] Fix wrong SPM counter values on MI3xx

* Add mode and query blocks (#196)

Co-authored-by: Giovanni baraldi <gbaraldi@amd.com>

* [aqlprofile][spm] Use existing SpmBlockId enum info for delay table size

* [aqlprofile][spm] Remove obsolete logic

* Update projects/aqlprofile/src/core/include/aqlprofile-sdk/aql_profile_v2.h

---------

Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>
Co-authored-by: Giovanni baraldi <gbaraldi@amd.com>
2025-11-19 11:17:01 -08:00
xuchen-amd 9efd330fae add warning msg for unsupported arch in profile mode. (#1933) 2025-11-19 13:42:13 -05:00
JeniferC99 09010eb68a Revert "rocr: Fix VMM cpu mapping clean up (#1831)" (#1923)
This reverts commit 2327cd35c8.
2025-11-19 10:33:50 -08:00
Julia Jiang 78a9d9ff70 [clr] SWDEV-566950 - Adding changelog for 7.2 (#1891)
* [clr]SWDEV-566950 - Adding changelog for 7.2

* Update CHANGELOG.md

* Update CHANGELOG.md
2025-11-19 09:10:14 -08:00
Gopesh Bhardwaj 56a829995e Perfetto build failures (#1878) 2025-11-19 11:01:56 -06:00
xuchen-amd c778acdb70 [rocprof-compute] update yamls for docs (#1887) 2025-11-19 10:46:02 -05:00
raramakr eddd4c3601 SWDEV-505204 - Update libamdocl.so installation path to avoid exposing all ROCm libraries via ldconfig (#1914)
ldconfig is run during rocm-opencl package installation.
Installing libamdocl.so in /opt/rocm-xxx/lib exposes all ROCm libraries when /opt/rocm/lib is added to ldconfig.
To prevent this, libamdocl.so is now installed in /opt/rocm-xxx/lib/opencl.
ldconfig will use the updated path, limiting exposure to only libamdocl.so library.

Co-authored-by: raramakr <raramakr@amd.com>
2025-11-19 21:14:28 +05:30
abchoudh-amd 433af908a6 Iteration multiplexing in rocprof-compute (#1533)
* Profile with multiple input files

* Iteration multiplexing kernel option

* Iteration multiplexing data

* Iteration multiplexing

* Counter profile caching

* Counter dispatch info

* Sanitize CLI args

* Formatting and removed unused header file

* Formattng

* Changed CLI args

* Merge counters for analysis

* Iteration multiplexing log while analysis

* Formatting

* Log

* Guard against incomplete profiling

* Fixed merge counter

* Tests

* Update doc

* Test update

* Fixed formatting

* Test fix

* Merge conflict commit

* Fix tests

* Added comment for counter definition file

* Do not allow dispatch filtering with iteration multiplexing

* Fixed formatting

* Doc indentation update
2025-11-19 21:09:16 +05:30
abchoudh-amd 76ea35787d Split roofline tests, and fix none outputs (#1913)
* Split roofline tests

* Use N/A for missing values

* Test eval_expression for no valid data

* Fixed tests

* Updated Changelog for N/A

* Fixed platform specific test failure
2025-11-19 15:36:08 +05:30
vedithal-amd ae8f72fa79 [rocprofiler-compute] Use native tool for counter collection (#1212)
* Use native tool for counter collection

* Add native counter collection tool which uses rocprofiler-sdk C++
  library public API to get counter collection data
    * This is enabled by default, unless --no-native-tool option is
      provided or ROCPROF=rocprofv3 env. var. is provided
    * This tool is only supported for ROCm version >=7.x.x
    * This tool is not supported for attach/detach scenario
* Build native tool shared object during build time
* If using rocprof-compute without building then runtime compilation of
  t push native tool shared object is performed
* rocprofiler-sdk tools is still used for services other than counter
  collection and data collected by native tool is merged into the
  rocpd/csv output of rocprofiler-sdk tool

* Make `rocpd` choice the default choice for `--format-rocprof-output`
  option
    * If `rocpd` public API from rocprofiler-sdk library is not present,
      then fallback to `csv` choice
    * In this case only `pmc_perf.csv` is written in workload folder
      instead of multiple `csv` files for each profiling run
* Remove `json` choice from `--format-rocprof-output` option since it
  functions identical to `csv` option

* Rename option `--rocprofiler-sdk-library-path` to
  `--rocprofiler-sdk-tool-path` since we LD_PRELOAD the
  rocprofiler-sdk tool shared object and not the rocprofiler-sdk library
shared object

* Fix the meaning of `--dispatch` option in `profile` mode to mention
  dispatch iteration filtering instead of dispatch id filtering
    * --dispatch option in analyze mode does dispatch id filtering

* Move standalone binary creation logic from cmake file to docker file

* fix native counter collection tool during attach/detach

* improve logging

* fix attach detach with native tool

* fix attach detach with native tool

* do not support attach/detach in native tool

* Update changelog

* add standalone binary creation functionality in cmake

* address review comments

* address review comments

* fix formatting

* address review comments

* Adding paths for cmake to search. Also updated min. cmake requirement to 3.21 as this was when hip was supported.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Update hip compiler ID check, sometimes comes up as Clang, sometimes ROCMClang- depends on setup.
Updated formatting.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* RHEL8.10 unable to compile due to defaulting to old c++ version, need to force c++17

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Updating changelog per docs team recommendations

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Apply suggestions from code review to changelog

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Do not required HIP complier to build native counter collection tool

* fix cmake

* gersemi formatting on latest cmake change

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* ex ci updated dependencies to include rocprofiler-sdk, but cmake was still not capturing the path- there was a commit that added to the cmake_prefix_path entry that specified rocprof-sdk's cmake location ut was too specific for the search paths in find_package's config mode.
removing the cmake_prefix_path var and adding hints to find_package call instead, and specifying config mode so it knows how to construct the search paths

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* gersemi run for formatting

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Still need prefix path, should not have been removed in last commit but does need to be shortened to just the rocm path to allow for find_package config mode to do the job

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* include cstdint for uint32_t

* Run formatting on helper.cpp

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Remove rocm 7.2 release stuff from version and changelog and handle it in separate pr

* fix version

* fix changelog

* fix changelog

* run ruff formatter

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* fix rocprofiler-sdk attach so path

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-11-18 23:34:38 -05:00
cfallows-amd 8418895b92 Add tencentos to roofline binary detection (#1830)
Force tencentos to use rhel-based bin since tencent is branched off of centos, which is branch of fedora.

Verified rocprof-compute run correctly selects bin to use, and the roofline benchmark values look similar between runs on rhel vs tencentos4 docker images on same system.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-11-18 14:51:05 -05:00
Milan Radosavljevic 3ee393047c Add user_api_active flag to enable/disable user-defined regions (#312)
* Add user start/stop bool

* Update documentation for user-api

* Update projects/rocprofiler-systems/source/lib/rocprof-sys-dl/dl.cpp

Co-authored-by: Aleksandar Djordjevic <aleksandar.djordjevic@amd.com>

* Format fix

---------

Co-authored-by: Aleksandar Djordjevic <aleksandar.djordjevic@amd.com>
2025-11-18 13:48:27 -05:00
Gopesh Bhardwaj 9fa00a76d3 Added license information (#1862)
* Added license information

* Remove leading space in include

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Mythreya Kuricheti <mythreya.kuricheti@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-11-18 09:05:11 -08:00
Swati Rawat 4c4b3a3e95 Fix the broken sample GitHub link (#1828) 2025-11-18 08:59:37 -08:00
Mark Meserve 12718139fe [rocprofiler-sdk] rename librocprofv3-attach.so (#1342)
* attach: rename librocprofv3-attach

- Renames library to librocprofiler-sdk-rocattach
- ROCAttach library will be formalized and documented in future commit

* Address review comments

- Rename rocprofv3-attach.py to rocprof-attach.py
- Use common filesystem.hpp in rocattach

* Fix component name typo

* Doc fixup

---------

Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-18 08:52:17 -08:00
ywang103-amd 15e5f09038 roll back json file processing logic for pc sampling (#1835)
* roll back json file processing logic for pc sampling

* format cmake files

* Revert "format cmake files"

This reverts commit e64df65a8f30abcb6738e3a0d7ffd4270bd1d302.
2025-11-18 08:32:21 -08:00
Jonathan R. Madsen 18d956eb9c [rocprofiler-sdk] Fix hip compiler table initialization after finalization (#1174)
* [rocprofiler-sdk] Fix hip compiler table initialization after finalization

- Resolves tickets
  - https://ontrack-internal.amd.com/browse/SWDEV-557219
  - https://ontrack-internal.amd.com/browse/SWDEV-505503

* Tweak log message

* Remove unsupported hip limit enums

- hipLimitDevRuntimeSyncDepth
- hipLimitDevRuntimePendingLaunchCount

* Update conftest.py

Co-authored-by: Mark Meserve <mark.meserve@amd.com>

* Update README.md

Co-authored-by: Mark Meserve <mark.meserve@amd.com>

* Update hip_host.cpp

---------

Co-authored-by: Mark Meserve <mark.meserve@amd.com>
2025-11-18 08:28:42 -08:00
vedithal-amd 44a32e23ac [rocprofiler-compute] Bump version and update changelog ahead of ROCm 7.2 release (#1908) 2025-11-18 10:04:28 -05:00
Ameya Keshava Mallya 8eceb6e5eb Merge commit 'a044536b8d690a9ae5962a93e7596d9eec2030b7' into develop 2025-11-18 01:14:31 +00:00
Sajina PK f6183e3563 [Rocprofiler-systems]: Documentation addition for xgmi and pcie metrics feature (#1798)
* Documentation addition for xgmi and pcie metrics feature

Add documentation to provide details about How to get collect XGMI and PCIe interconnect metrics.

* Apply suggestions from code review

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Update projects/rocprofiler-systems/CHANGELOG.md

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Update projects/rocprofiler-systems/CHANGELOG.md

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

---------

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-17 18:34:28 -05:00
Ameya Keshava Mallya ac9e029c3e Add 'projects/amdsmi/' from commit 'b4b3539631460b986dddc86a2303cef11cd38816'
git-subtree-dir: projects/amdsmi
git-subtree-mainline: 0633d8d8ce
git-subtree-split: b4b3539631
2025-11-17 22:28:37 +00:00
randyh62 92b3629b25 Update environment.yml (#1884)
Update path to requirements.txt
2025-11-17 12:10:56 -08:00
Milan Radosavljevic db111129ab [rocprof-sys] Add test to check perfetto files have been merged (#1863) 2025-11-17 11:50:40 -05:00
David Galiffi 828921c616 [rocprofiler-systems] Update CHANGELOG.md with 7.1.1 notes (#1844)
* Update CHANGELOG.md

* Update projects/rocprofiler-systems/CHANGELOG.md

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

---------

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-11-17 11:47:08 -05:00
jamessiddeley-amd d49e2e35fd [rocprof-compute] Automate ctest coverage and test cases on runners with CDash (#1481)
* Add nightly coverage workflow

* ruff formatting

* temp workflow testing

* restore workflow file

* add workflow condition

* update workflow file

* update workflow file

* fix typo in run-ci.py

* edit run-ci.py

* add python deps install

* add python deps install

* add python deps install

* add python deps install

* check if enable coverage is on when using workflow

* remove github CI breakdown and fix enable coverage

* set cache variables must be set before dashboard starts

* Update run-ci.py

* Update run-ci.py to fix ctest cache

* Update rocprofiler-compute-code-coverage.yml to install tests

* Update rocprofiler-compute-code-coverage.yml

* Restore workflow file

* Update run-ci.py

* Simplify workflow build command

* Update run-ci.py to build tests

* edited run-ci script

* edit ctest configure commands

* edit ctest configure commands to be on one line

* edit ctest configure command to include path to amdclang++

* update clang check in tests/cmakelists.txt

* update rocm

* update rocm

* update rocm version 7.0.2

* update tests/CMakeLists.txt

* use tarball instead for rocm install

* apt install rocm-dev instead for 7.0.0 release

* workflow tweaks

* update to use new 'tools' dir

* install rocm-dev

* add CMAKE_CXX_COMPILER as clang

* update tests/cmakelists.txt

* update cdasg site and build names

* remove run automatically on pull requests

* ruff format

* increased timeouts for tests

* add back reruns for workflow testing

* fix typo

* rename workflow "nightly" -> "code"

* added tracks to keep track of gpu (325 vs 355)

* remove test_db_connector.py

* revert build names and tracking

* update workflow pushes

* CMake format

* changed parallel level back to 1
2025-11-17 09:24:24 -05:00
Gopesh Bhardwaj 75ad45d5f1 Added missing license (#1861) 2025-11-17 11:16:09 +05:30
Sajina PK 09b8342e22 [Rocprofiler-systems] : Add XGMI and PCIe metrics to the profiling data (#1628)
* Add XGMI and PCIe metrics to the profiling data

Add support for AMD XGMI (GPU-to-GPU interconnect) and PCIe
metrics:
  * XGMI link width in bits
  * XGMI link speed in GT/s
  * Per-link read bandwidth (KB)
  * Per-link write bandwidth (KB)

- Add new categories for PCIe metrics:
  * PCIe link width
  * PCIe link speed in GT/s
  * Accumulated bandwidth (MB)
  * Instantaneous bandwidth (MB/s)

* Fix VCN/JPEG insert logic

* Modify the gpu_metrics struct to accomodate XCP structure

* Add ctest automation for gpu interconnect metrics

* Refactor to move gpu_metrics struct and serialization to another file

* Possible fix for timeout in CI

Fix redundant skip check in ctest
Add xgmi and pcie option in rocprof-sys-avail.

* Change2: Address review comments

Change ctest sampling to avoid timeout
Change variable name and code structuring

* Add option in ctest to run rocprof-sys-run without rewrite

Run transferbench with rocprof-sys-run without sampling

* Change3: Fix sample insert bug and address review comments

xgmi and pci support check
renaming variables
additional hip_api validation in rocpd

* Reduce the load from the trnasferBench sample

The CI builds were timing out when flushing a big temporary file to the
DB: (2720824.23 KB / 2720.82 MB / 2.72 GB)...
2025-11-14 19:42:33 -05:00
David Yat Sin 9535b7fcbe rocr: Fix exception on AsyncEventControl init (#1852)
* rocr: Fix exception on AsyncEventControl init

Fix exception on init when compiling with in release mode.

* rocr: Fix crash when interrupts are disabled

Fix segfault due to assert for signal->EopEvent() being false when
HSA_ENABLE_INTERRUPT=0. Use Signal::WaitMultiple(..) when interrupt is
disabled.

---------

Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-14 12:45:34 -08:00
Alysa Liu 2327cd35c8 rocr: Fix VMM cpu mapping clean up (#1831)
Remove CPU mapping before calling RemoveAccess().
2025-11-14 13:52:45 -05:00
German Andryeyev ff4782620e SWDEV-547108 - Fix PAL build with HSA backend (#1850)
When hip is built with HSA backend then the headers from ROCR will be used, but
scratch_backing_memory_byte_size is a part of amd_queue_v2_t structure
2025-11-14 12:28:03 -05:00
marandje 5616a255e2 SWDEV-515530 - Re-enable passing tests (#1013) 2025-11-14 16:36:44 +01:00
Swati Rawat cb257ab9f7 [rdc] Replace readme link rdc -> rocm-systems/projects/rdc (#1758)
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-11-14 13:19:26 +01:00
amilanov-amd 738bf16008 [hip-tests] Tag multigpu tests with Catch2 tags (#1315) 2025-11-14 13:00:30 +01:00
venkatesh-amd f7249e092b SWDEV-533237 : Added test cases for hipOccupancyAvailableDynamicSMemPer… (#716)
* SWDEV-533237 Added test cases for hipOccupancyAvailableDynamicSMemPerBlock API

* SWDEV-533237 : Added test cases for hipOccupancyAvailableDynamicSMemPerBlock

* SWDEV-533237 : Addressed review comments for hipOccupancyAvailableDynamicSMemPerBlock aip test cases

---------

Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com>
2025-11-14 15:41:45 +05:30
Milan Radosavljevic a77be32660 Prevent duplicated sdk events (#1826) 2025-11-13 22:36:36 -05:00
David Galiffi 540eda3865 [rocprof-sys] Forward ctest labels from the execution test to the validation test. (#1697)
* Forward ctest labels from the execution test to the validation test.

* Adjust test validation parameters for amid_smi samples

The actual number of samples will vary depending on the GPU. This test
is just to validate the presence of the samples
2025-11-13 21:49:07 -05:00
Milan Radosavljevic 833c250c27 Add clean up fixture for trace cache temporary files (#1836)
* Add clean up fixture for trace cache tmp files

* Switch to bash instead of cmake running command
2025-11-13 21:01:04 -05:00
Matt Arsenault 4830979f0e SWDEV-548892 - Stop using ocml fma wrappers (#1702)
Directly use elementwise builtin
2025-11-13 16:20:27 -08:00
Matt Arsenault 42e91b8934 SWDEV-548892 - Stop using ocml sqrt wrappers (#1716) 2025-11-13 16:19:44 -08:00
Kian Cossettini 65b607b0bd [rocprofiler-systems] Add rocprof-sys-build to gitignore (#1829)
* Add rocprof-sys-build to gitignore

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-13 16:22:19 -05:00
Julia Jiang 5599e8b1de SWDEV-561500 - Update change log and port 7.1.1 to develop branch (#1688)
* SWDEV-561500 - Porting changelog(up to 7.1.1) to develop branch

* Update CHANGELOG.md

* Update CHANGELOG.md

* Update CHANGELOG.md
2025-11-13 12:22:34 -08:00
systems-assistant[bot] f55dda2082 SWDEV-543340 - Added Unit_hipEventIpc_shm_cleanup test (#548)
The test verifies that all shared memory objects for
IPC events used internally by HIP are properly cleaned
up after use and do not leave persistent files in /dev/shm.

Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com>
2025-11-13 18:21:12 +00:00
Istvan Kiss a0f53a5fdb Sync HIP documentation leftover (#1597)
* Sync HIP documentation leftover

* Update HIP docs environment.yaml and doxyfile
2025-11-13 09:19:33 -08:00
David Yat Sin 7e4b62290c rocr: Switch back to legacy IPC (#1744)
Switch back to legacy IPC Implementation while we fix some race
conditions.
2025-11-13 09:41:55 -05:00
Giovanni Lenzi Baraldi 5b5269f666 [aqlprofile] Enable nondetail shaderdata (#1805) 2025-11-13 13:47:21 +01:00
Giovanni Lenzi Baraldi cf164dd025 Fix for SQTT perfmon IDs (#1818)
* Fix for SQTT perfmon IDs

* Review comments
2025-11-13 13:46:57 +01:00
systems-assistant[bot] 720a5bcf9a SWDEV-547526 - Add missing free calls (#531)
Co-authored-by: Vladana Stojiljkovic <Vladana.Stojiljkovic@amd.com>
2025-11-13 11:16:41 +01:00
systems-assistant[bot] 7450910e53 SWDEV-548241 - Add missing destroy calls in graph tests (#520)
Co-authored-by: Vladana Stojiljkovic <Vladana.Stojiljkovic@amd.com>
2025-11-13 11:13:40 +01:00