Gráfico de Commits

1059 Commits

Autor SHA1 Mensagem Data
Jimbo 3d9d35a1f8 SWDEV-553375 - Allow hipMemAllocationTypeUncached in hipMemGetAllocationGranularity (#847) 2025-09-05 10:31:20 -04:00
Sam Ruscica b2b44de1e7 Removed old perftest directory (#840) 2025-09-04 22:17:31 -04:00
Ajay GunaShekar 2202dcfe80 SWDEV-552613 - Windows: Use Direct Dispatch only on HSA ROCr Backend (#809)
* SWDEV-552613 - Disable Direct Dispatch on Windows

* SWDEV-552613 -  Use Direct Dispatch on HSA backend only

---------

Co-authored-by: GunaShekar <agunashe@amd.com>
Co-authored-by: Christophe Paquot <35546540+chrispaquot@users.noreply.github.com>
2025-09-04 14:07:12 -07:00
habajpai-amd fb6fe518e8 fix(transpose): correct host allocation and GB/s calculation (#860) 2025-09-04 16:08:16 -04:00
harkgill-amd 782dc9214b Fix: Error messages printed to stderr to trigger CMake Error Variable (#743)
This PR intends to cover the edge case seen in https://github.com/ROCm/rocm-systems/issues/694. 

`hip-config-amd.cmake` uses rocm_agent_enumerator to determine which GPU architecture to target when no target is specified.
https://github.com/ROCm/rocm-systems/blob/9a02dae75f8df9d8f08923d34d06d76e96ced7b4/projects/clr/hipamd/hip-config-amd.cmake.in#L86-L95

On WSL, both `readFromKFD` and `readFromLSPCI` are skipped. If `readFromTargetLstFile()` isn't in use, `readFromROCMINFO()` is called on. If rocminfo times out, it prints the following message to stdout.
```
"Timeout querying rocminfo.  Are you compiling with more than 254 threads?"
```
Because this is output and not an explicit error message, `execute_command` in the previous code blocks treats the output as `OUTPUT_VARIABLE` and passes it on as a valid gfx arch which causes these errors in CMake,
```
lang++: error: invalid target ID 'Timeout'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'querying'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'rocminfo.'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'Are'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'you'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
    clang++: error: invalid target ID 'compiling'; format is a processor name followed by an optional colon-delimited list of features followed by an enable/disable sign (e.g., 'gfx908:sramecc+:xnack-')
```
The output can be properly pushed to `ERROR_VARIABLE` if rocm_agent_enumerator pushes the output to stderr instead of stdout. This can be done with the changes to the print statement in this PR or using the `logging` module.
2025-09-04 15:12:41 -04:00
vedithal-amd 77ed80f457 Fix Performance (GFLOPs) metric (#843) 2025-09-04 14:30:22 -04:00
jamessiddeley-amd 5a85df8f31 [rocprof-compute] update coverage to 81.1% (#826)
* update coverage to 81.1%

* Update coverage to latest develop
2025-09-04 14:21:03 -04:00
SaleelK c4537e8050 SWDEV-553126 - Improve logging (#835)
* Ability to mask COPY api usage in logs
* Show total graph nodes in logs
* Add another log level for detailed debug
2025-09-04 10:08:41 -07:00
marandje 79bda80049 SWDEV-549686 - Resolve memory leaks in texture unit-tests (#711) 2025-09-04 17:59:18 +02:00
vstojilj 12ad8421bb SWDEV-549700 - Add missing destroy calls (#755) 2025-09-04 17:21:32 +02:00
jamessiddeley-amd 04c6cfec8c [rocprof-compute] updated test_utils mock to be compatible with python3.11 (#839)
* updated test_utils mock to work with python3.11

* fixed python formatting
2025-09-04 10:44:04 -04:00
German Andryeyev 7a1a6682e2 SWDEV-552846 - Unpin memory for hip before exit the copy (#851) 2025-09-04 20:04:01 +05:30
Ioannis Assiouras 7bf7110ae8 SWDEV-550667 - Correct the check for availability of __hip_atomic_fetch_add (#818) 2025-09-04 15:15:34 +01:00
RahulC 9b4c12a357 Revert "[rocprofiler-sdk][SDK] Update to address new API changes for HIP ROCm…" (#850)
This reverts commit 5ac738150a.
2025-09-03 21:52:43 -07:00
systems-assistant[bot] 9f11d73561 SWDEV-541096 - add hipEventWaitDefault and hipEventWaitExternal (#460)
Co-authored-by: Li, Todd tiantuo <Toddtiantuo.Li@amd.com>
2025-09-03 20:19:06 -07:00
estewart08 bc35beafbf rocr: Remove extra LibElf find_package (#767)
This should have been removed when the libelf config search
was added.
2025-09-03 20:04:05 -04:00
Ammar ELWazir fa70fef03d Fixing ROCPD SQL Inserts with strange text (#825)
* Fixing ROCPD SQL Inserts with strange text

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update generateRocpd.cpp

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-03 13:08:40 -05:00
Saurabh Verma c6669309a1 Add some missing copyright info (#841) 2025-09-03 12:06:48 -05:00
systems-assistant[bot] 83a10986a4 SWDEV-539130 - Log blit copy duration (#258)
Co-authored-by: Pengda Xie <pengda.xie@amd.com>
2025-09-03 10:01:47 -07:00
Pengda Xie b9fc643a56 SWDEV-538789 - Cleanup unused values in perftests(#789) 2025-09-03 09:13:29 -07:00
Ajay GunaShekar f2ad8d6d5e SWDEV-553099 - remove WITHOUT_HSA_BACKEND usage (#831) 2025-09-03 08:40:25 -07:00
vedithal-amd 181bdf9ca1 [rocprofiler-compute] Fix MI100 tests (#832)
* Fix MI100 tests

* Handle missing roofline in db_analysis.py
2025-09-03 11:09:12 -04:00
systems-assistant[bot] baaca0f956 SWDEV-545485 - Fix memory leaks in memset performance tests (#541)
Co-authored-by: Satyanvesh Dittakavi <Satyanvesh.Dittakavi@amd.com>
Co-authored-by: Satyanvesh Dittakavi <53337087+satyanveshd@users.noreply.github.com>
2025-09-03 20:35:42 +05:30
systems-assistant[bot] 673c93e96e SWDEV-545482 - Refactor the error handling test cases (#544)
Co-authored-by: Satyanvesh Dittakavi <Satyanvesh.Dittakavi@amd.com>
2025-09-03 20:34:38 +05:30
Satyanvesh Dittakavi 86792d8562 SWDEV-549680 - Fix memory leaks in stream tests (#629) 2025-09-03 20:33:49 +05:30
Venkateshwar Reddy Kandula 5ac738150a [rocprofiler-sdk][SDK] Update to address new API changes for HIP ROCm 7.1 (#793)
* Add new HIP 7.1 changes.

* bug fix.

* bug fix.

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-03 09:36:58 -05:00
systems-assistant[bot] 2cfedef6b6 [CI] Increase rocDecode and rocJPEG Code Coverage (#183)
* Increase rocDecode code coverage and add version check

* Update rocJPEG tests

* Fix rocJPEG tests

* Enable building tests/samples in rocm release compat workflow

* Readded rocJPEG test skips

* formatting

* Adding ROCm libraries for the code-coverage job

* Added return value check for error message and updated compatability to enable tests

* Disable rocm_release_compatibility samples and tests until openmp issue is resolved

---------

Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
2025-09-03 19:20:11 +05:30
SaleelK 230a22b395 rocr: Workaround for peak SDMA b/w on gfx94x (#626)
* Ideally SDMA0/1/2 are the engines to use for H2D/D2H due to physical
  PCIE proximity
* Allow using same src/dst agent for SDMA query apis
2025-09-03 09:33:29 -04:00
David Galiffi a57fd50865 Update the rocprof-sys-rt library (#786)
Derived from Dyninst_RT 13.0.0
2025-09-03 09:19:43 -04:00
Sajina PK 2da209da7f SWDEV-536287 - Detect SELinux mode and log error if enabled (#819)
* Detect SELinux mode and fail-fast

* Detect SELinux status by reading /sys/fs/selinux/enforce during initialization.
* Fix the verbose mode for HIP Stream events

* Add more information in the logs
Add information to the user about how to change the setting
2025-09-03 09:16:36 -04:00
Ioannis Assiouras c28acac74d SWDEV-550882 - Fix hang in Unit_hipIpcMemAccess_Semaphores (#816) 2025-09-03 10:01:15 +01:00
Ajay GunaShekar d4435bb74a SWDEV-552573 - restore asm file before clang formatting (#802)
* Restore asm file before clang formatting
* Surround inline assembly in clang-format off
2025-09-03 14:26:49 +05:30
Scott Todd 1eb4bd26a6 Bump rocprofiler-systems/external/papi to papi-7-2-0b2-t. (#785) 2025-09-02 13:21:47 -04:00
systems-assistant[bot] 45e969f0c3 rocprofv3-avail minor scrip fix (#172)
* rocprofv3-avail scrip fix

* addressing feedback

* formatting

* rocprofv3 and rocprofv3-avail to display help when no args are provided

---------

Co-authored-by: gobhardw <gopesh.bhardwaj@amd.com>
2025-09-02 09:49:12 -07:00
systems-assistant[bot] 7e3ddf3de8 SWDEV-515512 - Enable memcpy synchronization_behaviour tests (#593)
* SWDEV-515512 - Enable memcpy synchronization_behaviour tests

* SWDEV-515512 - Remove invalid parts of the tests

* SWDEV-515512 - Format the code

---------

Co-authored-by: Marko Arandjelovic <Marko.Arandjelovic@amd.com>
2025-09-02 17:32:08 +02:00
systems-assistant[bot] 12ca3c9043 SWDEV-548482 - Address memory leaks in memory tests (#438)
* SWDEV-548482 - Address memory leaks in memory tests

* SWDEV-547453 - Do not alter the dev_ptr if operation is not successfull

* SWDEV-548482 - Minor tweaks

* SWDEV-548482 - Move eventlist release after the command is created

---------

Co-authored-by: Marko Arandjelovic <Marko.Arandjelovic@amd.com>
2025-09-02 17:29:41 +02:00
systems-assistant[bot] 05a9a528f7 SWDEV-548482 - Address memory leaks in memory tests (#526)
* SWDEV-548482 - Address memory leaks in memory tests

* SWDEV-548482 - Added destroy calls

* SWDEV-548482 - Address one more memory leak

* SWDEV-548482 - Minor tweaks

* SWDEV-548482 - Run clang-format

* SWDEV-548482 - Add new lines

* SWDEV-548482 - Run clang-format

* SWDEV-548482 - Minor fix

---------

Co-authored-by: Marko Arandjelovic <Marko.Arandjelovic@amd.com>
2025-09-02 17:29:29 +02:00
jamessiddeley-amd f3a2bb07a4 [rocprofiler-compute] added ctest coverage and cdash submission (#366)
* added cdash automatic CI upload

* added cdash automatic CI upload

* tweaked wording

* changed nightly to continuous

* removed unnecessary dry-run arg

* updated README.md

* edited workflow description

* update coverage

* formatted cmakelists.txt

* ruff formatting and update coverage
2025-09-02 11:21:40 -04:00
systems-assistant[bot] bfbb005c42 SWDEV-491252 - Add stream capture testcases to host allocation APIs (#590)
* SWDEV-491252 - Add stream capture testcases to host allocation APIs

* SWDEV-491252 - Add stream capture behavior testcase for hipFreeHost

* SWDEV-491252 - Refactor capture testcases

* SWDEV-491252 - Run clang-format

---------

Co-authored-by: Marko Arandjelovic <Marko.Arandjelovic@amd.com>
2025-09-02 16:58:02 +02:00
systems-assistant[bot] 0468340d03 SWDEV-524815 - Specify path to hipconfig.exe instead of hipconfig on windows (#421)
Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com>
2025-09-02 15:15:26 +01:00
Ioannis Assiouras a1c30318fb SWDEV-546223 - Get image support info from ISA meta (#773) 2025-09-02 15:05:18 +01:00
systems-assistant[bot] ae874b489d SWDEV-515530 - Re-enable passing tests (#592)
* SWDEV-515530 - Re-enable passing tests

* SWDEV-515530 - Revert back windows config file

* SWDEV-515530 - Fix new line

* SWDEV-515530 - Enable a few more tests

* SWDEV-515530 - Enable passing VMM tests

* SWDEV-515530 - Disable failing tests

* SWDEV-515530 - Fix and enable texture tests

* SWDEV-515530 - Minor fixes

* SWDEV-515530 - Disable one more test

---------

Co-authored-by: Marko Arandjelovic <Marko.Arandjelovic@amd.com>
2025-09-02 16:03:07 +02:00
Mythreya Kuricheti 43ac6b2ef5 [rocprofiler-sdk] Add support for new RCCL API (#771)
* [rocprofiler-sdk] Add support for new RCCL API

Add support for `ncclAllReduceWithBias`

* Move func to be in sync with rccl header
2025-09-02 16:17:44 +05:30
systems-assistant[bot] 4174508dcc SWDEV-545953 - Tests for hipStreamGetId (#529)
Co-authored-by: Satyanvesh Dittakavi <Satyanvesh.Dittakavi@amd.com>
Co-authored-by: ammallya <ameyakeshava.mallya@amd.com>
Co-authored-by: swargamrambabu <rambabu.swargam@amd.com>
Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com>
2025-08-29 16:07:01 +05:30
habajpai-amd cd729ab630 Improve library discovery in openmp-target example (#792)
cmake(openmp/target): make libomptarget discovery robust across ROCm layouts
2025-08-28 14:55:55 -04:00
gabrpham 94e194eba2 [SWDEV-540377] Fixed segfault in --showevent command (#649)
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-08-28 11:49:36 -05:00
Sam Ruscica 9018e0fc7b SWDEV-546639 monorepo fix for nvidia hip runtime api (#746)
* SWDEV-546639 monorepo fix for nvidia hip runtime api

* Added back hipSetValidDevices.
2025-08-28 09:03:37 -07:00
usrihari123 2449bfd483 Update the scratch memory docs with the new allocation_size field (#685)
* Update the scratch memory docs with the new allocation_size field

* Address review comment

---------

Co-authored-by: Srihari <srihariu1@gmail.com>
2025-08-28 17:37:06 +05:30
Ioannis Assiouras 1017532916 SWDEV-546631 - Fix hipLaunchHostFunction in stream capture for windows (#654) 2025-08-28 07:51:50 +01:00
itrowbri 4d98a0169f Handle special cases when stream value is hipStreamLegacy (0x01) or hipStreamPerThread (0x02) (#343)
* Updated stream code to handle special cases when stream value is 0x01 or 0x02

* Removed extra definitions and updated tests to account for special case

* Modified stream.cpp so that each thread assigned a unique stream ID when hipStreamPerThread is used as stream value. Modified tests to check that threads are assigned unique, repeated values when hipStreamPerThread is called

* Updated idx_offset, stream_map, and thread counter to be in one struct.

* Update stream.cpp to only use add_stream() and update tests for seperate unit test for hipStreamPerThread

* Remove unecessary comment

* Removed unecessary line

* Updated tests and stream.cpp to update stream ID correctly

* Updated test structure
2025-08-27 20:04:13 -05:00