コミットグラフ

1424 コミット

作成者 SHA1 メッセージ 日付
Gopesh Bhardwaj 1585fe59cd [Documentation] Repo location and limitation update (#1537) 2025-10-27 12:26:05 -04:00
MachineTom eb69a455ed SWDEV-558844 - Cleanup Os header (#1530)
Remove codes that aren't used in Os header.
2025-10-27 11:52:31 -04:00
systems-assistant[bot] c1926d547e SWDEV-515530 - Re-enable passing tests on NV (#605) 2025-10-27 16:32:37 +01:00
Benjamin Welton d496bcef18 Fix dimension mismatch for multi-GPU systems with identical architect… (#1440)
* Fix dimension mismatch for multi-GPU systems with identical architectures

This change addresses an issue where counter dimensions were incorrectly
shared across all GPU agents with the same architecture name, even when
those agents had different hardware configurations (e.g., different CU counts).

Changes:
- Updated getBlockDimensions() to accept agent ID instead of architecture name
- Made dimension cache agent-specific instead of architecture-specific
- Updated set_dimensions() in AST evaluation to use specific agent ID
- Modified all API functions to handle agent-specific dimension lookups
- Updated tests to work with agent-specific dimensions

This fix ensures that dimensions accurately reflect the actual hardware
configuration of each individual GPU agent, preventing dimension mismatches
in multi-GPU systems where GPUs share the same architecture but have
different physical configurations.

Counter ID Representation Changes:
- Modified counter_id encoding to include agent information in bits 37-32
- Agent logical_node_id is encoded as (value + 1) to ensure agent 0 is detectable
- Counter records internally store only 16-bit base metric IDs (bits 15-0)
- Tool reconstructs agent-encoded counter IDs from base metric ID & agent info
- Instance record counter_id field uses bitwise AND mask to extract base metric ID
  (counter_id.handle & 0xFFFF) to fit in 16-bit storage
- Output generators (CSV, JSON, Perfetto) use agent-encoded IDs for consistency
- Updated counter_config.cpp and metrics.cpp to extract base metric ID when needed
- All counter lookups now properly handle agent-encoded vs base metric IDs

This ensures counter IDs are consistent between metadata and output records while
maintaining compact storage in instance records.
2025-10-27 07:58:20 -07:00
systems-assistant[bot] e22856b3ac SWDEV-515562 - Fix and enable hipDeviceReset tests (#594) 2025-10-27 15:07:44 +01:00
systems-assistant[bot] 8cc65f49c4 SWDEV-491296 - Add stream capture testcases to Virtual Memory APIs (#589) 2025-10-27 15:06:51 +01:00
marantic-amd 08d259c24c Fix the issue when sampling JAX with rocpd (#1552) 2025-10-27 09:59:51 -04:00
David Yat Sin f7b180ee7d rocr: SW workaround for gfx90x SDMA poll (#1469)
Workaround for rare issue on gfx90x asics when SDMA_OP_POLLREGMEM
returns before polled memory has value of 0.
Removing previous SW workaround to double-poll as it was not reliable.
2025-10-27 09:33:20 -04:00
David Yat Sin db01d95ebc Users/dayatsin/swdev 519413 hsa amd pointer info return err shutdown (#1509)
* rocr: hsa_amd_pointer_info return err on shutdown

Decrement ref count before starting to unload to make sure API
calls during shutdown return error.

Delete blit objects during agent destructor.

* Add support for HSA_AMD_SYSTEM_SHUTDOWN_EVENT

Add support for new event to indicate shut down within the
hsa_amd_register_system_event_handler API.
2025-10-27 09:32:52 -04:00
systems-assistant[bot] 45d6598724 SWDEV-517867 - Enable Unit_hipStreamCreateWithPriority_MulthreadDefaultflag (#599) 2025-10-27 11:36:40 +01:00
systems-assistant[bot] abaf29d0b6 SWDEV-537855 - Add hipEventDestroy (#554)
Co-authored-by: Vladana Stojiljkovic <Vladana.Stojiljkovic@amd.com>
2025-10-26 21:20:21 +01:00
SaleelK f301053740 clr: Improve logging (#1457) 2025-10-25 15:55:27 -07:00
David Galiffi e22a8e865e Update Timemory submodule (#1539)
- Fixes clang build failure

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Aleksandar Janicijevic <Aleksandar.Janicijevic@amd.com>
2025-10-25 14:56:43 -04:00
David Galiffi 28c2728b6b Update Dyninst module (#1540)
- Fix nullptr check

------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Aleksandar Janicijevic <Aleksandar.Janicijevic@amd.com>
2025-10-25 14:56:29 -04:00
MachineTom 6a49171fa5 SWDEV-562431 - Fix Unit_hipBindTexture_Negative failure (#1523) 2025-10-24 16:25:22 -04:00
Rakesh Roy e9dac39102 SWDEV-560065 - Revert changes to align error code with Cuda when stream capture is tried on Legacy stream (#1337)
* SWDEV-560065 - Revert "SWDEV-555484 - Invalidate capturing stream only for null/legacy stream. (#1032)"

This reverts commit 99613f1009.

* SWDEV-560065 - Revert "SWDEV-542700 - Return an error if stream capture is attempted on the null stream while a stream capture is active. (#450)"

This reverts commit 0647cf1d28.
2025-10-24 21:33:25 +05:30
Milan Radosavljevic 8806be162c Change how cache manager handles child process trace cache for rocpd (#1033)
* Change how cache manager handles child process trace cache

* Sampling and backtrace metrics to cache

* Apply cmake formatting

* Fix parsing of metadata json

* Code clean up

* Fix build nlohmann json from source

* Fix storage parsed finished callback

* Revert sampling for child process

* Change cache file name generating

* Fix thread start stop

* Fix process start end timestamp

* Applied suggestions from code review

* Try with late start of flushing task thread

* Change dockerfiles for ci

* Revert changes on github workflows

* Remove json_fwd.hpp include

* fix dump

* Build nlohmann/json by default

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update location of build artifacts for nlohmann/json

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Revert use_output_suffix

* Remove unused logs

* Fix cache store inside counter due to structure change

* Remove decode tests from debian ci

* Fix issue where all databases have the same UUID (#1499)

Co-authored-by: Aleksandar Djordjevic <adjordje@amd.com>

* Removing the cpack and install steps to save space

* Revert "Remove decode tests from debian ci"

This reverts commit ddabf6dd142dcf438e6b8997b8abe86f2c868468.

* Revert "Removing the cpack and install steps to save space"

This reverts commit 973da3a1ba99d99d529af5269d30e177092f9bfa.

* Add prepare-runner job as dependency to clean up the space

* Fix formatting

* Free up even more space

* Remove verbose for workflows

* remove hw_counters from ext_data

* move space clean up inside container

* try to remove external folder to free up space

* Check space

* Refactor Cleanup to it's own step

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Aleksandar Djordjevic <aleksandar.djordjevic@amd.com>
Co-authored-by: Aleksandar Djordjevic <adjordje@amd.com>
2025-10-24 11:47:15 -04:00
Rahul Manocha 4f075902fc SWDEV-555347 - Remove lock contention in async events loop (#878)
* SWDEV-555347 - Remove lock contention in async events loop

* SWDEV-555347 - Introduce Pool of AsyncEventItems

* create generic mempool for AsyncEventItem

* Use BaseShared allocate and free for async event pool

---------

Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-10-24 08:43:00 -07:00
marandje 7e20e8ec13 SWDEV-548500 - Resolve memory leaks in memory tests (#1093) 2025-10-24 16:27:48 +02:00
pghoshamd 95f721f8a5 Check emulator mode at runtime (#1432)
* Check emulator mode at runtime

* Reduce emu mode function call to one time and use result

* Move function to main.cc

* Address feedback

* EmuMode check improvement; convert to AoS

* replace g_isEmuMode with func call

* Add mode check func for every sample
2025-10-24 10:11:19 -04:00
systems-assistant[bot] 339877853d SWDEV-487395 - Add capture testcases to memcpy APIs (#587) 2025-10-24 12:43:45 +02:00
systems-assistant[bot] 196086042d SWDEV-523137 - Enable and fix failing tests on NV (#602) 2025-10-24 12:41:54 +02:00
Jatin Chaudhary 48313b8655 SWDEV-1 add missing hiperror entries (#1450) 2025-10-24 09:29:27 +01:00
abchoudh-amd a7bbe0c5d2 Use amd-smi Python API instead of CLI (#1334)
* Use amd-smi Python API instead of CLI

Formatting fix

python path

* Update CHANGELOG

* Create amdsmi interface

* Added amdsmi tests

* Removed run

* Prioritize rocm's amdsmi python API

* address review comments

* update changelog

* fix ruff formatting

---------

Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
2025-10-24 11:11:33 +05:30
SaleelK 839fb95717 clr: Do not increase signal pool (#1354)
* Do not increase signal pool when profiling, instead allow saving off
  timestamps. This is slow but a tradeoff to memory footprint of the
signals
2025-10-23 22:05:00 -07:00
MachineTom 5f76cb916d SWDEV-555888 - Refactor Numa code (#1191)
1. Create a set of mini numa interface.
In Linux, the interface is based on system call rather than libnuma.
In Windows, the interface can also work, but the policy class is dummy.
Different from Linux, Windows doesn't provide numactl tool or numa lib to setup numa policy, thus
the default policy is followed in Windows, that is, using the closest host numa node to allocate
pinned host memory in hipHostMalloc().
To get the closest host numa node of a GPU device, you need query the new attribute
hipDeviceAttributeHostNumaId. Then you can create a thread with CPU affinity on the numa node.
For example, reference the test in hip-tests/catch/perftests/memory/hipPerfHostNumaAllocWin.cc.

2. Remove pfnSetThreadGroupAffinity and pfnGetNumaNodeProcessorMaskEx as the functions have been exposed since Win7 and Win server 2008.

3. Other minor fixes.
2025-10-23 21:56:15 -04:00
Ioannis Assiouras 602ea0be1e SWDEV-558078 - Fix use-after-free in graph tests due to AsyncEventHandler (#1502) 2025-10-23 22:49:24 +01:00
Julia Jiang 4942f3cae5 SWDEV-555548 - Fix Unit_hipMemPoolMaxAlloc failure on Windows (#1486) 2025-10-23 17:09:46 -04:00
nunnikri 45528ea3fc SWDEV-559329 : Added missing hash value needed for module file (#1431) 2025-10-23 12:05:41 -07:00
Pengda Xie a4bbd73dc6 SWDEV-556684 - Remove HSAIL support (#1183) 2025-10-23 11:21:49 -07:00
Kian Cossettini db949445c3 [rocprofiler-systems] Overhaul OpenMP-VV Test compilation (#1389)
* Reworked Compilation

* Formatting

* Change compile log name

* Optimize Code

* Remove gfx940 and gfx941
2025-10-23 13:58:11 -04:00
Venkateshwar Reddy Kandula 40f9f15ece use rhel 8.10 amdgpu kernel driver for rhel 8.8 (#1490) 2025-10-23 09:00:10 -05:00
Charis Poag Jones 933fdc3c7e [SWDEV-558141] Fix rocm-smi --setsclk [0...n] & other clocks in partitioned configurations (#1493)
Changes:
  - Fix `rocm-smi --setsclk [0 .. n]` for multiple devices to continue on fail when
    in a partitioned configuration (ex. in DPX/QPX/CPX/etc).
  - Partitioned configurations or devices which do not support changing
    sclk/mclk/pcie clks will now continue on failure. Will report a "not
    supported" or other (rocm-smi) error codes for these devices.
  - Updates impact other clock settings such as `--setmclk` and
    `--setpcie`.

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-10-23 08:56:41 -05:00
vedithal-amd 2a37cbf2ca Bump VERSION and add CHANGELOG for ROCm 7.1.1 release (#1447) 2025-10-23 09:34:18 -04:00
ywang103-amd ee805d1014 remove option of json as rocprofv3's intermediate file to avoid test failures of outdated code (#1474) 2025-10-23 09:33:54 -04:00
Gopesh Bhardwaj 30bcf123a8 build fix for linker error (#1376) 2025-10-23 17:35:51 +05:30
Ioannis Assiouras 6d6b136374 SWDEV-559166 - Fix data races in GetSubmissionBatch, CaptureAndSet and SetQueueStatus (#1441) 2025-10-23 12:18:31 +01:00
amd-srinivas1 e99bd0c783 SWDEV-546345-[catch2][dtest]-Added tests for hipMemcpy3DPeer Apis(Memory Management) (#897)
* SWDEV-546345-Added tests for hipMemcpy3DPeer apis

* SWDEV-546345-Removed nested SECTIONS.

* SWDEV-546345-Optimized the code.

* SWDEV-546345-Addressed Review comments

* SWDEV-546345-Added image check support

---------

Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com>
2025-10-23 14:40:13 +05:30
ywang103-amd 9b562c0e58 pc sampling multi kernel (#1382)
* initial commit

* add csv support extraction for non kernel selection mode

* add --kernel-trace for rocprofiler-sdk mode

* make non kernel selective mode runnable

* make kernel selection work with -k

* remove upper case of arg hint

* update documentation

* display same kernel name at only one place and merge instruction id with same obj id as well as offset

* remove kernel name's display for single kernel selection

* change log added

---------

Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2025-10-23 01:26:08 -04:00
Gerardo Hernandez a128884078 SWDEV-541351 - query engine clock frequency via amdsmi to avoid clock tests being flaky (#1186) 2025-10-23 06:09:51 +01:00
Jimbo 37f2be9140 SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister (#962)
* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister

* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister

* SWDEV-554174 Added hipHostRegisterIoMemory flag in test cases

* SWDEV-554174 : Did formatting corrections

* SWDEV-554608 - set HSA_AMD_MEMORY_POOL_UNCACHED_FLAG if IoMemory is set

* SWDEV-554608 - set HSA_AMD_MEMORY_POOL_UNCACHED_FLAG if IoMemory is set

* SWDEV-554608 - Add hipHostRegisterIoMemory for hipHostRegister

---------

Co-authored-by: Anavena Venkatesh <Anavena.Venkatesh@amd.com>
Co-authored-by: Rambabu Swargam <rambabu.swargam@amd.com>
2025-10-22 20:25:59 -04:00
Mark Meserve 79076c4ad5 attach: Cleanup docs from initial commmit (#1302)
- Remove unimplemented older API functions
- Remove mentions of reattach API
- Remove details on implementing a process attachment library
  - This will return later as a theory of operation
2025-10-22 16:16:49 -05:00
Todd tiantuo Li bc7898c687 SWDEV-556751 - skip Unit_hipEventRecord (#1239) 2025-10-22 13:49:22 -07:00
xuchen-amd 578589d363 [rocprofiler-compute] metrics generator (#1199) 2025-10-22 15:17:43 -04:00
David Galiffi e453705d9b Check if test exists before adding validation (#1478)
* Check if test exists before adding validation

* Adjust validation parameters for rocpd_string

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-10-22 12:19:38 -04:00
Swati Rawat 3808f7ea76 rocpd documentation improvements (#1498) 2025-10-22 11:59:22 -04:00
pcritchl-amd 63a991a8b9 SWDEV-543498 - Some compute Ubertrace profiles are missing queue timing data (#1146) 2025-10-22 08:56:33 -07:00
Jatin Chaudhary ee93c9ddab SWDEV-545100 - add two SPIRV targets (#1037) 2025-10-22 11:39:45 -04:00
solaiys eab103d4ed [RDHC] Update rocm-core package scripts to include rdhc script (#1482)
* Add rdhc script in to rocm-core package
  * Create the rdhc symlink within the package itself.
  * rdhc tool support is not enabled for windows.

  * [RDHC] Check if the required pip pkgs are present and warn .
     rdhc checks the required pip packages are present or not.
     if not warns the user and exits gracefully.

Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
2025-10-22 19:54:40 +05:30
marandje aa4dee57b5 SWDEV-555295 - Fix and enable Unit_hipFreeAsync_Negative_Parameters (#991) 2025-10-22 15:57:54 +02:00