Graphe des révisions

64742 Révisions

Auteur SHA1 Message Date
Ammar ELWazir fee5bd9a4e Fixing ROCProfiler Register CI & ROCProfiler-SDK Docs CI (#1570)
---------

Co-authored-by: bgopesh <gopesh.bhardwaj@amd.com>
2025-11-03 09:24:32 -06:00
systems-assistant[bot] 740b27528f kfdtest: Enable GPU selection via CLI for multi-GPU tests (#245)
* kfdtest: Enable GPU selection via CLI for multi-GPU tests

Replaced environment variable-based GPU selection with
GPU selection via command-line parameter --concurrentnodes (-c)
Modified g_TestGPUsNum to be passed in via command-line
parameter --testnodenum (t)

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>

* kfdtest: Enable GPU selection via CLI for multi-GPU tests
Replaced environment variable-based GPU selection with
GPU selection via command-line parameter --concurrentnodes (-c)
Modified g_TestGPUsNum to be passed in via command-line
parameter --testnodenum (t)

---------

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
Co-authored-by: Alysa Liu <Alysa.Liu@amd.com>
2025-11-03 09:27:38 -05:00
vedithal-amd bb5fd1d4ae [rocprofiler-compute] Update analysis db for visualizer integration (#1548)
* Analysis db changes for visualizer

* Add support for per kernel analysis metrics

* Add support for dispatch timeline visualiztion

* Show median instead of mean of dispatch duration in kernel view

* Add test case to validate analysis db schema

* Analysis db schema updte
    * Add Kernel table and make Metric and Dispatch table its children
    * Kernel table is a child of Workload table
    * Update metric_view to show kernel_name column
    * Add disptach timestamps to Dispatch table for dispatch timeline
      visualization
    * Update kernel_view to show duration_ns_median instead of mean
      duration

* Add mean duation in kernel view

* update changelog

---------

Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2025-11-03 09:25:12 -05:00
vedithal-amd dbb361c606 [rocprofiler-compute] fix parser to prevent missing metrics in analysis mode (#1613)
* fix parser

* fix parser

* fix parser

---------

Co-authored-by: fei.zheng <fei.zheng@amd.com>
Co-authored-by: ywang103-amd <ywang103@amd.com>
2025-11-03 09:23:22 -05:00
Victor Zhang 437ce0b8df fix atomics SystemTest() use after free (#1595) 2025-11-02 21:45:44 -05:00
arvindcheru fb1d32c15c SWDEV-530465 Update share/doc/<pkgnm> License Folder for hsa-rocr (#923)
* SWDEV-530465 Update share/doc/<pkgnm> License Folder for hsa-rocr
* Review Comments Updated - reverted to usage of DOCDIR
2025-10-31 23:21:22 -04:00
lmoriche f5bbb09c0d clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue (#1316)
* clr: SWDEV-547890 - Maintain an MQD for the emulated AQL queue

To simplify the shader debugger implementation, maintain the relevant
parts of the emulated AQL queue's MQD (amd_queue_t): read_dispatch_id,
write_dispatch_id, compute_tmpring_size.

With this MQD, the shader debugger can handle the emulated AQL queue
the same way it does the real AQL queue, no specialization is required.

* clr: SWDEV-547890 - Conservatively update the MQD's read_dispatch_id

The read_dispatch_id cannot be smaller than the current aql_packet_id
- hsa_queue.size for the debugger to work correctly.

The read_dispatch_id really should be updated when the CmdBuf is marked
as complete. Left a FIXME to address it in a future commit.
2025-10-31 16:07:02 -04:00
Satyanvesh Dittakavi f332888366 SWDEV-560304 - Fix segfault with invalid stream (#1360) 2025-11-01 00:04:44 +05:30
David Galiffi 5850d5b973 Updating documentation (#1602)
* Update rocprof-sys-feature-set.rst

* Update configuring-runtime-options.rst
2025-10-31 14:30:25 -04:00
Jaydeep 10763f0e7a SWDEV-559505 - Enable back memset optimization and handle the cases when setParam can change the number of AQL packets for memset graph node. (#1320)
Co-authored-by: jaydeeppatel1111 <jaypatel@amd.com>
2025-10-31 22:49:14 +05:30
Ossian O'Reilly b9de7baaa9 Update README.md (#1611)
* Update README.md

Add missing directory in git sparse-checkout instructions

* Update README.md typo

---------

Co-authored-by: Young Hui - AMD <145490163+yhuiYH@users.noreply.github.com>
2025-10-31 13:16:09 -04:00
Yiannis Papadopoulos 37bbc9062a rocr/aie: Detect AIE architecture and marketing name (#1459)
* rocr/aie: Detect AIE architecture and marketing name

* rocr/aie: Modernize code, update comments
2025-10-31 09:10:18 -05:00
Yiannis Papadopoulos 82d68fc772 rocrtst: Assume that AIE agent memory is system RAM (#1231) 2025-10-31 09:10:00 -05:00
Kian Cossettini 883caf2719 [rocprofiler-systems] Overhaul skip condition of implicit_task and add ROCPD validation test (#1589)
- Add rocpd validation check and fix implicit_task check
- SWDEV-562896
2025-10-31 09:59:23 -04:00
Ioannis Assiouras 1dd0237cb2 SWDEV-563752 - Allow hipMemLocationTypeHost in hipMemSetAccess even if memory was created on the device (#1620)
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-10-31 13:57:36 +00:00
ywang103-amd 24cb8c4deb fix crashs related to metric generator and add copy right (#1608)
* fix crash created by path and arg for pc_sampling  and add copyright for mat_mul

* resolve fomat issue of line too long

* bugfixes

* copy gfx9 config template to analysis config in src

---------

Co-authored-by: Wang <ywang103@ctr2-alola-login-01.amd.com>
Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
2025-10-30 16:36:56 -04:00
Dmitrii a2cff3c84d [RDC] Fix GPU_COUNT metric to only count GPUs (#1453)
* [RDC] Fix GPU_COUNT metric to only count GPUs
* [RDC] Clean up float->double casts

---------

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-10-30 12:50:47 -05:00
Dmitrii e0ec72ccdd [rdc] Bump rocprofiler-sdk requirement to 1.1.0 (#1610)
Fixes RDC builds broken by #1563
2025-10-30 10:06:45 -04:00
marandje cfbb2230ea SWDEV-491296 - Fix Unit_hipMemImportFromShareableHandle_Capture (#1564) 2025-10-30 15:06:26 +01:00
cadolphe-amd 458c25c3a0 SWDEV-556658 - Update Unit_TexObjectCreate_TypePitch2D_IncompleteInit to align with API (#1144) 2025-10-29 11:36:45 -04:00
xuchen-amd b774f28181 [rocprofiler-compute] Remove grafana and mongodb integration (#978)
* Remove grafana and mongodb integration

* Remove grafana documentation assets

* clarify changelog

---------

Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
2025-10-29 11:32:06 -04:00
dsicarov-amd 4915496bf9 SWDEV-533237 Add hipOccupancyAvailableDynamicSMemPerBlock API (#899)
* SWDEV-533237 Add initial support for hipOccupancyAvailableDynamicSMemPerBlock API

* SWDEV-533237 Add hipOccupancyAvailableDynamicSMemPerBlock wrapper for nvidia

* SWDEV-533237 Add implementation of hipOccupancyAvailableDynamicSMemPerBlock API

* SWDEV-533237 Add LDSAlignment field in Isa table

---------

Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-10-29 10:58:42 +01:00
Istvan Kiss 197f73dac9 Sync HIP documentation 2025-10-20 (#1258)
* Add examples to tools folder
* Correct P2P memory access section
* Sync poriting guide
* Add HIP Graph tutorial
* Add hint about using amdgpu-dkms for IPC API
* Add a few more env variables
2025-10-29 07:42:06 +01:00
Geo Min 8e98b80deb [TheRock CI] Fixing patches for rocm-systems (#1460)
* Fixing patches for rocm-systems

* Adding all

* Adding remaining projects

* Submodule bump

* adding compiler

* adding test commit hash

* Adding artifact group

* adding update for artifact group

* Adding new commit hash
2025-10-28 19:47:17 -07:00
Ajay GunaShekar 22213c0ec3 SWDEV-559569 - enable fixed tests (#1363) 2025-10-28 12:17:15 -07:00
David Galiffi 3d7a5eec0e Setup rocprofsys_root environment variable (#1561)
* Setup `rocprofsys_root` environment variable

* Update `CHANGELOGS`

* Fixed formatting

* Add rocpd output and validation to python tests

* Refactoring environment setup
2025-10-28 13:06:07 -04:00
Venkateshwar Reddy Kandula c5bd693478 [rocprofiler-sdk] Disable HIP/CLR build in rocprofiler-sdk CI jobs (#1574)
* disable HIP/CLR build

* misc. fix
2025-10-28 11:42:11 -05:00
Gopesh Bhardwaj 2be2945228 Version bump and CHANGELOG update for 7.1 (#1563) 2025-10-28 11:53:32 -04:00
Swati Rawat f0f008d494 Update using-rocprofv3-process-attachment.rst (#1534) 2025-10-28 11:52:23 -04:00
ywang103-amd 99183ffd92 fix failure of pc sampling and unit tests (#1526) 2025-10-28 11:30:32 -04:00
systems-assistant[bot] 00b2bd3e8c SWDEV-515530 - Re-enable passing test (#598) 2025-10-28 11:23:30 +01:00
Ajay GunaShekar f8e3858659 remove usage of HIP_RETURN in internal function (#1359) 2025-10-27 15:37:46 -07:00
Rahul Manocha f5d901f016 SWDEV-546311 - implement hipKernelGetLibrary & hipLibraryEnumerateKer… (#1143)
* SWDEV-546311 - implement hipKernelGetLibrary & hipLibraryEnumerateKernels API

* Fix for LibraryEnumerateKernel and KernelGetName

* Update Enumerate Kernels to handle 0 numKernels

* Minor fixes to function names

* fix error checking in internal function

* Update changelog for new apis

---------

Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-10-27 14:13:17 -07:00
Shadi Dashmiz 3e59eebf17 SWDEV-558510:Correct max mem per multiprocessor value (#1207)
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
2025-10-27 15:45:06 -04:00
David Yat Sin 6497fa0339 rocr: Fix wrong args in memory copy functions (#1520)
Fix incorrect arguments passed into system_region->Lock
2025-10-27 14:12:06 -05:00
Gopesh Bhardwaj 1585fe59cd [Documentation] Repo location and limitation update (#1537) 2025-10-27 12:26:05 -04:00
MachineTom eb69a455ed SWDEV-558844 - Cleanup Os header (#1530)
Remove codes that aren't used in Os header.
2025-10-27 11:52:31 -04:00
systems-assistant[bot] c1926d547e SWDEV-515530 - Re-enable passing tests on NV (#605) 2025-10-27 16:32:37 +01:00
Benjamin Welton d496bcef18 Fix dimension mismatch for multi-GPU systems with identical architect… (#1440)
* Fix dimension mismatch for multi-GPU systems with identical architectures

This change addresses an issue where counter dimensions were incorrectly
shared across all GPU agents with the same architecture name, even when
those agents had different hardware configurations (e.g., different CU counts).

Changes:
- Updated getBlockDimensions() to accept agent ID instead of architecture name
- Made dimension cache agent-specific instead of architecture-specific
- Updated set_dimensions() in AST evaluation to use specific agent ID
- Modified all API functions to handle agent-specific dimension lookups
- Updated tests to work with agent-specific dimensions

This fix ensures that dimensions accurately reflect the actual hardware
configuration of each individual GPU agent, preventing dimension mismatches
in multi-GPU systems where GPUs share the same architecture but have
different physical configurations.

Counter ID Representation Changes:
- Modified counter_id encoding to include agent information in bits 37-32
- Agent logical_node_id is encoded as (value + 1) to ensure agent 0 is detectable
- Counter records internally store only 16-bit base metric IDs (bits 15-0)
- Tool reconstructs agent-encoded counter IDs from base metric ID & agent info
- Instance record counter_id field uses bitwise AND mask to extract base metric ID
  (counter_id.handle & 0xFFFF) to fit in 16-bit storage
- Output generators (CSV, JSON, Perfetto) use agent-encoded IDs for consistency
- Updated counter_config.cpp and metrics.cpp to extract base metric ID when needed
- All counter lookups now properly handle agent-encoded vs base metric IDs

This ensures counter IDs are consistent between metadata and output records while
maintaining compact storage in instance records.
2025-10-27 07:58:20 -07:00
systems-assistant[bot] e22856b3ac SWDEV-515562 - Fix and enable hipDeviceReset tests (#594) 2025-10-27 15:07:44 +01:00
systems-assistant[bot] 8cc65f49c4 SWDEV-491296 - Add stream capture testcases to Virtual Memory APIs (#589) 2025-10-27 15:06:51 +01:00
marantic-amd 08d259c24c Fix the issue when sampling JAX with rocpd (#1552) 2025-10-27 09:59:51 -04:00
David Yat Sin f7b180ee7d rocr: SW workaround for gfx90x SDMA poll (#1469)
Workaround for rare issue on gfx90x asics when SDMA_OP_POLLREGMEM
returns before polled memory has value of 0.
Removing previous SW workaround to double-poll as it was not reliable.
2025-10-27 09:33:20 -04:00
David Yat Sin db01d95ebc Users/dayatsin/swdev 519413 hsa amd pointer info return err shutdown (#1509)
* rocr: hsa_amd_pointer_info return err on shutdown

Decrement ref count before starting to unload to make sure API
calls during shutdown return error.

Delete blit objects during agent destructor.

* Add support for HSA_AMD_SYSTEM_SHUTDOWN_EVENT

Add support for new event to indicate shut down within the
hsa_amd_register_system_event_handler API.
2025-10-27 09:32:52 -04:00
systems-assistant[bot] 45d6598724 SWDEV-517867 - Enable Unit_hipStreamCreateWithPriority_MulthreadDefaultflag (#599) 2025-10-27 11:36:40 +01:00
systems-assistant[bot] abaf29d0b6 SWDEV-537855 - Add hipEventDestroy (#554)
Co-authored-by: Vladana Stojiljkovic <Vladana.Stojiljkovic@amd.com>
2025-10-26 21:20:21 +01:00
SaleelK f301053740 clr: Improve logging (#1457) 2025-10-25 15:55:27 -07:00
David Galiffi e22a8e865e Update Timemory submodule (#1539)
- Fixes clang build failure

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Aleksandar Janicijevic <Aleksandar.Janicijevic@amd.com>
2025-10-25 14:56:43 -04:00
David Galiffi 28c2728b6b Update Dyninst module (#1540)
- Fix nullptr check

------
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Aleksandar Janicijevic <Aleksandar.Janicijevic@amd.com>
2025-10-25 14:56:29 -04:00
MachineTom 6a49171fa5 SWDEV-562431 - Fix Unit_hipBindTexture_Negative failure (#1523) 2025-10-24 16:25:22 -04:00