* SWDEV-561708 Initial shared queue pool apis
* Validate params; some fixes in callback function (but still needs to be checked)
* Dtor cleanup
* minor
* Enable profiling; remove callback since aql_queue takes care of it
* setPriority and setCuMask APIs updated for counted queues
* Increasing step and minor version for rocprofiler
* Tests for CountedQueueManager
* tests
* Code refactored to make pool manager part of GpuAgent only (incomplete); unique handles issue pending
* Refactored code to support CQM inside GpuAgent and unique handles; multithreaded test added
* Changed to ASSERT_SUCCESS macros for all tests
* RIng buffer overflow test added
* tests fixed; cleanup added at hsa_shutdown
* priority conversion table changes
* Compiler warnings fixed
* Rewrite 1 test; add desc and improve SetUp() code
* Improvement
* Unififed getinfo for both counted and non-counted queues
* Address PR feedback
* Addressing feedback: memleak, data type mismatch, documentation
* improve comment
* format
* Missing HSA_API macros for roctracer
* Revert "Addressing feedback: memleak, data type mismatch, documentation"
This reverts commit 5e498a55fb3640e00d06cec63dcec79293fb23de.
* Improving acquire api doc
* release api doc improved
* error codes for release api doc
* SWDEV-555889 - Support mipmap on rocr
Support mipmap in hip-rt on rocr backend.
Enable all mipmap tests in Windows.
Some other minor improvement.
Add some SRD logs that will be removed finally.
* Add sampler.mipFilter to fix sampler issues on mipmap in rocr.
Fix format issues of view of leveled image and mipmap image in blit kernel in rocr.
Enabled disabled mipmap tests.
* Rewrite view logic
* Set word4.f.PITCH = 0 for mipmap SRD on navi31 to fix unstable test issues.
Reset last error in nagative tests.
* Remove SRD dump log from hip-rt
Let Rocr mipmap log be in condition.
* minor format chang
* Exclude mipmap tests for mi200+ which don't support mipmap.
Updated to convert flags correctly
Added ObjectRegistry to track registered and mapped resources and incorporated it into hip_gl.
Added mip level check
Made functions static in-line
Reworked validation to be more clear.
Co-authored-by: Prasannakumar Murugesan <prmuruge@amd.com>
As part of an earlier commit, bfloat16 handling in reduce kernel for FuncMinMax fell into generic/default template when there is no SPECIALIZE_REDUCE for a particular type, this generic template does a bitwise integer comparison and it broke bfloat16 ops.
change the else-if statement to else statement, that way it covers both ROCm version < 6.0 and >= 6.0 (with ROCm > 6.0, device.h already typedefs __hip_bfloat16 to hip_bfloat16, so no special case is needed here).
[ROCm/rccl commit: fa366ac03f]
Co-authored-by: Prasannakumar Murugesan <prmuruge@amd.com>
As part of an earlier commit, bfloat16 handling in reduce kernel for FuncMinMax fell into generic/default template when there is no SPECIALIZE_REDUCE for a particular type, this generic template does a bitwise integer comparison and it broke bfloat16 ops.
change the else-if statement to else statement, that way it covers both ROCm version < 6.0 and >= 6.0 (with ROCm > 6.0, device.h already typedefs __hip_bfloat16 to hip_bfloat16, so no special case is needed here).
## Motivation
Enable UCX communication tracing and communication metadata
## Technical Details
Implement UCX API wrappers to trace transport-layer communication. This adds communication data tracking and exposes “UCX Comm Send/Recv” timelines, enabling detailed analysis of MPI, OpenSHMEM, and other UCX-based runtime communication patterns.
- Implements function interception for UCX functions across multiple categories using gotcha component.
- Extended comm_data component to track UCX send/recv operations - Added ucx_send and ucx_recv labels for Perfetto counter tracks. Integrated UCX data tracking with existing MPI/RCCL tracking infrastructure.
- Added ROCPROFSYS_USE_UCX configuration option (enabled by default).
- Created FindUCX.cmake module for UCX header detection. Falls back to internal UCX headers if system headers not found.
- Updated all Dockerfiles to include UCX dependencies.