76333 Révisions

Auteur SHA1 Message Date
yugang-amd 05a6d017c6 [ROCmInfo] docs: mono-repo changes and style edits (#2584)
* initial edits

* mono repo related updates

* standardize component name

* style edits

* more edits
2026-01-20 18:06:54 -05:00
Yiltan 55aab4d62e [Docs] Clarify ROCSHMEM_HEAP_SIZE (#392)
* clarify ROCSHMEM_HEAP_SIZE

* Apply suggestions from code review

Co-authored-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>

---------

Co-authored-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>

[ROCm/rocshmem commit: 0496586829]
2026-01-20 17:22:18 -05:00
Yiltan 0496586829 [Docs] Clarify ROCSHMEM_HEAP_SIZE (#392)
* clarify ROCSHMEM_HEAP_SIZE

* Apply suggestions from code review

Co-authored-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>

---------

Co-authored-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>
2026-01-20 17:22:18 -05:00
prasanna-amd 520f309bb1 fix potential segfaults due to use after malloc fails (#2137)
* fix potential segfaults

* replace NULL with nullptr

---------

Co-authored-by: Prasannakumar Murugesan <prmuruge@amd.com>

[ROCm/rccl commit: 4a32ec2501]
2026-01-20 14:11:29 -08:00
prasanna-amd 4a32ec2501 fix potential segfaults due to use after malloc fails (#2137)
* fix potential segfaults

* replace NULL with nullptr

---------

Co-authored-by: Prasannakumar Murugesan <prmuruge@amd.com>
2026-01-20 14:11:29 -08:00
prasanna-amd bb47eee7cc fix bug in reduce kernel bfloat16 for ROCm >= 6.0 (#2139)
Co-authored-by: Prasannakumar Murugesan <prmuruge@amd.com>
As part of an earlier commit, bfloat16 handling in reduce kernel for FuncMinMax fell into generic/default template when there is no SPECIALIZE_REDUCE for a particular type, this generic template does a bitwise integer comparison and it broke bfloat16 ops.
change the else-if statement to else statement, that way it covers both ROCm version < 6.0 and >= 6.0 (with ROCm > 6.0, device.h already typedefs __hip_bfloat16 to hip_bfloat16, so no special case is needed here).

[ROCm/rccl commit: fa366ac03f]
2026-01-20 14:07:20 -08:00
prasanna-amd fa366ac03f fix bug in reduce kernel bfloat16 for ROCm >= 6.0 (#2139)
Co-authored-by: Prasannakumar Murugesan <prmuruge@amd.com>
As part of an earlier commit, bfloat16 handling in reduce kernel for FuncMinMax fell into generic/default template when there is no SPECIALIZE_REDUCE for a particular type, this generic template does a bitwise integer comparison and it broke bfloat16 ops.
change the else-if statement to else statement, that way it covers both ROCm version < 6.0 and >= 6.0 (with ROCm > 6.0, device.h already typedefs __hip_bfloat16 to hip_bfloat16, so no special case is needed here).
2026-01-20 14:07:20 -08:00
dependabot[bot] 48d1530205 Bump pynacl from 1.5.0 to 1.6.2 in /docs/sphinx (#2127)
Bumps [pynacl](https://github.com/pyca/pynacl) from 1.5.0 to 1.6.2.
- [Changelog](https://github.com/pyca/pynacl/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/pynacl/compare/1.5.0...1.6.2)

---
updated-dependencies:
- dependency-name: pynacl
  dependency-version: 1.6.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Corey Derochie <161367113+corey-derochie-amd@users.noreply.github.com>

[ROCm/rccl commit: f38665ac9a]
2026-01-20 14:30:10 -07:00
dependabot[bot] f38665ac9a Bump pynacl from 1.5.0 to 1.6.2 in /docs/sphinx (#2127)
Bumps [pynacl](https://github.com/pyca/pynacl) from 1.5.0 to 1.6.2.
- [Changelog](https://github.com/pyca/pynacl/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/pynacl/compare/1.5.0...1.6.2)

---
updated-dependencies:
- dependency-name: pynacl
  dependency-version: 1.6.2
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Corey Derochie <161367113+corey-derochie-amd@users.noreply.github.com>
2026-01-20 14:30:10 -07:00
dependabot[bot] c2fd82c02d Bump rocm-docs-core from 1.26.0 to 1.29.0 in /docs/sphinx (#2051)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.26.0 to 1.29.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.26.0...v1.29.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-version: 1.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com>

[ROCm/rccl commit: 131900c264]
2026-01-20 14:28:59 -07:00
dependabot[bot] 131900c264 Bump rocm-docs-core from 1.26.0 to 1.29.0 in /docs/sphinx (#2051)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.26.0 to 1.29.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.26.0...v1.29.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-version: 1.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com>
2026-01-20 14:28:59 -07:00
dependabot[bot] a1bb4108c1 Bump urllib3 from 2.5.0 to 2.6.3 in /docs/sphinx (#2130)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.3.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.3)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.6.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Corey Derochie <161367113+corey-derochie-amd@users.noreply.github.com>

[ROCm/rccl commit: d94ecb7772]
2026-01-20 14:27:31 -07:00
dependabot[bot] d94ecb7772 Bump urllib3 from 2.5.0 to 2.6.3 in /docs/sphinx (#2130)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.5.0 to 2.6.3.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.5.0...2.6.3)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.6.3
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Corey Derochie <161367113+corey-derochie-amd@users.noreply.github.com>
2026-01-20 14:27:31 -07:00
Mythreya Kuricheti 73df3f12b3 use message instead of warning for nccl.h C++ check (#2128)
Co-authored-by: Corey Derochie <161367113+corey-derochie-amd@users.noreply.github.com>

[ROCm/rccl commit: 0dc31b1a4a]
2026-01-20 14:21:38 -07:00
Mythreya Kuricheti 0dc31b1a4a use message instead of warning for nccl.h C++ check (#2128)
Co-authored-by: Corey Derochie <161367113+corey-derochie-amd@users.noreply.github.com>
2026-01-20 14:21:38 -07:00
Kian Cossettini 7c9361190b [rocprofiler-systems] Fix MPI recv_data calculation (#2694)
Fix incorrect `mpi_recv` calculation. It was using `_send_size` instead of `_recv_size` for `mpi_recv`.
2026-01-20 16:17:22 -05:00
Allen Hubbe 3edd56ca23 gda ionic: ccqe cleanup and error check (#389)
Delete unreachable ccqe polling path, ionic_poll_wave_ccqe().
Move cqe error check to ionic_quiet_internal_ccqe().

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>

[ROCm/rocshmem commit: 6b00964f32]
2026-01-20 15:26:53 -05:00
Allen Hubbe 6b00964f32 gda ionic: ccqe cleanup and error check (#389)
Delete unreachable ccqe polling path, ionic_poll_wave_ccqe().
Move cqe error check to ionic_quiet_internal_ccqe().

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
2026-01-20 15:26:53 -05:00
Nusrat Islam 96f6029a1b revert memcpy use for direct AG (#2146)
Co-authored-by: Islam <nusislam@amd.com>

[ROCm/rccl commit: f3c5156bbf]
2026-01-20 13:58:28 -06:00
Nusrat Islam f3c5156bbf revert memcpy use for direct AG (#2146)
Co-authored-by: Islam <nusislam@amd.com>
2026-01-20 13:58:28 -06:00
German Andryeyev db792fac37 SWDEV-558849 - Add support for static linking with ROCR (#2659) 2026-01-20 14:53:01 -05:00
mberenjk 9ee8fb0aa9 Merge pull request #2136 from mberenjk/mberenjk/nccl-sync-2.28.3
Merge remote-tracking branch 'nccl/master' 2.28.3 into develop

[ROCm/rccl commit: 2fdcceaabb]
2026-01-20 11:38:11 -08:00
mberenjk 2fdcceaabb Merge pull request #2136 from mberenjk/mberenjk/nccl-sync-2.28.3
Merge remote-tracking branch 'nccl/master' 2.28.3 into develop
2026-01-20 11:38:11 -08:00
Alysa Liu 9139f5a241 Revert "rocr: Switch back to legacy IPC (#1744)" (#2676)
This reverts commit 7e4b62290c.
2026-01-20 14:34:10 -05:00
dependabot[bot] 33e37797e6 Docs - Bump rocm-docs-core[api_reference] from 1.31.2 to 1.31.3 in /docs/sphinx (#220)
Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.31.2 to 1.31.3.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.31.2...v1.31.3)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.31.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/rocjpeg commit: a428bae524]
2026-01-20 11:15:12 -08:00
dependabot[bot] a428bae524 Docs - Bump rocm-docs-core[api_reference] from 1.31.2 to 1.31.3 in /docs/sphinx (#220)
Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.31.2 to 1.31.3.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.31.2...v1.31.3)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.31.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-01-20 11:15:12 -08:00
Marzieh Berenjkoub d7293281f3 Merge remote-tracking branch 'nccl/master' into develop
[ROCm/rccl commit: 858b4e76eb]
2026-01-20 13:04:02 -06:00
Marzieh Berenjkoub 858b4e76eb Merge remote-tracking branch 'nccl/master' into develop 2026-01-20 13:04:02 -06:00
Ioannis Assiouras 59aa56a340 hip-issue-3876 : Take into account thread-local capture mode in checks for valid capture (#2177) 2026-01-20 18:42:27 +00:00
Sajina PK 15c82d6da8 [rocprofiler-system]: Enable UCX Communication API tracing (#2306)
## Motivation

Enable UCX communication tracing and communication metadata 

## Technical Details

Implement UCX API wrappers to trace transport-layer communication. This adds communication data tracking and exposes “UCX Comm Send/Recv” timelines, enabling detailed analysis of MPI, OpenSHMEM, and other UCX-based runtime communication patterns.

- Implements function interception for UCX functions across multiple categories using gotcha component.
- Extended comm_data component to track UCX send/recv operations - Added ucx_send and ucx_recv labels for Perfetto counter tracks. Integrated UCX data tracking with existing MPI/RCCL tracking infrastructure.
- Added ROCPROFSYS_USE_UCX configuration option (enabled by default).
- Created FindUCX.cmake module for UCX header detection. Falls back to internal UCX headers if system headers not found.
- Updated all Dockerfiles  to include UCX dependencies.
2026-01-20 13:16:43 -05:00
Bindhiya Kanangot Balakrishnan 72f0a41658 [SWDEV-559965] Update Changelog for power cap type (#2647)
* [SWDEV-559965] Update Changelog for amd-smi set --power-cap

Updated Changelog to mention flexible argument
ordering for power cap type in amdsmi power cap set.
Corrected Changelog documentation on PPT1 reset
power_cap command.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2026-01-20 11:28:09 -06:00
Rakesh Roy 5049efdd75 Reset HIP_VERSION_PATCH to 0 (#2590) 2026-01-20 22:54:20 +05:30
Kian Cossettini 698ac6b8bc [rocprofiler-systems] Add build option for "examples" to specify gfx-arch (#2626)
## Motivation
 - Added `check_rocminfo` function that returns true if the provided regex was found, false otherwise. Can also use `GET_OUTPUT` to get the raw output filtered with or without a regex.
 - Moved `rocprofiler_systems_get_gfx_archs()` to `MacroUtilities.cmake` 
 - Added `rocprofiler_systems_lookup_gfx()`, which detects whether a given `gfx` is from the `instinct`, `radeon` or `apu` family.
 - Added `ROCPROFSYS_GFX_TARGETS` as a build argument. Used to specify the offloading architectures that GPU examples should compile for. If empty, defaults to whatever your system has.
 - GPU examples now check if the given `gfx` targets (from `ROCPROFSYS_GFX_TARGETS`) are supported.
 - OMPVV offload tests now only compile if `amdflang` version is `>= 20`
 - Improve link time by reducing the number of GFX targets that binaries need to support.
   - RCCL is now passed a `GPU_TARGETS` var specifying the architectures to build/link against.
2026-01-20 12:13:21 -05:00
German Andryeyev 3af2bf4952 Merge branch 'develop' into amd/dev/gandryey/SWDEV-558849 2026-01-20 12:04:53 -05:00
vedithal-amd 4a5cbbfba5 [rocprofiler-compute] Fix kernel/dispatch filtering (#2479)
* Fix kernel/dispatch fitlering in GUI

* Disallow --kernel and --dispatch filtering in analyze --gui mode since
  GUI frontend offers dropdown menu for kernel and dispatch filtering
    * Update CHANGELOG and documentation

* Gracefully handle N/A values

* Ensure workload path is valid before using it in GUI

* Ignore kernel filters if dispatch filters provided

* Add documentation for dispatch filtering overriding kernel filtering

* Fix typo

* Fix documentation

* remove unnecessary whitespace

* Address review comments

* Allow kernel/dispatch filtering with --gui

* Address review comments

* Address review comments

* Update CHANGELOG

* Fix formatting
2026-01-20 10:02:31 -05:00
vedithal-amd a926660670 [rocprofiler-compute] Use TheRock nightly builds in testing container (#2661)
* Use TheRock nightly builds in testing container

* Add HIP_DEVICE_LIB_PATH env var for hipcc to work

* Add HIP_PLATFORM env var for cmake hip package

* Add tarball placeholder

* Add -f to curl command to fail on HTTP error
2026-01-20 09:54:38 -05:00
Edgar Gabriel 55e2b501d3 replace memset with hipMemset (#390)
[ROCm/rocshmem commit: bc70ce551c]
2026-01-20 08:14:25 -06:00
Edgar Gabriel bc70ce551c replace memset with hipMemset (#390) 2026-01-20 08:14:25 -06:00
marantic-amd 51f49d8835 Add notice for the newly deprecated env variables (#2690) 2026-01-20 13:59:31 +01:00
Milan Radosavljevic b533f56197 Add automatic PyTorch library discovery for Python applications (#2623)
* Add automatic PyTorch library discovery for Python applications (#2623)
2026-01-20 08:42:49 +01:00
David Galiffi c83b3aae07 Fix Python Formatting (#2679)
Updated version of black to 26.1.0 updated some formatting rules

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2026-01-19 21:26:50 -05:00
jamessiddeley-amd 25090e003f [rocprof-compute] Pin ruff version for consistent formatting (#2680)
* pin ruff versions each to current latest

* Update rocprofiler-compute-formatting.yml

* Downgrade .pre-commit-config.yaml to match develop
2026-01-19 19:10:02 -05:00
Karthik Jayaprakash 99c3a06f4e SWDEV-549518 - Enable logging dynamically through HIP APIS. (#1079)
* SWDEV-549518 - Enable logging dynamically through HIP APIS.

* SWDEV-549518 - Adding ROCProfiler related new API changes.

* rocprofiler-sdk changes for hip api additions.

---------

Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com>
2026-01-19 16:16:14 -05:00
marandje 9f37cd6309 SWDEV-1 - Fix hipMemPoolTrimTo failing tests (#2628) 2026-01-19 21:10:15 +00:00
abchoudh-amd dd149d3957 [rocprofiler-compute] Support new attach/detach API (#2642)
* Removed attach tool library path

* Support new attach/detach API

* New attach/detach API was introduced in
  https://github.com/ROCm/rocm-systems/pull/1653

* Provide backward compatibility with old api

* Stabilize attach/detach tests by adding sleep to help workload get
  ready for attachment

* Fix typo in test name

---------

Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2026-01-19 16:00:14 -05:00
SakaSitharammurthy 1c5aa2d4e7 [SWDEV-567099] Updated 'amdsmi list --cpu all' command (#2519)
Signed-off-by: Saka, Sitharam Murthy <SitharamMurthy.Saka@amd.com>
2026-01-19 14:56:59 -06:00
vedithal-amd 0254181f42 [rocprofiler-compute] Analysis Database Schema Improvements (v1.2.0) (#2526)
* Analysis database v1.2.0

* `pc_sampling` and `roofline_data` tables should relate to `kernel` table instead of `workload` table

* Remove `kernel_name` fields in `pc_sampling` and `roofline_data` table

* Add kernel existence check for roofline data to prevent KeyError (#2536)

* Initial plan

* Add kernel existence check for roofline data to prevent KeyError

Co-authored-by: vedithal-amd <191402304+vedithal-amd@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: vedithal-amd <191402304+vedithal-amd@users.noreply.github.com>

* Optimize analysis performance

* Refactor database schema: separate metric definitions from kernels

Reorganize the database ORM to decouple metric definitions from kernel
objects. This improves the schema design by:

- Rename Metric -> MetricDefinition and Value -> MetricValue for clarity
- Move metric definitions from kernel-level to workload-level, since
  metric definitions are shared across kernels
- Update relationships: MetricDefinition belongs to Workload,
  MetricValue
  references both MetricDefinition and Kernel
- Refactor metric_view to join through the new schema structure
- Update test fixtures to use renamed table and class names
- Update documentation with new example output using nbody workload
- Regenerate database schema and views diagrams

* Add min amd max aggregation in kernel_view

* Add primary key id from tables into the view

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: vedithal-amd <191402304+vedithal-amd@users.noreply.github.com>
2026-01-19 15:25:43 -05:00
systems-assistant[bot] 88f07baa92 SWDEV-493792 - add split barriers for grid_group (#508)
* SWDEV-493792 - add split barriers for grid_group

* add tests

* Update change log

* Add Navi4 split barrier

* Update docs

* Use new Catch2 Approx macro

* Update split_barrier.cc to check for coop groups

---------

Co-authored-by: Jatin Chaudhary <jatchaud@amd.com>
Co-authored-by: Jatin Chaudhary <51944368+cjatin@users.noreply.github.com>
2026-01-19 09:17:00 -08:00
lloginov-amd e49b501e9a Add scratch memory support (#2211) 2026-01-19 16:24:30 +01:00
Gopesh Bhardwaj 1ac805cb35 [rocprofiler-sdk][Documentation] Updating CHANGELOG for 7.2 (#2573)
* Updating CHANGELOG for 7.2

* Updated CHANGELOG

* Addressed feedback

* Addressed Feedback

* Updated based on review comments

* Update installation steps and documentation links

Updated installation documentation and links to latest repository.

* Addressed Feedback

* Updated CHANGELOG

* Addressed feedback

* updated CHANGELOG

* Apply suggestion from @prbasyal-amd

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

---------

Co-authored-by: Swati Rawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2026-01-17 14:55:55 +05:30