76333 Commits

Author SHA1 Message Date
systems-assistant[bot] ed02159bf6 Stop trying to fit too much in one line for default view (#1897)
* Stop trying to fit too much in one line for default view

The default view is really cramped trying to put a lot of version
information into one line, to the point that some strings are
cropped. Instead of cropping the strings just put each into it's
own line.

For running without a ROCm release installed hide the ROCm version
line.

Sample output:
```
+------------------------------------------------------------------------------+
| AMD-SMI 26.1.0+2a668c34                                                      |
| amdgpu version: Linuxver                                                     |
| VBIOS version: 023.010.001.022.000001                                        |
| Platform: Linux Baremetal                                                    |
|-------------------------------------+----------------------------------------|
| BDF                        GPU-Name | Mem-Uti   Temp   UEC       Power-Usage |
| GPU  HIP-ID  OAM-ID  Partition-Mode | GFX-Uti    Fan               Mem-Usage |
|=====================================+========================================|
| 0000:c1:00.0 ...adeon 890M Graphics | N/A      59 °C   0                17 W |
|   0       0     N/A             N/A | 25 %       N/A              479/512 MB |
+-------------------------------------+----------------------------------------+
+------------------------------------------------------------------------------+
| Processes:                                                                   |
|  GPU        PID  Process Name          GTT_MEM  VRAM_MEM  MEM_USAGE     CU % |
|==============================================================================|
|  No running processes found                                                  |
+------------------------------------------------------------------------------+
```

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Don't show amdgpu version on mainline kernels

amdgpu version doesn't exist on a mainline kernel.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Truncate amdgpu version string to 80 characters

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Allow longer AMD-SMI version strings

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Adjusted version header format

---------

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Co-authored-by: Mario Limonciello (AMD) <superm1@kernel.org>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-12-04 23:23:34 -06:00
Mario Limonciello d1aaae2539 Run pre-commit's whitespace related hooks on projects/rocprofiler-systems (#2123)
In order for pre-commit to be useful, everything needs to meet a common
baseline.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-12-04 23:39:42 -05:00
Jason Bonnell 463126770a Update build docker container workflow, opensuse dockerfiles (#1883)
## Motivation

<!-- Explain the purpose of this PR and the goals it aims to achieve. -->

- __Reduced Code Duplication__: Version parsing logic moved from individual Dockerfiles to the central build script
- __Improved Edge Case Handling__: Better handling of ROCm versions with and without patch numbers (e.g., `6.2` vs `6.2.0`)
- __Easier Maintenance__: Future version-related changes only need to be made in one place
- __Cleaner Dockerfiles__: Simplified Dockerfiles focus on package installation rather than complex shell logic
- __Updated Platform Support__: Refreshed container matrix to reflect current platform/ROCm version combinations
- __Fix OpenSUSE Docker Generation__: OpenSUSE container generation fails due to a change to the `binutils-gold` package
- __Error Handling__: Fix bug where errors in docker image build were being masked, allowing workflow to pass anyway.


## Technical Details

<!-- Explain the changes along with any relevant GitHub links. -->
- Updated `Dockerfile.opensuse` and `Dockerfile.opensuse.ci` docker files to remove `binutils-gold`
  - Not needed since we build `binutils` with systems anyways
- Updated `rocprofiler-systems-containers.yml` to remove `pushd/popd` commands and just run the shell scripts
  - There was a silent failure observed here, which I verified in this PR before adding the fix for openSUSE
- Refactor ROCm version parsing. Move this logic to the `build-docker.sh` script to reduce duplication.
  - Fix bug that caused ROCm 7.0 to fail installation. The trailing `.0` was being trimmed.
- Fixed inconsistencies in `containers.yml` that lead to invalid ROCm-OS_VERSION combinations.
- Formatting fixes 
  - Removed trailing whitespace
  - Fix docker build warnings. Use an `=` rather than ` ` when assigning an environment variable.
2025-12-04 23:33:15 -05:00
arvindcheru 0f76bb45c7 Enable Lintian configuration/Files for AMDSMI (#2140)
* Enable Lintian configuration/Files for AMDSMI
2025-12-04 22:01:57 -05:00
Kiriti Gowda 09c8afe519 Host decouple - samples and test (#677)
* Host decouple - samples and test

* Host - install utils with dev

* Host - Install host files in core temp

[ROCm/rocdecode commit: 0a4fadb24d]
2025-12-04 16:04:47 -08:00
Kiriti Gowda 0a4fadb24d Host decouple - samples and test (#677)
* Host decouple - samples and test

* Host - install utils with dev

* Host - Install host files in core temp
2025-12-04 16:04:47 -08:00
Atul Kulkarni 86a4dd95f6 Remove static to non-static conversion used in tests (#2084)
* Remove coll_reg tests which are unsupported

* removed static to non-static conversion feature

[ROCm/rccl commit: 7ec8e73e12]
2025-12-04 18:03:14 -06:00
Atul Kulkarni 7ec8e73e12 Remove static to non-static conversion used in tests (#2084)
* Remove coll_reg tests which are unsupported

* removed static to non-static conversion feature
2025-12-04 18:03:14 -06:00
hongkzha-amd 4bd1b90e62 rocr: Add WSL support by conditionally handling DRM operations (#2081)
This patch enhances compatibility for DXG environments by introducing conditional
checks for DRM operations, particularly around buffer object metadata handling
in IPC scenarios. These changes improve robustness in DXG IPC memory management
without impacting existing functionality in standard Linux environments.

Signed-off-by: Horatio Zhang <Hongkun.Zhang@amd.com>
2025-12-05 07:50:48 +08:00
ywang103-amd 092ca13f4f [rocprofiler-compute] add try catch to ensure subprocess killed if test of attach/detach fails (#2139)
* add try catch to ensure subprocess killed if test of attach/detach fails

* remove unnecessary comments

* remove duplicated cleanup
2025-12-04 15:49:03 -08:00
Atul Kulkarni a364ada6e7 Add missing header in alloc.h (#2086)
[ROCm/rccl commit: 892d258319]
2025-12-04 11:26:19 -06:00
Atul Kulkarni 892d258319 Add missing header in alloc.h (#2086) 2025-12-04 11:26:19 -06:00
Avinash Kethineedi 1ecc355062 IPC: insert __threadfence_system() after *wg RMA APIs to guarantee global memory visibility (#346)
[ROCm/rocshmem commit: f907ef91e4]
2025-12-04 10:21:25 -06:00
Avinash Kethineedi f907ef91e4 IPC: insert __threadfence_system() after *wg RMA APIs to guarantee global memory visibility (#346) 2025-12-04 10:21:25 -06:00
Atul Kulkarni 0ced7aede8 Fix rccl test suite to use hip_bf16.h instead of hip_bfloat16.h for the __bf16 intrinsic (#2082)
[ROCm/rccl commit: cc6e259a02]
2025-12-04 10:02:06 -06:00
Atul Kulkarni cc6e259a02 Fix rccl test suite to use hip_bf16.h instead of hip_bfloat16.h for the __bf16 intrinsic (#2082) 2025-12-04 10:02:06 -06:00
Ammar ELWazir d79ebea9a7 [ROCProfiler-SDK-CI] Update ROCm and amdgpu package versions in Dockerfile (#2144) 2025-12-04 09:53:55 -06:00
Maisam Arif 2feb0ae998 Fix powercap default to enum for sensor_ind (#2004)
* Fix powercap default to enum for sensor_ind

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

* [SWDEV-559965] Refactor amdsmi set power cap

Modified power cap set to accept args with
optional power_cap type. Added power_cap helper
validate_and_set_power_cap(). Fixed JSON output
format.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-12-04 09:52:59 -06:00
vedithal-amd d8a8a3ef30 [rocprofiler-compute] Add exception handling for native tool path search (#2159)
* Add exception handling for native tool path search

* Fix formatting in roofline benchmark code

* Fix detection of .so files

* include hip code and native tool code in standalone binary

* add fallback path for ROCM_PATH
2025-12-04 10:29:49 -05:00
Charis Poag Jones 4ff89b6fd1 [SWDEV-570457] Fix Python 3.8/3.7 typing errors (#2164)
Changes:
  - Fixed `amd-smi` showing:
```console
  $ amd-smi
Traceback (most recent call last):
  File "/opt/rocm/bin/amd-smi", line 53, in <module>
    from amdsmi_init import *
  File "/opt/rocm/libexec/amdsmi_cli/amdsmi_init.py", line 38, in <module>
    from amdsmi import amdsmi_interface, amdsmi_exception
  File "/usr/local/lib/python3.8/dist-packages/amdsmi/__init__.py", line 24, in <module>
    from .amdsmi_interface import amdsmi_init
  File "/usr/local/lib/python3.8/dist-packages/amdsmi/amdsmi_interface.py", line 5581, in <module>
    ) -> tuple[int, int]:
TypeError: 'type' object is not subscriptable
```
  This was a python3.8 issue, which is now resolved by using
  `Tuple[int, int]` typing for Python 3.8 compatibility.
2025-12-04 09:29:01 -06:00
vedithal-amd ac640c13d6 [rocprofiler-compute] Allow to specify path for standalone binary extraction (#2162)
* Allow to specify path for standalone binary extraction

* Add cmake option -D STANDALONEBINARY_EXTRACT_DIR=<path> to specify extraction dir. for binary

* fix formatting

---------

Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2025-12-04 10:13:18 -05:00
vedithal-amd 7a2df64b59 [rocprofiler-compute] Enable running tests from installation only for TheRock setup (#2067)
* Enable running tests from installation only

* Use cmake option -DTEST_FROM_INSTALL=ON to enable running tests from installation folder only
    * It is not possible to run tests from build folder in this case
    * This option prevents changing working directory to source folder

* Fix SourceFileLoader to import rocprof-compute main module correctly

* Install sample executables in the test folder

* fix num_xcds_cli_output test

* Fix tests

* Skip autogen. config. test and add a TODO task for re-design of this
  test

* Add flexible import of source code in test_gpu_specs.py

* Update cmake to install tests/workloads folder when INSTALL_TESTS=ON

* Fix sys.argv[0] for tests

* fix live attach detach test
2025-12-04 10:12:38 -05:00
habajpai-amd 30161885e2 refactor: centralize update_env across binaries with unit test added … (#2029)
* refactor: centralize update_env across binaries with unit test added for testing

* removed unused includes suggested by clangd and small cleanup

* use centralized update_env in argparse as well

* review comments incorporated

* move update_env tests closer to common library

* fix: missing common:: prefix in rocprof-sys-sample

* cmake formatting
2025-12-04 19:24:27 +05:30
Atul Kulkarni e4aef19511 Added new unit tests for AllReduce with Bias API (#2036)
* Added new unit tests for AllReduce with Bias API

* Address review comments

[ROCm/rccl commit: 7c12b0b76b]
2025-12-03 17:37:34 -06:00
Atul Kulkarni 7c12b0b76b Added new unit tests for AllReduce with Bias API (#2036)
* Added new unit tests for AllReduce with Bias API

* Address review comments
2025-12-03 17:37:34 -06:00
Yazen AL Musaffar 16b9160034 [RDC] [SWDEV-551280] RDC to include Error Counters (#1087)
* rdc error counter

* RDC error counters

* fix

* Updates

* updated field names

Signed-off-by: yalmusaf_amdeng <yalmusaf@amd.com>

---------

Signed-off-by: yalmusaf_amdeng <yalmusaf@amd.com>
Co-authored-by: yalmusaf_amdeng <yalmusaf@amd.com>
2025-12-03 15:22:18 -06:00
Yazen AL Musaffar c9d6a8720c [SWDEV-548312] Fix for rsmitstReadWrite.TestPciReadWrite failure in rsmi-tests on MI200. (#1834)
* Fix for rsmitstReadWrite.TestPciReadWrite failure in rsmi-tests

Signed-off-by: yalmusaf_amdeng <yalmusaf@amd.com>

* Resolved comments

Signed-off-by: yalmusaf_amdeng <yalmusaf@amd.com>

---------

Signed-off-by: yalmusaf_amdeng <yalmusaf@amd.com>
Co-authored-by: yalmusaf_amdeng <yalmusaf@amd.com>
2025-12-03 15:21:36 -06:00
Yazen AL Musaffar c0d773c47b Fix for created rdc groups not listing when running rdci dmon & rdci group -l -u (#1983)
Signed-off-by: yalmusaf_amdeng <Yazen.ALMusaffar@amd.com>
2025-12-03 15:21:17 -06:00
Edgar Gabriel 3d658b558b reenable gfx1100 (#328)
* reenable gfx1100

use the modified version of the flat_store_short assembly instruction as suggested by the compiler team (32bit input value instead of 16bit)

* add fix for gfx1201

add the same fix for gfx1201 that was introduced for gfx1100

[ROCm/rocshmem commit: 224c969bef]
2025-12-03 13:49:38 -06:00
Edgar Gabriel 224c969bef reenable gfx1100 (#328)
* reenable gfx1100

use the modified version of the flat_store_short assembly instruction as suggested by the compiler team (32bit input value instead of 16bit)

* add fix for gfx1201

add the same fix for gfx1201 that was introduced for gfx1100
2025-12-03 13:49:38 -06:00
harkgill-amd 8f622de972 Add gfx1152 support to PAL (#2077) 2025-12-03 10:39:22 -08:00
Jaydeep fa4a75a26c SWDEV-563114 - User stack array instead to avoid delete gets skipped in case of assert failure. (#1553) 2025-12-03 23:03:50 +05:30
Fábio Mestre 47b80c011c [hip-tests] Fix Unit_Assert_Positive_Basic_KernelFail (#1916)
* [hip-tests] Fix Unit_Assert_Positive_Basic_KernelFail

This test was expecting a call to abort() when assertions
where hit on AMD devices. This is no longer true since
aborts from assertions are disabled unless
HIP_SKIP_ABORT_ON_GPU_ERROR is set.

This PR simplifies the test by removing the SIGABRT signal
handling (which was also undefined behaviour). Instead,
if HIP_SKIP_ABORT_ON_GPU_ERROR is set, the test is skipped.
2025-12-03 17:16:10 +00:00
Jatin Chaudhary 8e1aee62d0 make hip-tests compileable with TheRock (#1624)
## Motivation

Resolved: SWDEV-566226

The current implementation of agents inside of rocprof-systems keeps just the minimal necessary set of information required for populating the `info_agent` table inside of rocpd database. There is a sufficient amount of data that is being left out from database, so this change should fix that and store the additional agent information as an `extdata` row inside of `info_agent` table.

## Technical Details

This PR introduces additional filed inside of `agent` structure inside which is representing the JSON formatted string of all the additional information we can acquire about particular agent. This data is processed and added during the initial fetching of agents, and afterwards pushed inside of the database.

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>

* SWDEV-557412 - Incorporate proper chunk offset when remapping virtual memory (#1848)

* SWDEV-557412 - Incorporate proper offset when remapping virtual memory

* Fix condition to check if VMHeap allocation address matches a chunk address

* Move offset calculation outside if/else block

---------

Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>

* SWDEV-567852 - Clean-up hip::init() (#1948)

* SWDEV-559267 - Use CLPrint to DevLogPrintf with Log Level - detail debug. (#1160)

* SWDEV-548892 - Stop using ocml isinf wrapper (#1854)

* SWDEV-562708 - change default maximum SVM size to 256GB (#1731)

* SWDEV-503089 - Fix and enable disabled HIP tests from math group (#1319)

* SWDEV-503089 - Fix and enable disabled HIP tests from math group

* SWDEV-503089 - Move single precision reduced run to a common function

* SWDEV-548892 - Stop using ockl steadyctr function (#1882)

Directly use the builtin

* Implement PTL support (#1957)

* Implement PTL support

Signed-off-by: adapryor <Adam.pryor@amd.com>
(cherry picked from commit 45bc31292e7940a3b8fca044ef7df22047b95733)

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

---------

Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

* SWDEV-558080 - Add recommended granularity (#1176)

* Add recommended granularity

* Improve granularity testing

* Update based on feedback

* Fix and enable VMM tests on cuda (#1855)

* Fix and enable VMM tests on cuda

* Minor syntax fixes

---------

Co-authored-by: Rahul Manocha <rmanocha@amd.com>

* [rocprofiler-systems] Add support for ompt_callback_thread_begin (#1681)

* Add thread_begin callback

* Make OMPT callbacks that are instant have start_ts = end_ts

* SWDEV-567514: Remove default stream wait (#1977)

- when virtual map command is called

- can create deadlock

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>

* Fix flaky test Unit_hipStreamAddCallback_StrmSyncTiming (#2022)

* Review comments

* skip the 3 failing tests to merge hip-tests rocm-systems PR

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Co-authored-by: GunaShekar <agunashe@amd.com>
Co-authored-by: agunashe <ajay.gunashekar@amd.com>
Co-authored-by: Ethan Trinh <Ethan.Trinh@amd.com>
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
Co-authored-by: Victor Zhang <111778801+victzhan@users.noreply.github.com>
Co-authored-by: German Andryeyev <56892148+gandryey@users.noreply.github.com>
Co-authored-by: usrihari123 <srihari.u@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: anujshuk-amd <anujshuk@amd.com>
Co-authored-by: itrowbri <Ian.Trowbridge@amd.com>
Co-authored-by: marantic-amd <marantic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: cadolphe-amd <chris.adolphe@amd.com>
Co-authored-by: Karthik Jayaprakash <54370791+kjayapra-amd@users.noreply.github.com>
Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Co-authored-by: Todd tiantuo Li <88386084+lttamd@users.noreply.github.com>
Co-authored-by: amilanov-amd <Aleksandar.Milanov@amd.com>
Co-authored-by: Adam Pryor <61172547+adam360x@users.noreply.github.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: AidanBeltonS <abeltons@amd.com>
Co-authored-by: Rahul Manocha <153310294+manocharahul@users.noreply.github.com>
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
Co-authored-by: Kian Cossettini <Kian.Cossettini@amd.com>
Co-authored-by: Shadi Dashmiz <94885391+shadidashmiz@users.noreply.github.com>
Co-authored-by: Ioannis Assiouras <38722728+iassiour@users.noreply.github.com>
Co-authored-by: Ajay GunaShekar <86270081+agunashe@users.noreply.github.com>
2025-12-03 08:53:17 -08:00
German Andryeyev f3ffd7070c rocr: Change hsaKmtQueueRingDoorbell interface (#2068)
WSL uses the call just for the thread wake-up, however under Windows
KMD needs the actual value (SWDEV-568592). The interface is changed
to avoid programming of a modified write_ptr value, which somewhat
changes the client's logic.
2025-12-03 11:49:40 -05:00
Alysa Liu e79af13068 rocrtst: add VMM memory accounting test (#1666)
Add VMM test for memory accounting.
2025-12-03 11:27:51 -05:00
Sunday Clement 3f3260ffd2 rocr: Fix IPC dmabuf hang with large allocations (#1945)
Changed ipc_sock_server_conns_ map's value type to size_t. Previous
type of int caused allocations of sizes greater than 2GB to overflow,
causing the message len to be stored as a negative value, preventing the
IPC server from exporting dmabuf file descriptors, which lead to hangs.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-12-03 11:20:30 -05:00
Ben Richard 2bfa9a4d4c Intergrate roofline benchmark into rocprof-compute (#2015)
---------

Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2025-12-03 10:51:46 -05:00
Kian Cossettini 43f0a53fb0 Skip transferbench validation if binary is not built (#2136)
Added set(skip_validation TRUE) if transferBench binary not found.
2025-12-03 10:14:16 -05:00
Milan Radosavljevic fddef714a0 [rocprofiler-systems] Add trace_cache unit tests (#2086)
Improve test coverage and reliability of the trace_cache module by adding comprehensive unit tests for all major components.
2025-12-03 09:25:33 -05:00
Anatolii Rozanov 4b04b540bf Add host API for alltoallmem_on_stream collective operation (#333)
* Add host-side rocshmem_alltoallmem_on_stream function

Function signature:
  rocshmem_alltoallmem_on_stream(rocshmem_team_t team, void *dest,
                                 const void *source, size_t size,
                                 hipStream_t stream)

- The function launches rocshmem_alltoallmem_kernel which calls
device-side alltoall<char> workgroup collective through default context.
- Uses dynamic block size determination via occupancy API.
- Implemented for all backends.

* Fix incorrect sync buffer size allocation for alltoall in GDA and IPC backends

When allocating memory for alltoall_pSync_pool in setup_teams() and
teams_init() functions, the code incorrectly used ROCSHMEM_BCAST_SYNC_SIZE
instead of ROCSHMEM_ALLTOALL_SYNC_SIZE.

* Add functional test for team_alltoallmem_on_stream

This commit adds a new functional test to verify the correctness of
the host-side rocshmem_team_alltoallmem_on_stream API.

* Add documentation for rocshmem_alltoallmem_on_stream

This commit adds API documentation for the host-side
rocshmem_alltoallmem_on_stream function in the collective routines
section. The documentation includes:

[ROCm/rocshmem commit: 5577feb70d]
2025-12-03 08:40:24 -05:00
Anatolii Rozanov 5577feb70d Add host API for alltoallmem_on_stream collective operation (#333)
* Add host-side rocshmem_alltoallmem_on_stream function

Function signature:
  rocshmem_alltoallmem_on_stream(rocshmem_team_t team, void *dest,
                                 const void *source, size_t size,
                                 hipStream_t stream)

- The function launches rocshmem_alltoallmem_kernel which calls
device-side alltoall<char> workgroup collective through default context.
- Uses dynamic block size determination via occupancy API.
- Implemented for all backends.

* Fix incorrect sync buffer size allocation for alltoall in GDA and IPC backends

When allocating memory for alltoall_pSync_pool in setup_teams() and
teams_init() functions, the code incorrectly used ROCSHMEM_BCAST_SYNC_SIZE
instead of ROCSHMEM_ALLTOALL_SYNC_SIZE.

* Add functional test for team_alltoallmem_on_stream

This commit adds a new functional test to verify the correctness of
the host-side rocshmem_team_alltoallmem_on_stream API.

* Add documentation for rocshmem_alltoallmem_on_stream

This commit adds API documentation for the host-side
rocshmem_alltoallmem_on_stream function in the collective routines
section. The documentation includes:
2025-12-03 08:40:24 -05:00
Milan Radosavljevic 09a9f9e31d [rocprofiler-systems] Improve rocpd writing speed (#2061) 2025-12-03 13:11:15 +01:00
Jan Stephan 09fbf08b40 Add Cline storage to .gitignore (#2132)
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2025-12-03 11:23:32 +01:00
Todd tiantuo Li 55bce8108d SWDEV-491246 - add stream capture tests for Memset sync APIs (#1776) 2025-12-02 23:59:50 -08:00
Mark Meserve a3a177b14f rocprofiler-sdk: fix python linting (#2147)
- flake8 rules are not currently included in make format
2025-12-02 19:31:49 -06:00
vedithal-amd 2700ca0287 Update changelog to reflect 7.2 cherry-pick (#2137) 2025-12-02 17:26:17 -05:00
spolifroni-amd a2d3cd97cc Docs - updated the libva requirements for 7.2 (#207)
* updated the libva requirements for 7.2

* updated with feedback from Aryan

* messed up the library reqs; fixed

* added a later

* made the changelog clearer

[ROCm/rocjpeg commit: 1a86352fdd]
2025-12-02 11:46:12 -08:00
spolifroni-amd 1a86352fdd Docs - updated the libva requirements for 7.2 (#207)
* updated the libva requirements for 7.2

* updated with feedback from Aryan

* messed up the library reqs; fixed

* added a later

* made the changelog clearer
2025-12-02 11:46:12 -08:00
Matt Arsenault d75d0bc1c9 SWDEV-548892 - Stop using ocml exp and exp2 functions (#2032) 2025-12-02 13:39:09 -05:00