Commit Graph

2042 Commits

Author SHA1 Message Date
gabrpham_amdeng 3527ecf8dd Fix PCIe Levels converting to csv
Change-Id: Id69b8bbc167887673a88e13eb497bdeac6dd0425
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>


[ROCm/amdsmi commit: 46e5f157f0]
2025-11-13 13:08:12 -06:00
gabrpham_amdeng 351b6f96ae Added support for configuring PPT1 power cap
- Updated python integration test to account for PPT1 support changes
  - Updated set/reset power-cap input format
  - Adjusted python API and updated C++ API test

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
Change-Id: Ia9d02868b6e91c88c10a9772d9e2d9f37c3c352f


[ROCm/amdsmi commit: 18faddf6f3]
2025-11-13 13:08:12 -06:00
darren-amd 360a4316b5 Set amdsmi_parser compute partition --help to list
[ROCm/amdsmi commit: 4dfe74eb72]
2025-11-13 12:49:56 -06:00
Saitarun Movva b8298bf5b6 Fix #132: Python parser not accepting compute partition arguments
The argparse 'choices' parameter was receiving a comma-separated string
instead of a list, causing it to treat individual characters as valid
choices rather than complete tokens like 'SPX', 'DPX', etc.

Fixed by removing the unnecessary join() operation in
get_accelerator_choices_types_indices() to return the list directly.
This matches the pattern used by get_memory_partition_types().

Now 'amd-smi set -C DPX' and other partition commands work correctly.


[ROCm/amdsmi commit: f6b1cb9024]
2025-11-13 12:49:56 -06:00
AL Musaffar, Yazen 93a719b894 Fix for XGMI and SOC policies KeyError (#823)
Fix for amd-smi XGMI and SOC policies errors

Signed-off-by: Yazen AL Musaffar <Yazen.ALMusaffar@amd.com>

[ROCm/amdsmi commit: 699890a3f5]
2025-11-10 12:41:47 -06:00
Poag, Charis ced0642b4b [SWDEV-562295] Fix Dmesg errors when using CLI (#822)
* Changes:
  - Modified attempting to open files to check
    permissions -> check read access only.

Do not try to open all paths, may cause driver issues.
Read access is sufficient to check permissions.

Reason: GPUs which support partitioning (memory/compute),
logical devices will not be valid until configured.
See `sudo amd-smi set -h` or applicable APIs
to configure on supported hardware.

Example error dmesg output:
[965358.883112] amdgpu 0000:15:00.0: amdgpu: renderD153 partition 1 not valid!
[965358.883283] amdgpu 0000:15:00.0: amdgpu: renderD154 partition 2 not valid!
[965358.883438] amdgpu 0000:15:00.0: amdgpu: renderD155 partition 3 not valid!
[965358.883594] amdgpu 0000:15:00.0: amdgpu: renderD156 partition 4 not valid!
[965358.883749] amdgpu 0000:15:00.0: amdgpu: renderD157 partition 5 not valid!
[965358.883904] amdgpu 0000:15:00.0: amdgpu: renderD158 partition 6 not valid!
[965358.884060] amdgpu 0000:15:00.0: amdgpu: renderD159 partition 7 not valid!

---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: d73726698b]
2025-11-06 10:24:14 -06:00
Galantsev, Dmitrii 181659ea1f Add numbers to .so because wheels dont allow symlinks (#820)
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>

[ROCm/amdsmi commit: 8bdf951d32]
2025-11-06 03:57:31 -06:00
Galantsev, Dmitrii 4e8d89306e Add downloaded gtest as fallback
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: aac09912ec]
2025-11-06 01:26:40 -06:00
Galantsev, Dmitrii 87ace88e72 Fix missing iomanip and cstdio in tests
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 982737a852]
2025-11-05 10:14:19 -06:00
gabrpham_amdeng 4d29ff8a2d Fixed Namspace has no attribute 'pcie' error in set command
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>


[ROCm/amdsmi commit: 9739611239]
2025-10-30 15:17:48 -05:00
adapryor 5c95a1485f Fix evicted_time
[ROCm/amdsmi commit: 4abb69f9d9]
2025-10-30 14:01:44 -05:00
Galantsev, Dmitrii adaf3c9966 Use system gtest instead of building from source
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: a375479386]
2025-10-30 12:38:11 -05:00
Galantsev, Dmitrii 55f999f3ce Find libamd_smi.so and librocm-core.so relative to wrapper.py
Allow amdsmi to find libamd_smi.so and librocm-core.so relative to
amdsmi_wrapper.py location.

The amdsmi_wrapper.py file is located in
_rocm_sdk_core/share/amd_smi/amdsmi and the libraries are in
_rocm_sdk_core/lib/libamd_smi.so.26.
_rocm_sdk_core/lib/librocm-core.so.1.


[ROCm/amdsmi commit: ad20d57162]
2025-10-30 12:35:06 -05:00
Bindhiya Kanangot Balakrishnan 9973a6b324 [SWDEV-558046] Fix topology weight corruption due to casting
The out of bound writes caused corruption in next field,
which was weight. Fixed by reading to a temp and then assigning
safely.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>


[ROCm/amdsmi commit: a2aae5e8a9]
2025-10-30 10:49:38 -05:00
Charis Poag 4df843f110 [SWDEV-560847] Fix Vram type not showing newer types
* Changes:
  - Allows `amd-smi static --vram` (`amdsmi_get_gpu_vram_info()`)
    to read the following types:
    DDR5, LPDDR4, LPDDR5, and HBM3E.

Change-Id: I1eddf9dcb574e1868541cc5063ae95cb6d6e1c59
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 0a5fdc944f]
2025-10-29 16:13:42 -05:00
Allan Xavier 9b4a9acd27 Allowed GPU enumeration to continue with non-contiguous render nodes
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>


[ROCm/amdsmi commit: 51971426bd]
2025-10-29 15:31:56 -05:00
Bindhiya Kanangot Balakrishnan d5691b7ed9 [SWDEV-563281] Add json and csv output for xgmi status
Added json and csv output format support for newly
added xgmi link_status. Aligned legend.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>


[ROCm/amdsmi commit: 8dd4a4997b]
2025-10-29 15:25:15 -05:00
Pham, Gabriel 87b2fd73b8 Added set --pcie command and added more pcie info to static --bus output (#481)
* Added amd-smi set --pcie command
* Removed current pcie level due to it not being static
* Added pcie information to static --bus

---------

Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 9e3537d778]
2025-10-28 14:55:55 -05:00
Pryor, Adam 354886f4ff [SWDEV-357472] Add evicted_ms metric (#620)
- **Added evicted_time metric for kfd processes**.  
  - Time that queues are evicted on a GPU in milliseconds
  - Added to CLI in `amd-smi monitor -q` and `amd-smi process`
  - Added to C API and Python API:
    - amdsmi_get_gpu_process_list()
    - amdsmi_get_gpu_compute_process_info()
    - amdsmi_get_gpu_compute_process_info_by_pid()

---------

Signed-off-by: Pryor, Adam <Adam.Pryor@amd.com>

[ROCm/amdsmi commit: 2144cfbba4]
2025-10-28 14:49:03 -05:00
dependabot[bot] f36affe4d5 Bump rocm-docs-core[api_reference] from 1.26.0 to 1.27.0 in /docs/sphinx (#790)
Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.26.0 to 1.27.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.26.0...v1.27.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.27.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

[ROCm/amdsmi commit: 6f222c11a6]
2025-10-28 09:59:49 -05:00
Park, Peter e8d06cc0d6 Update install instructions (#759)
- `amdgpu-install` is no longer recommended. Link to separate driver
installation docs.
- add verify step
- update readme
- add package info

Signed-off-by: Park, Peter <Peter.Park@amd.com>

[ROCm/amdsmi commit: 12fb58c30b]
2025-10-28 09:59:11 -05:00
Charis Poag fee59b2c58 [SWDEV-562726] Fix clang + ASAN errors
* Updates:
  - [ASAN] GCC does not support `-shared-libsan flags`, so removed this one
  - [Clang] Fixed refernces to local binding errors (name collision)
    & other strict scope/structure/lamda binding errors
  - [Clang] Fix rsmi_wrapper error: \"error: missing default argument on parameter \'args\'\"
  - [ASAN] Fixed stack-buffer-overflow found in
    `amdsmi_get_gpu_accelerator_partition_profile()`

Change-Id: I854007efb75d828dbb8088c0d56dbc125081f0f2
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 00a04f5810]
2025-10-28 09:54:23 -05:00
Narlo, Joseph 54317f3fe8 [SWDEV-553416] Fix amdsmi_get_gpu_reg_table_info and amdsmi_get_gpu_pm_metrics_info(#787)
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>

[ROCm/amdsmi commit: ced7d12395]
2025-10-27 14:43:31 -05:00
Saeed, Oosman be53750aa3 Sync with latest ras-decode @bc6b43c (#770)
Signed-off-by: Oosman Saeed <oossaeed@amd.com>

[ROCm/amdsmi commit: 90f4b8c43d]
2025-10-27 14:10:00 -05:00
AL Musaffar, Yazen 2050dde73b [SWDEV-557731] Fix for amd-smi error not exiting when timeout command used (#779)
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>

[ROCm/amdsmi commit: 4bcc2ca598]
2025-10-27 13:51:22 -05:00
Billakanti, Koushik a428180cb8 [SWDEV-434556] Updated warning text when argcomplete fails to install (#778)
* Solved error: [Errno 2] No such file or directory: '/usr/local/lib/python3.10/dist-packages/argcomplete/bash_completion.d/python-argcomplete.sh'

[ROCm/amdsmi commit: 013d6cb511]
2025-10-27 13:05:17 -05:00
Kanangot Balakrishnan, Bindhiya 3924171d74 [SWDEV-542718] Correct socket_affinity (#760)
* [SWDEV-542718] Correct socket_affinity

Updated Socket affinity to show bitmask and expanded cpu list.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Update per-device local_cpulist for socket_affinity

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Added amdsmi_get_cpu_affinity_from_local_cpulist API.
Updated the wrapper.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Revert "Added amdsmi_get_cpu_affinity_from_local_cpulist API."

This reverts commit 9a2ef934b1787f8aa09d3e4efe02f897b4295215.

* Moved the changes to C API.
In case of SOCKET_SCOPE, use local_cpulist first.
If it is unavailable or not readable, fallback to
numa.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Addressed review comments

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 09a97f02ed]
2025-10-22 16:20:41 -05:00
Poag, Charis ce19b921b0 [SWDEV-535159] Add support for GPU partition metrics (#490)
[SWDEV-535159] Add support for GPU partition metrics

Changes include:
  - Internal logic to smart-switch between gpu_metrics/xcp_metrics files
  - [WIP] Initial plumbing for new partition metric API

Change-Id: I4340fb1b48bac0117d80d5d486b9e871430d5cd8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Add amdsmi_get_gpu_partition_metrics_info() + minor cleanup

Change-Id: I5d60604f18baddbd03852dc90e88aa0b8107d50e
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Fix partition metric logic + update logging/tests

Change-Id: I9e89b19ead17694c54e224f8e13ff8ee3eb2e22a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Adjust amd-smi metric/monitor/default to show (some) partition information

Change-Id: I2e8d2745876a19bdaec3c039daa97345c9f701b5
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Add C++ tests

Change-Id: Ib9eb0b57a6d7a280992e05a4c6eba632826952ef
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Remove modification of energy counter, not needed

Change-Id: I5c48eaaae248ee6dc79abba609d837ec35d78022
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[CLI] amd-smi metric: cleaned up N/A'd multi-valued to show just N/A

Changes:
1. amd-smi metric: cleaned up N/A'd multi-valued to show just N/A
ex.
JPEG_ACTIVITY: [N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A]

Now just shows: N/A

2. [Python Unit Test] Changed testname TestAmdSmiPythonBDF(unittest.TestCase) ->
 AmdSmiPythonUnitTest

Test name was confusing.

Change-Id: Ieb3b036f30002fd22362508eb9fc5d443df395ae
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Log cleanup

Change-Id: I1b1a95f1844d35bec7a7bd8cb996f87e4914c069
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Add amd-smi partition-metrics CLI + general cleanup

Change-Id: Ia91488e6cb3a4d62b4087afbddfe0b3bb9378fdc
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[1.3 metrics] Remove forwards compatibility for partition metrics

Change-Id: Iab928983e6f6f1587bc9307f6f3fa2b2696ca6f7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Fixed violation output not showing % + general cleanup

Change-Id: Icac1b0a55b18c7628b07109ae0c377d17e0825f1
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Clean up amdsmi_get_gpu_partition_metrics_info & amd-smi partition-metric outputs

Change-Id: I6427028b980874641e9ffb3b5d88ad493dbf9cf4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Fix metrics not found + extra logging/formatting

Change-Id: I841a27bb2c305e97ec7579a13ac915e5be497c3a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Update license to current default

Change-Id: I0de9b8a2d5dbbeab4491097f0354ba17b0d30866
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Cleanup for review

Change-Id: I96ed25c3f2b8968eea1af24c5e5860c2b4e74e6e
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Moderize updated/new interal APIs.

Change-Id: I3c48a250eeb703709b14cb5ffa68268d8321626c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Remove extra logging in dynamic metrics

Change-Id: Idb97547bcbe143d6fa1cb5cb278ffe4da615ce14
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Remove amd-smi partition-metric command

Change-Id: Ib83c17e5cd7e0da3798198943bddd46c296b411c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Move new CLI updates to another PR + minor fixes

Change-Id: I3b1163eec12f9b5f7d95ee33de08e168cec1b1fe
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Allow dynamic metrics to work for gpu/xcp metrics 1.9+/1.1+

Updated some logging as well.

Change-Id: I2ed9f5a5ef8afb1520508820ca6153525f0644b4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Allow dyn gpu/xcp metric v1.9+/v1.1+

Added tests for quick check

Change-Id: I576d6f6582a55afb08e5ac57791ce95e2fa184a2
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Update tests for larger subset of version checks

Change-Id: I3cdf4f8bb4fc6161f4c76566939f90545d0f362a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Fix XCP metrics in gpu/partition metric pre-v1.9/v1.1 (dynamic)

Change-Id: I4dabc1ed6bef6b86c8e7f92bf9cb5992f3966fe2
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: 01b4fe6614]
2025-10-20 14:43:40 -05:00
Pryor, Adam 428bded17a Add cache for user/group checking (#780)
* Add cache for user/group checking
* Fix self

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: b2b0815b37]
2025-10-17 15:46:19 -05:00
Saeed, Oosman 7d39749a08 [SWDEV-551318] Update readme doc: amdsmi_get_afids_from_cper() input arguments (#766)
* Update readme doc: amdsmi_get_afids_from_cper() input argument is only bytes, not a list of dicts each with keys “bytes” (List[int]) and “size” (int)

---------

Signed-off-by: Oosman Saeed <oossaeed@amd.com>

[ROCm/amdsmi commit: f7c9fe3011]
2025-10-17 15:42:17 -05:00
Narlo, Joseph 5ec7b213e4 [SWDEV-555807] TestCudaMallocAsync test power draw failing (#755)
* Clarified comments regarding power limit retrieval and its support on virtualized systems.
* Change unsupported comment to UINT32_MAX

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 460cfcba1f]
2025-10-17 08:57:57 -05:00
Bindhiya Kanangot Balakrishnan baa8fa3042 Move build_xcp_dict def to helpers.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>


[ROCm/amdsmi commit: 4dd1c1042a]
2025-10-16 15:25:10 -05:00
Pryor, Adam 1c6147ead5 [SWDEV-558993] Fix list() groups printout (#772)
* Updated groups printing
* added parameters to check_required_groups
	* two device groups since kfd and render require the same group

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: ee1445e2cc]
2025-10-16 11:23:49 -05:00
adapryor cda730140f [SWDEV-560778] Update gpu metrics factory to return a new pointer every time
[ROCm/amdsmi commit: a64e9b4ac4]
2025-10-15 11:00:44 -05:00
Pryor, Adam 5127c923b9 [SWDEV-559082] Add asic info cache (#756)
Signed-off-by: adapryor <Adam.pryor@amd.com>

[ROCm/amdsmi commit: cba4c871d3]
2025-10-08 21:48:08 -05:00
Arif, Maisam 0aae5d381d Spellcheck
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: b02f7b2793]
2025-10-08 12:03:17 -05:00
Maisam Arif dcb8ba2215 Clean up and add comments
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Id30c0ccb68918e109533593df7c360837bdfa002


[ROCm/amdsmi commit: 4e8ed1f3e3]
2025-10-08 12:00:21 -05:00
Maisam Arif 9c5609410e Changelog Update
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Idf2faae9fce903468d6ddccb1dce8161b1ac904c


[ROCm/amdsmi commit: c5c8e98def]
2025-10-08 01:55:16 -05:00
Pryor, Adam 6758e447b6 [SWDEV-556149] Fix group checking (#746)
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: Pryor, Adam <Adam.Pryor@amd.com>


[ROCm/amdsmi commit: 4518e7ee91]
2025-10-07 22:24:56 -05:00
Oosman Saeed 2214445327 [SWDEV-553168] Add support for decoding out of band boot time CPER files.
Change-Id: Ic4278698f9c5b5ae56bd56fd43150c0653c1ef05


[ROCm/amdsmi commit: c6698c9100]
2025-10-07 22:23:33 -05:00
yalmusaf_amdeng 25a6ac3585 [SWDEV-558349] Fix for cper record count mismatch with --file-limit
Change-Id: I4fdcc0fb1153e47c195062e7bdf71c0362723ef6


[ROCm/amdsmi commit: c4cad504be]
2025-10-07 21:36:53 -05:00
Pryor, Adam a93b9d473d [SWDEV-558895] Fix rsmi monitor fds (#748)
Signed-off-by: adapryor <Adam.pryor@amd.com>

[ROCm/amdsmi commit: 346e1516af]
2025-10-07 21:31:23 -05:00
Maisam Arif 1269ff4c0c [SWDEV-558993] Fix amd-smi list to not check for groups for bdf
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1ff9c0e00a9188435b0ee60e57c2678121dd8e72


[ROCm/amdsmi commit: 0a45a12e7a]
2025-10-07 09:45:37 -05:00
Maisam Arif 0b16d22254 [SWDEV-558993] Fix bdf sourcing
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I0c50f490334f6de12a4c01abf1c2ed9e50d87295


[ROCm/amdsmi commit: a0d59397b4]
2025-10-07 01:32:26 -05:00
Kanangot Balakrishnan, Bindhiya 693055ee50 [SWDEV-554046] xgmi cli redesign (#574)
Added `GPU LINK PORT STATUS` table to `amd-smi xgmi` command 
The `amd-smi xgmi -s` or `amd-smi xgmi --source-status` will show `GPU LINK PORT STATUS` table.  

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 7ddd91653e]
2025-10-07 01:07:27 -05:00
Pryor, Adam d1679c7ade [SWDEV-558895] Fix rsmi_event_notification_get segfaulting (#738)
Signed-off-by: adapryor <Adam.pryor@amd.com>

[ROCm/amdsmi commit: ce016f0dcb]
2025-10-06 15:10:56 -05:00
Narlo, Joseph 6975b29c15 [SWDEV-539078] Add missing API definitions to python interface (#525)
Added the following API's to amdsmi_interface.py.
	amdsmi_get_cpu_handle()
	amdsmi_get_esmi_err_msg()
	amdsmi_get_gpu_event_notification()
	amdsmi_get_processor_count_from_handles()
	amdsmi_get_processor_handles_by_type()
	amdsmi_gpu_validate_ras_eeprom()
	amdsmi_init_gpu_event_notification()
	amdsmi_set_gpu_event_notification_mask()
	amdsmi_stop_gpu_event_notification()
	amdsmi_get_gpu_busy_percent()

Added additional return value to API amdsmi_get_xgmi_plpd().
	The entry policies is added to the end of the dictionary to match API definition.
	The entry plpds is marked for deprecation as it has the same information as policies.

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 7decbc67a1]
2025-10-06 14:50:00 -05:00
dependabot[bot] 8104a48f8e Bump rocm-docs-core[api_reference] from 1.25.0 to 1.26.0 in /docs/sphinx
Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core) from 1.25.0 to 1.26.0.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.26.0/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.25.0...v1.26.0)

---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
  dependency-version: 1.26.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>

[ROCm/amdsmi commit: faf0024135]
2025-10-06 09:17:33 -05:00
Pryor, Adam 8bc7216a65 [SWDEV-525336] Use KFD to determine process start/stop (#723)
* Used KFD to determine linking between GPUs and PIDs rather than depend on fdinfo's per pid single gpu bdf info that we were getting.

Signed-off-by: adapryor <Adam.pryor@amd.com>

---------

Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: c967aead58]
2025-10-02 10:57:08 -05:00
Narlo, Joseph 098aa488aa Add ASIC and Board information (#721)
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>

[ROCm/amdsmi commit: b1eeff9992]
2025-10-01 17:39:26 -05:00