Galantsev, Dmitrii
06b8484bbc
CLI - Fix partition json output
...
Change-Id: I2b9e575cb960db7c136776bfe5c040b27feba727
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com >
[ROCm/amdsmi commit: 4262802588 ]
2025-06-19 10:34:57 -05:00
josnarlo
ed9086505d
[SWDEV-538604] Sync Unified Header and AMDSMI Comments
...
Signed-off-by: josnarlo <Joseph.Narlo@amd.com >
[ROCm/amdsmi commit: 5ed9fba9be ]
2025-06-18 09:13:01 -05:00
Deepak Mewar
63784f77f7
Updated display format of cpu & socket affinities
...
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com >
[ROCm/amdsmi commit: 7571eb014f ]
2025-06-13 17:37:00 -05:00
Bindhiya Kanangot Balakrishnan
cd709e93d1
[SWDEV-512393] Print keys of lists in custom_dump
...
The custom_dump function was not printing list's key
and so static numa was not displaying list keys
CPU affinity and Socket affinity. Updated custom_dump
to print the keys.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com >
[ROCm/amdsmi commit: 6fbda16098 ]
2025-06-13 17:37:00 -05:00
josnarlo
48ed5787a6
[SWDEV-537983] Fix comments about temperature units for amdsmi_get_temp_metric
...
Signed-off-by: josnarlo <Joseph.Narlo@amd.com >
[ROCm/amdsmi commit: d4a946717b ]
2025-06-13 16:51:59 -05:00
josnarlo
986a2dd0b5
[SWDEV-537983] Fix comments about temperature units for amdsmi_get_temp_metric
...
Signed-off-by: josnarlo <Joseph.Narlo@amd.com >
[ROCm/amdsmi commit: 4aee30f49b ]
2025-06-13 16:51:59 -05:00
Pham, Gabriel
dfaf8386fa
Added GTT Memory to default output process table ( #480 )
...
* Added GTT Memory to default command and adjusted table format
---------
Signed-off-by: gabrpham <Gabriel.Pham@amd.com >
[ROCm/amdsmi commit: 940ece6813 ]
2025-06-13 16:43:56 -05:00
dependabot[bot]
b1753ad3b3
Bump rocm-docs-core[api_reference] from 1.17.0 to 1.20.1 in /docs/sphinx
...
Bumps [rocm-docs-core[api_reference]](https://github.com/ROCm/rocm-docs-core ) from 1.17.0 to 1.20.1.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases )
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/v1.20.1/CHANGELOG.md )
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.17.0...v1.20.1 )
---
updated-dependencies:
- dependency-name: rocm-docs-core[api_reference]
dependency-version: 1.20.1
dependency-type: direct:production
update-type: version-update:semver-minor
...
Signed-off-by: dependabot[bot] <support@github.com >
[ROCm/amdsmi commit: 152184dd49 ]
2025-06-13 16:35:08 -05:00
Maisam Arif
34041504f9
Update workflows and Contrib docs
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I2ae31144ee1ab29c8bbba83f0c7eb0bb9dc079ba
[ROCm/amdsmi commit: 049c59c5bb ]
2025-06-13 16:19:10 -05:00
Maisam Arif
6688ae237f
Updated 6.4.2 Changelog
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I975f5db0bde9ebccec3756415cb1e7dc47e78988
[ROCm/amdsmi commit: 772b572913 ]
2025-06-12 17:17:13 -05:00
Maisam Arif
6e37490e87
[SWDEV-529665] PLDM Bundle naming
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: Id7f652ddc4e790027869683a4aaa3226ffc05c83
[ROCm/amdsmi commit: 6da33b8ded ]
2025-06-12 02:19:37 -05:00
Maisam Arif
7be2218717
[SWDEV-537491] Updated Copyright to aca-decode files
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I9621e4c54f3b490c6eb4cfc3e9bdfb4d489f0052
[ROCm/amdsmi commit: 5763412f7d ]
2025-06-11 20:51:51 -05:00
Arif, Maisam
2658f0fe20
Fixed type hinting & Added copy rights ( #462 )
...
* Added copyrights
* Fixed type hinting for processor_handle in python_interface
* Fixed Incorrect type hinting to actual return types
---------
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com >
Change-Id: Ie2a09acf628ed0c43eacc8ec78c159d125acbcdb
[ROCm/amdsmi commit: 23b9da656c ]
2025-06-11 17:19:02 -05:00
Justin Williams
0c2228852a
CI - Added Build Warnings
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
[ROCm/amdsmi commit: 6d03ca79ff ]
2025-06-11 13:13:38 -05:00
Maisam Arif
b8caa120a8
[SWDEV-537062] Fixed CU Occupancy reporting UINT MAX
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I975579997a9e455eb930f6c0b8fc5f3dc3cbfae4
[ROCm/amdsmi commit: b579d89ae2 ]
2025-06-11 10:42:00 -05:00
dependabot[bot]
aa35398722
Bump requests from 2.32.3 to 2.32.4 in /docs/sphinx ( #471 )
...
Bumps [requests](https://github.com/psf/requests ) from 2.32.3 to 2.32.4.
- [Release notes](https://github.com/psf/requests/releases )
- [Changelog](https://github.com/psf/requests/blob/main/HISTORY.md )
- [Commits](https://github.com/psf/requests/compare/v2.32.3...v2.32.4 )
---
updated-dependencies:
- dependency-name: requests
dependency-version: 2.32.4
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
[ROCm/amdsmi commit: 7e956ce4f3 ]
2025-06-11 08:23:27 -05:00
Maisam Arif
2cbf0accea
[SWDEV-529665] Fix PLDM version format
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I7df4c2068e32a5c81c83adc69dc82a9f5d725533
[ROCm/amdsmi commit: 93404a6bff ]
2025-06-11 07:35:25 -05:00
Galantsev, Dmitrii
6892907072
CMAKE - Remove example build from src/CMakeLists.txt ( #469 )
...
* CMAKE - Remove example build from src/CMakeLists.txt
For some reason it was building examples every time even when not
necessary...
* CMAKE - Format
* Fix drm_example broken PRIu32
* CMAKE - Do NOT create lib64 when building examples
* CMAKE - Examples should only install C and CMake files
---------
Change-Id: I6274b72a085a41b5bd5ae698af798f60a8a092a0
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com >
[ROCm/amdsmi commit: f9b8066c26 ]
2025-06-11 07:12:44 -05:00
Maisam Arif
75fac0a105
Fixed Parser Folder Checking
...
* Adjusted help text
* Adjusted --afid to run only with --cper-file
* Fixed interface return error
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I2b96f4515c85f3b9dd84ba5c2d819729a997141b
[ROCm/amdsmi commit: ac63f410c2 ]
2025-06-10 15:58:06 -05:00
Maisam Arif
7eea09e4d8
[SWDEV-536417] CPER Display fixes
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: Ic2f3901d0f4c95bd9ed4beda8aa5fd3d596df8d2
[ROCm/amdsmi commit: fb592e003a ]
2025-06-10 15:58:06 -05:00
Williams, Justin
20e374663d
CI v5.0 ( #459 )
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
[ROCm/amdsmi commit: ae4f56d14b ]
2025-06-06 16:29:20 -05:00
Saeed, Oosman
cc2b4b4067
[SWDEV-536417] AFID & addc decode fixes ( #449 )
...
* fix endian problem
* use hw_revision and flags_mask from cper section instead of hardcoded values
---------
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
[ROCm/amdsmi commit: 815e0252b1 ]
2025-06-06 13:41:16 -05:00
Maisam Arif
8c60c4ed94
[SWDEV-536417] CPER & AFID CLI Fixes
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I20aafb1cd2bf8386c30e6d0a0fff8df9c8587554
[ROCm/amdsmi commit: 8bc37a19d2 ]
2025-06-06 12:26:13 -05:00
Charis Poag
df6de25624
[SWDEV-529030/SWDEV-531217] Fix tests & output for partitioned configurations (CPX, DPX, QPX, etc.)
...
Changes:
- Updated AMD SMI firmware to display "N/A" for unavailable firmware in partitioned environments, improving clarity.
Example (in DPX):
$ amd-smi firmware
GPU: 0
FW_LIST:
...
FW 12:
FW_ID: PM
FW_VERSION: 00.86.39.00
GPU: 1
FW_LIST: N/A
- Fixed amd-smi partition not showing current partition information on
asics with inablity to set memory or accelerator partitions.
$ amd-smi partition -c -m
CURRENT_PARTITION:
GPU_ID MEMORY ACCELERATOR_TYPE ACCELERATOR_PROFILE_INDEX PARTITION_ID
0 NPS1 CPX 2 0
1 N/A N/A N/A 1
2 N/A N/A N/A 2
3 N/A N/A N/A 3
4 N/A N/A N/A 4
5 N/A N/A N/A 5
6 NPS1 SPX 0 0
7 NPS1 SPX 0 0
8 NPS1 SPX 0 0
MEMORY_PARTITION:
GPU_ID MEMORY_PARTITION_CAPS CURRENT_MEMORY_PARTITION
0 N/A NPS1
1 N/A N/A
2 N/A N/A
3 N/A N/A
4 N/A N/A
5 N/A N/A
6 N/A NPS1
7 N/A NPS1
8 N/A NPS1
- Refactored amd_smi_drm_example.cc:
- Grouped partition changes and restores original partition settings.
- Now handles partitioned environments allowing example to continue even if some APIs are not supported in partitioned configurations.
- Modified amdsmi_asic_info_t (see amdsmi_get_gpu_asic_info()) to report OAM ID as N/A if 0xFFFFFFFF (was 0xFFFF).
Allows for better handling of OAM IDs in partitioned environments (DNE for non-primary nodes,
since its a physical identifier). Easier to handle in tests and example code (ie. now consistent w/ max size of the structure's value).
- Introduced amdsmi_RAII_open_FD() (internal API) to manage file descriptors using RAII, ensuring proper closure and preventing resource leaks.
Updated the following APIs to use this function:
- amdsmi_get_gpu_asic_info(), amdsmi_get_gpu_vram_usage(),
amdsmi_get_gpu_vram_info(), amdsmi_get_gpu_vbios_info(),
amdsmi_get_gpu_driver_info(), amdsmi_get_gpu_virtualization_mode()
- Updated AMD SMI test_base.cc/.h:
- Improved output and handling for partitioned environments.
- Added detailed ASIC information logging to align with structure changes.
- Enhanced error messages for better context before ASSERT checks.
- Resolved test failures in partitioned environments by updating
logic and handling for partition-specific configurations.
Fixed tests include:
- computepartition_read_write.cc, frequencies_read_write.cc,
gpu_metrics_read.cc, mem_util_read.cc, memorypartition_read_write.cc,
perf_level_read.cc, perf_level_read_write.cc, power_cap_read_write.cc,
power_read.cc, sys_info_read.cc, gpu_busy_read.cc
Change-Id: I36e903f8fddd714c74c719459c71aba8bbb77e6f
Signed-off-by: Charis Poag <Charis.Poag@amd.com >
Resetting head + adding fixes for tests ran in partitions
Change-Id: I0c1e9ac07488b50c95f3bc6d8a724e67d2c715dc
Signed-off-by: Charis Poag <Charis.Poag@amd.com >
[ROCm/amdsmi commit: 391451752b ]
2025-06-05 19:24:49 -05:00
Pham, Gabriel
f12b070e14
[SWDEV-536184] Removed extra debug print statement ( #447 )
...
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com >
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com >
[ROCm/amdsmi commit: f0233eb664 ]
2025-06-05 17:50:56 -05:00
gabrpham_amdeng
f30205b296
[SWDEV-536184] Modified KFD fallback condition for getting VRAM to include sysfs read failures
...
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com >
[ROCm/amdsmi commit: 7130de3058 ]
2025-06-05 01:49:16 -05:00
Bindhiya Kanangot Balakrishnan
60a86179b9
[SWDEV-534746] Generate valid json output for partition command
...
The amd-smi partition --json output was not in valid json
format. Changes are done to get the output in valid
json format.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com >
[ROCm/amdsmi commit: 872c58b7a3 ]
2025-06-05 01:40:52 -05:00
Saeed, Oosman
99df131155
[SWDEV-530385] Update aca-decode with parsing fixes ( #435 )
...
*Update aca-decode to #4cd539d that fixes some errors in parsing cper files for afid extraction
*Without this fix, we get garbage value for some cper input files relating GFX_poison_cpers
Signed-off-by: Oosman Saeed <oossaeed@amd.com >
[ROCm/amdsmi commit: 2c3fa591b5 ]
2025-06-04 18:49:05 -05:00
Arif, Maisam
e38de3932f
Add Directory Not Found Status code to map to ENOTDIR ( #238 )
...
* Corrected ecc count error return
* Added directory not found error code
* Added ENOTDIR mapping to RSMI_STATUS_DIRECTORY_NOT_FOUND in ErrnoToRsmiStatus
---------
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com >
[ROCm/amdsmi commit: e2692ab533 ]
2025-06-03 17:53:33 -05:00
Narlo, Joseph
ba8d2f0d84
[SWDEV-532069] Doxygen Not Picking Non-Documented Values ( #362 )
...
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com >
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com >
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com >
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com >
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com >
[ROCm/amdsmi commit: c0c4e021ea ]
2025-06-03 17:24:44 -05:00
Narlo, Joseph
4eb6d34df0
[SWDEV-532769] amd-smi APIs mismatch with documentation ( #428 )
...
* Populated socket_power to get power info
---------
Signed-off-by: josnarlo <Joseph.Narlo@amd.com >
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com >
[ROCm/amdsmi commit: ce7d6dfe61 ]
2025-06-03 17:12:13 -05:00
Bindhiya Kanangot Balakrishnan
851d0d015d
[SWDEV-534745] Generate valid json output for xgmi command
...
The amd-smi xgmi --json output was not in valid json
format. Changes are done to get the output in valid
json format.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com >
[ROCm/amdsmi commit: 8f943b03e1 ]
2025-06-03 12:48:02 -05:00
Saeed, Oosman
877c7b1bda
[SWDEV-530385] show afids on each line of printout ( #422 )
...
* show afids on each line of printout
* clean up afids and cper code
---------
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com >
[ROCm/amdsmi commit: fab13c5b60 ]
2025-06-02 17:22:10 -05:00
Pham, Gabriel
3d75b7881a
[SWDEV-446039] Added Flat Process table to default output ( #425 )
...
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com >
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com >
[ROCm/amdsmi commit: 91021da055 ]
2025-06-02 17:15:15 -05:00
Kanangot Balakrishnan, Bindhiya
a3521ea6ed
[SWDEV-519061] xgmi command output shows zero for all xgmi acc read/write data in the first column ( #392 )
...
The xgmi read and write accumulated data from gpu metric index
is based on sysfs xgmi_port_num file. Mapped these two to display
read and write wrt src_gpu Vs dst_gpu.
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com >
[ROCm/amdsmi commit: 8ed52616ad ]
2025-06-02 14:01:06 -05:00
Justin Williams
d8b32bf2ee
[SWDEV-533596] CI - Fixed Docs
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
[ROCm/amdsmi commit: bf0448ff96 ]
2025-06-02 13:48:01 -05:00
Joseph Narlo
3d0f98c16d
[SWDEV-522996] Syncing Unified Header and AMDSMI
...
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com >
[ROCm/amdsmi commit: ee43ec71e8 ]
2025-06-02 13:44:33 -05:00
Maisam Arif
cd11d4f051
Updated Changelog
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I10efa8ed10288d3445a330ad27081d1f03113b38
[ROCm/amdsmi commit: 996917e9bc ]
2025-05-30 20:48:29 -05:00
Maisam Arif
00ad72baf9
Deprecated PASID
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: Ib008f80f3d736172079358c0ceb3ebca87340d28
[ROCm/amdsmi commit: c89b5db09d ]
2025-05-30 20:48:29 -05:00
Maisam Arif
16d60f3411
[SWDEV-488303] Fixed process list information source
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: Iec3416cb5ca1bdd806c3225b514bbf3dbf8c0d2e
[ROCm/amdsmi commit: cebb0799cb ]
2025-05-30 20:48:29 -05:00
Maisam Arif
5324134708
Version Bump 26.0.0
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I29ea6fa781dfc338a60b390ff498c46b4a1efe52
[ROCm/amdsmi commit: cc4dfd834f ]
2025-05-30 20:48:29 -05:00
gabrpham_amdeng
42238ef83c
Updated CLI Tool Help
...
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com >
[ROCm/amdsmi commit: c8f33c96c3 ]
2025-05-30 20:10:32 -05:00
dependabot[bot]
2f803473e1
Bump tornado from 6.4.2 to 6.5.1 in /docs/sphinx ( #418 )
...
Bumps [tornado](https://github.com/tornadoweb/tornado ) from 6.4.2 to 6.5.1.
- [Changelog](https://github.com/tornadoweb/tornado/blob/master/docs/releases.rst )
- [Commits](https://github.com/tornadoweb/tornado/compare/v6.4.2...v6.5.1 )
---
updated-dependencies:
- dependency-name: tornado
dependency-version: 6.5.1
dependency-type: indirect
...
[ROCm/amdsmi commit: dd81cfd688 ]
2025-05-30 19:53:58 -05:00
gabrpham_amdeng
c4f8ba1178
Suppressed help text of default command
...
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com >
[ROCm/amdsmi commit: 1fa4cdacf3 ]
2025-05-30 19:53:14 -05:00
Pham, Gabriel
d229f86108
[SWDEV-511822] Added group check to default command ( #415 )
...
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com >
[ROCm/amdsmi commit: daf74d1cd6 ]
2025-05-30 18:40:18 -05:00
Kanangot Balakrishnan, Bindhiya
f12c72a4e2
[SWDEV-530633] Use gpu_metric speed and BW for xgmi ( #366 )
...
The xgmi command was showing pcie bit rate and bandwidth instead of xgmi. Corrected the API to get xgmi data from gpu metric.
Added python API for amdsmi_get_link_metrics. Modified the amdsmi_link_metrics struct.
Added check to confirm non zero partition got xgmi command.
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com >
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
[ROCm/amdsmi commit: 2eff0b3764 ]
2025-05-30 16:51:11 -05:00
Castillo, Juan
c830bb4d74
[SWDEV-534728] Fixed deep_sleep status does not work with --json flag ( #413 )
...
- When in json output mode the .rstrip function does not work due to dict obj type.
- The clk_value is now checked for dict instance before extracting the value.
- If clk_value is a dict then the .get() function is used to extract the value.
- Else it is a string obj which uses .split() to extract the value.
- If clk_value is < min_clk_value then deep_sleep is set to ENABLED
- initialize clk_value and min_clk_value to 0 for each loop.
- fix if/else for better readability
---------
Signed-off-by: Juan Castillo <juan.castillo@amd.com >
[ROCm/amdsmi commit: 2e8aaf02c9 ]
2025-05-30 16:45:32 -05:00
Arif, Maisam
da430dec05
[SWDEV-488303] Adjusted process vram_mem data source ( #411 )
...
* [SWDEV-488303] Adjusted process vram_mem data source
* Standardized sscanf format strings
---------
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com >
[ROCm/amdsmi commit: 42441c78ea ]
2025-05-29 23:26:12 -05:00
Maisam Arif
b2b6779593
[SWDEV-523247] Corrected amdsmi_get_gpu_vram_usage total
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I0f8bb067bf34f64d1b8d41e2a89d3a79a6745990
[ROCm/amdsmi commit: 876f3976e0 ]
2025-05-29 21:30:00 -05:00
Arif, Maisam
465f2e6a41
[SWDEV-488303] Updated CU occupancy for per-process retrieval ( #243 )
...
Change-Id: I2990597c6dd4b2e8cf3e11ce60f72049ebdd9a8c
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
[ROCm/amdsmi commit: 0fdaebdbaa ]
2025-05-29 20:35:27 -05:00