コミットグラフ

368 コミット

作成者 SHA1 メッセージ 日付
SakaSitharammurthy 1c5aa2d4e7 [SWDEV-567099] Updated 'amdsmi list --cpu all' command (#2519)
Signed-off-by: Saka, Sitharam Murthy <SitharamMurthy.Saka@amd.com>
2026-01-19 14:56:59 -06:00
Mario Limonciello 838b3dccf1 Adjust amdgpu version output for amd-smi (#2563)
* Fix the amdgpu version string comparison

The intention behind it was to avoid showing the string if it's not
got information.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>

* Display the kernel version in amd-smi output

This is an interesting debugging point, especially in the case of
not having a DKMS package installed.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Moving os_kernel_version to static --driver

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

---------

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2026-01-15 11:11:58 -08:00
systems-assistant[bot] 53c56fca5f [SWDEV-558534] AMD-SMI bad pages add flag to convert to hex (#1900)
* Simplify hex flag check for bad page info
* moved the hex help text up with the other help text

---------

Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Authored-by: Koushik Billakanti <Koushik.Billakanti@amd.com>
Co-authored-by: Koushik Billakanti <Koushik.Billakanti@amd.com>
2026-01-08 10:21:10 -06:00
systems-assistant[bot] c6b7448227 Add support for get and set APIs for CPUISOFreqPolicy and DFCState Co… (#1901)
* Add support for get and set APIs for CPUISOFreqPolicy and DFCState Control

  - Add support for get and set APIs for CPUISOFreqPolicy and DFCState Control
    in AMD SMI and also in the CLI tool

* CHANGELOG.md file updated

* SWDEV-562837: Update amdsmi-py-api.md as per the new APIs

Updated amdsmi-py-api.md as per the new APIs added.

---------

Signed-off-by: Soumya <sranjanr@amd.com>
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Co-authored-by: Saka Sitharammurthy <SitharamMurthy.Saka@amd.com>
2026-01-06 10:37:07 -06:00
Mario Limonciello 08949cb884 Run pre-commit's whitespace related hooks on projects/amdsmi (#2119)
* Run pre-commit's whitespace related hooks on projects/amdsmi

In order for pre-commit to be useful, everything needs to meet a common
baseline.

* Add whitespace back to Changelog for formatting

---------

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-12-15 13:20:47 -06:00
koushikbillakanti-amd 9e06ea8f79 [SWDEV-564696] Structure size mismatch in SOC pstate/XGMI PLPD (#2207)
* Address PR feedback: consolidate switch cases, move CSV formatting, use direct API calls for error messages
* csv output flattening changes

---------

Signed-off-by: Billakanti, Koushik <Koushik.Billakanti@amd.com>
2025-12-10 23:37:36 -06:00
Mario Limonciello 73778bf83c Adjust policy for memory display on APUs (#1967)
* Read the ids_flags when fetching GPU info

The ids_flags contains the flags that can help identify if a GPU
is a dGPU or an APU.

* Show correct memory pool for APUs

The kernel policy for APUs will be to choose the bigger pool of
memory (GTT or VRAM) for KFD work.  Adjust the policy for the monitor
and default commands to show the right memory pool when using an APU.
2025-12-09 21:49:06 -06:00
Mario Limonciello a08170bc75 Apu prerequisites (#1946)
* Don't require powercap support

APUs don't necessarily support setting a power cap from sysfs.
Ignore failures of the file missing.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Show edge temperature in default output if hotspot is missing

APUs don't have a hotspot temperature, they have an edge though.
Use that.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Format all "power" keys as watts

There will be more power keys when APU support is added, so format
them properly.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Don't show power limit in output if it's invalid

APUs can't set power limit using power_cap1 interface.  The limit
will be 0 and thus the UX looks weird in default output.
Only add the `/power_limit` if it's valid.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Unify sizes of `amdsmi_power_info_t`

Sizes are used inconsistently.  This causes tools to not show
N/A when they should.  Make them unified.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

---------

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-12-08 21:36:45 -06:00
Dmitrii a6183e3ca7 [amdsmi] Dont crash on node handle error (#2206)
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-12-08 11:59:20 -06:00
Maisam Arif 2feb0ae998 Fix powercap default to enum for sensor_ind (#2004)
* Fix powercap default to enum for sensor_ind

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

* [SWDEV-559965] Refactor amdsmi set power cap

Modified power cap set to accept args with
optional power_cap type. Added power_cap helper
validate_and_set_power_cap(). Fixed JSON output
format.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-12-04 09:52:59 -06:00
Bindhiya Kanangot Balakrishnan a627c12501 [SWDEV-566465] Fix json output for amdsmi reset (#2043)
Fixed json output for reset command.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-12-01 13:30:32 -06:00
Adam Pryor 422253f871 Implement PTL support (#1957)
* Implement PTL support

Signed-off-by: adapryor <Adam.pryor@amd.com>
(cherry picked from commit 45bc31292e7940a3b8fca044ef7df22047b95733)

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

---------

Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-11-26 08:33:27 -06:00
systems-assistant[bot] c404fbd851 [SWDEV-560235] Add gpu_board and base_board temperatures to monitor (#1906)
* Add helpers for gpu_board and base_board temperatures
* Added gpu_board and base_board temperatures arguments for non-default monitor subcommand

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-11-24 13:12:09 -06:00
Ramalingam, Muthusamy 3659db6f21 [SWDEV-560044]: [AMDSMI][CPU] Update AMDSMI as per latest ESMI Driver (#763)
[AMDSMI][CPU] Update AMDSMI as per latest ESMI Driver,
1) hsmp_acpi
2) amd_hsmp
3) hsmp_common

Signed-off-by: Muthusamy Ramalingam <muthusamy.ramalingam@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: ssaka_amdeng <SitharamMurthy.Saka@amd.com>

[ROCm/amdsmi commit: b4b3539631]
2025-11-17 13:45:43 -06:00
Bindhiya Kanangot Balakrishnan 5e56281eb0 [SWDEV-566930] Fix numa CSV output
Handled numa data - including cpu and socket list, bitmask,
and affinity for csv format.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>


[ROCm/amdsmi commit: 1b027d15bd]
2025-11-13 21:52:49 -06:00
Kanangot Balakrishnan, Bindhiya 072daa28d5 [SWDEV-538483] Add NPM API's and CLI (#817)
* Added Python & C API's for new node devices. Currently these are functional for node 0 only.
 - amdsmi_get_node_handle
 - amdsmi_get_npm_info
* Added `amd-smi node` CLI for Node Power Management

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: f8e4771363]
2025-11-13 21:51:31 -06:00
Charis Poag e53645e871 Fix PPT - reset calls, unit format, and get_power_cap()
Changes:
  - Simplified reset calls
  - Updated static limit N/A values to all possible data
  (helps csv format be consistent)
  - Unit format was broken on static
  - get_power_cap() had min/max values swapped, and the return
    was missing two fields
  - Updated changelog to reflect all changes

Change-Id: I23713471b984f52085372486c6e6ff852e2f42f8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 00a893d299]
2025-11-13 15:40:13 -06:00
gabrpham_amdeng 3527ecf8dd Fix PCIe Levels converting to csv
Change-Id: Id69b8bbc167887673a88e13eb497bdeac6dd0425
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>


[ROCm/amdsmi commit: 46e5f157f0]
2025-11-13 13:08:12 -06:00
gabrpham_amdeng 351b6f96ae Added support for configuring PPT1 power cap
- Updated python integration test to account for PPT1 support changes
  - Updated set/reset power-cap input format
  - Adjusted python API and updated C++ API test

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
Change-Id: Ia9d02868b6e91c88c10a9772d9e2d9f37c3c352f


[ROCm/amdsmi commit: 18faddf6f3]
2025-11-13 13:08:12 -06:00
gabrpham_amdeng 4d29ff8a2d Fixed Namspace has no attribute 'pcie' error in set command
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>


[ROCm/amdsmi commit: 9739611239]
2025-10-30 15:17:48 -05:00
adapryor 5c95a1485f Fix evicted_time
[ROCm/amdsmi commit: 4abb69f9d9]
2025-10-30 14:01:44 -05:00
Bindhiya Kanangot Balakrishnan d5691b7ed9 [SWDEV-563281] Add json and csv output for xgmi status
Added json and csv output format support for newly
added xgmi link_status. Aligned legend.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>


[ROCm/amdsmi commit: 8dd4a4997b]
2025-10-29 15:25:15 -05:00
Pham, Gabriel 87b2fd73b8 Added set --pcie command and added more pcie info to static --bus output (#481)
* Added amd-smi set --pcie command
* Removed current pcie level due to it not being static
* Added pcie information to static --bus

---------

Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 9e3537d778]
2025-10-28 14:55:55 -05:00
Pryor, Adam 354886f4ff [SWDEV-357472] Add evicted_ms metric (#620)
- **Added evicted_time metric for kfd processes**.  
  - Time that queues are evicted on a GPU in milliseconds
  - Added to CLI in `amd-smi monitor -q` and `amd-smi process`
  - Added to C API and Python API:
    - amdsmi_get_gpu_process_list()
    - amdsmi_get_gpu_compute_process_info()
    - amdsmi_get_gpu_compute_process_info_by_pid()

---------

Signed-off-by: Pryor, Adam <Adam.Pryor@amd.com>

[ROCm/amdsmi commit: 2144cfbba4]
2025-10-28 14:49:03 -05:00
AL Musaffar, Yazen 2050dde73b [SWDEV-557731] Fix for amd-smi error not exiting when timeout command used (#779)
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>

[ROCm/amdsmi commit: 4bcc2ca598]
2025-10-27 13:51:22 -05:00
Kanangot Balakrishnan, Bindhiya 3924171d74 [SWDEV-542718] Correct socket_affinity (#760)
* [SWDEV-542718] Correct socket_affinity

Updated Socket affinity to show bitmask and expanded cpu list.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Update per-device local_cpulist for socket_affinity

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Added amdsmi_get_cpu_affinity_from_local_cpulist API.
Updated the wrapper.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Revert "Added amdsmi_get_cpu_affinity_from_local_cpulist API."

This reverts commit 9a2ef934b1787f8aa09d3e4efe02f897b4295215.

* Moved the changes to C API.
In case of SOCKET_SCOPE, use local_cpulist first.
If it is unavailable or not readable, fallback to
numa.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Addressed review comments

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 09a97f02ed]
2025-10-22 16:20:41 -05:00
Poag, Charis ce19b921b0 [SWDEV-535159] Add support for GPU partition metrics (#490)
[SWDEV-535159] Add support for GPU partition metrics

Changes include:
  - Internal logic to smart-switch between gpu_metrics/xcp_metrics files
  - [WIP] Initial plumbing for new partition metric API

Change-Id: I4340fb1b48bac0117d80d5d486b9e871430d5cd8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Add amdsmi_get_gpu_partition_metrics_info() + minor cleanup

Change-Id: I5d60604f18baddbd03852dc90e88aa0b8107d50e
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Fix partition metric logic + update logging/tests

Change-Id: I9e89b19ead17694c54e224f8e13ff8ee3eb2e22a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Adjust amd-smi metric/monitor/default to show (some) partition information

Change-Id: I2e8d2745876a19bdaec3c039daa97345c9f701b5
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Add C++ tests

Change-Id: Ib9eb0b57a6d7a280992e05a4c6eba632826952ef
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Remove modification of energy counter, not needed

Change-Id: I5c48eaaae248ee6dc79abba609d837ec35d78022
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[CLI] amd-smi metric: cleaned up N/A'd multi-valued to show just N/A

Changes:
1. amd-smi metric: cleaned up N/A'd multi-valued to show just N/A
ex.
JPEG_ACTIVITY: [N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A, N/A]

Now just shows: N/A

2. [Python Unit Test] Changed testname TestAmdSmiPythonBDF(unittest.TestCase) ->
 AmdSmiPythonUnitTest

Test name was confusing.

Change-Id: Ieb3b036f30002fd22362508eb9fc5d443df395ae
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Log cleanup

Change-Id: I1b1a95f1844d35bec7a7bd8cb996f87e4914c069
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Add amd-smi partition-metrics CLI + general cleanup

Change-Id: Ia91488e6cb3a4d62b4087afbddfe0b3bb9378fdc
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[1.3 metrics] Remove forwards compatibility for partition metrics

Change-Id: Iab928983e6f6f1587bc9307f6f3fa2b2696ca6f7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Fixed violation output not showing % + general cleanup

Change-Id: Icac1b0a55b18c7628b07109ae0c377d17e0825f1
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

Clean up amdsmi_get_gpu_partition_metrics_info & amd-smi partition-metric outputs

Change-Id: I6427028b980874641e9ffb3b5d88ad493dbf9cf4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Fix metrics not found + extra logging/formatting

Change-Id: I841a27bb2c305e97ec7579a13ac915e5be497c3a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Update license to current default

Change-Id: I0de9b8a2d5dbbeab4491097f0354ba17b0d30866
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Cleanup for review

Change-Id: I96ed25c3f2b8968eea1af24c5e5860c2b4e74e6e
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Moderize updated/new interal APIs.

Change-Id: I3c48a250eeb703709b14cb5ffa68268d8321626c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Remove extra logging in dynamic metrics

Change-Id: Idb97547bcbe143d6fa1cb5cb278ffe4da615ce14
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Remove amd-smi partition-metric command

Change-Id: Ib83c17e5cd7e0da3798198943bddd46c296b411c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Move new CLI updates to another PR + minor fixes

Change-Id: I3b1163eec12f9b5f7d95ee33de08e168cec1b1fe
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Allow dynamic metrics to work for gpu/xcp metrics 1.9+/1.1+

Updated some logging as well.

Change-Id: I2ed9f5a5ef8afb1520508820ca6153525f0644b4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Allow dyn gpu/xcp metric v1.9+/v1.1+

Added tests for quick check

Change-Id: I576d6f6582a55afb08e5ac57791ce95e2fa184a2
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Update tests for larger subset of version checks

Change-Id: I3cdf4f8bb4fc6161f4c76566939f90545d0f362a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

* Fix XCP metrics in gpu/partition metric pre-v1.9/v1.1 (dynamic)

Change-Id: I4dabc1ed6bef6b86c8e7f92bf9cb5992f3966fe2
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: 01b4fe6614]
2025-10-20 14:43:40 -05:00
Bindhiya Kanangot Balakrishnan baa8fa3042 Move build_xcp_dict def to helpers.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>


[ROCm/amdsmi commit: 4dd1c1042a]
2025-10-16 15:25:10 -05:00
Pryor, Adam 1c6147ead5 [SWDEV-558993] Fix list() groups printout (#772)
* Updated groups printing
* added parameters to check_required_groups
	* two device groups since kfd and render require the same group

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: ee1445e2cc]
2025-10-16 11:23:49 -05:00
Pryor, Adam 6758e447b6 [SWDEV-556149] Fix group checking (#746)
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: Pryor, Adam <Adam.Pryor@amd.com>


[ROCm/amdsmi commit: 4518e7ee91]
2025-10-07 22:24:56 -05:00
Maisam Arif 1269ff4c0c [SWDEV-558993] Fix amd-smi list to not check for groups for bdf
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1ff9c0e00a9188435b0ee60e57c2678121dd8e72


[ROCm/amdsmi commit: 0a45a12e7a]
2025-10-07 09:45:37 -05:00
Kanangot Balakrishnan, Bindhiya 693055ee50 [SWDEV-554046] xgmi cli redesign (#574)
Added `GPU LINK PORT STATUS` table to `amd-smi xgmi` command 
The `amd-smi xgmi -s` or `amd-smi xgmi --source-status` will show `GPU LINK PORT STATUS` table.  

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 7ddd91653e]
2025-10-07 01:07:27 -05:00
Narlo, Joseph 6975b29c15 [SWDEV-539078] Add missing API definitions to python interface (#525)
Added the following API's to amdsmi_interface.py.
	amdsmi_get_cpu_handle()
	amdsmi_get_esmi_err_msg()
	amdsmi_get_gpu_event_notification()
	amdsmi_get_processor_count_from_handles()
	amdsmi_get_processor_handles_by_type()
	amdsmi_gpu_validate_ras_eeprom()
	amdsmi_init_gpu_event_notification()
	amdsmi_set_gpu_event_notification_mask()
	amdsmi_stop_gpu_event_notification()
	amdsmi_get_gpu_busy_percent()

Added additional return value to API amdsmi_get_xgmi_plpd().
	The entry policies is added to the end of the dictionary to match API definition.
	The entry plpds is marked for deprecation as it has the same information as policies.

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 7decbc67a1]
2025-10-06 14:50:00 -05:00
Pryor, Adam 94e6ba68b4 [SWDEV-547088] Dynamic GPU Metrics Implementation (#692)
* Added ability to format gpu_metrics v1_9
* New gpu_metrics format from the driver should allow amd-smi to parse with future compatibility guaranteed

---------

Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Signed-off-by: adapryor <Adam.pryor@amd.com>
Co-authored-by: Oliveira, Daniel <daniel.oliveira@amd.com>

[ROCm/amdsmi commit: 5ef0b3c34d]
2025-10-01 15:46:10 -05:00
Maisam Arif 405f34e4d1 [SWDEV-554587] Added IFWI Version and boot_firmware API
- Changed amd-smi static --vbios to accept ifwi
- Change population logic for vbios version API
- Added IFWI boot_firmware to the CLI, C++, Rust, and Python API

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I4ea504d40a43cfb011ab38fc9a664ecf12d39c8a


[ROCm/amdsmi commit: cd21b5edcc]
2025-09-23 16:05:10 -05:00
Charis Poag fb6b706559 [SWDEV-554860] Fix amd-smi monitor -qt --gpu 0 --csv
For process -
Dual CSV is required in order to print 4 separate rows.
1. Metric header + data
2. Process header + data

Change-Id: Ibb7bfb13fa95a7c43b2e3f9061ada3a6be4aa8cb
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 4fd8b88aa5]
2025-09-23 14:16:08 -05:00
Saeed, Oosman 10bfc7c056 [SWDEV-554697] CPER not properly displaying warnings for non-zero partition id's (#687)
* Get primary gpu_id for non-primary partitions.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* corrected partitions warning print logic

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I08be6c78ddd46e5316dc9d538de4908b65b21d43

* Updated patch with latest changes and modified
xgmi partition_id check.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Typo correction

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* adjusted logging

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I6d425102d8583aabbcd4d7f55c9c733428524d59

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Oosman Saeed <oossaeed@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 5398eaa6b3]
2025-09-12 16:39:56 -05:00
Pham, Gabriel e9ee0bccf2 [SWDEV-551309] Adjusted amdsmitst and reset command (#654)
* Adjusted amdsmitst and reset command to account for separation of power profile and perf level behavior
* Updated test to reset power profile to previous user setting
* Removed performance level from reset_profile_results in reset --profile command
* Updated Changelog with change to reset profile behavior

---------

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>

[ROCm/amdsmi commit: 954d4860c1]
2025-09-09 16:11:07 -05:00
Kanangot Balakrishnan, Bindhiya e5ba10d4c2 [SWDEV-553557] Add bad_page_threshold_exceeded to RAS (#677)
Added bad_page_threshold_exceeded field to ras, which
compares retired pages count against bad page threshold.
This field displays True if retired pages exceed the
threshold, False if within threshold, or N/A if
threshold data is unavailable.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: edaae978a2]
2025-09-09 09:15:37 -05:00
Maisam Arif 8d5335a8de [SWDEV-544299] Fix CLI prefix for amd-smi metric -G
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ic184ec824213421388356417e713d9ed5adeddeb


[ROCm/amdsmi commit: 978fad01d2]
2025-08-27 18:08:06 -05:00
Pham, Gabriel 3ef5bfef94 Added gpuboard and baseboard temperatures to amd-smi metric (#617)
* Added gpu-board and base-board temperatures to amd-smi metric
* Updated Changelog and adjusted the metric base-board/gpu-board output
* Adjusted output of metric to hide base/gpu-board when not relevant

---------

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>

[ROCm/amdsmi commit: b13fc16d60]
2025-08-26 12:49:56 -05:00
Maisam Arif a68cd9612a [SWDEV-540665] Power cap on 1VF cli parsing fix
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I5aac8f820fd8ae1c6c1dbae3b5b9e69018c69452


[ROCm/amdsmi commit: e030f71229]
2025-08-22 15:22:44 -05:00
gabrpham_amdeng f55c41202e [SWDEV-549373] Added vbios and pldm information to version header and adjusted platform info display
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>


[ROCm/amdsmi commit: 71c8b92076]
2025-08-21 18:16:47 -05:00
Maisam Arif f732ee4e98 Fix spelling and incorrect error references
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I23e947a0cfd4f68067f9fca703574f44680163d4


[ROCm/amdsmi commit: 074c4b7a3f]
2025-08-21 12:36:43 -05:00
Pham, Gabriel 729b7beddf [SWDEV-446394] Updated error message for setting clock limit (#633)
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>

[ROCm/amdsmi commit: c0ea186d47]
2025-08-19 18:51:49 -05:00
Poag, Charis 35b4b0df38 [SWDEV-550355] Fix process + violation output when in partitions (#623)
Changes:
  - Fixes amd-smi monitor such as:
    amd-smi monitor -Vqt, amd-smi monitor -g 0 -Vqt -w 1
    amd-smi monitor -Vqt --file /tmp/test1, ...
  - Required moving around when process is called, since xcp
    information is gathered in right format expected by monitor
  - Requires process to be appended first with the gpu data -> xcp
    info to be gathered + added after 1st device

Change-Id: I76356a4610944f633a9530970fac66556d65bf11
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: 1b2edd70bd]
2025-08-19 18:50:51 -05:00
Charis Poag b239e5be60 [SWDEV-550679] Fix amd-smi monitor AttributeError
Impacts only Guest systems

Fixes following error:
$ amd-smi monitor
AttributeError: 'Namespace' object has no attribute 'violation'

Change-Id: If501819be3f8e2d2dfd75775dc776873a92465a3
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 5fe58a8e38]
2025-08-19 17:58:44 -05:00
Bindhiya Kanangot Balakrishnan 8e645a6da7 [SWDEV-547160] Fix VRAM percentage calculation
The vram_percent calculation was missing
multiplication by 100.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>


[ROCm/amdsmi commit: 41488f0c18]
2025-08-18 17:28:30 -05:00
Arif, Maisam 4e568b2eea [SWDEV-540665] Add power_cap set to Linux Guest (#626)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I3c8d707681c141390b40521231e0d638c81cdeaf

[ROCm/amdsmi commit: 2d5accd000]
2025-08-18 14:59:14 -05:00
Charis Poag 7ab967ec69 Revert Major ABI break for amdsmi_get_violation_status()
Changes:
- This aligns back to original struct naming for ROCm 7.0. This removes
any Major ABI breakages for updates for 7.0 release.
- Minor ABI breakage is required since there were additions to the
header. Refer to changelog for these updates.

Change-Id: If35af74eac6beac8c267d05ce789b7761ed24bff
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: d3b73fac82]
2025-08-18 11:36:57 -05:00