* Fix the amdgpu version string comparison
The intention behind it was to avoid showing the string if it's not
got information.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
* Display the kernel version in amd-smi output
This is an interesting debugging point, especially in the case of
not having a DKMS package installed.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* Moving os_kernel_version to static --driver
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
---------
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
* Add support for get and set APIs for CPUISOFreqPolicy and DFCState Control
- Add support for get and set APIs for CPUISOFreqPolicy and DFCState Control
in AMD SMI and also in the CLI tool
* CHANGELOG.md file updated
* SWDEV-562837: Update amdsmi-py-api.md as per the new APIs
Updated amdsmi-py-api.md as per the new APIs added.
---------
Signed-off-by: Soumya <sranjanr@amd.com>
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Co-authored-by: Saka Sitharammurthy <SitharamMurthy.Saka@amd.com>
Currently if the input file name already exists, the tool
appends output to existing file. Added overwrite, append,
or no(discard) options to choose from.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Changes:
- Simplified reset calls
- Updated static limit N/A values to all possible data
(helps csv format be consistent)
- Unit format was broken on static
- get_power_cap() had min/max values swapped, and the return
was missing two fields
- Updated changelog to reflect all changes
Change-Id: I23713471b984f52085372486c6e6ff852e2f42f8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: 00a893d299]
- Updated python integration test to account for PPT1 support changes
- Updated set/reset power-cap input format
- Adjusted python API and updated C++ API test
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
Change-Id: Ia9d02868b6e91c88c10a9772d9e2d9f37c3c352f
[ROCm/amdsmi commit: 18faddf6f3]
* Added amd-smi set --pcie command
* Removed current pcie level due to it not being static
* Added pcie information to static --bus
---------
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
[ROCm/amdsmi commit: 9e3537d778]
- **Added evicted_time metric for kfd processes**.
- Time that queues are evicted on a GPU in milliseconds
- Added to CLI in `amd-smi monitor -q` and `amd-smi process`
- Added to C API and Python API:
- amdsmi_get_gpu_process_list()
- amdsmi_get_gpu_compute_process_info()
- amdsmi_get_gpu_compute_process_info_by_pid()
---------
Signed-off-by: Pryor, Adam <Adam.Pryor@amd.com>
[ROCm/amdsmi commit: 2144cfbba4]
Added `GPU LINK PORT STATUS` table to `amd-smi xgmi` command
The `amd-smi xgmi -s` or `amd-smi xgmi --source-status` will show `GPU LINK PORT STATUS` table.
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
[ROCm/amdsmi commit: 7ddd91653e]
Added the following API's to amdsmi_interface.py.
amdsmi_get_cpu_handle()
amdsmi_get_esmi_err_msg()
amdsmi_get_gpu_event_notification()
amdsmi_get_processor_count_from_handles()
amdsmi_get_processor_handles_by_type()
amdsmi_gpu_validate_ras_eeprom()
amdsmi_init_gpu_event_notification()
amdsmi_set_gpu_event_notification_mask()
amdsmi_stop_gpu_event_notification()
amdsmi_get_gpu_busy_percent()
Added additional return value to API amdsmi_get_xgmi_plpd().
The entry policies is added to the end of the dictionary to match API definition.
The entry plpds is marked for deprecation as it has the same information as policies.
---------
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
[ROCm/amdsmi commit: 7decbc67a1]
* Used KFD to determine linking between GPUs and PIDs rather than depend on fdinfo's per pid single gpu bdf info that we were getting.
Signed-off-by: adapryor <Adam.pryor@amd.com>
---------
Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
[ROCm/amdsmi commit: c967aead58]
Changes:
- Moved `amd-smi monitor` guest fixes to 7.0.1
- [7.0.0] Provided details on updated violation output
- [7.0.0] Provided details on new set/reset error outputs
- [7.0.0] Added details on a resolved non-json format output
for `amd-smi partiton --json`
- [7.0.0] Moved known issue for `amd-smi monitor`
accidentally placed in wrong release
Change-Id: Iea745255a69d8ff88b470ca533d83ff3eef09fef
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: 06324c0dde]
- Changed amd-smi static --vbios to accept ifwi
- Change population logic for vbios version API
- Added IFWI boot_firmware to the CLI, C++, Rust, and Python API
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I4ea504d40a43cfb011ab38fc9a664ecf12d39c8a
[ROCm/amdsmi commit: cd21b5edcc]
* Changes:
- Moved `amd-smi monitor` guest fixes to 7.0.2
- [7.0.0] Provided details on updated violation output
- [7.0.0] Provided details on new set/reset error outputs
- [7.0.0] Added details on a resolved non-json format output
for `amd-smi partiton --json`
- [7.0.0] Moved known issue for `amd-smi monitor`
accidentally placed in wrong release
- Moved `amd-smi monitor` guest fixes to 7.0.2
- [7.1.0] Added power caps guest set info
- [7.1.0] Other various fixes noted
Change-Id: I374b98f32e947520fcb8a6e33e6f6fcd290b00d6
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: 94d87ea222]
* fix links to python apis
* add links to repo for example code
* fix `WARNING: Pygments lexer name is not known`
Signed-off-by: Peter Park <Peter.Park@amd.com>
[ROCm/amdsmi commit: 5d0a39fa9d]
* Adjusted amdsmitst and reset command to account for separation of power profile and perf level behavior
* Updated test to reset power profile to previous user setting
* Removed performance level from reset_profile_results in reset --profile command
* Updated Changelog with change to reset profile behavior
---------
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
[ROCm/amdsmi commit: 954d4860c1]
- **Fixed gpuboard and baseboard temperatures enums in amdsmi Python Library**.
- AmdSmiTemperatureType had issues with referencing the right attribute, so we removed the following duplicate enums:
- `AmdSmiTemperatureType.GPUBOARD_NODE_FIRST`
- `AmdSmiTemperatureType.GPUBOARD_VR_FIRST`
- `AmdSmiTemperatureType.BASEBOARD_FIRST`
Change-Id: Ia61446b593bd9182d597c4b4c2ac3c5ffdae7493
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
[ROCm/amdsmi commit: 286c421a49]
* Added gpu-board and base-board temperatures to amd-smi metric
* Updated Changelog and adjusted the metric base-board/gpu-board output
* Adjusted output of metric to hide base/gpu-board when not relevant
---------
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
[ROCm/amdsmi commit: b13fc16d60]
Changes:
- Update violation status logic and metric naming for XCP/XCC metrics (thrm/thm consistency)
- Added XCP identifier in monitor to allow partition metrics to be shown with applicable APIs
(Violation Status is the first example of this in monitor)
- Improve CLI monitor output:
support multiple GPU lines per GPU, add new columns, and better formatting
- Refactor helpers and logger for flexible unit formatting and table rendering
- Add examples for amdsmi_get_gpu_pm_metrics_info()/amdsmi_get_gpu_reg_table_info()
new metrics APIs in C++ example
- Sync Python/C++ interface and structures for new metrics fields and naming
- Remove deprecated/unused RSMI activity APIs, documentation not needed since
the APIs no longer exist in ROCm SMI either.
- Cleanup metric violations + fix handle watch arguments
- Provide better handling/doc for average_flattened_ints()
- Group xcp metrics with brackets in human readable + adjust output size
Signed-off-by: Poag, Charis <Charis.Poag@amd.com>
[ROCm/amdsmi commit: e2e4fc65c1]
Description:
- Added a new API `amdsmi_gpu_driver_reload()` to reload the AMD GPU driver independently.
- Updated CLI (`sudo amd-smi reset -r`) and Python bindings to support driver reload functionality.
- Removed automatic driver reload from `amdsmi_set_gpu_memory_partition()` and `amdsmi_set_gpu_memory_partition_mode()`.
- Enhanced CLI and test cases to allow users to control when the driver reload occurs.
- Updated documentation and changelog to reflect the new driver reload process.
- Improved error handling and logging for driver reload operations.
- Added progress bar and user confirmation prompts for driver reload commands.
* Update build/test strategy to only allow one test execution at a time
* Modify API verbage + modify systemctl error output
- Systemctl is typically not enabled on docker.
- And is an edge case for gpu being active process/etc for display devices.
* Remove AMDSMI_STATUS_AMDGPU_RESTART_ERR from the return values
* Move driver reload to after we save original compute partitions
---------
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: d24dc7ef89]
* Disabled violation argument for monitor on guests as it is supported on BM only.
* Added `-v` and `--violation` args to metric along with `throttle` due to legacy behavior.
* Supressed metric throttle arg and do not show in help text
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
[ROCm/amdsmi commit: f6b854b4ed]
Updates:
- Separate extra APIs calls from amd-smi CLI to target specific CLI commands that need them.
- Remove extra current_compute_partition SYSFS calls from amd-smi static.
- Remove the partition information from the default `amd-smi static` CLI command.
- Users must now use the `-p` argument to view partition information with `amd-smi static`.
- The help text for the `partition` argument has been updated to reflect this change.
- The partition information can still be accessed using the `amd-smi partition -c -m` or `sudo amd-smi partition -a` commands.
---------
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: 88473b7fd0]
* Add the API and CLI to show the board voltage.
---------
Change-Id: Icb25bd653bb1d004704b5a21b378ca31b2b242c7
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
[ROCm/amdsmi commit: 970560fc7c]