Added the following API's to amdsmi_interface.py.
amdsmi_get_cpu_handle()
amdsmi_get_esmi_err_msg()
amdsmi_get_gpu_event_notification()
amdsmi_get_processor_count_from_handles()
amdsmi_get_processor_handles_by_type()
amdsmi_gpu_validate_ras_eeprom()
amdsmi_init_gpu_event_notification()
amdsmi_set_gpu_event_notification_mask()
amdsmi_stop_gpu_event_notification()
amdsmi_get_gpu_busy_percent()
Added additional return value to API amdsmi_get_xgmi_plpd().
The entry policies is added to the end of the dictionary to match API definition.
The entry plpds is marked for deprecation as it has the same information as policies.
---------
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
[ROCm/amdsmi commit: 7decbc67a1]
* Used KFD to determine linking between GPUs and PIDs rather than depend on fdinfo's per pid single gpu bdf info that we were getting.
Signed-off-by: adapryor <Adam.pryor@amd.com>
---------
Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
[ROCm/amdsmi commit: c967aead58]
* Added ability to format gpu_metrics v1_9
* New gpu_metrics format from the driver should allow amd-smi to parse with future compatibility guaranteed
---------
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Signed-off-by: adapryor <Adam.pryor@amd.com>
Co-authored-by: Oliveira, Daniel <daniel.oliveira@amd.com>
[ROCm/amdsmi commit: 5ef0b3c34d]
* conf: update RTD config to ub24.04 (doxygen 1.9.8) and py3.12
* update generate-docs workflow
* Update "modules" to "topics" due to Doxygen 1.9.8
* bump rocm-docs-core to 1.25.0 and pip-compile requirements.txt
* doxygen: fill in version string in Doxyfile from conf.py
* remove unneeded rocm-smi-lib tutorials
* remove wikipedia references in doxyfile to satisfy ci check
Signed-off-by: Park, Peter <Peter.Park@amd.com>
[ROCm/amdsmi commit: 311eade5b1]
This unbreaks having sources on one mount point and builds at another.
Signed-off-by: Marius Brehler <marius.brehler@amd.com>
Change-Id: I68363112382a95baaa867cad91e09bdec2b30d90
[ROCm/amdsmi commit: bd3579a1ac]
Having the SOVERSION derived from the git tags doesn't scale well
for distributions that don't have the git history while building
(such as a tarball).
As part of 8b96ee5 the strings are parsed from a header. Re-use
those.
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
[ROCm/amdsmi commit: ccfdb65b6f]
Added back the temp-type map initialization to
RSMI_TEMP_TYPE_INVALID before probing hwmon files. This
prevents std::out_of_range for unsupported or absent
temperature sensor types.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
[ROCm/amdsmi commit: 3e7e4ab1ac]
Changes:
- Moved `amd-smi monitor` guest fixes to 7.0.1
- [7.0.0] Provided details on updated violation output
- [7.0.0] Provided details on new set/reset error outputs
- [7.0.0] Added details on a resolved non-json format output
for `amd-smi partiton --json`
- [7.0.0] Moved known issue for `amd-smi monitor`
accidentally placed in wrong release
Change-Id: Iea745255a69d8ff88b470ca533d83ff3eef09fef
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: 06324c0dde]
- Changed amd-smi static --vbios to accept ifwi
- Change population logic for vbios version API
- Added IFWI boot_firmware to the CLI, C++, Rust, and Python API
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I4ea504d40a43cfb011ab38fc9a664ecf12d39c8a
[ROCm/amdsmi commit: cd21b5edcc]
For process -
Dual CSV is required in order to print 4 separate rows.
1. Metric header + data
2. Process header + data
Change-Id: Ibb7bfb13fa95a7c43b2e3f9061ada3a6be4aa8cb
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: 4fd8b88aa5]
* Changes:
- Moved `amd-smi monitor` guest fixes to 7.0.2
- [7.0.0] Provided details on updated violation output
- [7.0.0] Provided details on new set/reset error outputs
- [7.0.0] Added details on a resolved non-json format output
for `amd-smi partiton --json`
- [7.0.0] Moved known issue for `amd-smi monitor`
accidentally placed in wrong release
- Moved `amd-smi monitor` guest fixes to 7.0.2
- [7.1.0] Added power caps guest set info
- [7.1.0] Other various fixes noted
Change-Id: I374b98f32e947520fcb8a6e33e6f6fcd290b00d6
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: 94d87ea222]
* fix links to python apis
* add links to repo for example code
* fix `WARNING: Pygments lexer name is not known`
Signed-off-by: Peter Park <Peter.Park@amd.com>
[ROCm/amdsmi commit: 5d0a39fa9d]
Increased the AMDSMI_MAX_DEVICES to 64 to accomodate all
devices in CPX mode. The link type has been modified in
amd-smi to match with rocm-smi types, updated the same
for drm tests.
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
[ROCm/amdsmi commit: 6715c5aa92]
* [SWDEV-531904] Unit and Integ Test Updates
Updated: unit_tests.py
- Removed redundant self.setUp() and self.tearDown() calls.
- Removed test_free_name_value_pairs() since is internal only.
Updated: integration_test.py
- Added logic to set AMDSMI_CLI_PATH from environment or default.
- Raise FileNotFoundError if path does not exist.
- Append CLI path to sys.path and handle ImportError with a clear message.
- Removed redundant @handle_exceptions function decorator.
- Removed redundant self.setUp() and self.tearDown() calls.
Updated: amdsmi_interface.py
- Removed POINTER conversion in amdsmi_get_gpu_pm_metrics_info() and amdsmi_get_gpu_reg_table_info()
All tests pass/skip
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
* Update tests/python_unittest/integration_test.py
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Castillo, Juan <Juan.Castillo@amd.com>
* Review Update 1
Modified: integration_test.py
- Added logic to properly loop through firmware list and display each name and version
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
* Skip xgmi_err tests + improve running output
Changes:
1. Now check for elevated permissions
2. Skip xgmi_error related SYSFS tests, refer to xgmi_read_write.cc
(both are skipped)
3. Added list of tests and provided a summary of additional output
provided
Change-Id: Iefc85c270faad89c625e2bd7af397d24faed2437
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
---------
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Signed-off-by: Castillo, Juan <Juan.Castillo@amd.com>
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
[ROCm/amdsmi commit: 67eb541c15]
* Adjusted amdsmitst and reset command to account for separation of power profile and perf level behavior
* Updated test to reset power profile to previous user setting
* Removed performance level from reset_profile_results in reset --profile command
* Updated Changelog with change to reset profile behavior
---------
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
[ROCm/amdsmi commit: 954d4860c1]
Previously, the function was iterating through all enum
values(0-250). This fix reduces the number of hwmon operations
by calling add_temp_sensor_entry only for temperature types
that fall within the defined enum ranges.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
[ROCm/amdsmi commit: 17ffe5a1bd]
Added bad_page_threshold_exceeded field to ras, which
compares retired pages count against bad page threshold.
This field displays True if retired pages exceed the
threshold, False if within threshold, or N/A if
threshold data is unavailable.
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
[ROCm/amdsmi commit: edaae978a2]
amdsmi/tests/amd_smi_test/functional/memorypartition_read_write.cc:453:32: warning: the address of ‘orig_memory_partition’ will never be NULL [-Waddress]
453 | if ((orig_memory_partition == nullptr) ||
| ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
[ROCm/amdsmi commit: 66eb189396]
warning: the address of ‘amdsmi_asic_info_t::vendor_name’ will never be NULL [-Waddress]
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
[ROCm/amdsmi commit: 4a863b27ab]