Граф коммитов

1769 Коммитов

Автор SHA1 Сообщение Дата
Park, Peter 83bbbd55a3 docs: Update Doxygen, Sphinx, and readthedocs configs (#719)
* conf: update RTD config to ub24.04 (doxygen 1.9.8) and py3.12
* update generate-docs workflow
* Update "modules" to "topics" due to Doxygen 1.9.8
* bump rocm-docs-core to 1.25.0 and pip-compile requirements.txt
* doxygen: fill in version string in Doxyfile from conf.py
* remove unneeded rocm-smi-lib tutorials
* remove wikipedia references in doxyfile to satisfy ci check

Signed-off-by: Park, Peter <Peter.Park@amd.com>

[ROCm/amdsmi commit: 311eade5b1]
2025-09-26 17:30:48 -05:00
Maisam Arif 0d00e3c5ab [SWDEV-538483] Fix amdsmi.h doc tag for amdsmi_set_power_cap
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I919eba1745990fd05ca3ff1077317e7b1244fded


[ROCm/amdsmi commit: e61eac1368]
2025-09-26 13:28:43 -05:00
Maisam Arif 28fbf0d74f Create symbolic links instead of hard links
This unbreaks having sources on one mount point and builds at another.

Signed-off-by: Marius Brehler <marius.brehler@amd.com>
Change-Id: I68363112382a95baaa867cad91e09bdec2b30d90


[ROCm/amdsmi commit: bd3579a1ac]
2025-09-26 12:17:06 -05:00
Maisam Arif 4fcf92ee14 Removed unused version config files
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I3b00a8c302615026422f6d5d602959989ee0418e


[ROCm/amdsmi commit: 843dfaeed2]
2025-09-25 18:19:14 -05:00
Mario Limonciello ef8882b4bf Set the SOVERSION in CMake from MAJOR/MINOR/RELEASE variables
Having the SOVERSION derived from the git tags doesn't scale well
for distributions that don't have the git history while building
(such as a tarball).

As part of 8b96ee5 the strings are parsed from a header.  Re-use
those.

Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>


[ROCm/amdsmi commit: ccfdb65b6f]
2025-09-25 18:19:14 -05:00
Maisam Arif f20ecc8512 Changed amd_smi_drm.cc to depend on dynamic libdrm
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ie8a794578da1a3ad8893d436e54bbfb67857a7ae


[ROCm/amdsmi commit: df87246b40]
2025-09-25 17:40:05 -05:00
Bindhiya Kanangot Balakrishnan f4b921c5f5 [SWDEV-556005 & SWDEV-556853] Initialize temp-type map
Added back the temp-type map initialization to
RSMI_TEMP_TYPE_INVALID before probing hwmon files. This
prevents std::out_of_range for unsupported or absent
temperature sensor types.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>


[ROCm/amdsmi commit: 3e7e4ab1ac]
2025-09-25 12:03:35 -05:00
AL Musaffar, Yazen a2b299d2a0 [AMD-SMI] [SWDEV-551318] amdsmi_get_afids_from_cper python api Docs Updated (#709)
* Fix formatting & Examples for amdsmi_get_afids_from_cper CPER record examples in documentation

---------
Change-Id: Ib5e268dc818adc633541652a0eb982641385bf7d
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>

[ROCm/amdsmi commit: 6550c51b35]
2025-09-24 21:06:38 -05:00
Maisam Arif 0caec33dc5 Change libdrm.so.2 references to dynamic libdrm naming
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ie02c91a3a210ab7612fec670b2aad66d476d2cf3


[ROCm/amdsmi commit: 9f22e59c52]
2025-09-24 20:44:03 -05:00
Stella Laurenzo 7412d14fed Fix delay loading of drm by soname.
[ROCm/amdsmi commit: 4d5d24d1c6]
2025-09-24 20:44:03 -05:00
Stella Laurenzo e16c125f20 Add rt dep back
[ROCm/amdsmi commit: 62e4329559]
2025-09-24 20:44:03 -05:00
Stella Laurenzo 060293c7fc [cmake] Fix dependencies.
* Use CMAKE_DL_LIBS instead of hard-coded `dl`.
* Use Threads::Threads instead of `pthread`.
* Drop `rt` dep.
* Find libdrm via pkgconfig (consistent to how other ROCm projects do it as documented here: https://github.com/ROCm/TheRock/blob/main/docs/development/dependencies.md#libdrm)


[ROCm/amdsmi commit: 4e6731a817]
2025-09-24 20:44:03 -05:00
Narlo, Joseph 4d76a0088f [SWDEV-554880] Sync Unified and Linux Header (#686)
Sync Unified and Linux Header

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>

[ROCm/amdsmi commit: 3c8fd1bf54]
2025-09-23 16:56:32 -05:00
Charis Poag 2a74a4519e Changelog updates for ROCm 7.0 and 7.0.1
Changes:
- Moved `amd-smi monitor` guest fixes to 7.0.1
- [7.0.0] Provided details on updated violation output
- [7.0.0] Provided details on new set/reset error outputs
- [7.0.0] Added details on a resolved non-json format output
  for `amd-smi partiton --json`
- [7.0.0] Moved known issue for `amd-smi monitor`
  accidentally placed in wrong release

Change-Id: Iea745255a69d8ff88b470ca533d83ff3eef09fef
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 06324c0dde]
2025-09-23 16:05:10 -05:00
Maisam Arif 5fa2108491 Rust fixes
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Iff93cf4c53513df5aea95c970400598320c0f6c9


[ROCm/amdsmi commit: 51216187e2]
2025-09-23 16:05:10 -05:00
Maisam Arif 405f34e4d1 [SWDEV-554587] Added IFWI Version and boot_firmware API
- Changed amd-smi static --vbios to accept ifwi
- Change population logic for vbios version API
- Added IFWI boot_firmware to the CLI, C++, Rust, and Python API

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I4ea504d40a43cfb011ab38fc9a664ecf12d39c8a


[ROCm/amdsmi commit: cd21b5edcc]
2025-09-23 16:05:10 -05:00
Maisam Arif 6705bc8a77 Version Bump 26.1.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I2ca5acf58741fa4c64476615371b400b080e17e8


[ROCm/amdsmi commit: c708a7e11f]
2025-09-23 16:05:10 -05:00
Charis Poag fb6b706559 [SWDEV-554860] Fix amd-smi monitor -qt --gpu 0 --csv
For process -
Dual CSV is required in order to print 4 separate rows.
1. Metric header + data
2. Process header + data

Change-Id: Ibb7bfb13fa95a7c43b2e3f9061ada3a6be4aa8cb
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 4fd8b88aa5]
2025-09-23 14:16:08 -05:00
yalmusaf_amdeng dd518c61d2 --cper-file case sensitivity fix
[ROCm/amdsmi commit: d6f8c0437e]
2025-09-19 16:28:00 -05:00
Justin Williams 66f6455881 Updated Container Test Cases
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 74b35ddfd9]
2025-09-19 11:07:18 -05:00
Charis Poag 4639f53df5 Changelog updates for ROCm 7.0 / 7.0.2 / 7.1.0
* Changes:
  - Moved `amd-smi monitor` guest fixes to 7.0.2
  - [7.0.0] Provided details on updated violation output
  - [7.0.0] Provided details on new set/reset error outputs
  - [7.0.0] Added details on a resolved non-json format output
    for `amd-smi partiton --json`
  - [7.0.0] Moved known issue for `amd-smi monitor`
    accidentally placed in wrong release
  - Moved `amd-smi monitor` guest fixes to 7.0.2
  - [7.1.0] Added power caps guest set info
  - [7.1.0] Other various fixes noted

Change-Id: I374b98f32e947520fcb8a6e33e6f6fcd290b00d6
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 94d87ea222]
2025-09-19 11:06:31 -05:00
Park, Peter 26841f86c0 docs: Fix links to API usage examples (#701)
* fix links to python apis
* add links to repo for example code
* fix `WARNING: Pygments lexer name is not known`

Signed-off-by: Peter Park <Peter.Park@amd.com>

[ROCm/amdsmi commit: 5d0a39fa9d]
2025-09-19 10:07:38 -05:00
Kanangot Balakrishnan, Bindhiya e0995ce7a0 [SWDEV-534605] Increase max devices supported and drm test link type (#625)
Increased the AMDSMI_MAX_DEVICES to 64 to accomodate all
devices in CPX mode. The link type has been modified in
amd-smi to match with rocm-smi types, updated the same
for drm tests.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 6715c5aa92]
2025-09-17 16:30:04 -05:00
Mario Limonciello e9fdf65aa2 Fix compilation with gcc 15
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>


[ROCm/amdsmi commit: 902667db3c]
2025-09-17 16:29:38 -05:00
Williams, Justin 587b844c2f CI - Added New ABI Labeling Logic (#695)
* CI - Added New ABI Labeling Logic

Signed-off-by: Justin Williams <Justin.Williams@amd.com>

[ROCm/amdsmi commit: 2a1f9a6e4a]
2025-09-17 16:29:22 -05:00
Saeed, Oosman d3fbbb4d36 Python_Cli_Examples (#696)
* Adjusted Python CLI examples

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: ea225b459b]
2025-09-16 18:53:07 -05:00
Saeed, Oosman 10bfc7c056 [SWDEV-554697] CPER not properly displaying warnings for non-zero partition id's (#687)
* Get primary gpu_id for non-primary partitions.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* corrected partitions warning print logic

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I08be6c78ddd46e5316dc9d538de4908b65b21d43

* Updated patch with latest changes and modified
xgmi partition_id check.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* Typo correction

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

* adjusted logging

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I6d425102d8583aabbcd4d7f55c9c733428524d59

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Oosman Saeed <oossaeed@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 5398eaa6b3]
2025-09-12 16:39:56 -05:00
3049ac537468bd90fe07f2cbb3d7a83e_amdeng 85bcf06edd [SWDEV-531904] Unit and Integ Test Updates (#563)
* [SWDEV-531904] Unit and Integ Test Updates
Updated: unit_tests.py
- Removed redundant self.setUp() and self.tearDown() calls.
- Removed test_free_name_value_pairs() since is internal only.
Updated: integration_test.py
- Added logic to set AMDSMI_CLI_PATH from environment or default.
- Raise FileNotFoundError if path does not exist.
- Append CLI path to sys.path and handle ImportError with a clear message.
- Removed redundant @handle_exceptions function decorator.
- Removed redundant self.setUp() and self.tearDown() calls.
Updated: amdsmi_interface.py
- Removed POINTER conversion in amdsmi_get_gpu_pm_metrics_info() and amdsmi_get_gpu_reg_table_info()

All tests pass/skip

Signed-off-by: Juan Castillo <juan.castillo@amd.com>

* Update tests/python_unittest/integration_test.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Castillo, Juan <Juan.Castillo@amd.com>

* Review Update 1
Modified: integration_test.py
- Added logic to properly loop through firmware list and display each name and version

Signed-off-by: Juan Castillo <juan.castillo@amd.com>

* Skip xgmi_err tests + improve running output

Changes:
1. Now check for elevated permissions
2. Skip xgmi_error related SYSFS tests, refer to xgmi_read_write.cc
   (both are skipped)
3. Added list of tests and provided a summary of additional output
   provided

Change-Id: Iefc85c270faad89c625e2bd7af397d24faed2437
Signed-off-by: Charis Poag <Charis.Poag@amd.com>

---------

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Signed-off-by: Castillo, Juan <Juan.Castillo@amd.com>
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: 67eb541c15]
2025-09-11 16:39:31 -05:00
Pryor, Adam 0a2231deb7 Fix groups failing inside container (#684)
* Fix groups failing inside container

---------

Signed-off-by: adapryor <Adam.pryor@amd.com>

[ROCm/amdsmi commit: 5ebd7b8022]
2025-09-10 15:36:26 -05:00
Pham, Gabriel e9ee0bccf2 [SWDEV-551309] Adjusted amdsmitst and reset command (#654)
* Adjusted amdsmitst and reset command to account for separation of power profile and perf level behavior
* Updated test to reset power profile to previous user setting
* Removed performance level from reset_profile_results in reset --profile command
* Updated Changelog with change to reset profile behavior

---------

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>

[ROCm/amdsmi commit: 954d4860c1]
2025-09-09 16:11:07 -05:00
Arif, Maisam 1a36f2ad0b [SWDEV-550075] Updated README to link to amd-smi virtualization repo (#664)
Co-authored-by: Peter Park <peter.park@amd.com>

[ROCm/amdsmi commit: fd5eb4e963]
2025-09-09 16:05:01 -05:00
Bindhiya Kanangot Balakrishnan 9d0ce8ba42 [SWDEV-414304] Reduce excessive hwmon operations
Previously, the function was iterating through all enum
values(0-250). This fix reduces the number of hwmon operations
by calling add_temp_sensor_entry only for temperature types
that fall within the defined enum ranges.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>


[ROCm/amdsmi commit: 17ffe5a1bd]
2025-09-09 10:30:51 -05:00
Park, Peter 0f75c19e4d [SWDEV-551318] Add doc about RAS / CPER (#636)
* add doc about ras/cper
* add sample code examples for CPER and AFID
---------

Signed-off-by: Park, Peter <Peter.Park@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Oosman Saeed <oossaeed@amd.com>

[ROCm/amdsmi commit: 5e92adc5b3]
2025-09-09 10:27:15 -05:00
Kanangot Balakrishnan, Bindhiya e5ba10d4c2 [SWDEV-553557] Add bad_page_threshold_exceeded to RAS (#677)
Added bad_page_threshold_exceeded field to ras, which
compares retired pages count against bad page threshold.
This field displays True if retired pages exceed the
threshold, False if within threshold, or N/A if
threshold data is unavailable.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: edaae978a2]
2025-09-09 09:15:37 -05:00
AL Musaffar, Yazen 851354429f [SWDEV-545894] Folder name defaulting to lower case fix (#611)
* Folder name defaulting to lower case

* Update amdsmi_cli/amdsmi_cli.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>

* Fixed Based On Comments

* Remove unused variable 'skip_next'

Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>

---------

Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
Co-authored-by: yalmusaf_amdeng <yalmusaf@amd.com>
Co-authored-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

[ROCm/amdsmi commit: 4a8ee27225]
2025-09-07 20:38:29 -05:00
Galantsev, Dmitrii d0b5e20440 Create run-clang-tidy.sh
Change-Id: I4faa950a59434ba4706da581af51dd8a7e071dcb
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 0cd05bf307]
2025-09-05 17:44:17 -05:00
Galantsev, Dmitrii 7bbfc98588 Add extra element to array for bounds checking
Decrement padding to keep struct size the same

Change-Id: I4bea5d4b4d5c908423c7cc55a7e8c404b4a6b5e8
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 85e37bb6ce]
2025-09-05 17:44:17 -05:00
Galantsev, Dmitrii 6797de3ed5 Ignore more warnings in clang and clang-tidy
Change-Id: I6f7c7e478f0f176da550d5bccf833dae1a4f1878
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 20bc3aeeef]
2025-09-05 17:44:17 -05:00
Galantsev, Dmitrii 74efdc57a7 Clean up clang-tidy warnings and unused variables
Change-Id: I1365edf8926908b3a49652fb87f079f8fbf1f56b


[ROCm/amdsmi commit: aba1c792b4]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 3a7b4a283a Remove an impossible check
amdsmi/tests/amd_smi_test/functional/memorypartition_read_write.cc:453:32: warning: the address of ‘orig_memory_partition’ will never be NULL [-Waddress]
  453 |     if ((orig_memory_partition == nullptr) ||
      |          ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: 66eb189396]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 0c7c849c42 Use nested namespace for amd::smi
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: eacec681dd]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 47b2e80ab8 Drop an unnecessary NULL comparison
warning: the address of ‘amdsmi_asic_info_t::vendor_name’ will never be NULL [-Waddress]

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: 4a863b27ab]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 02b357526b Fix a comparison between signed and unsigned integer
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: a15bad1c9e]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 08eec3c675 Drop unused variables
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: a99e827d97]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) a8a89db945 Remove unnecessary typedef declarations
amd_smi_cper.h:32:1: warning: ‘typedef’ was ignored in this declaration

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: 3d0ea25af3]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) c9eddf75e7 Remove unnecessary includes
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: 924a06d1e1]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 5fe413710b Fix a typo
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: 05f79879c3]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 15e335ac3f Use nested namespace for amd::smi
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: faca0222f0]
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 3b411b6759 Fix a crash when running amd-smi version --cpu
When running on a system that doesn't support HSMP (such as an APU)
then the following is observed:
```
/usr/include/c++/15.1.1/bits/stl_vector.h:1263: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = void*; _Alloc = std::allocator<void*>; reference = void*&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
```

This is because no "CPU" are detected on the SOC, which really means
no CPUs that support HSMP.  Catch this case so that a clean return
can be passed up.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>


[ROCm/amdsmi commit: e5d9e1361e]
2025-09-03 00:49:48 -05:00
Maisam Arif d8c125f2b0 [SWDEV-553016] Added Copyright to scoped_fd.cc
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I2ea872e7c5c61a6e4b5c7e7114d016b8a1069b28


[ROCm/amdsmi commit: c876180875]
2025-09-02 15:02:47 -05:00