提交線圖

1959 次程式碼提交

作者 SHA1 備註 日期
Arif, Maisam fd5eb4e963 [SWDEV-550075] Updated README to link to amd-smi virtualization repo (#664)
Co-authored-by: Peter Park <peter.park@amd.com>
2025-09-09 16:05:01 -05:00
Bindhiya Kanangot Balakrishnan 17ffe5a1bd [SWDEV-414304] Reduce excessive hwmon operations
Previously, the function was iterating through all enum
values(0-250). This fix reduces the number of hwmon operations
by calling add_temp_sensor_entry only for temperature types
that fall within the defined enum ranges.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-09-09 10:30:51 -05:00
Park, Peter 5e92adc5b3 [SWDEV-551318] Add doc about RAS / CPER (#636)
* add doc about ras/cper
* add sample code examples for CPER and AFID
---------

Signed-off-by: Park, Peter <Peter.Park@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Oosman Saeed <oossaeed@amd.com>
2025-09-09 10:27:15 -05:00
Kanangot Balakrishnan, Bindhiya edaae978a2 [SWDEV-553557] Add bad_page_threshold_exceeded to RAS (#677)
Added bad_page_threshold_exceeded field to ras, which
compares retired pages count against bad page threshold.
This field displays True if retired pages exceed the
threshold, False if within threshold, or N/A if
threshold data is unavailable.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-09-09 09:15:37 -05:00
AL Musaffar, Yazen 4a8ee27225 [SWDEV-545894] Folder name defaulting to lower case fix (#611)
* Folder name defaulting to lower case

* Update amdsmi_cli/amdsmi_cli.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>

* Fixed Based On Comments

* Remove unused variable 'skip_next'

Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>

---------

Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
Co-authored-by: yalmusaf_amdeng <yalmusaf@amd.com>
Co-authored-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-07 20:38:29 -05:00
Galantsev, Dmitrii 0cd05bf307 Create run-clang-tidy.sh
Change-Id: I4faa950a59434ba4706da581af51dd8a7e071dcb
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-09-05 17:44:17 -05:00
Galantsev, Dmitrii 85e37bb6ce Add extra element to array for bounds checking
Decrement padding to keep struct size the same

Change-Id: I4bea5d4b4d5c908423c7cc55a7e8c404b4a6b5e8
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-09-05 17:44:17 -05:00
Galantsev, Dmitrii 20bc3aeeef Ignore more warnings in clang and clang-tidy
Change-Id: I6f7c7e478f0f176da550d5bccf833dae1a4f1878
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-09-05 17:44:17 -05:00
Galantsev, Dmitrii aba1c792b4 Clean up clang-tidy warnings and unused variables
Change-Id: I1365edf8926908b3a49652fb87f079f8fbf1f56b
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 66eb189396 Remove an impossible check
amdsmi/tests/amd_smi_test/functional/memorypartition_read_write.cc:453:32: warning: the address of ‘orig_memory_partition’ will never be NULL [-Waddress]
  453 |     if ((orig_memory_partition == nullptr) ||
      |          ~~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) eacec681dd Use nested namespace for amd::smi
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 4a863b27ab Drop an unnecessary NULL comparison
warning: the address of ‘amdsmi_asic_info_t::vendor_name’ will never be NULL [-Waddress]

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) a15bad1c9e Fix a comparison between signed and unsigned integer
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) a99e827d97 Drop unused variables
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 3d0ea25af3 Remove unnecessary typedef declarations
amd_smi_cper.h:32:1: warning: ‘typedef’ was ignored in this declaration

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 924a06d1e1 Remove unnecessary includes
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) 05f79879c3 Fix a typo
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) faca0222f0 Use nested namespace for amd::smi
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-05 17:44:17 -05:00
Mario Limonciello (AMD) e5d9e1361e Fix a crash when running amd-smi version --cpu
When running on a system that doesn't support HSMP (such as an APU)
then the following is observed:
```
/usr/include/c++/15.1.1/bits/stl_vector.h:1263: std::vector<_Tp, _Alloc>::reference std::vector<_Tp, _Alloc>::operator[](size_type) [with _Tp = void*; _Alloc = std::allocator<void*>; reference = void*&; size_type = long unsigned int]: Assertion '__n < this->size()' failed.
```

This is because no "CPU" are detected on the SOC, which really means
no CPUs that support HSMP.  Catch this case so that a clean return
can be passed up.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-09-03 00:49:48 -05:00
Maisam Arif c876180875 [SWDEV-553016] Added Copyright to scoped_fd.cc
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I2ea872e7c5c61a6e4b5c7e7114d016b8a1069b28
2025-09-02 15:02:47 -05:00
Maisam Arif 2c9f3af026 [SWDEV-540665] Change parser to not accept 0 as a power set input
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I510fa5523b8dd7ea33f49e21cc199d4a2cfcf9bb
2025-08-29 04:18:36 -05:00
gabrpham_amdeng 39b26104d4 reverted help formatting column width to 80
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-08-28 11:30:24 -05:00
Tim Huang 51a44bc0c4 Regenerate Rust bindings against latest amdsmi.h header
- Regenerate Rust wrapper against latest amdsmi.h header
- Add libc dependency for proper C memory management
- Fix compilation errors caused by types removed from amdsmi.h
- Add FFI bindings regeneration documentation in README

This update ensures the Rust bindings are synchronized with the latest
C API and provides guidance for developers on regenerating
Bindings.

Signed-off-by: Tim Huang <tim.huang@amd.com>
2025-08-28 09:34:57 -05:00
Maisam Arif 4ffa468613 [SWDEV-540665] Remove amdsmi_set_power_cap API Guest Restriction
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I682506b48c10eefbd04f9b494ad57fb8ae8842b0
2025-08-27 20:18:43 -05:00
Arif, Maisam ed2300516f Revert "[SWDEV-536176] libdrm_amdgpu depdency change (#448)"
This reverts commit 652761de54.
2025-08-27 20:11:17 -05:00
Oosman Saeed 594d5ce8ee [SWDEV-546239] Match amdsmi output with host output 2025-08-27 18:41:59 -05:00
Maisam Arif 978fad01d2 [SWDEV-544299] Fix CLI prefix for amd-smi metric -G
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ic184ec824213421388356417e713d9ed5adeddeb
2025-08-27 18:08:06 -05:00
Arif, Maisam 286c421a49 [SWDEV-552378] Removed First enums in amdsmi_interface.py (#659)
- **Fixed gpuboard and baseboard temperatures enums in amdsmi Python Library**.  
  - AmdSmiTemperatureType had issues with referencing the right attribute, so we removed the following duplicate enums:
    - `AmdSmiTemperatureType.GPUBOARD_NODE_FIRST`
    - `AmdSmiTemperatureType.GPUBOARD_VR_FIRST`
    - `AmdSmiTemperatureType.BASEBOARD_FIRST`

Change-Id: Ia61446b593bd9182d597c4b4c2ac3c5ffdae7493
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-08-27 18:07:17 -05:00
Arif, Maisam 652761de54 [SWDEV-536176] libdrm_amdgpu depdency change (#448)
* Cmake fix updates
* Next fix will be addressing libdrm further

---------

Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Justin Williams <juwillia@amd.com>
2025-08-27 09:32:51 -05:00
Pham, Gabriel b13fc16d60 Added gpuboard and baseboard temperatures to amd-smi metric (#617)
* Added gpu-board and base-board temperatures to amd-smi metric
* Updated Changelog and adjusted the metric base-board/gpu-board output
* Adjusted output of metric to hide base/gpu-board when not relevant

---------

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-08-26 12:49:56 -05:00
adapryor e8fa06d223 [SWDEV-546543] Fix segfault in gpu_metrics
Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-08-22 15:23:57 -05:00
adapryor d25c01e802 [SWDEV-546543] Fix segfault in gpu_metrics
Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-08-22 15:23:57 -05:00
Maisam Arif e030f71229 [SWDEV-540665] Power cap on 1VF cli parsing fix
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I5aac8f820fd8ae1c6c1dbae3b5b9e69018c69452
2025-08-22 15:22:44 -05:00
Oosman Saeed dee18e9fb4 continue to process all entries 2025-08-21 23:37:24 -05:00
gabrpham_amdeng 71c8b92076 [SWDEV-549373] Added vbios and pldm information to version header and adjusted platform info display
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-08-21 18:16:47 -05:00
Pryor, Adam f8afba0a5f [SWDEV-540665] Move Virtualization checks in APIs into amd-smi APIs (#643)
* Remove vm checks in rocm-smi
* Move virtualization checks up the stack into amd-smi

---------

Signed-off-by: adapryor <Adam.pryor@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-08-21 18:11:50 -05:00
gabrpham_amdeng 5aae1a31fa Added Version Header to all Help Sections
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-08-21 17:17:16 -05:00
Pryor, Adam 4ac1c7e453 [SWDEV-540665] Fix power_caps in help text (#642)
Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-08-21 16:45:37 -05:00
Maisam Arif 074c4b7a3f Fix spelling and incorrect error references
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I23e947a0cfd4f68067f9fca703574f44680163d4
2025-08-21 12:36:43 -05:00
Pryor, Adam ad29de4238 [SWDEV-525336] Filter out amd-smi process itself from detection (#638)
* Filter out amd-smi from process detection
* Fixed N/A stripping N/ incorrectly from running elevated processes

Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-08-21 11:41:03 -05:00
Oosman Saeed ffca095246 [SWDEV-547223] RAS HBM CRC Read CE failed due to AFID missing 24
cherry-pick aca-decode repo changeset: aca-decode repo: f9e5ad5 (HEAD -> main, origin/main, origin/HEAD) Fix bug in Corrected HBM Error being decoded as AFID 34 (#5)
2025-08-21 11:00:30 -05:00
Saeed, Oosman fd5e37a07e [SWDEV-546239] amd-smi ras cper - no data created (#614)
* Update amd-smi doc with examples of CPER and AFID API usage.

---------

Signed-off-by: Oosman Saeed <oossaeed@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-08-20 11:27:41 -05:00
Pham, Gabriel e6af86f44a Updated Changelog for updated temperature metrics API (#616)
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-08-19 19:02:50 -05:00
AL Musaffar, Yazen e84e364b35 [SWDEV-549789] Removed incorrect CPER AFID references (#619)
* Fix for afid help
* Update amdsmi_parser.py

Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
2025-08-19 18:55:33 -05:00
Pryor, Adam b62900c372 [SWDEV-544620] Add kfd fallback for GPU Processes (#631)
Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-08-19 18:53:16 -05:00
Pham, Gabriel c0ea186d47 [SWDEV-446394] Updated error message for setting clock limit (#633)
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-08-19 18:51:49 -05:00
Poag, Charis 1b2edd70bd [SWDEV-550355] Fix process + violation output when in partitions (#623)
Changes:
  - Fixes amd-smi monitor such as:
    amd-smi monitor -Vqt, amd-smi monitor -g 0 -Vqt -w 1
    amd-smi monitor -Vqt --file /tmp/test1, ...
  - Required moving around when process is called, since xcp
    information is gathered in right format expected by monitor
  - Requires process to be appended first with the gpu data -> xcp
    info to be gathered + added after 1st device

Change-Id: I76356a4610944f633a9530970fac66556d65bf11
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-08-19 18:50:51 -05:00
Charis Poag 5fe58a8e38 [SWDEV-550679] Fix amd-smi monitor AttributeError
Impacts only Guest systems

Fixes following error:
$ amd-smi monitor
AttributeError: 'Namespace' object has no attribute 'violation'

Change-Id: If501819be3f8e2d2dfd75775dc776873a92465a3
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-08-19 17:58:44 -05:00
Maisam Arif 6de6290dc1 Removed kfd_ioctl.h from rocm include install
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I7948eb050f79a8a0f71e0b8a8e4e08187ac0bb84
2025-08-19 17:18:14 -05:00
Galantsev, Dmitrii cd33b75540 [SWDEV-545751] CMAKE - Enable fPIC (#629)
Change-Id: Iaade10e70b3a39d6bca23ae98f9f501339ffd76d
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-08-19 11:39:39 -05:00