Graphe des révisions

1687 Révisions

Auteur SHA1 Message Date
AL Musaffar, Yazen adb5060ecb Fix binary dump
Change-Id: I3d91a7d33fc6860eea27fb396937139fe229daeb
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>


[ROCm/amdsmi commit: 9b66bc5690]
2025-04-14 19:26:34 -05:00
Maisam Arif 3e419ee84b CPER Tests fix
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I5c1b85c37df07b912ad82b50a3658a8a7edaccb1


[ROCm/amdsmi commit: 81c53e179d]
2025-04-14 19:26:34 -05:00
Mewar, Deepak 94a54d24a5 [SWDEV-499995] amdsmi updated for esmi library changes (#266)
CMakelist updated to latest esmi tag esmi_pkg_ver-4.2, which
has fixes for esmi warnings during amdsmi build,

amdsmi_get_cpu_current_xgmi_bw updated as per change in
corresponding esmi library API

Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>

[ROCm/amdsmi commit: 49aa2af045]
2025-04-14 19:21:51 -05:00
Galantsev, Dmitrii b1ec78b54b Add amdsmi_get_gpu_busy_percent
This is required for GPU busy percent in RDC

Change-Id: Idf2ab72993ecc8227958e6eb47f36fc68c93759f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 955ceac78a]
2025-04-14 10:40:13 -05:00
Galiffi, David 7653d44090 [SWDEV-526012] Enable RPM autoprov (#246)
Update help_package.cmake

Signed-off-by: Galiffi, David <David.Galiffi@amd.com>

[ROCm/amdsmi commit: 7592ffa8f5]
2025-04-14 04:20:55 -05:00
Kanangot Balakrishnan, Bindhiya 58b46c5c9d [SWDEV-516592] Add python interface API for Bad Page Threshold (#141)
- Added python interface APIs for amdsmi_get_gpu_bad_page_threshold()
 - Updated the docs and changelog.

---------

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 9d7964dff5]
2025-04-14 04:19:45 -05:00
Charis Poag 1413ae1431 [SWDEV-518325/SWDEV-518320/SWDEV-443309] Changelog addition
Change-Id: I29567229f0e27d307ac3df935b5a5fab8ca43409
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 19a4775d32]
2025-04-14 03:40:35 -05:00
Charis Poag 8d4a4d7b14 [SWDEV-518325/SWDEV-518320/SWDEV-443309] Fix Partition Enumeration
* Changes:
  - Updates to DRM renderD* / card* pathing for partition
  - Now use KFD to discover AMD devices and populate accordingly
    Device MUST have an accessible KFD node (via cgroups)
  - Updated serveral AMD SMI CLI outputs to handle SYSFS files
    which are not accessible on partition nodes
  - Tests are updated to handle not supported features
  - Added new method to help get card/drm info
    (rsmi_dev_device_identifiers_get) from ROCm SMI
  - Renamed device->get_card_id() & device->get_drm_render_minor()
    These can now be used on internal AMD SMI calls.
  - Removed warnings shown in build

Change-Id: Ice882fd9b97fb625a5bd4ef327f3ceaf247dc570
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 4782528770]
2025-04-12 14:41:38 -05:00
Justin Williams 484614fe9b [SWDEV-521116] Added 'more_itertools" error workaround
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: af943ac05c]
2025-04-12 13:42:43 -05:00
Arif, Maisam 7ea98e06dd [SWDEV-511234] Added amdsmi_get_gpu_cper_entries & CLI implementation
Added amdsmi_get_gpu_cper_entries() in the python and C APIs

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com>
Co-authored-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>

[ROCm/amdsmi commit: d81871ef16]
2025-04-12 01:54:57 -05:00
Justin Williams 574144c9a0 Created ABI Compliance Checker
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 3f75cd906f]
2025-04-09 12:40:02 -05:00
Galantsev, Dmitrii 055f603ab9 Revert "CMAKE - Force INSTALL_LIBDIR to be lib"
This reverts commit 7bbe33c94d.


[ROCm/amdsmi commit: 2e429ed890]
2025-04-09 01:38:42 -05:00
Galantsev, Dmitrii 7bbe33c94d CMAKE - Force INSTALL_LIBDIR to be lib
On some systems it defaults to lib64, on others to lib.

Change-Id: I973b488253d106ded518ee590a0edb370927f9a4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 62c10bfe3c]
2025-04-08 16:02:43 +00:00
Williams, Justin 52195d6505 [SWDEV-500518] Updated CI Structure (#244)
Signed-off-by: Justin Williams <Justin.Williams@amd.com>

[ROCm/amdsmi commit: e3ab8cf71b]
2025-04-07 15:02:57 -05:00
Pham, Gabriel b485d4ba70 [SWDEV-524288] Fixed duplication of GPU id in events. (#233)
* Fixed duplication of GPU id in events.

---------

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>

[ROCm/amdsmi commit: e2c371ece4]
2025-04-04 18:31:08 -05:00
Arif, Maisam cb3c979dfe Removed unnecessary rocm-smi files (#217)
* Removed unnecessary rocm-smi files
* Moved the update wrapper script into the tools folder

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 50d7d5287f]
2025-04-04 18:26:43 -05:00
Maisam Arif dab3e0657b Updated imports on amdsmi_quick_start
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ideb0f2addf61fb6bdb728e549a8b0f133682d7d6


[ROCm/amdsmi commit: 0da6613b99]
2025-04-04 18:25:17 -05:00
Galantsev, Dmitrii 2c88f8ebe6 CI - Disable example builds after breakage
Change-Id: I8a070dd65ed752b2485c17e0eeb5bc1dc875931e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: a0e6c1c1bd]
2025-04-04 18:19:20 -05:00
Galantsev, Dmitrii ea1fcea378 CMAKE - Fix examples and clean up unused variables
Change-Id: Ie072476a525b49bb7c9c0fb9e49393a482a7d0b0
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 396afadd43]
2025-04-04 18:19:20 -05:00
Justin Williams a99db46c4c [SWDEV-521116] Added more_itertools workaround to 6.4.0 known issues
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 7d5cb0d287]
2025-04-03 13:34:42 -05:00
Galantsev, Dmitrii 0e1bc25280 .clangd - Add -Wno-c++20-designator
Change-Id: I344f12e2f99e795e011de8d4426e76c282190918


[ROCm/amdsmi commit: 1517436b83]
2025-04-03 12:53:05 -05:00
Justin Williams e8db1d64f6 [SWDEV-500518] Added RHEL10
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 846f0f5688]
2025-04-03 09:04:23 -05:00
Kanangot Balakrishnan, Bindhiya 17ed406553 [SWDEV-524528] Modify the amd-smi monitor to use drm VRAM API (#228)
Updated the amd-smi monitor to use VRAM drm API.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: af9afacfbd]
2025-04-01 17:05:14 -05:00
Arif, Maisam 237334ef65 [SWDEV-521408] Fixed call to amdsmi_get_gpu_virtualization_mode (#230)
Change-Id: I29c86f8982b53cc139004ebc06b26a5d8f430091

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 35fbe2cbf1]
2025-04-01 16:57:23 -05:00
Mallya, Ameya Keshava e4d8950a0e Fix syntax to mainline
Signed-off-by: Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com>

[ROCm/amdsmi commit: 19a5f25829]
2025-04-01 09:47:57 -07:00
Arif, Maisam 2ea49b6b33 [SWDEV-520754] Fixed unboundLocalError for Mulit-VF (#225)
Change-Id: Ib1c0826342a5882fde6ddd4f06f058462226b82d

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 307de69149]
2025-04-01 11:21:56 -05:00
Mallya, Ameya Keshava 0114eca2b1 Adding !verify features
Signed-off-by: Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com>

[ROCm/amdsmi commit: 6a8de725b7]
2025-03-31 13:32:41 -07:00
Arif, Maisam 56fa8ec779 Update quick start tool (#219)
Added CLI libs to amdsmi_quick_start.py

Change-Id: I72428d083dbff6224e57a97b954f602c72d323e8

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: d7416c98d7]
2025-03-29 12:06:02 -05:00
Galantsev, Dmitrii e798e5336f Bump Version 25.4.0
Change-Id: Ief60ff2270e7e73d4e14b5181fa6fb18e32bcc1e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: b0129c390c]
2025-03-28 21:50:38 -05:00
Arif, Maisam c5a819b6b9 Revert "[SWDEV-493519] Fix Getting Version Information (#201)"
This reverts commit ebdfe2ea21.


[ROCm/amdsmi commit: 7ff8041afa]
2025-03-28 21:37:14 -05:00
Yuan, Perry b92ffd2bcf [SWDEV-482949] Add CPU model name querying support (#33)
- Add support to check CPU vendor info which will be called by RDC to
discovery CPU information
- Move esmi headers declaration to impl/amd_smi_common.h
- remove duplicated amdsmi_cpu_util_t

---------

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>

[ROCm/amdsmi commit: 68e44c7f66]
2025-03-28 21:21:39 -05:00
Arif, Maisam 952bff9126 [SWDEV-524528] Nullptr check correction for TestErrCntRead (#211)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1331a69544a6f5b7b61ea4655b635b42bbb56444

[ROCm/amdsmi commit: 13c222a103]
2025-03-28 11:18:22 -05:00
Narlo, Joseph ebdfe2ea21 [SWDEV-493519] Fix Getting Version Information (#201)
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>

[ROCm/amdsmi commit: df8ee3db85]
2025-03-28 11:12:21 -05:00
Maisam Arif 27ba6fcb86 [SWDEV-524528] Nullptr check correction for TestErrCntRead
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1331a69544a6f5b7b61ea4655b635b42bbb56444


[ROCm/amdsmi commit: 3aac3801d1]
2025-03-28 11:11:58 -05:00
Mallya, Ameya Keshava eb2fef9767 Added KWS check for amd-mainline
Signed-off-by: Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com>

[ROCm/amdsmi commit: edf70ea81a]
2025-03-28 08:07:02 -07:00
Arif, Maisam 2ce2d46609 [SWDEV-500518] Updated AMDSMI sanity checks (#209)
[ROCm/amdsmi commit: a5707dfced]
2025-03-27 21:58:09 -05:00
Justin Williams 2914543174 [SWDEV-500518] Updated AMDSMI sanity checks
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 4e3a197dcc]
2025-03-27 18:35:13 -05:00
Liu, Shuzhou (Bill) c3a0ec4f9a [SWDEV-524147] Patch for handling new ras filenames (#205)
The code is changed to handle both original and ACA based ECC counters
for backward compatibilities.

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 9b6e0432b2]
2025-03-27 15:36:43 -05:00
Kanangot Balakrishnan, Bindhiya a5f5da8b90 [SWDEV-513855] Add power cap to power monitor (#193)
Added power cap to display on amd-smi monitor -p.
Updated help and Changelog as well.

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 7d109001ac]
2025-03-26 17:45:08 -05:00
Kanangot Balakrishnan, Bindhiya b442530b74 [SWDEV-513958] Add help flag to valid commands (#204)
Added '-h' flag to valid first input command list

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 9b64dcb61a]
2025-03-26 17:43:28 -05:00
Kanangot Balakrishnan, Bindhiya f13bc29e0d [SWDEV-520148] Modify VRAM details in monitor output (#199)
Earlier amd-smi monitor was showing VRAM usage as used and total.
Modified it to display free VRAM and VRAM percentage. Updated
Changelog.

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 3ddfbcc0a3]
2025-03-26 13:12:41 -05:00
Kanangot Balakrishnan, Bindhiya ec9be97b9f [SWDEV-513958] Fix error message due to argparse behavior (#108)
When argparse parses multiple invalid arguments, the error
message displays only the last argument and this leads to
confusion. To avoid the scenario, added valid command check
before argparse and in case of invalid first command, added
new exception.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: 3681f900ee]
2025-03-26 13:11:17 -05:00
Su, Daniel a735c401af External CI: enable trigger for amd-mainline (#189)
Signed-off-by: Su, Daniel <Daniel.Su@amd.com>

[ROCm/amdsmi commit: dc7a5bb925]
2025-03-26 12:31:30 -05:00
Galantsev, Dmitrii 08f0ce8a21 CHANGELOG - Remove power api breakage info (#202)
Change-Id: Ida7b02f0731915b4ca659ca54f952618527e5cdf

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>

[ROCm/amdsmi commit: 27f1466416]
2025-03-26 10:32:45 -05:00
Williams, Justin 73ee5233f4 [SWDEV-500518] Fixed Uninstall Checks (#187)
Signed-off-by: Justin Williams <Justin.Williams@amd.com>

[ROCm/amdsmi commit: 50a1c2905a]
2025-03-26 08:57:05 -05:00
Pham, Gabriel a98c8bca1c [SWDEV-520754] Fixed str int concatenation issue (#186)
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>

[ROCm/amdsmi commit: b72cd22225]
2025-03-25 17:43:59 -05:00
Poag, Charis e014342896 [SWDEV-513807] Fix amd-smi partition --accelerator not returning AMDSMI_STATUS_NO_PERM (#192)
* [SWDEV-513807] Fix amd-smi partition --accelerator not returning AMDSMI_STATUS_NO_PERM

Changes:
- Fixed amdsmi_get_gpu_accelerator_partition_profile_config() from not
  returning AMDSMI_STATUS_NO_PERM
- Changed amd-smi partition --accelerator to provide user with a warning
  if users does not use sudo or root permissions.
- Updated changelog for fixes planned for 6.4.1 release

Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: 0402bb4d75]
2025-03-20 17:23:01 -05:00
Galantsev, Dmitrii 633d2a8890 Make amdsmi_get_power_info backwards compatible
Change-Id: Ie5b4c35265827e78934caa94c142d31efce597e4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 4a3c70136f]
2025-03-19 23:23:48 -05:00
Castillo, Juan fff2d21baf SWDEV-518209: GPU Metrics 1.8 (#177)
- Updates:
    - Adding the following metrics to allow new calculations for violation status:
        - Per XCP metrics gfx_below_host_limit_ppt_acc
        - Per XCP metrics gfx_below_host_limit_thm_acc
        - Per XCP metrics gfx_low_utilization_acc
        - Per XCP metrics gfx_below_host_limit_total_acc
    - Increasing available JPEG engines to 40. Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI.

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: 7c882b2f69]
2025-03-19 10:24:02 -05:00
Mallya, Ameya Keshava 2f5792e208 Added release trigger for further releases
Signed-off-by: Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com>

[ROCm/amdsmi commit: fba8b3f6e1]
2025-03-14 13:48:55 -07:00