Граф коммитов

1485 Коммитов

Автор SHA1 Сообщение Дата
Poag, Charis 58a16a1998 [SWDEV-528097] Fix HIP_UUID output from updates (#311)
Changes:
- Changed HIP_UUID to reference rsmi_dev_unique_id_get()
- Added better logging for amdsmi_get_gpu_device_uuid() references

Change-Id: Ie233044de8c6e85b807faf22121b450233db861b

Signed-off-by: Charis Poag <Charis.Poag@amd.com>

[ROCm/amdsmi commit: 817c077067]
2025-04-24 10:21:04 -05:00
Kanangot Balakrishnan, Bindhiya 8ae4c30ae9 [SWDEV-520371] Generate valid json format output (#273)
Earlier, the amd-smi metric and static json output
was not in valid json format. Changes are done to
get the output in valid json format.

---------
Change-Id: I5576333269509f63b3c800f225c3d73127ce80cf

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 8e5f6b1a8d]
2025-04-23 00:08:43 -05:00
Arif, Maisam c5cf1ba92f Improved import structure (#275)
* Improved import structure

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Id265cbb7dba5ba805b7cf7c353af870fef6cbb4a

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: b2f1df85a3]
2025-04-22 23:52:06 -05:00
Arif, Maisam a5c2ac9f87 CLI Help text and parser formatting updates (#218)
* Small Fixes
* CLI Help text and parser formatting updates
* Changed metavar for set partition

---------
Change-Id: Ia8809665f6fac670452cd4db4e5e8f9c7270faba
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Co-authored-by: Pham, Gabriel <Gabriel.Pham@amd.com>

[ROCm/amdsmi commit: 53dbb7bf58]
2025-04-22 23:32:42 -05:00
Arif, Maisam 05c80e7ace Reduce Load times for Partition CLI (#290)
* Reduced Load times for CLI in partition mode
* Change rsmi_dev_id_get() to use KFD, if KGD interface does not exist
* Make gpu_device_uuid fallback to rsmi_wrapper
* Moved Enumeration info calls in list for more speed
* Moved made group check excluded from recursion

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>

[ROCm/amdsmi commit: 63b13ecb05]
2025-04-22 22:54:43 -05:00
AL Musaffar, Yazen 5c59f20f22 [SWDEV-528364] CPER CLI --follow fix (#298)
Signed-off-by: Yazen ALMusaffar <yalmusaf@amd.com>

[ROCm/amdsmi commit: f1f782312d]
2025-04-22 22:52:03 -05:00
Poag, Charis 36a8775ddd [SWDEV-528097] Unique ID fix for missing ID in KGD -> use KFD's (#292)
Changes:
   - Unique Id tries reading from KGD
     -> falls back to use KFD if not found

Change-Id: I05456dd79715e04d83f118b5bb4f1d3612822173
---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: b58625cafa]
2025-04-22 16:27:33 -05:00
Hila, Nino 808ee84222 Add palamida.yml
[ROCm/amdsmi commit: bd20b1b150]
2025-04-22 16:07:51 -05:00
Justin Williams f997769c8f [SWDEV-527430] Add API breakage checks for amd-smi
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 7aa45b2f55]
2025-04-22 13:18:39 -05:00
AL Musaffar, Yazen 0551b2aa67 Removed CPER tests and adjust the implementation (#269)
- Moved helper functions into amdsmi_utils.cc
- Removed tests since they are not working.

---------

Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com>

[ROCm/amdsmi commit: d6954bcc62]
2025-04-21 14:54:47 -05:00
Maisam Arif f635fab179 Reduced Monitor Calls to gpu_metrics_info()
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Iffb1ba9e91452016feb16da039ce96d63a5ce3e2


[ROCm/amdsmi commit: 925fc6d194]
2025-04-17 19:47:17 -05:00
Justin Williams 94d451bc03 CI - Add mainline labels automatically
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 369afa85eb]
2025-04-17 17:53:09 -05:00
Castillo, Juan 21e32ffe4a [SWDEV-523794] Update to fix MIN_CLK and MAX_CLK incorrect values
(#280)

- Fixed potential issue with min/max values when only one frequency is available
- Improve error handling in GPU frequency range detection
- Refactor clock frequency range detection for better readability
- Added special handling for current frequency indicator (*) in DPM output
- Added comments explaining special case handling for current frequency
- Cleaned up incorrect definitions in hsmp metric table definition

---------

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 4d92dea079]
2025-04-17 17:46:04 -05:00
dependabot[bot] a0ce11135d Bump jinja2 from 3.1.5 to 3.1.6 in /docs/sphinx
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-version: 3.1.6
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>

[ROCm/amdsmi commit: 581ad75729]
2025-04-16 15:08:41 -05:00
Liu, Shuzhou (Bill) 7a1d836338 [SWDEV-526610] Palamida scan remediation copyright (#279)
Add missing copyrights
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>

[ROCm/amdsmi commit: d73452b3bf]
2025-04-16 14:54:45 -05:00
Williams, Justin d4dd78c8b5 [SWDEV-500518] Redesigned CI
[ROCm/amdsmi commit: 4e7c8dc5c9]
2025-04-15 22:09:49 -05:00
Galantsev, Dmitrii 3e82aba71f CI - Add cherrypick labels automatically
Change-Id: Ibea66f45e082a7c425f1b927775629c2f05d7c32
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 690a5279eb]
2025-04-15 18:57:16 -05:00
Maisam Arif 23b4110f72 [SWDEV-527630] Fix invalid json comparision
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I248c9089fcc8feafe402e960b7a05b641596403f


[ROCm/amdsmi commit: f906bf828d]
2025-04-15 10:45:14 -05:00
AL Musaffar, Yazen adb5060ecb Fix binary dump
Change-Id: I3d91a7d33fc6860eea27fb396937139fe229daeb
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>


[ROCm/amdsmi commit: 9b66bc5690]
2025-04-14 19:26:34 -05:00
Maisam Arif 3e419ee84b CPER Tests fix
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I5c1b85c37df07b912ad82b50a3658a8a7edaccb1


[ROCm/amdsmi commit: 81c53e179d]
2025-04-14 19:26:34 -05:00
Mewar, Deepak 94a54d24a5 [SWDEV-499995] amdsmi updated for esmi library changes (#266)
CMakelist updated to latest esmi tag esmi_pkg_ver-4.2, which
has fixes for esmi warnings during amdsmi build,

amdsmi_get_cpu_current_xgmi_bw updated as per change in
corresponding esmi library API

Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>

[ROCm/amdsmi commit: 49aa2af045]
2025-04-14 19:21:51 -05:00
Galantsev, Dmitrii b1ec78b54b Add amdsmi_get_gpu_busy_percent
This is required for GPU busy percent in RDC

Change-Id: Idf2ab72993ecc8227958e6eb47f36fc68c93759f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 955ceac78a]
2025-04-14 10:40:13 -05:00
Galiffi, David 7653d44090 [SWDEV-526012] Enable RPM autoprov (#246)
Update help_package.cmake

Signed-off-by: Galiffi, David <David.Galiffi@amd.com>

[ROCm/amdsmi commit: 7592ffa8f5]
2025-04-14 04:20:55 -05:00
Kanangot Balakrishnan, Bindhiya 58b46c5c9d [SWDEV-516592] Add python interface API for Bad Page Threshold (#141)
- Added python interface APIs for amdsmi_get_gpu_bad_page_threshold()
 - Updated the docs and changelog.

---------

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 9d7964dff5]
2025-04-14 04:19:45 -05:00
Charis Poag 1413ae1431 [SWDEV-518325/SWDEV-518320/SWDEV-443309] Changelog addition
Change-Id: I29567229f0e27d307ac3df935b5a5fab8ca43409
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 19a4775d32]
2025-04-14 03:40:35 -05:00
Charis Poag 8d4a4d7b14 [SWDEV-518325/SWDEV-518320/SWDEV-443309] Fix Partition Enumeration
* Changes:
  - Updates to DRM renderD* / card* pathing for partition
  - Now use KFD to discover AMD devices and populate accordingly
    Device MUST have an accessible KFD node (via cgroups)
  - Updated serveral AMD SMI CLI outputs to handle SYSFS files
    which are not accessible on partition nodes
  - Tests are updated to handle not supported features
  - Added new method to help get card/drm info
    (rsmi_dev_device_identifiers_get) from ROCm SMI
  - Renamed device->get_card_id() & device->get_drm_render_minor()
    These can now be used on internal AMD SMI calls.
  - Removed warnings shown in build

Change-Id: Ice882fd9b97fb625a5bd4ef327f3ceaf247dc570
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 4782528770]
2025-04-12 14:41:38 -05:00
Justin Williams 484614fe9b [SWDEV-521116] Added 'more_itertools" error workaround
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: af943ac05c]
2025-04-12 13:42:43 -05:00
Arif, Maisam 7ea98e06dd [SWDEV-511234] Added amdsmi_get_gpu_cper_entries & CLI implementation
Added amdsmi_get_gpu_cper_entries() in the python and C APIs

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com>
Co-authored-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>

[ROCm/amdsmi commit: d81871ef16]
2025-04-12 01:54:57 -05:00
Justin Williams 574144c9a0 Created ABI Compliance Checker
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 3f75cd906f]
2025-04-09 12:40:02 -05:00
Galantsev, Dmitrii 055f603ab9 Revert "CMAKE - Force INSTALL_LIBDIR to be lib"
This reverts commit 7bbe33c94d.


[ROCm/amdsmi commit: 2e429ed890]
2025-04-09 01:38:42 -05:00
Galantsev, Dmitrii 7bbe33c94d CMAKE - Force INSTALL_LIBDIR to be lib
On some systems it defaults to lib64, on others to lib.

Change-Id: I973b488253d106ded518ee590a0edb370927f9a4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 62c10bfe3c]
2025-04-08 16:02:43 +00:00
Williams, Justin 52195d6505 [SWDEV-500518] Updated CI Structure (#244)
Signed-off-by: Justin Williams <Justin.Williams@amd.com>

[ROCm/amdsmi commit: e3ab8cf71b]
2025-04-07 15:02:57 -05:00
Pham, Gabriel b485d4ba70 [SWDEV-524288] Fixed duplication of GPU id in events. (#233)
* Fixed duplication of GPU id in events.

---------

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>

[ROCm/amdsmi commit: e2c371ece4]
2025-04-04 18:31:08 -05:00
Arif, Maisam cb3c979dfe Removed unnecessary rocm-smi files (#217)
* Removed unnecessary rocm-smi files
* Moved the update wrapper script into the tools folder

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 50d7d5287f]
2025-04-04 18:26:43 -05:00
Maisam Arif dab3e0657b Updated imports on amdsmi_quick_start
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ideb0f2addf61fb6bdb728e549a8b0f133682d7d6


[ROCm/amdsmi commit: 0da6613b99]
2025-04-04 18:25:17 -05:00
Galantsev, Dmitrii 2c88f8ebe6 CI - Disable example builds after breakage
Change-Id: I8a070dd65ed752b2485c17e0eeb5bc1dc875931e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: a0e6c1c1bd]
2025-04-04 18:19:20 -05:00
Galantsev, Dmitrii ea1fcea378 CMAKE - Fix examples and clean up unused variables
Change-Id: Ie072476a525b49bb7c9c0fb9e49393a482a7d0b0
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 396afadd43]
2025-04-04 18:19:20 -05:00
Justin Williams a99db46c4c [SWDEV-521116] Added more_itertools workaround to 6.4.0 known issues
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 7d5cb0d287]
2025-04-03 13:34:42 -05:00
Galantsev, Dmitrii 0e1bc25280 .clangd - Add -Wno-c++20-designator
Change-Id: I344f12e2f99e795e011de8d4426e76c282190918


[ROCm/amdsmi commit: 1517436b83]
2025-04-03 12:53:05 -05:00
Justin Williams e8db1d64f6 [SWDEV-500518] Added RHEL10
Signed-off-by: Justin Williams <Justin.Williams@amd.com>


[ROCm/amdsmi commit: 846f0f5688]
2025-04-03 09:04:23 -05:00
Kanangot Balakrishnan, Bindhiya 17ed406553 [SWDEV-524528] Modify the amd-smi monitor to use drm VRAM API (#228)
Updated the amd-smi monitor to use VRAM drm API.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>

[ROCm/amdsmi commit: af9afacfbd]
2025-04-01 17:05:14 -05:00
Arif, Maisam 237334ef65 [SWDEV-521408] Fixed call to amdsmi_get_gpu_virtualization_mode (#230)
Change-Id: I29c86f8982b53cc139004ebc06b26a5d8f430091

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 35fbe2cbf1]
2025-04-01 16:57:23 -05:00
Mallya, Ameya Keshava e4d8950a0e Fix syntax to mainline
Signed-off-by: Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com>

[ROCm/amdsmi commit: 19a5f25829]
2025-04-01 09:47:57 -07:00
Arif, Maisam 2ea49b6b33 [SWDEV-520754] Fixed unboundLocalError for Mulit-VF (#225)
Change-Id: Ib1c0826342a5882fde6ddd4f06f058462226b82d

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: 307de69149]
2025-04-01 11:21:56 -05:00
Mallya, Ameya Keshava 0114eca2b1 Adding !verify features
Signed-off-by: Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com>

[ROCm/amdsmi commit: 6a8de725b7]
2025-03-31 13:32:41 -07:00
Arif, Maisam 56fa8ec779 Update quick start tool (#219)
Added CLI libs to amdsmi_quick_start.py

Change-Id: I72428d083dbff6224e57a97b954f602c72d323e8

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

[ROCm/amdsmi commit: d7416c98d7]
2025-03-29 12:06:02 -05:00
Galantsev, Dmitrii e798e5336f Bump Version 25.4.0
Change-Id: Ief60ff2270e7e73d4e14b5181fa6fb18e32bcc1e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: b0129c390c]
2025-03-28 21:50:38 -05:00
Arif, Maisam c5a819b6b9 Revert "[SWDEV-493519] Fix Getting Version Information (#201)"
This reverts commit ebdfe2ea21.


[ROCm/amdsmi commit: 7ff8041afa]
2025-03-28 21:37:14 -05:00
Yuan, Perry b92ffd2bcf [SWDEV-482949] Add CPU model name querying support (#33)
- Add support to check CPU vendor info which will be called by RDC to
discovery CPU information
- Move esmi headers declaration to impl/amd_smi_common.h
- remove duplicated amdsmi_cpu_util_t

---------

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>

[ROCm/amdsmi commit: 68e44c7f66]
2025-03-28 21:21:39 -05:00
Arif, Maisam 952bff9126 [SWDEV-524528] Nullptr check correction for TestErrCntRead (#211)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1331a69544a6f5b7b61ea4655b635b42bbb56444

[ROCm/amdsmi commit: 13c222a103]
2025-03-28 11:18:22 -05:00