Poag, Charis
b58625cafa
[SWDEV-528097] Unique ID fix for missing ID in KGD -> use KFD's ( #292 )
...
Changes:
- Unique Id tries reading from KGD
-> falls back to use KFD if not found
Change-Id: I05456dd79715e04d83f118b5bb4f1d3612822173
---------
Signed-off-by: Charis Poag <Charis.Poag@amd.com >
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com >
2025-04-22 16:27:33 -05:00
Hila, Nino
bd20b1b150
Add palamida.yml
2025-04-22 16:07:51 -05:00
Justin Williams
7aa45b2f55
[SWDEV-527430] Add API breakage checks for amd-smi
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
2025-04-22 13:18:39 -05:00
AL Musaffar, Yazen
d6954bcc62
Removed CPER tests and adjust the implementation ( #269 )
...
- Moved helper functions into amdsmi_utils.cc
- Removed tests since they are not working.
---------
Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com >
2025-04-21 14:54:47 -05:00
Maisam Arif
925fc6d194
Reduced Monitor Calls to gpu_metrics_info()
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: Iffb1ba9e91452016feb16da039ce96d63a5ce3e2
2025-04-17 19:47:17 -05:00
Justin Williams
369afa85eb
CI - Add mainline labels automatically
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
2025-04-17 17:53:09 -05:00
Castillo, Juan
4d92dea079
[SWDEV-523794] Update to fix MIN_CLK and MAX_CLK incorrect values
...
(#280 )
- Fixed potential issue with min/max values when only one frequency is available
- Improve error handling in GPU frequency range detection
- Refactor clock frequency range detection for better readability
- Added special handling for current frequency indicator (*) in DPM output
- Added comments explaining special case handling for current frequency
- Cleaned up incorrect definitions in hsmp metric table definition
---------
Signed-off-by: Juan Castillo <juan.castillo@amd.com >
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com >
2025-04-17 17:46:04 -05:00
dependabot[bot]
581ad75729
Bump jinja2 from 3.1.5 to 3.1.6 in /docs/sphinx
...
Bumps [jinja2](https://github.com/pallets/jinja ) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases )
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst )
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6 )
---
updated-dependencies:
- dependency-name: jinja2
dependency-version: 3.1.6
dependency-type: indirect
...
Signed-off-by: dependabot[bot] <support@github.com >
2025-04-16 15:08:41 -05:00
Liu, Shuzhou (Bill)
d73452b3bf
[SWDEV-526610] Palamida scan remediation copyright ( #279 )
...
Add missing copyrights
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com >
2025-04-16 14:54:45 -05:00
Williams, Justin
4e7c8dc5c9
[SWDEV-500518] Redesigned CI
2025-04-15 22:09:49 -05:00
Galantsev, Dmitrii
690a5279eb
CI - Add cherrypick labels automatically
...
Change-Id: Ibea66f45e082a7c425f1b927775629c2f05d7c32
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com >
2025-04-15 18:57:16 -05:00
Maisam Arif
f906bf828d
[SWDEV-527630] Fix invalid json comparision
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I248c9089fcc8feafe402e960b7a05b641596403f
2025-04-15 10:45:14 -05:00
AL Musaffar, Yazen
9b66bc5690
Fix binary dump
...
Change-Id: I3d91a7d33fc6860eea27fb396937139fe229daeb
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com >
2025-04-14 19:26:34 -05:00
Maisam Arif
81c53e179d
CPER Tests fix
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I5c1b85c37df07b912ad82b50a3658a8a7edaccb1
2025-04-14 19:26:34 -05:00
Mewar, Deepak
49aa2af045
[SWDEV-499995] amdsmi updated for esmi library changes ( #266 )
...
CMakelist updated to latest esmi tag esmi_pkg_ver-4.2, which
has fixes for esmi warnings during amdsmi build,
amdsmi_get_cpu_current_xgmi_bw updated as per change in
corresponding esmi library API
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com >
2025-04-14 19:21:51 -05:00
Galantsev, Dmitrii
955ceac78a
Add amdsmi_get_gpu_busy_percent
...
This is required for GPU busy percent in RDC
Change-Id: Idf2ab72993ecc8227958e6eb47f36fc68c93759f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com >
2025-04-14 10:40:13 -05:00
Galiffi, David
7592ffa8f5
[SWDEV-526012] Enable RPM autoprov ( #246 )
...
Update help_package.cmake
Signed-off-by: Galiffi, David <David.Galiffi@amd.com >
2025-04-14 04:20:55 -05:00
Kanangot Balakrishnan, Bindhiya
9d7964dff5
[SWDEV-516592] Add python interface API for Bad Page Threshold ( #141 )
...
- Added python interface APIs for amdsmi_get_gpu_bad_page_threshold()
- Updated the docs and changelog.
---------
Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com >
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com >
2025-04-14 04:19:45 -05:00
Charis Poag
19a4775d32
[SWDEV-518325/SWDEV-518320/SWDEV-443309] Changelog addition
...
Change-Id: I29567229f0e27d307ac3df935b5a5fab8ca43409
Signed-off-by: Charis Poag <Charis.Poag@amd.com >
2025-04-14 03:40:35 -05:00
Charis Poag
4782528770
[SWDEV-518325/SWDEV-518320/SWDEV-443309] Fix Partition Enumeration
...
* Changes:
- Updates to DRM renderD* / card* pathing for partition
- Now use KFD to discover AMD devices and populate accordingly
Device MUST have an accessible KFD node (via cgroups)
- Updated serveral AMD SMI CLI outputs to handle SYSFS files
which are not accessible on partition nodes
- Tests are updated to handle not supported features
- Added new method to help get card/drm info
(rsmi_dev_device_identifiers_get) from ROCm SMI
- Renamed device->get_card_id() & device->get_drm_render_minor()
These can now be used on internal AMD SMI calls.
- Removed warnings shown in build
Change-Id: Ice882fd9b97fb625a5bd4ef327f3ceaf247dc570
Signed-off-by: Charis Poag <Charis.Poag@amd.com >
2025-04-12 14:41:38 -05:00
Justin Williams
af943ac05c
[SWDEV-521116] Added 'more_itertools" error workaround
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
2025-04-12 13:42:43 -05:00
Arif, Maisam
d81871ef16
[SWDEV-511234] Added amdsmi_get_gpu_cper_entries & CLI implementation
...
Added amdsmi_get_gpu_cper_entries() in the python and C APIs
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com >
Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com >
Co-authored-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com >
2025-04-12 01:54:57 -05:00
Justin Williams
3f75cd906f
Created ABI Compliance Checker
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
2025-04-09 12:40:02 -05:00
Galantsev, Dmitrii
2e429ed890
Revert "CMAKE - Force INSTALL_LIBDIR to be lib"
...
This reverts commit 62c10bfe3c .
2025-04-09 01:38:42 -05:00
Galantsev, Dmitrii
62c10bfe3c
CMAKE - Force INSTALL_LIBDIR to be lib
...
On some systems it defaults to lib64, on others to lib.
Change-Id: I973b488253d106ded518ee590a0edb370927f9a4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com >
2025-04-08 16:02:43 +00:00
Williams, Justin
e3ab8cf71b
[SWDEV-500518] Updated CI Structure ( #244 )
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
2025-04-07 15:02:57 -05:00
Pham, Gabriel
e2c371ece4
[SWDEV-524288] Fixed duplication of GPU id in events. ( #233 )
...
* Fixed duplication of GPU id in events.
---------
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com >
2025-04-04 18:31:08 -05:00
Arif, Maisam
50d7d5287f
Removed unnecessary rocm-smi files ( #217 )
...
* Removed unnecessary rocm-smi files
* Moved the update wrapper script into the tools folder
---------
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
2025-04-04 18:26:43 -05:00
Maisam Arif
0da6613b99
Updated imports on amdsmi_quick_start
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: Ideb0f2addf61fb6bdb728e549a8b0f133682d7d6
2025-04-04 18:25:17 -05:00
Galantsev, Dmitrii
a0e6c1c1bd
CI - Disable example builds after breakage
...
Change-Id: I8a070dd65ed752b2485c17e0eeb5bc1dc875931e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com >
2025-04-04 18:19:20 -05:00
Galantsev, Dmitrii
396afadd43
CMAKE - Fix examples and clean up unused variables
...
Change-Id: Ie072476a525b49bb7c9c0fb9e49393a482a7d0b0
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com >
2025-04-04 18:19:20 -05:00
Justin Williams
7d5cb0d287
[SWDEV-521116] Added more_itertools workaround to 6.4.0 known issues
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
2025-04-03 13:34:42 -05:00
Galantsev, Dmitrii
1517436b83
.clangd - Add -Wno-c++20-designator
...
Change-Id: I344f12e2f99e795e011de8d4426e76c282190918
2025-04-03 12:53:05 -05:00
Justin Williams
846f0f5688
[SWDEV-500518] Added RHEL10
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
2025-04-03 09:04:23 -05:00
Kanangot Balakrishnan, Bindhiya
af9afacfbd
[SWDEV-524528] Modify the amd-smi monitor to use drm VRAM API ( #228 )
...
Updated the amd-smi monitor to use VRAM drm API.
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com >
2025-04-01 17:05:14 -05:00
Arif, Maisam
35fbe2cbf1
[SWDEV-521408] Fixed call to amdsmi_get_gpu_virtualization_mode ( #230 )
...
Change-Id: I29c86f8982b53cc139004ebc06b26a5d8f430091
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
2025-04-01 16:57:23 -05:00
Mallya, Ameya Keshava
19a5f25829
Fix syntax to mainline
...
Signed-off-by: Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com >
2025-04-01 09:47:57 -07:00
Arif, Maisam
307de69149
[SWDEV-520754] Fixed unboundLocalError for Mulit-VF ( #225 )
...
Change-Id: Ib1c0826342a5882fde6ddd4f06f058462226b82d
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
2025-04-01 11:21:56 -05:00
Mallya, Ameya Keshava
6a8de725b7
Adding !verify features
...
Signed-off-by: Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com >
2025-03-31 13:32:41 -07:00
Arif, Maisam
d7416c98d7
Update quick start tool ( #219 )
...
Added CLI libs to amdsmi_quick_start.py
Change-Id: I72428d083dbff6224e57a97b954f602c72d323e8
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
2025-03-29 12:06:02 -05:00
Galantsev, Dmitrii
b0129c390c
Bump Version 25.4.0
...
Change-Id: Ief60ff2270e7e73d4e14b5181fa6fb18e32bcc1e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com >
2025-03-28 21:50:38 -05:00
Arif, Maisam
7ff8041afa
Revert "[SWDEV-493519] Fix Getting Version Information ( #201 )"
...
This reverts commit df8ee3db85 .
2025-03-28 21:37:14 -05:00
Yuan, Perry
68e44c7f66
[SWDEV-482949] Add CPU model name querying support ( #33 )
...
- Add support to check CPU vendor info which will be called by RDC to
discovery CPU information
- Move esmi headers declaration to impl/amd_smi_common.h
- remove duplicated amdsmi_cpu_util_t
---------
Signed-off-by: Perry Yuan <perry.yuan@amd.com >
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com >
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com >
2025-03-28 21:21:39 -05:00
Arif, Maisam
13c222a103
[SWDEV-524528] Nullptr check correction for TestErrCntRead ( #211 )
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I1331a69544a6f5b7b61ea4655b635b42bbb56444
2025-03-28 11:18:22 -05:00
Narlo, Joseph
df8ee3db85
[SWDEV-493519] Fix Getting Version Information ( #201 )
...
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com >
2025-03-28 11:12:21 -05:00
Maisam Arif
3aac3801d1
[SWDEV-524528] Nullptr check correction for TestErrCntRead
...
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Change-Id: I1331a69544a6f5b7b61ea4655b635b42bbb56444
2025-03-28 11:11:58 -05:00
Mallya, Ameya Keshava
edf70ea81a
Added KWS check for amd-mainline
...
Signed-off-by: Mallya, Ameya Keshava <AmeyaKeshava.Mallya@amd.com >
2025-03-28 08:07:02 -07:00
Arif, Maisam
a5707dfced
[SWDEV-500518] Updated AMDSMI sanity checks ( #209 )
2025-03-27 21:58:09 -05:00
Justin Williams
4e3a197dcc
[SWDEV-500518] Updated AMDSMI sanity checks
...
Signed-off-by: Justin Williams <Justin.Williams@amd.com >
2025-03-27 18:35:13 -05:00
Liu, Shuzhou (Bill)
9b6e0432b2
[SWDEV-524147] Patch for handling new ras filenames ( #205 )
...
The code is changed to handle both original and ACA based ECC counters
for backward compatibilities.
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com >
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com >
2025-03-27 15:36:43 -05:00