Commit Graph

1365 Commits

Author SHA1 Message Date
Joe Narlo 01d303806a SWDEV-504389 [AMD-SMI] Synching Comments in Linux BM
Sync comments from Unified Header to Linux BM

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: I9b1ae94db68761a7963ad87cd60177a57e93ad85


[ROCm/amdsmi commit: ef31bb7166]
2024-12-18 10:57:06 -06:00
Choudhary, Rahul 0375bc03b3 Update rocm_ci_caller.yml fixing base ref
base ref to cover both pull and push request

[ROCm/amdsmi commit: 6ffe28fb47]
2024-12-17 12:19:06 -08:00
Choudhary, Rahul 12ef77a426 Create rocm_ci_caller.yml adding workflow caller for PSDB and OSDB
[ROCm/amdsmi commit: 2c36a327de]
2024-12-16 22:06:03 -08:00
Choudhary, Rahul 5114f28549 Create codeql.yml
copied from previous repo

[ROCm/amdsmi commit: c11a7f6eb9]
2024-12-16 22:03:54 -08:00
Joe Narlo 68497c68e9 SWDEV-492272 [AMDSMI] Build/Compiler warnings messages
Fix compiler warnings

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: I10657b8f3ef18a9b45311e8f6509958297a57823


[ROCm/amdsmi commit: d0a7332d32]
2024-12-13 00:38:07 -05:00
gabrpham d71dac9766 [SWDEV-484382] Added fclk and socclk to amd-smi metric -c
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Ie7e19c757b05455693c0d26eeb5e8b6c1e238375


[ROCm/amdsmi commit: fe290a2056]
2024-12-13 00:33:12 -05:00
gabrpham 49a0904903 [SWDEV-484382] Added new command amd-smi set -c/--clk-level
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: If45152e3a3c94f65b6a8a960601b9ed16fa3d0d7


[ROCm/amdsmi commit: 5f9c2db6f3]
2024-12-13 00:32:19 -05:00
gabrpham cb42e8e444 [SWDEV-484382] Added new command amd-smi static --clock
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I49e1aa2e699734d81c40c76c62da1cecc5bd3c0e


[ROCm/amdsmi commit: bc16e1a5da]
2024-12-13 00:30:29 -05:00
Maisam Arif df85708b46 [SWDEV-489060] Added python3-setuptools & python3-wheel for base images
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I222395e656469f67405bc94a86ab7f8fd1ed34a2


[ROCm/amdsmi commit: aed7749a2c]
2024-12-11 16:40:51 -06:00
Charis Poag 323ebacde0 Fix amd-smi firmware not printing YAML-like dictionary correctly
List string should take into account dictionary value types

Change-Id: Icc08288cb0007d43eacd1aff6d44c40a84ea9448
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 57f45954b7]
2024-12-11 10:48:43 -05:00
Justin Williams 458f2fcbd8 [SWDEV-479339/498804] Added AMDSMI Dockerfile
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
Change-Id: Ic7cc6eb6417708cff3f4a33b91a8ef6dcd2b2807


[ROCm/amdsmi commit: 2a1e2eed18]
2024-12-10 16:18:42 -05:00
Maisam Arif 6447baeedd Fixed spacing in amd-smi --xgmi
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I9fbd20c50a25aa3be80c8aa68eea37b81a74dc67


[ROCm/amdsmi commit: 554203c13a]
2024-12-10 15:45:06 -05:00
Charis Poag b99a90ead4 [SWDEV-475712] Fix MI2x target_graphics_version
Removed correcting target_graphics_version by
product name. Instead detected target_graphics_version which
needs to be corrected -> populate accordingly.

Change-Id: Ie9240a049313d9338f831ef47be973cd5c228612
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 7543a058ea]
2024-12-10 13:43:02 -05:00
Charis Poag 95cc962509 [SWDEV-488288] Remove GFX_BUSY_ACC from amd-smi metric --usage
Output is not helpful to users.

Change-Id: I12a60e28b8eab2fc3ffca4ea88f03018bf0ef3ce
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: bc0015fd36]
2024-12-10 13:37:36 -05:00
Charis Poag 6829980152 [SWDEV-495824] AMD SMI reporting CPX partitions incorrectly
Updated changelog to provide options to users on how to fix.

Change-Id: I4fd04b1e65ff9d678b2d13109599f57a03c84d41
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: b911a0606a]
2024-12-10 11:20:03 -05:00
Maisam Arif 0db64ff2b3 [SWDEV-503491] Updated Market Names
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ib56c4c96190e18708ef4d0d6358dd8d5b1ee9e6a


[ROCm/amdsmi commit: ddcfe28520]
2024-12-09 15:40:06 -05:00
Bindhiya Kanangot Balakrishnan efa7286410 [SWDEV-496639] Align amd-smi xgmi statistics
The xgmi read and write values were displayed in KB. The numbers became
unreadable due to misalignment. So, converted read and write values to
readable units using helper function. Updated Changelog.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Change-Id: I4c90a1de8a58c29cbdf43fe3480a1546f3946673


[ROCm/amdsmi commit: 288b11df37]
2024-12-09 12:57:45 -05:00
Charis Poag 44adc457b2 [SWDEV-502744] Fix "amd-smi monitor" shows VCN ENC utilization & clock but not VCN DEC
Reason for this fix:
Navi products use vclk and dclk for both encode and decode.
On MI products, only decode is supported.
Navi products cannot support displaying ENC_UTIL % at this time.

Change-Id: I107bb761794ae4724949ac21c110b23a4f616700
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: d323ecff97]
2024-12-07 12:11:10 -05:00
gabrpham cdecce3658 Fixed post reset and ring_hang issues
Issues include:
	SWDEV-480250
	SWDEV-480255
	SWDEV-480248
Known issue:
	`amd-smi event` has threads taking events from the same device
which, in the case of resetting gpus, makes it seem like some gpus have
reset mulitple times and other have not reset at all.

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Ic7dcc214e0366fc1532ece579d915d34d35d5407


[ROCm/amdsmi commit: bd01cfc203]
2024-12-06 17:46:00 -05:00
Bindhiya Kanangot Balakrishnan 3794b3796c [SWDEV-457845] Error code unification for amd-smi set
Earlier amd-smi set was returning different outputs in Linux
and Windows. In Linux it was returning ValueError. As part of
Error Code unification, corrected this output message.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Change-Id: Iba9ddd9c5b2bed0456f303e4373f6771c93608be


[ROCm/amdsmi commit: 1586005a5b]
2024-12-06 14:21:31 -05:00
Justin Williams 8976ba5628 [SWDEV-502001] Added amd_hsmp.h locally
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
Change-Id: I28e48913743f86fb5fc9082307ec326830d55960


[ROCm/amdsmi commit: 2c24cab86c]
2024-12-05 17:02:48 -05:00
Maisam Arif f707990d49 Added gpu_metrics table debug logs in monitor
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8aa96629a65df7a2d52ef9ed42a884732d097a54


[ROCm/amdsmi commit: bc3ac61641]
2024-12-05 15:18:13 -06:00
Joe Narlo 6cfeef5a2c SWDEV-502330 [AMD-SMI][Unified Header] Convert struct to typedef struct
Change struct to a typedef struct

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: I6f3b22a5219c0db0aab2c308b71213ae75334476


[ROCm/amdsmi commit: 547db10384]
2024-12-04 09:14:05 -05:00
Justin Williams 8c9cb42a58 [SWDEV-469278] Removed PyYAML Dependency
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
Change-Id: Idec32cfb0de84cc255b506d7f972e2750992745e


[ROCm/amdsmi commit: 2370aa1b40]
2024-12-03 15:40:44 -05:00
Bindhiya Kanangot Balakrishnan d3a0e9a72e [SWDEV-499030] Fix truncated FRU_ID
The FRU_ID was truncated because the string copied from sysfs
was limited to 32 characters. This limit has been increased to
AMDSMI_MAX_STRING_LENGTH to accommodate longer FRU_IDs. Also
updated the deprecated string length macros.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Change-Id: I8becaf9f37609b2e5aecdf92b6ae60f4419ad8ef


[ROCm/amdsmi commit: bc77330a74]
2024-12-03 13:43:53 -06:00
Bindhiya Kanangot Balakrishnan bfd480c6ba [SWDEV-498507] Tool amd-smi could be more case insensitive
Modified amdsmi_cli to accept case insensitive arguments if
the argument does not start with a single dash(-).

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Change-Id: I1b6320db0afaad0900d5a2049206002c3899fa71


[ROCm/amdsmi commit: fc7e1ddb4a]
2024-12-02 18:09:45 -05:00
Maisam Arif 835b438186 [SWDEV-502001] Fix link for amd_hsmp.h
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I402ee539cdd4c896acd7ccc83f3090c3a5eeba12


[ROCm/amdsmi commit: 664ade7354]
2024-12-02 16:30:06 -06:00
Charis Poag e3793942de [SWDEV-499029] Fix unable to change memory partition modes
Changes:
  * [API] Removed checking board name, fixes for other MI ASICs
  * [API] Fixed unable to restart AMD GPU, libdrm blocked
    doing this operation
  * [API] Added ability to unload/reload libdrm
    from within AMD SMI APIs
  * [CLI] Increased progress bar to change memory partition modes
    to 140 seconds, since driver reload is variable per system

Change-Id: I52f227f2ab850c4a6332ff3ecdc899903b1080f1
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 7d061f9ae4]
2024-11-25 09:28:02 -05:00
Joe Narlo b07c57b2cd SWDEV-497305 [AMDSMI] Consistent string lengths
Unify max string length to AMDSMI_MAX_STRING_LENGTH 256
Replace AMDSMI_NORMAL_STRING_LENGTH, AMDSMI_256_LENGTH

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: Ia81d738be0eefb9683ee53d51c969598fe587f50


[ROCm/amdsmi commit: 35d8e827b9]
2024-11-22 15:37:24 -05:00
Joe Narlo bad2cc9c23 SWDEV-495787 [AMDSMI] Different license headers
Change copyrights to MIT and remove date

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: I16f5b412f2b9ddefaaa1771aa714cc18829a1be4


[ROCm/amdsmi commit: 3052ad4220]
2024-11-22 08:55:28 -05:00
gabrpham f963f24d63 [SWDEV-498453] Enabled 'amd-smi set --clk-limit' for virtual environments
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I23e994502d4abc1a602d2341e77ad9c50fcf4839


[ROCm/amdsmi commit: 50eaf14b9e]
2024-11-19 16:17:29 -06:00
gabrpham f7a77c2539 [SWDEV-498453] Enabled for virtual environments
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Ic7b274cf8e579b733515efe84fc0f325256ef8b1


[ROCm/amdsmi commit: fc9d18dd3e]
2024-11-18 11:57:04 -05:00
Maisam Arif e578a46e54 Revert "[SWDEV-446215] Update cmake to put test libs in proper lib dir"
This reverts commit a33cdd7da6.

Reason for revert: Incorrect Path

Change-Id: I88bb304cfab997460a916e1a130fdb75435c648b


[ROCm/amdsmi commit: ed58196e35]
2024-11-18 11:15:22 -05:00
Adam Pryor 469b40d573 Revert "[SWDEV-446215] Update cmake to put test libs in proper lib dir"
This reverts commit a33cdd7da6.

Reason for revert: Because the gtest of amdsmi is different to other components so it was installed in a share/amdsmi/lib folder. It cannot be installed in a common folder such as /usr/local/bin or /usr/bin because all other components try to search those folder first.

 

This is breaking ROCmValidationSuite and other tools. Per Wang, Yanyao this should be reverted.

Change-Id: Id61bc6056fe41800e738616f39293e9b8762a377


[ROCm/amdsmi commit: b7789d4699]
2024-11-15 15:08:12 -05:00
Maisam Arif 48da4536c7 Updated CLI exceptions
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I5c68eed7719c093727afa434e25ba2560dde894a


[ROCm/amdsmi commit: f1c3fbf226]
2024-11-15 11:44:51 -05:00
Maisam Arif 2ee6a3c178 Revert "SWDEV-489696 [AMD SMI] Update python integration test"
This reverts commit f34bfce669.

Reason for revert: Changes needed

Change-Id: I96cc956a2f1c73a2828c70ec9aa22931ba570d8f


[ROCm/amdsmi commit: afd06950c1]
2024-11-14 18:54:48 -05:00
Joe Narlo f34bfce669 SWDEV-489696 [AMD SMI] Update python integration test
Initial update

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: I7c5777159f591f8b402168576b14ef8c1157e8d9


[ROCm/amdsmi commit: 06e7bf8a98]
2024-11-14 17:52:01 -05:00
Maisam Arif c6722a33ad Corrected pyyaml debian package name
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ice1541b4c1fc2297ee8bef5a7c7336c93267e01a


[ROCm/amdsmi commit: dfcf5b4ae5]
2024-11-14 14:42:50 -06:00
Justin Williams 966d0d996f [SWDEV-492047] Removed setup.cfg.in
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
Change-Id: I97b14d05b17fefbb87368824f57bc4ab690f1bf0


[ROCm/amdsmi commit: d3d6157854]
2024-11-13 12:45:09 -05:00
Peter Park 49592ef24f remove duplicated changelog
black format docs/conf.py
add seealso to python api reference

Change-Id: I60fa754f0af662669282dc90eea4b7dc5c5030cc
Signed-off-by: Peter Park <peter.park@amd.com>


[ROCm/amdsmi commit: cbfe403b1d]
2024-11-13 11:46:47 -05:00
Charis Poag f01eea6077 [SWDEV-488276/SWDEV-497613] Update memory partition set functionality
Changes:
  - [CLI] Added warning screen to AMD SMI users
    setting memory partition
  - [CLI] Added a progress bar time-bar for CLI sets display to 40 seconds
  - [API] Updated to wait until the driver reloads with SYSFS files active
  - [CLI] Now users can set or reset without providing:
    amd-smi set -g all <set arguments>
    or amd-smi reset -g all <set arguments>
    now can directly call -> sudo amd-smi set <set arguments>
    or sudo amd-smi reset <set arguments>
  - [SWDEV-475712][CLI/API] Fixed target_graphics_version field
    not properly displaying for older MI or Navi ASICs.
  - [All APIs] Added a catch for the driver to report invalid arguments
    now these APIs will show AMDSMI_STATUS_INVAL
    (ex. changing to NPS8 if the device does not support it)
  - [Install] Modified paths for Python install commands to support
    multi-ROCm installs

Change-Id: Id11f25d68a82d23c6b2d77ccb30b51e860dd0ca7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 3ea4a42a6e]
2024-11-12 16:50:32 -04:00
gabrpham e883dd7c87 Documented and adjusted APIs for asic info, vram info, and P2P topology
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I7ac9a868148e29c92299b21540e057f64cb4123e


[ROCm/amdsmi commit: 19cc4718c0]
2024-11-11 20:45:37 -05:00
gabrpham 5337da2573 Documented and adjusted python apis for pm metrics and reg table info
* amdsmi_get_gpu_pm_metrics_info and amdsmi_get_gpu_reg_table_info
were added to python api documentation
* AmdSmiRegType added as enum
* amdsmi_get_gpu_reg_table_info reg_type changed to AmdSmiRegType

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I57239ecf048e82226151db071e8d9299e9182647


[ROCm/amdsmi commit: 4d26db84ca]
2024-11-11 20:45:37 -05:00
gabrpham 194c33852f [SWDEV-492739] Partial fix for sclk min/max out of bounds
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I1f0230955c890c11a735c8cb352c8a9ee4cebe27


[ROCm/amdsmi commit: 2273d95a6c]
2024-11-11 20:45:37 -05:00
Maisam Arif 17bae546f1 Bump Version to 24.7.1.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I0fc42fe55cb653102d189db9aa5eaf723280170e


[ROCm/amdsmi commit: 4b511a31e1]
2024-11-11 19:23:20 -06:00
gabrpham f9bfce707d updated cli tool examples doc to reflect current CLI
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Iab78a412464ba6d7919aeb7da04a031b063a7d09


[ROCm/amdsmi commit: 0f067488e1]
2024-11-11 17:12:40 -05:00
Maisam Arif 43efe1c39a Updated parser help text
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8cc65edb1e629a55e0efbfc1109b1c549ed81101


[ROCm/amdsmi commit: 7932de967a]
2024-11-11 15:07:21 -06:00
Peter Park a003e50130 docs: Remove redundant/stale docs
bump rocm-docs-core to 1.8.2

rm unused files

rm stale docs

fix sphinx conf

reorg docs

SWDEV-482203 -- add note to usage guides

update readmes

Change-Id: I9e0111ac8fe2a691ac964b27436ba47747c27904
Signed-off-by: Peter Park <Peter.Park@amd.com>


[ROCm/amdsmi commit: e196f98dba]
2024-11-11 16:49:17 -04:00
Maisam Arif 16ffa7714c Updated amdsmi_get_energy_count() C API documentation
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Iac75a0dcd583f39eb97aada769c736c3305cc8a2


[ROCm/amdsmi commit: 6e843436f5]
2024-11-08 16:37:10 -05:00
Maisam Arif 27d81891be Adjusted private helper variables
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I0590b9ee5a1b4d5e6d4ae71c9587550c8d95033b


[ROCm/amdsmi commit: 5449d78cc4]
2024-11-08 11:25:50 -06:00