Commit Graph

475 Commitit

Tekijä SHA1 Viesti Päivämäärä
Maisam Arif cc4dfd834f Version Bump 26.0.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I29ea6fa781dfc338a60b390ff498c46b4a1efe52
2025-05-30 20:48:29 -05:00
Kanangot Balakrishnan, Bindhiya 2eff0b3764 [SWDEV-530633] Use gpu_metric speed and BW for xgmi (#366)
The xgmi command was showing pcie bit rate and bandwidth instead of xgmi. Corrected the API to get xgmi data from gpu metric.
Added python API for amdsmi_get_link_metrics. Modified the amdsmi_link_metrics struct.
Added check to confirm non zero partition got xgmi command.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-30 16:51:11 -05:00
Arif, Maisam 42441c78ea [SWDEV-488303] Adjusted process vram_mem data source (#411)
* [SWDEV-488303] Adjusted process vram_mem data source
* Standardized sscanf format strings

---------

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-05-29 23:26:12 -05:00
Arif, Maisam 0fdaebdbaa [SWDEV-488303] Updated CU occupancy for per-process retrieval (#243)
Change-Id: I2990597c6dd4b2e8cf3e11ce60f72049ebdd9a8c
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-29 20:35:27 -05:00
Maisam Arif fba62e2270 [SWDEV-534707] Adjust power value documentation
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1c4516e403715b9a1fe9c78fae94848c89daa920
2025-05-29 18:55:44 -05:00
Liu, Shuzhou (Bill) 970560fc7c [SWDEV-520665] Add support for board voltage (#303)
* Add the API and CLI to show the board voltage. 

---------

Change-Id: Icb25bd653bb1d004704b5a21b378ca31b2b242c7
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
2025-05-29 18:55:08 -05:00
Kanangot Balakrishnan, Bindhiya e7f19b36f0 [SWDEV-463406] ViolationStatus Changes (#288)
* Expanded Violation Status tracking for GPU metrics 1.8
* Added new fields to `amdsmi_violation_status_t` and related interfaces for enhanced violation statuses
---------

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
2025-05-29 13:26:21 -05:00
Mewar, Deepak 9a49e454fd [SWDEV-512393] Fix for incorrect cpu set size input (#399)
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
2025-05-29 12:14:03 -05:00
Pryor, Adam d0a89393df Remove ring hang (#391)
Change-Id: I856cd0949d3661911ab9302148aa1bc6e72abeed

Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-05-29 11:58:46 -05:00
Maisam Arif 2481573184 Removed leftover AMDSMI_MAX_DRIVER_VERSION_LENGTH
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Iee95728e6eb6d7962ed658b9a77feccb88e24e92
2025-05-29 10:34:21 -05:00
Narlo, Joseph 4cd0f3391e [SWDEV-522996] Syncing Unified Header and AMDSMI (#355)
* Update doxygen help text and formatting

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-05-28 19:06:10 -05:00
Narlo, Joseph b6d638d942 [SWDEV-532125] Remove_Unused_Definitions (#385)
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
2025-05-28 18:49:08 -05:00
Narlo, Joseph 7c29b4eab8 [SWDEV-532131] Update String Lengths (#383)
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
2025-05-28 18:31:30 -05:00
Narlo, Joseph 9862db63dd [SWDEV-532129] Update amdsmi asic info (#369)
* Added `subsystem_id` to `amdsmi_get_gpu_asic_info`
---------
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
2025-05-28 18:26:58 -05:00
Narlo, Joseph f3a5cc9cd5 [SWDEV-533941] Align P2P input struct (#395)
* Removed `amdsmi_io_link_type_t` and replaced with alredy implemented amdsmi_link_type_t
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-05-28 18:22:19 -05:00
Narlo, Joseph 38a1fadf44 [SWDEV-535200] Remove deprecated function amdsmi_get_power_info_v2 (#397)
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
2025-05-28 18:09:13 -05:00
Narlo, Joseph 7b3c85e970 [SWDEV-534438] Update structure amdsmi_bdf_t (#388)
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
2025-05-28 18:05:43 -05:00
Narlo, Joseph f71ae88956 [SWDEV-529483] Get Vram Vendor Name from Driver (#323)
* Update to remove vram enum and instead use the string directly from the driver.

Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-28 17:57:49 -05:00
Maisam Arif cebc512b1a Spellcheck
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I3842ca7552c8d3525ac7fee8c94b15cfdd7defdd
2025-05-27 13:59:23 -05:00
Pham, Gabriel c40d4291f6 Updated docs with new KFD events (#382)
* Updated docs with new KFD events

---------

Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-05-27 12:21:38 -05:00
Daniel Oliveira fe9b6eeb49 [SWDEV-529665] Add PLDM Bundle version
feat: Report PLDM Bundle from SMC to IB

Code changes related to the following:
  * APIs
  * CLI
  * Unit tests

Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Change-Id: I35ccf01eb612ca80e3ae6b72039085c18c989222
2025-05-20 01:37:00 -05:00
Mewar, Deepak b999f86611 [SWDEV-512393] Added amdsmi_get_cpu_affinity_with_scope (#198)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
2025-05-20 01:06:09 -05:00
Pryor, Adam 51e99965b3 [SWDEV-527092] - Fix ringhang event removal (#372)
Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-05-16 16:41:31 -05:00
Pryor, Adam 8713305f80 [SWDEV-527092] - Process Start/Stop event addition (#368)
- Added more events to `amdsmi_evt_notification_type_t`

Change-Id: I6a256fe828e4bec3197c7fecbed374ab17c6f850
Signed-off-by: Adam Pryor <Adam.Pryor@amd.com>
2025-05-16 11:01:15 -05:00
Saeed, Oosman 1bb1f8acc2 [SWDEV-522623] Add afid functionality to API and CLI (#330)
Change-Id: I015bde926491d54e09da8f39b05650515711e09f

[SWDEV-522623] Add afid functionality to API and CLI


Change-Id: I015bde926491d54e09da8f39b05650515711e09f

Signed-off-by: Oosman Saeed <oossaeed@amd.com>
Co-authored-by: Oosman Saeed <oossaeed@amd.com>
2025-05-16 10:49:56 +08:00
Arif, Maisam ace3b0901a Version & Doc update (#343)
Change-Id: Ibf8e1809913e30aba4b21ba889b72e5db7205736

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-08 12:19:04 -05:00
Poag, Charis b5a43b7744 [SWDEV-528647/SWDEV-528450] Reduce API load times and libdrm/libdrm_amdgpu dynamic loading (#333)
Changes:
- Removed libdrm/libdrm_amdgpu dependencies
- Added/updated new internal libdrm/libdrm_amdgpu/xf86drm APIs
  to allow our APIs to reference before dynamic loading
  the libdrm/libdrm_amdgpu libraries:
  1. amdgpu_drm.h to what's seen in mainline
  2. Added xf86drm.h to whats seen in mainline
- Modified internal DRM capabilities:
  1. Require each API to independently connect to libdrm/libdrm_amdgpu
     + validate API handles reponses accordingly
  2. Initialization of AMD SMI no longer has as strong of a tie to
     libdrm
- Updated internal implementations of several APIs which have
connections to libdrm/libdrm_amdgpu or APIs which have conflicts
with open libdrm/libdrm_amdgpu connections:
  1. amdsmi_init()
  2. amdsmi_get_gpu_vram_usage()
  3. amdsmi_get_gpu_asic_info()
  4. amdsmi_get_gpu_vram_info()
  5. amdsmi_get_gpu_vbios_info()
  6. amdsmi_get_gpu_driver_info()
  7. amdsmi_get_gpu_virtualization_mode()
  8. amdsmi_set_gpu_memory_partition()
  9. amdsmi_set_gpu_memory_partition_mode()
- Cleaned up effected tests/APIs

Change-Id: I96e2cf1b06b0cfee1b01a5e991ccc6116c4245a8
2025-05-02 21:58:53 -05:00
Narlo, Joseph d5ce95573f [SWDEV-522996] Sync Unified Header and AMDSMI (#305)
Sync Unified Header and AMDSMI

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-04-24 13:31:08 -05:00
Poag, Charis b58625cafa [SWDEV-528097] Unique ID fix for missing ID in KGD -> use KFD's (#292)
Changes:
   - Unique Id tries reading from KGD
     -> falls back to use KFD if not found

Change-Id: I05456dd79715e04d83f118b5bb4f1d3612822173
---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-04-22 16:27:33 -05:00
AL Musaffar, Yazen d6954bcc62 Removed CPER tests and adjust the implementation (#269)
- Moved helper functions into amdsmi_utils.cc
- Removed tests since they are not working.

---------

Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com>
2025-04-21 14:54:47 -05:00
Galantsev, Dmitrii 955ceac78a Add amdsmi_get_gpu_busy_percent
This is required for GPU busy percent in RDC

Change-Id: Idf2ab72993ecc8227958e6eb47f36fc68c93759f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-04-14 10:40:13 -05:00
Charis Poag 4782528770 [SWDEV-518325/SWDEV-518320/SWDEV-443309] Fix Partition Enumeration
* Changes:
  - Updates to DRM renderD* / card* pathing for partition
  - Now use KFD to discover AMD devices and populate accordingly
    Device MUST have an accessible KFD node (via cgroups)
  - Updated serveral AMD SMI CLI outputs to handle SYSFS files
    which are not accessible on partition nodes
  - Tests are updated to handle not supported features
  - Added new method to help get card/drm info
    (rsmi_dev_device_identifiers_get) from ROCm SMI
  - Renamed device->get_card_id() & device->get_drm_render_minor()
    These can now be used on internal AMD SMI calls.
  - Removed warnings shown in build

Change-Id: Ice882fd9b97fb625a5bd4ef327f3ceaf247dc570
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-04-12 14:41:38 -05:00
Arif, Maisam d81871ef16 [SWDEV-511234] Added amdsmi_get_gpu_cper_entries & CLI implementation
Added amdsmi_get_gpu_cper_entries() in the python and C APIs

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com>
Co-authored-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
2025-04-12 01:54:57 -05:00
Galantsev, Dmitrii b0129c390c Bump Version 25.4.0
Change-Id: Ief60ff2270e7e73d4e14b5181fa6fb18e32bcc1e
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-03-28 21:50:38 -05:00
Yuan, Perry 68e44c7f66 [SWDEV-482949] Add CPU model name querying support (#33)
- Add support to check CPU vendor info which will be called by RDC to
discovery CPU information
- Move esmi headers declaration to impl/amd_smi_common.h
- remove duplicated amdsmi_cpu_util_t

---------

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>
2025-03-28 21:21:39 -05:00
Poag, Charis 0402bb4d75 [SWDEV-513807] Fix amd-smi partition --accelerator not returning AMDSMI_STATUS_NO_PERM (#192)
* [SWDEV-513807] Fix amd-smi partition --accelerator not returning AMDSMI_STATUS_NO_PERM

Changes:
- Fixed amdsmi_get_gpu_accelerator_partition_profile_config() from not
  returning AMDSMI_STATUS_NO_PERM
- Changed amd-smi partition --accelerator to provide user with a warning
  if users does not use sudo or root permissions.
- Updated changelog for fixes planned for 6.4.1 release

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-03-20 17:23:01 -05:00
Galantsev, Dmitrii 4a3c70136f Make amdsmi_get_power_info backwards compatible
Change-Id: Ie5b4c35265827e78934caa94c142d31efce597e4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-03-19 23:23:48 -05:00
Castillo, Juan 7c882b2f69 SWDEV-518209: GPU Metrics 1.8 (#177)
- Updates:
    - Adding the following metrics to allow new calculations for violation status:
        - Per XCP metrics gfx_below_host_limit_ppt_acc
        - Per XCP metrics gfx_below_host_limit_thm_acc
        - Per XCP metrics gfx_low_utilization_acc
        - Per XCP metrics gfx_below_host_limit_total_acc
    - Increasing available JPEG engines to 40. Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI.

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
2025-03-19 10:24:02 -05:00
Poag, Charis 48cb5529d2 [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition (#127)
* [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition

Changes:
    - Added amd-smi static --partition for guest systems
    - Added C++ tests for memory and compute (accelerator) partitions
    - Added Python tests for amdsmi_get_gpu_vram_info(),
       amdsmi_get_gpu_accelerator_partition_profile_config()
    - Updated Python tests for
      amdsmi_get_gpu_accelerator_partition_profile()
      Now includes more profile and resource detail
    - Added amdsmi_get_gpu_xcd_counter();
      Tests provided for both C++/Python APIs
    - Added AmdSmiVramType & AmdSmiVramVendor: they were missing
      python testing required adding.

Change-Id: Ib6549d8ccc5fb68726f38745b87c78f890186022
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-03-11 16:38:46 -05:00
Arif, Maisam 0e67568902 [SWDEV-501958] Doc Update deprecating pasid in 7.0 (#166)
Change-Id: Ie19ba271c901d0be324143474871241272166124

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I024f7e2b5e7a5fcd6e1d12181d21ffacfe29c00f
2025-03-07 14:56:46 -06:00
AL Musaffar, Yazen 2936e00fed [SWDEV-453922] AMD SMI to provide mapping feature of other enumeration methods (#51)
Added enumeration mapping for 
- drm render
- drm card
- hsa id 
- hip id
- hip uuid (rocminfo uuid)

Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-03-07 09:09:12 -06:00
Pham, Gabriel d5b2763aba [SWDEV-515730] Updated set partition documentation (#151)
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
2025-03-06 23:16:32 -06:00
Park, Peter 0b4a6ff149 [SWDEV-513210] Add references to AMDGPU RAS Support info in API docs (#144)
Add reference to AMDGPU RAS Support info in API docs
2025-03-04 09:32:23 -06:00
Narlo, Joseph d7c3ad0886 [SWDEV-515031] Change Header Version to 25.2.0 (#109)
Change Versioning Scheme to match https://semver.org/
Dropping the year enum and API fields in a future release.
Should not impact library versioning since we are now starting from 25.2.0
---------

Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
Change-Id: Id090e23f156926d08f9c0b781447388adf268cf6
2025-02-26 19:17:09 -06:00
Joseph Narlo ddcdd60964 Merge branch 'SWDEV-517156/Synchronize_Unified_and_Amdsmi_Headers' of github.com:AMD-ROCm-Internal/amdsmi into SWDEV-517156/Synchronize_Unified_and_Amdsmi_Headers 2025-02-24 09:21:22 -06:00
Joseph Narlo b38c9aa1cc [SWDEV-517156] Synchronize Unified and Amdsmi Headers
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-02-21 15:05:57 -06:00
Mewar, Deepak 2c591ffcc1 [SWDEV-499995] ESMI Build/Compiler warnings messages (#105)
* [SWDEV-499995] ESMI Build/Compiler warnings messages

Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
2025-02-18 16:20:28 -06:00
Narlo, Joseph dc4a16da6f [SWDEV-513651] Sync Unified And Linux Header (#98)
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-02-06 22:25:50 -06:00
Pham, Gabriel 09379f8438 Changed default behavior of amdsmi_get_gpu_virtualization_mode (#97)
Changed return behavior of amdsmi_get_gpu_virtualization_mode

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-02-05 19:09:44 -06:00
Narlo, Joseph 8e454950ef [SWDEV-509782] Add tags and redefine groups (#73)
Add tags and redefine groups in amd-smi header

Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-02-05 18:43:55 -06:00