نمودار کامیت

149 کامیت‌ها

مولف SHA1 پیام تاریخ
Poag, Charis d24dc7ef89 [SWDEV-518561] Separate Driver Reload from Memory Partition Sets (#582)
Description:
  - Added a new API `amdsmi_gpu_driver_reload()` to reload the AMD GPU driver independently.
  - Updated CLI (`sudo amd-smi reset -r`) and Python bindings to support driver reload functionality.
  - Removed automatic driver reload from `amdsmi_set_gpu_memory_partition()` and `amdsmi_set_gpu_memory_partition_mode()`.
  - Enhanced CLI and test cases to allow users to control when the driver reload occurs.
  - Updated documentation and changelog to reflect the new driver reload process.
  - Improved error handling and logging for driver reload operations.
  - Added progress bar and user confirmation prompts for driver reload commands.

* Update build/test strategy to only allow one test execution at a time
* Modify API verbage + modify systemctl error output
  - Systemctl is typically not enabled on docker.
  - And is an edge case for gpu being active process/etc for display devices.
* Remove AMDSMI_STATUS_AMDGPU_RESTART_ERR from the return values
* Move driver reload to after we save original compute partitions

---------

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-08-05 20:44:28 -05:00
Castillo, Juan 3b1957e674 [SWDEV-531904] Added test_get_gpu_revision (#533)
* [SWDEV-531904] Added test_get_gpu_revision
New:
- amdsmi_get_gpu_revision() previously not implemented in amdsmi_interface.py
- test_get_gpu_revision() missing integration test.

Updated:
-changelog.md added new doc fields for ROCm 7.1
-amdsmi-py-api.md added field|description doc fields

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
2025-07-15 19:35:54 -05:00
Narlo, Joseph 7c0802889b [SWDEV-489696] Improve AMD SMI Python APIs Functional and Unit Testing (#468)
* Adding python unit tests
* Remove duplicate functions definitions
* Added missing classes for __init__ for py-interface

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-06-19 16:38:34 -05:00
Kanangot Balakrishnan, Bindhiya 2eff0b3764 [SWDEV-530633] Use gpu_metric speed and BW for xgmi (#366)
The xgmi command was showing pcie bit rate and bandwidth instead of xgmi. Corrected the API to get xgmi data from gpu metric.
Added python API for amdsmi_get_link_metrics. Modified the amdsmi_link_metrics struct.
Added check to confirm non zero partition got xgmi command.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-30 16:51:11 -05:00
Narlo, Joseph f3a5cc9cd5 [SWDEV-533941] Align P2P input struct (#395)
* Removed `amdsmi_io_link_type_t` and replaced with alredy implemented amdsmi_link_type_t
Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-05-28 18:22:19 -05:00
Narlo, Joseph f71ae88956 [SWDEV-529483] Get Vram Vendor Name from Driver (#323)
* Update to remove vram enum and instead use the string directly from the driver.

Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-28 17:57:49 -05:00
Mewar, Deepak b999f86611 [SWDEV-512393] Added amdsmi_get_cpu_affinity_with_scope (#198)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
2025-05-20 01:06:09 -05:00
Kanangot Balakrishnan, Bindhiya 9d7964dff5 [SWDEV-516592] Add python interface API for Bad Page Threshold (#141)
- Added python interface APIs for amdsmi_get_gpu_bad_page_threshold()
 - Updated the docs and changelog.

---------

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-04-14 04:19:45 -05:00
Arif, Maisam d81871ef16 [SWDEV-511234] Added amdsmi_get_gpu_cper_entries & CLI implementation
Added amdsmi_get_gpu_cper_entries() in the python and C APIs

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com>
Co-authored-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
2025-04-12 01:54:57 -05:00
Arif, Maisam 35fbe2cbf1 [SWDEV-521408] Fixed call to amdsmi_get_gpu_virtualization_mode (#230)
Change-Id: I29c86f8982b53cc139004ebc06b26a5d8f430091

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-04-01 16:57:23 -05:00
Yuan, Perry 68e44c7f66 [SWDEV-482949] Add CPU model name querying support (#33)
- Add support to check CPU vendor info which will be called by RDC to
discovery CPU information
- Move esmi headers declaration to impl/amd_smi_common.h
- remove duplicated amdsmi_cpu_util_t

---------

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>
2025-03-28 21:21:39 -05:00
Poag, Charis 48cb5529d2 [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition (#127)
* [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition

Changes:
    - Added amd-smi static --partition for guest systems
    - Added C++ tests for memory and compute (accelerator) partitions
    - Added Python tests for amdsmi_get_gpu_vram_info(),
       amdsmi_get_gpu_accelerator_partition_profile_config()
    - Updated Python tests for
      amdsmi_get_gpu_accelerator_partition_profile()
      Now includes more profile and resource detail
    - Added amdsmi_get_gpu_xcd_counter();
      Tests provided for both C++/Python APIs
    - Added AmdSmiVramType & AmdSmiVramVendor: they were missing
      python testing required adding.

Change-Id: Ib6549d8ccc5fb68726f38745b87c78f890186022
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-03-11 16:38:46 -05:00
AL Musaffar, Yazen 2936e00fed [SWDEV-453922] AMD SMI to provide mapping feature of other enumeration methods (#51)
Added enumeration mapping for 
- drm render
- drm card
- hsa id 
- hip id
- hip uuid (rocminfo uuid)

Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-03-07 09:09:12 -06:00
Arif, Maisam 52b3ee2dc6 [SWDEV-503520] Add amdsmi_get_rocm_version() in python library (#76)
Changed amdsmi_get_rocm_version() to be an API in the python library only. 
Updated usage and version detection
Updated path detection of librocm-core.so
Updated docs to reflect both amdsmi_get_rocm_version and amdsmi_get_lib_version() do not require initialization.

Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-02-26 05:45:58 -06:00
Pham, Gabriel e663bed7d6 [SWDEV-462952] Updated passthrough to use virtualization mode struct
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-01-31 17:34:01 -06:00
Ramalingam, Muthusamy ced110dbb6 amdsmi: Adding Support to get hsmp Driver version
* amdsmi: Adding Support to get hsmp Driver version

Adding Support to fetch hsmp driver version from ESmi Interfaces.
Adding Support to fetch memory bandwidth per socket.

Signed-off-by: muthusamy <muthusamy.ramalingam@amd.com>
2025-01-29 13:45:02 -06:00
Castillo, Juan 9cc5c303a2 [SWDEV-508173] [AMDSMI] Python API missing function errors (#46)
* [SWDEV-508173] Updates include:
- Updating py-interface to import amdsmi_get_gpu_reg_table_info and amdsmi_get_gpu_pm_metrics_info.
- Updating the ctypes from byref to pointer.

Signed-off-by: Castillo, Juan <Juan.Castillo@amd.com>
2025-01-21 14:11:41 -06:00
Poag, Charis c1cd2b46ef [SWDEV-488276] Add partition 2.0 functionality (#44)
Changes:
* CLI:
  - Updated amd-smi partition
  - Updated amd-smi partition -c
  - Updated amd-smi partition -m
  - Updated amd-smi partition -a
  - Updated amd-smi set -M <NPS1/NPS2/NPS4/NPS8>
  - Updated amd-smi set -C <SPX/DPX/QPX/TPX/CPX>
  - Updated amd-smi set -C <ACCELERATOR_TYPE> or <PROFILE_INDEX>
    Where PROFILE_INDEX = available ACCELERATOR_TYPES
  - Updated amd-smi set --help, now includes more detail for
    amd-smi set -C <ACCELERATOR_TYPE> or <PROFILE_INDEX>

* API:
  - Added amdsmi_get_gpu_memory_partition_config
  - Added amdsmi_set_gpu_memory_partition_mode
  - Added amdsmi_get_gpu_accelerator_partition_profile_config
  - Updated amdsmi_get_gpu_accelerator_partition_profile_config
  - Added amdsmi_set_gpu_accelerator_partition_profile

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-01-16 00:53:46 -06:00
Juan Castillo f8b8347627 [SWDEV-496693]GPU Metrics 1.7
Features added:
- [SWDEV-475244] Add new interface to get max memory bandwidth
Updated API: amdsmi_get_gpu_vram_info
Updated: struct amdsmi_vram_info_t to include vram_max_bandwidth
CLI: amd-smi static --vram

- [SWDEV-488349] Add new interface for XGMI link status
New API: amdsmi_get_gpu_xgmi_link_status
CLI: amd-smi xgmi --link-status

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Change-Id: I1aa35b741136eb4f02f7ea9a95b865886273eb72
2024-12-18 10:57:06 -06:00
Joe Narlo 3052ad4220 SWDEV-495787 [AMDSMI] Different license headers
Change copyrights to MIT and remove date

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: I16f5b412f2b9ddefaaa1771aa714cc18829a1be4
2024-11-22 08:55:28 -05:00
gabrpham f5b7761ac7 [SWDEV-490187] reset gpu partition were removed
The reset gpu partition support for both compute and memory were removed

Code changes related to the following:
  * amdsmi_reset_gpu_compute_partition()
  * amdsmi_reset_gpu_memory_partition()
  * CLI

Change-Id: I372589074b4da172bedd39223edde18939e373ae
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-10-18 16:22:26 -05:00
Charis Poag 3a4abbd8c0 [SWDEV-422195/SWDEV-440985] GPU metrics 1.6
Changes:
    - Added new GPU metrics:
      1) Violation status' (ex. PVIOL/TVIOL) accumulators
      2) XCP (Graphics Compute Partitions) statistics
      3) pcie other end recovery counter
    - CLI/API/tests changes were made accordingly

Change-Id: I589b9b1f570f25dda12d95bb501feca85da8b3bb
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-09-27 12:04:21 -05:00
Lang Yu 7a557b1c50 SWDEV-463405: Add amdsmi_get_link_topology_nearest support
amdsmi_get_link_topology_nearest() is used to retrieve
the set of GPUs that are nearest to a given device
at a specific interconnectivity level.

Code changes related to the following:
    * API
    * CLI
    * Unit tests
    * Examples

Header Unification Change: "/amdsmi/+/1122408"

Change-Id: Id0317797c652c267742513936d321677793ec634
Signed-off-by: Lang Yu <lang.yu@amd.com>
2024-09-26 16:43:27 -05:00
gabrpham c9a489d437 Moved partition_id from static --asic-info to static --partition.
partition_id also removed from the `amdsmi_asic_info_t` struct and
supporting API has been added for querying partition information.

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Id5a6291a77d11bb97a1c7a200fc465898e86e081
2024-09-20 03:48:42 -04:00
Maisam Arif 3b7f661e71 Moved KFD information to separate structure and API
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: If6eaea589edc704cf408d6391b5f2154134035e7
2024-09-20 03:48:42 -04:00
gabrpham b7f779182d [SWDEV-448738] Added rocmsmi extremum command as 'set -L'
Change-Id: I997c630bd20cc61673813a2301eb5e3002619a32
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>

Change-Id: Ifa884303f9a0fa058af093a23f5be449bba54f29
2024-09-18 14:51:01 -04:00
Maisam Arif 105db1afcd Udpated License Dates
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8ca199c129c06508bc3e23745ab5ac2d20dce928
2024-09-16 16:14:47 -04:00
Tim Huang 260edaa752 [SWDEV-463402] - Support retrieving connection type and P2P capabilities between two GPUs
1. Add a API interface amdsmi_topo_get_p2p_status to retrieve
connection type and P2P capabilities between 2 GPUs.

2. Add getting p2p status test in hw_topology_read
to print P2P capability information.

3. Add below tables for cli topology sub commands:
  - CACHE COHERANCY TABLE
  - ATOMICS TABLE
  - DMA TABLE
  - BI-DIRECTIONAL TABLE

Change-Id: I199173030d4170115cea27c472958a4826e4e1bf
Signed-off-by: Tim Huang <tim.huang@amd.com>
2024-09-06 09:42:34 -04:00
gabrpham 7d8e54d0e1 [SWDEV-450553] Added gpu memory overdrive to metric function
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: If7bd6865d641a5a83c594a4d3c57938b1b6dc18e
2024-09-04 12:54:14 -04:00
Bill(Shuzhou) Liu e3c63628e5 Change the clean shader API to clean local data
To be align with the unified API.

Change-Id: I2819339fba6f528204cebd3e9605109e82cbc5b4
2024-06-17 16:23:33 -05:00
Bill(Shuzhou) Liu 4cf59c4edb Change the name of clear sram to run cleaner shader
The function is to clean the local data in LDS/GPRs. The clear sram
is misleading.

Change-Id: I0385e6d6348602fe0f347d17e48ed8983f7ceb87
2024-06-05 12:07:39 -05:00
Maisam Arif e5d1ba4621 Use different sysfs for soc_pstate and xmgi_plpd
The sysfs is changed to use the pm_policy folder with multiple
dpm_policy files.

Change-Id: I40fac8de2d0cb127950d238b8196f6d2416778d0
2024-05-31 01:38:41 -04:00
Maisam Arif 614816ab7e Added new functions to py-interface __init__.py
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I4bd591f834b026793cc9158890e30999cba46e82
2024-04-24 14:26:23 -04:00
Maisam Arif c551c3caed SWDEV-455131 - Updated process APIs
- Removed amdsmi_get_gpu_process_info from python API
  - Updated documentation
  - Aligned process --json output format to unit & value format

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I82bba1b6df71020b4a5995ff63b9aa62611ce4fe
2024-04-18 14:00:59 -05:00
Maisam Arif 55cfcf11d6 Added __version__ attr to Python Library
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ibbd90eaa60cc8b9dd0387d7fac8aef06a3a43375
2024-02-28 16:27:33 -05:00
Maisam Arif f831cf49f7 Renamed amdsmi_get_metrics_table to amdsmi_get_cpu_metrics_table
Renamed structs to be more conistent with what they are calling

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I6f2be2fcb76f004aa592f0dad8545565700ccd4b
2024-02-12 16:30:18 -06:00
Deepak Mewar 6f7273fda5 Added amdsmi cpu family & cpu model
- Updated header and source files
- Updated python interface
- Generated python wrapper for updated header
- Updated the CLI to have cpu family & cpu model
  as part of metric table

Change-Id: Iea440251797270d5d29ffe883b0ad6db790be658
2024-02-06 18:46:27 -05:00
Maisam Arif 59d885a9ca Fixed gpu_metric and cache cli checks
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic71e2b50dfa8fc106a17079842a7564a8e24b69d
2024-02-01 05:47:18 -05:00
Bill(Shuzhou) Liu 0b67c2ccc4 Unified API
amdsmi_get_link_metrics() and amdsmi_get_pcie_info()

Change-Id: Iea060e449813b842236243b772e8809497ce98fe
2024-01-24 18:27:20 -05:00
Naveen Krishna Chatradhi 94d3c563a3 amdsmi: py-interface: Add python interface for esmi api
Change-Id: I4a3ab1168a7d1bf011ecc9c508e111c281503520
2023-12-18 06:31:35 -05:00
Maisam Arif e9a6153836 Fix imports for partition API's
Change-Id: Ic3bc0230405ee5e662bfd2d5c6d0ed5bca42a671
2023-12-13 23:52:54 -06:00
Maisam Arif d790ebc62b Refactor gpu_metrics usage in libraries
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I763638d4b546bf49b234e823df81028c357e8f49
2023-11-22 03:32:15 -06:00
Deepak Mewar 14c50c9b4e another set of esmi python wrappers updated to amdsmi python library
Change-Id: I33557b9021ecfdab76daaf65ad63f624115aa322
2023-11-10 15:50:42 -05:00
Maisam Arif 6ec87b6a23 CLI Fixes & ESMI import handling
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I08e66d2bdbe37064fd73f42eeb9199f1e3bc3404
2023-10-16 19:29:17 -05:00
Deepak Mewar e6d82912e3 Added another set of esmi python wrappers for
amdsmi python library

Change-Id: Ib49efe21d4909c3e4a011eddcd96f587a9b6570c
2023-10-16 17:49:21 -04:00
Galantsev, Dmitrii 8568af65ff CMAKE - Generate ESMI wrapper
The wrapper is only generated if ENABLE_ESMI_LIB option is set.
./update_wrapper.sh will check the option if cmake was ran first.

Change-Id: I6267cdba8c6ecdff58ced75a2aa59afae964446c
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-10-16 16:49:00 -05:00
Maisam Arif 66eb3de5e4 Added static --cache to cli tool
Change-Id: I494d29aba7915a0b8815036977b2636a2da5264e
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-10-12 10:13:56 -04:00
Maisam Arif c6e54e09e7 Placeholders for new ras values
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I66cea253b19a5029172af0a3f4dd2f993c13b309
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-10-11 19:39:38 -04:00
Maisam Arif 9281bfbbfa Fixed log handling and exceptions
Updated exceptions
	Added driver load exception
	Fixed logging override by removing previous log handlers
	Updated debug output to use gpu_id vs C-pointer
	Removed AmdSmiRetcode class in favor of using the wrapper
directly
	Added traceback limits for clean errors (Not in debug)

Change-Id: Ia02bb842b8f60d9ab4b68b7f8b1afda30b1c021c
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-09-25 01:49:35 -05:00
Maisam Arif 6dcfd4a815 Added vram info to static --vram
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I48f6875d5131440848ef6f5875c6d385fee871e3
2023-09-23 23:52:50 -04:00