Граф коммитов

399 Коммитов

Автор SHA1 Сообщение Дата
Castillo, Juan e123e986f9 [SWDEV-530211] Fix for VCLK & DCLK N/A values + Update deep sleep logic (#342)
- Updated VCLK and DCLK min/max clock logic to populate N/A values.
- Updated VCLK and DCLK to show all available clocks.
- Updated deep_sleep logic using sys/fs clk_deep_sleep true/false.
- Added clarifying comments.
- Updated error output using e.get_error_info() instead of just error.
- Updated changelog

---------

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
2025-05-08 14:39:21 -05:00
Galantsev, Dmitrii bd82e881f5 [SWDEV-529762] CMAKE - Fix lintian issues (#325)
Change-Id: Ide3563a876cb530d0e80676e78f36f18a233a3ba

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-06 17:59:47 -05:00
Galantsev, Dmitrii 42c77a5912 CMAKE - Format with cmake-format
Change-Id: I5b86b7b83e3d151c3d6e1c216ecb28f1313d538a
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-06 17:09:53 -05:00
Arif, Maisam ee14ef7b95 [SWDEV-531364] Removed Python API debug statements (#351)
Removed Python API debug statements

Change-Id: Ifc17a7b49b11bce56075d620a9b0e7cbbdb5f417

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-05-06 14:01:59 -05:00
Galantsev, Dmitrii fe98b8bd63 CMAKE - Clean-up cmake changes introduced in a9b8b6d369b390af0c00bbffab2b4fe1748b8bad
Change-Id: Ida0e9475a926a2495e36b0d9bc2468c48aee0e77
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-05-05 15:43:12 -05:00
Poag, Charis b5a43b7744 [SWDEV-528647/SWDEV-528450] Reduce API load times and libdrm/libdrm_amdgpu dynamic loading (#333)
Changes:
- Removed libdrm/libdrm_amdgpu dependencies
- Added/updated new internal libdrm/libdrm_amdgpu/xf86drm APIs
  to allow our APIs to reference before dynamic loading
  the libdrm/libdrm_amdgpu libraries:
  1. amdgpu_drm.h to what's seen in mainline
  2. Added xf86drm.h to whats seen in mainline
- Modified internal DRM capabilities:
  1. Require each API to independently connect to libdrm/libdrm_amdgpu
     + validate API handles reponses accordingly
  2. Initialization of AMD SMI no longer has as strong of a tie to
     libdrm
- Updated internal implementations of several APIs which have
connections to libdrm/libdrm_amdgpu or APIs which have conflicts
with open libdrm/libdrm_amdgpu connections:
  1. amdsmi_init()
  2. amdsmi_get_gpu_vram_usage()
  3. amdsmi_get_gpu_asic_info()
  4. amdsmi_get_gpu_vram_info()
  5. amdsmi_get_gpu_vbios_info()
  6. amdsmi_get_gpu_driver_info()
  7. amdsmi_get_gpu_virtualization_mode()
  8. amdsmi_set_gpu_memory_partition()
  9. amdsmi_set_gpu_memory_partition_mode()
- Cleaned up effected tests/APIs

Change-Id: I96e2cf1b06b0cfee1b01a5e991ccc6116c4245a8
2025-05-02 21:58:53 -05:00
Narlo, Joseph d5ce95573f [SWDEV-522996] Sync Unified Header and AMDSMI (#305)
Sync Unified Header and AMDSMI

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>

---------

Signed-off-by: josnarlo <Joseph.Narlo@amd.com>
2025-04-24 13:31:08 -05:00
Kanangot Balakrishnan, Bindhiya 8e5f6b1a8d [SWDEV-520371] Generate valid json format output (#273)
Earlier, the amd-smi metric and static json output
was not in valid json format. Changes are done to
get the output in valid json format.

---------
Change-Id: I5576333269509f63b3c800f225c3d73127ce80cf

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-04-23 00:08:43 -05:00
Arif, Maisam 53dbb7bf58 CLI Help text and parser formatting updates (#218)
* Small Fixes
* CLI Help text and parser formatting updates
* Changed metavar for set partition

---------
Change-Id: Ia8809665f6fac670452cd4db4e5e8f9c7270faba
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Co-authored-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-04-22 23:32:42 -05:00
Castillo, Juan 4d92dea079 [SWDEV-523794] Update to fix MIN_CLK and MAX_CLK incorrect values
(#280)

- Fixed potential issue with min/max values when only one frequency is available
- Improve error handling in GPU frequency range detection
- Refactor clock frequency range detection for better readability
- Added special handling for current frequency indicator (*) in DPM output
- Added comments explaining special case handling for current frequency
- Cleaned up incorrect definitions in hsmp metric table definition

---------

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-04-17 17:46:04 -05:00
Galantsev, Dmitrii 955ceac78a Add amdsmi_get_gpu_busy_percent
This is required for GPU busy percent in RDC

Change-Id: Idf2ab72993ecc8227958e6eb47f36fc68c93759f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-04-14 10:40:13 -05:00
Kanangot Balakrishnan, Bindhiya 9d7964dff5 [SWDEV-516592] Add python interface API for Bad Page Threshold (#141)
- Added python interface APIs for amdsmi_get_gpu_bad_page_threshold()
 - Updated the docs and changelog.

---------

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-04-14 04:19:45 -05:00
Charis Poag 4782528770 [SWDEV-518325/SWDEV-518320/SWDEV-443309] Fix Partition Enumeration
* Changes:
  - Updates to DRM renderD* / card* pathing for partition
  - Now use KFD to discover AMD devices and populate accordingly
    Device MUST have an accessible KFD node (via cgroups)
  - Updated serveral AMD SMI CLI outputs to handle SYSFS files
    which are not accessible on partition nodes
  - Tests are updated to handle not supported features
  - Added new method to help get card/drm info
    (rsmi_dev_device_identifiers_get) from ROCm SMI
  - Renamed device->get_card_id() & device->get_drm_render_minor()
    These can now be used on internal AMD SMI calls.
  - Removed warnings shown in build

Change-Id: Ice882fd9b97fb625a5bd4ef327f3ceaf247dc570
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-04-12 14:41:38 -05:00
Arif, Maisam d81871ef16 [SWDEV-511234] Added amdsmi_get_gpu_cper_entries & CLI implementation
Added amdsmi_get_gpu_cper_entries() in the python and C APIs

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Co-authored-by: Saeed, Oosman <Oosman.Saeed@amd.com>
Co-authored-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
2025-04-12 01:54:57 -05:00
Pham, Gabriel e2c371ece4 [SWDEV-524288] Fixed duplication of GPU id in events. (#233)
* Fixed duplication of GPU id in events.

---------

Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-04-04 18:31:08 -05:00
Arif, Maisam 35fbe2cbf1 [SWDEV-521408] Fixed call to amdsmi_get_gpu_virtualization_mode (#230)
Change-Id: I29c86f8982b53cc139004ebc06b26a5d8f430091

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-04-01 16:57:23 -05:00
Yuan, Perry 68e44c7f66 [SWDEV-482949] Add CPU model name querying support (#33)
- Add support to check CPU vendor info which will be called by RDC to
discovery CPU information
- Move esmi headers declaration to impl/amd_smi_common.h
- remove duplicated amdsmi_cpu_util_t

---------

Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>
2025-03-28 21:21:39 -05:00
Galantsev, Dmitrii 4a3c70136f Make amdsmi_get_power_info backwards compatible
Change-Id: Ie5b4c35265827e78934caa94c142d31efce597e4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-03-19 23:23:48 -05:00
Castillo, Juan 7c882b2f69 SWDEV-518209: GPU Metrics 1.8 (#177)
- Updates:
    - Adding the following metrics to allow new calculations for violation status:
        - Per XCP metrics gfx_below_host_limit_ppt_acc
        - Per XCP metrics gfx_below_host_limit_thm_acc
        - Per XCP metrics gfx_low_utilization_acc
        - Per XCP metrics gfx_below_host_limit_total_acc
    - Increasing available JPEG engines to 40. Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI.

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
2025-03-19 10:24:02 -05:00
Poag, Charis 48cb5529d2 [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition (#127)
* [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition

Changes:
    - Added amd-smi static --partition for guest systems
    - Added C++ tests for memory and compute (accelerator) partitions
    - Added Python tests for amdsmi_get_gpu_vram_info(),
       amdsmi_get_gpu_accelerator_partition_profile_config()
    - Updated Python tests for
      amdsmi_get_gpu_accelerator_partition_profile()
      Now includes more profile and resource detail
    - Added amdsmi_get_gpu_xcd_counter();
      Tests provided for both C++/Python APIs
    - Added AmdSmiVramType & AmdSmiVramVendor: they were missing
      python testing required adding.

Change-Id: Ib6549d8ccc5fb68726f38745b87c78f890186022
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-03-11 16:38:46 -05:00
AL Musaffar, Yazen a6c8bab856 [SWDEV-491051] Fixed drm_card reference in python interface
Update amdsmi_interface.py

Typo at line 1744: 
was: "drm_card": _validate_if_max_uint(enumeration_info.drm_render, MaxUIntegerTypes.UINT32_T), 
changed to: "drm_card": _validate_if_max_uint(enumeration_info.drm_card, MaxUIntegerTypes.UINT32_T) 

changed from drm_render to drm_card

Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
2025-03-08 16:08:48 -06:00
Arif, Maisam 0e67568902 [SWDEV-501958] Doc Update deprecating pasid in 7.0 (#166)
Change-Id: Ie19ba271c901d0be324143474871241272166124

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I024f7e2b5e7a5fcd6e1d12181d21ffacfe29c00f
2025-03-07 14:56:46 -06:00
AL Musaffar, Yazen 2936e00fed [SWDEV-453922] AMD SMI to provide mapping feature of other enumeration methods (#51)
Added enumeration mapping for 
- drm render
- drm card
- hsa id 
- hip id
- hip uuid (rocminfo uuid)

Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-03-07 09:09:12 -06:00
Narlo, Joseph d7c3ad0886 [SWDEV-515031] Change Header Version to 25.2.0 (#109)
Change Versioning Scheme to match https://semver.org/
Dropping the year enum and API fields in a future release.
Should not impact library versioning since we are now starting from 25.2.0
---------

Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
Change-Id: Id090e23f156926d08f9c0b781447388adf268cf6
2025-02-26 19:17:09 -06:00
Arif, Maisam 52b3ee2dc6 [SWDEV-503520] Add amdsmi_get_rocm_version() in python library (#76)
Changed amdsmi_get_rocm_version() to be an API in the python library only. 
Updated usage and version detection
Updated path detection of librocm-core.so
Updated docs to reflect both amdsmi_get_rocm_version and amdsmi_get_lib_version() do not require initialization.

Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-02-26 05:45:58 -06:00
Narlo, Joseph af31967b6d Merge branch 'amd-staging' into SWDEV-517156/Synchronize_Unified_and_Amdsmi_Headers 2025-02-24 10:14:49 -06:00
Arif, Maisam 9d2bbcf14d Updated amdsmi_get_driver_info() to handle empty strings (#126)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-02-21 19:15:18 -06:00
Joseph Narlo 695e619913 Merge branch 'SWDEV-517156/Synchronize_Unified_and_Amdsmi_Headers' of github.com:AMD-ROCm-Internal/amdsmi into SWDEV-517156/Synchronize_Unified_and_Amdsmi_Headers 2025-02-21 15:11:46 -06:00
Joseph Narlo b38c9aa1cc [SWDEV-517156] Synchronize Unified and Amdsmi Headers
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-02-21 15:05:57 -06:00
Joseph Narlo 499ba8acb0 [SWDEV-517156] Synchronize Unified and Amdsmi Headers
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-02-21 13:52:30 -06:00
Galantsev, Dmitrii 30a6cb02f2 [SWDEV-513769] Search standard locations for libamd_smi.so
Change-Id: I2c364b7dfa6ffa91e5b1c837c2d3d14ef58ee66b
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-02-19 01:34:03 +00:00
Mewar, Deepak 2c591ffcc1 [SWDEV-499995] ESMI Build/Compiler warnings messages (#105)
* [SWDEV-499995] ESMI Build/Compiler warnings messages

Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
2025-02-18 16:20:28 -06:00
Poag, Charis 4cab72d692 [SWDEV-515298] Fix 'amdsmi_nps_flags_t' refactor (#110)
Earlier commit changing union object name needed to update Python/Rust
references.

Change-Id: I4097dbf3b114abe91d6ca1dfd8aae4ce39fea619
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-02-13 13:52:34 -06:00
Narlo, Joseph dc4a16da6f [SWDEV-513651] Sync Unified And Linux Header (#98)
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-02-06 22:25:50 -06:00
Narlo, Joseph 8e454950ef [SWDEV-509782] Add tags and redefine groups (#73)
Add tags and redefine groups in amd-smi header

Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-02-05 18:43:55 -06:00
Pham, Gabriel e663bed7d6 [SWDEV-462952] Updated passthrough to use virtualization mode struct
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-01-31 17:34:01 -06:00
Galantsev, Dmitrii 6dcb9322f9 update_wrapper.sh - Fix docker
Change-Id: Icb0d80dacfe17222b32bf5616765abc08cafd085

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2025-01-31 14:55:36 -06:00
Pham, Gabriel 0f79efac78 [SWDEV-462952] Options enabled for GPU passthrough scenarios
Added Dynamic Passthrough detection

Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-01-30 18:12:03 -06:00
Ramalingam, Muthusamy ced110dbb6 amdsmi: Adding Support to get hsmp Driver version
* amdsmi: Adding Support to get hsmp Driver version

Adding Support to fetch hsmp driver version from ESmi Interfaces.
Adding Support to fetch memory bandwidth per socket.

Signed-off-by: muthusamy <muthusamy.ramalingam@amd.com>
2025-01-29 13:45:02 -06:00
Williams, Justin 21841f44a5 [DCSM-524] ESMI build fix (#72)
Fix amd_hsmp failure to copy new version

Signed-off-by: Justin Williams <Justin.Williams@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-01-29 13:39:19 -06:00
Maisam Arif 803b18fe95 Dropped count from amdsmi_get_link_topology_nearest() python API
The count field was not pythonic nor needed

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I212f43dc11f2f2c7eddd39900e6e3aaec03f3f8f
2025-01-22 19:07:01 -06:00
Kanangot Balakrishnan, Bindhiya 6fa991c39c [SWDEV-481004] Fix for incorrect gfx_version number (#52)
The target_graphics_version was not formatted properly and was
showing incorrect Target Name. Corrected this by fomatting
major, minor and revision numbers.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-01-21 15:42:05 -06:00
Castillo, Juan 9cc5c303a2 [SWDEV-508173] [AMDSMI] Python API missing function errors (#46)
* [SWDEV-508173] Updates include:
- Updating py-interface to import amdsmi_get_gpu_reg_table_info and amdsmi_get_gpu_pm_metrics_info.
- Updating the ctypes from byref to pointer.

Signed-off-by: Castillo, Juan <Juan.Castillo@amd.com>
2025-01-21 14:11:41 -06:00
Poag, Charis c1cd2b46ef [SWDEV-488276] Add partition 2.0 functionality (#44)
Changes:
* CLI:
  - Updated amd-smi partition
  - Updated amd-smi partition -c
  - Updated amd-smi partition -m
  - Updated amd-smi partition -a
  - Updated amd-smi set -M <NPS1/NPS2/NPS4/NPS8>
  - Updated amd-smi set -C <SPX/DPX/QPX/TPX/CPX>
  - Updated amd-smi set -C <ACCELERATOR_TYPE> or <PROFILE_INDEX>
    Where PROFILE_INDEX = available ACCELERATOR_TYPES
  - Updated amd-smi set --help, now includes more detail for
    amd-smi set -C <ACCELERATOR_TYPE> or <PROFILE_INDEX>

* API:
  - Added amdsmi_get_gpu_memory_partition_config
  - Added amdsmi_set_gpu_memory_partition_mode
  - Added amdsmi_get_gpu_accelerator_partition_profile_config
  - Updated amdsmi_get_gpu_accelerator_partition_profile_config
  - Added amdsmi_set_gpu_accelerator_partition_profile

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-01-16 00:53:46 -06:00
Scaffidi, Salvatore 3793be7735 [SWDEV-463406] Update API with fields for gfx_clock_below_host_limit and low_utilization violations
Updated API with fields for gfx_clock_below_host_limit and low_utilization violations
Change-Id: I25647bae6e7b785f44dab024272767658688bcad

---------
Signed-off-by: Scaffidi, Salvatore <Salvatore.Scaffidi@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
2025-01-08 22:07:23 -06:00
Arif, Maisam 490132748f Corrected spacing and simplified logic
Change-Id: I51c98339367d1cb9470a00ee05463ac8662d6b01

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-01-08 20:18:24 -06:00
Maisam Arif 8ca2c6e247 Deprecated amdsmi_get_energy_count() power field
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I1b5fe8e278b797458e57dff689e692347901bbfd
2025-01-07 12:45:55 -06:00
Charis Poag 3226a1d0ea [SWDEV-484382] Fix VCLK/DCLK outputs for monitor, static, metric
Units were off and VCLK/DCLK outputs were not coming in
properly through amdsmi_get_clk_freq()

Now we match units sent back through rsmi_dev_gpu_clk_freq_get (MHz).

CLI now shows maximum of 2 VCLK/DCLKs otherwise shows N/A if there
is no current_freq listed.

Change-Id: I8a7b66cbb5263e8d396f8568c104e1ce3512923d
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-12-20 14:11:08 -06:00
Juan Castillo f8b8347627 [SWDEV-496693]GPU Metrics 1.7
Features added:
- [SWDEV-475244] Add new interface to get max memory bandwidth
Updated API: amdsmi_get_gpu_vram_info
Updated: struct amdsmi_vram_info_t to include vram_max_bandwidth
CLI: amd-smi static --vram

- [SWDEV-488349] Add new interface for XGMI link status
New API: amdsmi_get_gpu_xgmi_link_status
CLI: amd-smi xgmi --link-status

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Change-Id: I1aa35b741136eb4f02f7ea9a95b865886273eb72
2024-12-18 10:57:06 -06:00
gabrpham 5f9c2db6f3 [SWDEV-484382] Added new command amd-smi set -c/--clk-level
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: If45152e3a3c94f65b6a8a960601b9ed16fa3d0d7
2024-12-13 00:32:19 -05:00