Commit Graph

390 Commits

Author SHA1 Message Date
Maisam Arif dfbd0ab8ba Update spacing in amdsmi.h
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I6147b8e545fdb50f3d3ef37f4df994e7cd9c3046
2024-09-23 22:53:13 -05:00
gabrpham 8bc4abc88b Corrected partition changes in header and wrapper
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Iafd7de8f08924873da841ee6eca62100a17b2b6c
2024-09-20 17:01:55 -05:00
Maisam Arif 6a76f8a705 Bump Version to 24.6.5.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I93d6d397bd8d647f472017c28101dabe9ff8199b
2024-09-20 02:53:45 -05:00
gabrpham c9a489d437 Moved partition_id from static --asic-info to static --partition.
partition_id also removed from the `amdsmi_asic_info_t` struct and
supporting API has been added for querying partition information.

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Id5a6291a77d11bb97a1c7a200fc465898e86e081
2024-09-20 03:48:42 -04:00
Maisam Arif 3b7f661e71 Moved KFD information to separate structure and API
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: If6eaea589edc704cf408d6391b5f2154134035e7
2024-09-20 03:48:42 -04:00
Maisam Arif 105db1afcd Udpated License Dates
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8ca199c129c06508bc3e23745ab5ac2d20dce928
2024-09-16 16:14:47 -04:00
Maisam Arif 397d8d9339 Bump Version to 24.6.4.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I75b6039c221ecea1e36a451d93bb52b5406bd106
2024-09-11 17:36:07 -04:00
Charis Poag a33e4c9e14 [SWDEV-483526] Fix MI3x partitions not showing all logical nodes
Changes:
- Updates to amdsmi_asic_info_t structure to include:
  target_graphics_version, kfd_id, node_id, partition_id
- Updates to amd-smi static --asic to display new
  samdsmi_asic_info_t fields
- Updates to gpu enumeration during amdsmi_init()
  to discover all logical GPUs when in a non-SPX mode
  (ex. DPX, TPX, QPX, or CPX)
 - Updates to amdsmi_get_gpu_bdf_id(..) to include
   partition_id details when in BDF or optional bits.
     - bits [63:32] = domain
     - bits [31:28] or bits [2:0] = partition id
     - bits [27:16] = reserved
     - bits [15:8]  = Bus
     - bits [7:3] = Device
     - bits [2:0] = Function (partition id maybe in bits [2:0]) <-- Fallback for non SPX modes

- C++/Python tests updated to reflect these outputs

Change-Id: I4be0ea35bb98f3109ae2ca9e82f6b21baa38de29
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-09-11 16:35:17 -05:00
Tim Huang 260edaa752 [SWDEV-463402] - Support retrieving connection type and P2P capabilities between two GPUs
1. Add a API interface amdsmi_topo_get_p2p_status to retrieve
connection type and P2P capabilities between 2 GPUs.

2. Add getting p2p status test in hw_topology_read
to print P2P capability information.

3. Add below tables for cli topology sub commands:
  - CACHE COHERANCY TABLE
  - ATOMICS TABLE
  - DMA TABLE
  - BI-DIRECTIONAL TABLE

Change-Id: I199173030d4170115cea27c472958a4826e4e1bf
Signed-off-by: Tim Huang <tim.huang@amd.com>
2024-09-06 09:42:34 -04:00
Maisam Arif 97c487372f Clean up unused files & Update License info
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I5b58e8fe3d9eeac207b07ce0fe4134dd717dbd90
2024-09-05 09:52:48 -04:00
gabrpham 7d8e54d0e1 [SWDEV-450553] Added gpu memory overdrive to metric function
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: If7bd6865d641a5a83c594a4d3c57938b1b6dc18e
2024-09-04 12:54:14 -04:00
gabrpham 95ca2b83a1 Changed power parameter in amdsmi_get_energy_count() to energy_accumulator
Issue linked here: https://github.com/ROCm/amdsmi/issues/38

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I622236eb3f0144aefeb6c82d2713b4822bfeeb11
2024-09-04 09:38:08 -04:00
Oliveira, Daniel b05849dad0 SWDEV-463401: amdsmi_get_gpu_asic_info() adds num_of_compute_units
number of compute units `amdgpu_gpu_info.num_of_compute_units` is exposed through amdsmi_get_gpu_asic_info().

Code changes related to the following:
  * API
  * CLI
  * Unit tests
  * Examples

Change-Id: Ibeb612d079ed87437a0e56124b8504098fc2dcfd
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-08-28 10:15:07 -04:00
Oliveira, Daniel 893f13ab98 SWDEV-463399: amdsmi_get_gpu_vram_info() adds bit-width
Driver info `amdgpu_gpu_info.vram_bit_width` is exposed through amdsmi_get_gpu_vram_info().

Code changes related to the following:
  * API
  * CLI
  * Unit tests
  * Examples

Change-Id: I8abd8db7a603078b2b1c008b2685cecf35caf3d2
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-08-27 18:22:50 -04:00
Oliveira, Daniel af3670d758 SWDEV-463372: amdsmi_get_utilization_count() adds decoder_activity
GPU Metrics info `gpu_metrics.vcn_activity` is exposed through amdsmi_get_utilization_count().

Code changes related to the following:
  * API
  * CLI
  * Unit tests

Change-Id: I831b2a81bdc0e090a6698dcb689d10f91ed87dd9
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-08-27 16:58:34 -05:00
Tom St Denis f4506cfd65 Add amdsmi_get_gpu_pm_metrics_info and amdsmi_get_gpu_reg_table_info to py-interface (v3)
v2: drop depend on libc
v3: whitespace

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Change-Id: I2eff7aa9d4f0ca8635796f82b106ac0d36176346
2024-08-21 08:38:14 -04:00
Bill(Shuzhou) Liu 97e70d44cf Set soft min or max clock
Add the API to support set soft min or max clock.

Change-Id: Ia34381a721ef3c3d894d5a89d25afa757be46a79
2024-08-20 13:22:32 -04:00
Maisam Arif 40112f5b17 Bump Version to 24.6.3.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I902da5e5e9e7441002420afaaef01ca9c6c9666f
2024-08-08 01:30:51 -05:00
Maisam Arif 548938389d Bump Version to 24.6.2.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ic389b6783514e88c43958ff5d3413a4c4a8a884f
2024-07-10 19:15:17 -05:00
Oliveira, Daniel a20db864b8 fix: [SWDEV-466302] [rocm/amd_smi_lib]
Fixes `amdsmi_get_gpu_process_list` now requires sudo to access pid and memory information

Code changes related to the following:
  * amdsmi_get_gpu_process_list()
  * CLI

Change-Id: I72b154c220276b354c350fcc067c9a7c32e6c173
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-06-24 00:38:17 -04:00
muthusamy 057d688b55 amd-smi [CPU]: Added Support to get number of threads per core
Change-Id: I7e6500f3f53068a3483b64a54d78ac9e1d9cd183
2024-06-21 17:22:55 -04:00
Maisam Arif a3497702cb Bump Version to 24.6.1.0 and Update Changelog
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I88b8ad1341d7f1a2e14517db82979bb6b28575e8
2024-06-18 23:54:26 -05:00
Bill(Shuzhou) Liu e3c63628e5 Change the clean shader API to clean local data
To be align with the unified API.

Change-Id: I2819339fba6f528204cebd3e9605109e82cbc5b4
2024-06-17 16:23:33 -05:00
Dalibor Stanisavljevic 80043adb81 Changed type to uint32_t oam_id due to header unification
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
Change-Id: I351415f4a766ad6aa0c2e81adf8b416d066048ea
2024-06-13 05:12:55 -04:00
Bill(Shuzhou) Liu 4cf59c4edb Change the name of clear sram to run cleaner shader
The function is to clean the local data in LDS/GPRs. The clear sram
is misleading.

Change-Id: I0385e6d6348602fe0f347d17e48ed8983f7ceb87
2024-06-05 12:07:39 -05:00
Maisam Arif 68d8c1ab46 Bump Version to 24.6.0.0 and Update Changelog for 6.2.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I7f20094514cbfa32e40a6e4da36785d94839768c
2024-05-31 03:05:42 -05:00
Maisam Arif e5d1ba4621 Use different sysfs for soc_pstate and xmgi_plpd
The sysfs is changed to use the pm_policy folder with multiple
dpm_policy files.

Change-Id: I40fac8de2d0cb127950d238b8196f6d2416778d0
2024-05-31 01:38:41 -04:00
Dalibor Stanisavljevic 458dc8f180 SWDEV-457337 - Header aligment
Missing AMDSMI_STATUS prefix

Change-Id: I15d050a146c92f6897d48317d8fec51d046535d1
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
2024-05-30 15:35:38 -04:00
Dalibor Stanisavljevic 7b2463abe0 SWDEV-457337 - Fix header alignment
Change-Id: I9f25f6c4f0d00c76b66d13162f30be11368f5b59
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
2024-05-23 04:41:57 -04:00
Maisam Arif 721e3ed3ea Bump Version to 24.5.3.0
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I0d1ecddd650320287446a06cd8ce680c52a89342
2024-05-15 04:28:27 -04:00
Maisam Arif 7d999aa34c SWDEV-458102 - Updates to pp_od_clk_voltage parsing
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I650dae1a99856dcde914fe66917cf9111f3ce0e2
2024-05-15 03:18:24 -05:00
Charis Poag 4295bba37f [SWDEV-451104] Update static --board + amdsmi_get_gpu_board_info()
Updates:
    * Expanded `amdsmi_get_gpu_board_info()` amdsmi_board_info_t structure size
      Updated sizes that work for retrieving relevant board
      information across AMD's ASIC products.
    * Fixed `amdsmi_get_gpu_board_info()` to no longer return junk char strings

Change-Id: Ie1553c6109d678d283d82c24e9284f8e19cd6ccc
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-05-13 23:05:32 -05:00
Maisam Arif 52843152a5 SWDEV-444567 - Added Ring Hang Event
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I2e73ba08ee0004f6f30660b2fa425ea94bafceca
2024-05-03 17:21:28 -04:00
Maisam Arif 11c72946eb Revert "SWDEV-458102 - Deprecated Voltage Curve API"
This reverts commit 1423fb632e.

Change-Id: I8a3eaf0a9f28200e09fb35d5260fbc070fe8a4a9
2024-05-02 15:27:16 -05:00
Charis Poag c24d66740e SWDEV-450580 - Fix powercap set
Updates:
     * CLI - Added AMDSMIHelpers.convert_SI_unit() to help
       conversion of units
     * API - Reverted to uW for power cap limits
     * CLI - amd-smi static --limit now includes MIN_POWER
     * Tests now are all using uW units to keep W conversion
       to only happen in CLI
     * Python API now reflects same units as uW (what is seen
       in amdgpu driver)
     * CLI - amd-smi metric --power:
       Fixed power seen on gpu_metrics v1.3

Change-Id: I32d9ba78d0d8806772f0860f9a803a885b3f316a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-05-02 10:13:39 -05:00
Maisam Arif 051d5a4d42 Bump Version to 24.5.2.0
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I2f51ed93a356e55156983c56bac293a5d7d3b5c1
2024-05-02 02:53:48 -04:00
Maisam Arif 1423fb632e SWDEV-458102 - Deprecated Voltage Curve API
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I111c3ce26d2ab66d5e755432f4b8a9bfa631f805
2024-05-02 02:53:29 -04:00
Bill(Shuzhou) Liu 7d2ab7970d Process isolation and clean shader
A few APIs and command line options are added to support process
isolation and clean shader.

Change-Id: I98ad3fc9fc7429799a21798b7fca1c307de7f403
2024-04-24 13:22:20 -04:00
Maisam Arif 1bd18c1a65 Added new ecc blocks and adjusted metric --ecc-block filtering
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ib2f69c7d59ee5108024794434fb202b5e4f58738
2024-04-18 15:01:41 -04:00
Maisam Arif 092908daee Bump Version to 24.5.1.0
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I842e223b78f337a39098f652fa6e7ef51948fbaf
2024-04-05 02:31:08 -05:00
Oliveira, Daniel 08e2e21bab fix: [SWDEV-442525] [rocm/amd_smi_lib]
Fixes gpu_process_list

Code changes related to the following:
  * amdsmi_get_gpu_process_list()
  * CLI
  * Examples
  * Unit tests
  * Changelog
  * Readme
  * rocm_smi_lib commit: 677433b367

Change-Id: I9210fbca7a5da92d0a8b472b72ca82597c8e4fb5
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-03-27 16:48:24 -05:00
Maisam Arif 51b3f8cccb SWDEV-452739 - Add CEM slot type to amd-smi
Updated CHANGELOG.md and re-added spaces after bolded lines

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic728b3e9b083c62fe4c9791b8ede991f5dacc1ca
2024-03-27 02:01:25 -04:00
Bill(Shuzhou) Liu e4085c6414 Get and set the XGMI PLPD
Update the API and CLI to support XGMI Per-Link Power Down Policy.

Change-Id: Iaf04a771eb8bb0829a5b3088d803a7355a8dfd0b
2024-03-26 01:48:14 -05:00
Oliveira, Daniel 1310c767ce fix: [SWDEV-448201] [rocm/amd_smi_lib]
Adds Add PCIE Errors

Code changes related to the following:
  * amdsmi_get_pcie_info()
  * CLI
  * examples

Change-Id: Ie0b7053e77c88fb18309c16e74bce75d862c45a9
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2024-03-24 23:33:32 -04:00
Bill(Shuzhou) Liu 108e6d4ae6 Set and get DPM policy for GPU device
Add new APIs to set and get dpm policy for the GPU device.

Change-Id: I26fa49cd17d0ce66bda3446c38945a6cf35717ff
2024-03-12 10:32:31 -04:00
Bill(Shuzhou) Liu c489cb8f3f Add support for deferred RAS errors in API
The API will support the deferred errors

Change-Id: I221a146f09fefde1fc31e5f746d0870e07c93561
2024-03-04 22:46:44 -05:00
Maisam Arif 69caba8727 Bump Version to 24.4.0.0 & Corrected argument checks for set subcommand
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I651f8ca652c764f30845503dd869f435f728d5ba
2024-02-23 20:47:19 -06:00
Bill(Shuzhou) Liu db33cda0c1 Unify the amdsmi_get_pcie_info python interface
Make the python interface consistent with the C interface.

Change-Id: Idda08f888947c757e475d5a024b0ec3d8e1d846a
2024-02-22 03:33:59 -05:00
Maisam Arif f58613561c Refactor ESMI Initialization and Argument Parsing
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Iefab3a8110e0d3c525ee0cef1bdef9101550e9de
2024-02-21 19:02:14 -05:00
Deepak Mewar 84608807da Fix for multiple hsmp freq sources not reported on some setups
Change-Id: I8afe7076bd7790cf408ef104c50ac8d258b7d3fc
Signed-off-by: Maisam Arif <maisarif@amd.com>
2024-02-21 06:30:03 -06:00