614 커밋

작성자 SHA1 메시지 날짜
Joe Narlo bad2cc9c23 SWDEV-495787 [AMDSMI] Different license headers
Change copyrights to MIT and remove date

Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: I16f5b412f2b9ddefaaa1771aa714cc18829a1be4


[ROCm/amdsmi commit: 3052ad4220]
2024-11-22 08:55:28 -05:00
Charis Poag f01eea6077 [SWDEV-488276/SWDEV-497613] Update memory partition set functionality
Changes:
  - [CLI] Added warning screen to AMD SMI users
    setting memory partition
  - [CLI] Added a progress bar time-bar for CLI sets display to 40 seconds
  - [API] Updated to wait until the driver reloads with SYSFS files active
  - [CLI] Now users can set or reset without providing:
    amd-smi set -g all <set arguments>
    or amd-smi reset -g all <set arguments>
    now can directly call -> sudo amd-smi set <set arguments>
    or sudo amd-smi reset <set arguments>
  - [SWDEV-475712][CLI/API] Fixed target_graphics_version field
    not properly displaying for older MI or Navi ASICs.
  - [All APIs] Added a catch for the driver to report invalid arguments
    now these APIs will show AMDSMI_STATUS_INVAL
    (ex. changing to NPS8 if the device does not support it)
  - [Install] Modified paths for Python install commands to support
    multi-ROCm installs

Change-Id: Id11f25d68a82d23c6b2d77ccb30b51e860dd0ca7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 3ea4a42a6e]
2024-11-12 16:50:32 -04:00
Maisam Arif 2d760697b3 [SWDEV-492031] Update Market Names
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I13c2047fd8c7af0dc566f88a3cac8b365697a092


[ROCm/amdsmi commit: 2678e1f3f7]
2024-11-05 17:52:02 -04:00
Charis Poag 6e0b0792ab [SWDEV-463406] Update sample rate + align metric output
Changes:
- Corrected max speed users can sample from FW/driver
  is 100 ms
- Added warning to amdsmi_get_violation_status()
  call on delay required 100ms to sample
- Removed guest support, this API will not be supported
- Updated CLI `amd-smi metric --throttle` outputs from
    XXX_active -> XXX_status
    XXX_percent -> XXX_activity
  to align with host
- Changelog updated

Change-Id: Ib30dd35dcc04ff67904ca82c86a55a16689df226
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 0ceca28f41]
2024-10-23 17:36:35 -04:00
gabrpham 072e67c9c3 [SWDEV-490187] reset gpu partition were removed
The reset gpu partition support for both compute and memory were removed

Code changes related to the following:
  * amdsmi_reset_gpu_compute_partition()
  * amdsmi_reset_gpu_memory_partition()
  * CLI

Change-Id: I372589074b4da172bedd39223edde18939e373ae
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: f5b7761ac7]
2024-10-18 16:22:26 -05:00
Khader Basha Shaik 806d1a25c3 amdsmi [CPU]: Add implementation to get cpu handles and core handles API
- Update the API names, parameters to return cpu handles and core
handles in the system.
  - Update the amdsmi_wrapper.py.
  - Update the amdsmi_interface.py to use the processor handles and
    core handles API.

Change-Id: Ie24f62f345864f8b6773fdb3c6369993bca7e25b


[ROCm/amdsmi commit: 8308ede9e8]
2024-10-14 05:41:19 -04:00
Maisam Arif 5e3d644769 Corrected clean local data partition indexing
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ib0eeb065f160fccd3c3f4a2d13f0869af01a74ae


[ROCm/amdsmi commit: 27a48e69d8]
2024-10-10 10:54:45 -05:00
Charis Poag 5278e0c290 [SWDEV-463406] Add volation_status current counter/accumulated values
Changes:
  - amdsmi_violation_status_t now includes current accumulated/counter
   values
  - Tests/wrapper now include added values
  - Removed ASIC references in header for host/bm alignment
  - Fix violation_status->per_hbm_thrm /
    violation_status->active_hbm_thrm
    calculations.

Change-Id: Ic86a7cbad5198a41018f82f6b588b83158d9ba0b
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 5eff39915b]
2024-10-04 15:56:01 -04:00
Maisam Arif d759e8d704 Udpated market name
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I71948b185b6ac60610fedf2d48dd9c95c26e5777


[ROCm/amdsmi commit: e402fe7f36]
2024-10-02 14:24:03 -05:00
Charis Poag 7a35c805b0 [SWDEV-422195/SWDEV-440985] GPU metrics 1.6
Changes:
    - Added new GPU metrics:
      1) Violation status' (ex. PVIOL/TVIOL) accumulators
      2) XCP (Graphics Compute Partitions) statistics
      3) pcie other end recovery counter
    - CLI/API/tests changes were made accordingly

Change-Id: I589b9b1f570f25dda12d95bb501feca85da8b3bb
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 3a4abbd8c0]
2024-09-27 12:04:21 -05:00
Lang Yu 94d349573d SWDEV-463405: Add amdsmi_get_link_topology_nearest support
amdsmi_get_link_topology_nearest() is used to retrieve
the set of GPUs that are nearest to a given device
at a specific interconnectivity level.

Code changes related to the following:
    * API
    * CLI
    * Unit tests
    * Examples

Header Unification Change: "/amdsmi/+/1122408"

Change-Id: Id0317797c652c267742513936d321677793ec634
Signed-off-by: Lang Yu <lang.yu@amd.com>


[ROCm/amdsmi commit: 7a557b1c50]
2024-09-26 16:43:27 -05:00
Bill(Shuzhou) Liu c20427e1f0 amdsmi cannot read power cap more than 10 characters
Extend the default read array size.

Change-Id: I2739981873cb3c360661e3ef5f6e70d4f36cb0e8


[ROCm/amdsmi commit: 69109de8d3]
2024-09-24 14:31:40 -04:00
gabrpham 0fd0b46b7f Moved partition_id from static --asic-info to static --partition.
partition_id also removed from the `amdsmi_asic_info_t` struct and
supporting API has been added for querying partition information.

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Id5a6291a77d11bb97a1c7a200fc465898e86e081


[ROCm/amdsmi commit: c9a489d437]
2024-09-20 03:48:42 -04:00
Maisam Arif 82096d7f74 Moved KFD information to separate structure and API
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: If6eaea589edc704cf408d6391b5f2154134035e7


[ROCm/amdsmi commit: 3b7f661e71]
2024-09-20 03:48:42 -04:00
Eisuke Kawashima efed731082 chore: unset executable permission
Change-Id: I06727774f3b1657a7955b172a40d0dfc9c76d6b9
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 1b6ec8df07]
2024-09-16 17:34:39 -04:00
Maisam Arif c2b9cdfd2e Udpated License Dates
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8ca199c129c06508bc3e23745ab5ac2d20dce928


[ROCm/amdsmi commit: 105db1afcd]
2024-09-16 16:14:47 -04:00
Charis Poag df9d5d3ee5 [SWDEV-483526] Fix MI3x partitions not showing all logical nodes
Changes:
- Updates to amdsmi_asic_info_t structure to include:
  target_graphics_version, kfd_id, node_id, partition_id
- Updates to amd-smi static --asic to display new
  samdsmi_asic_info_t fields
- Updates to gpu enumeration during amdsmi_init()
  to discover all logical GPUs when in a non-SPX mode
  (ex. DPX, TPX, QPX, or CPX)
 - Updates to amdsmi_get_gpu_bdf_id(..) to include
   partition_id details when in BDF or optional bits.
     - bits [63:32] = domain
     - bits [31:28] or bits [2:0] = partition id
     - bits [27:16] = reserved
     - bits [15:8]  = Bus
     - bits [7:3] = Device
     - bits [2:0] = Function (partition id maybe in bits [2:0]) <-- Fallback for non SPX modes

- C++/Python tests updated to reflect these outputs

Change-Id: I4be0ea35bb98f3109ae2ca9e82f6b21baa38de29
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: a33e4c9e14]
2024-09-11 16:35:17 -05:00
Tim Huang 202ddc01aa [SWDEV-463402] - Support retrieving connection type and P2P capabilities between two GPUs
1. Add a API interface amdsmi_topo_get_p2p_status to retrieve
connection type and P2P capabilities between 2 GPUs.

2. Add getting p2p status test in hw_topology_read
to print P2P capability information.

3. Add below tables for cli topology sub commands:
  - CACHE COHERANCY TABLE
  - ATOMICS TABLE
  - DMA TABLE
  - BI-DIRECTIONAL TABLE

Change-Id: I199173030d4170115cea27c472958a4826e4e1bf
Signed-off-by: Tim Huang <tim.huang@amd.com>


[ROCm/amdsmi commit: 260edaa752]
2024-09-06 09:42:34 -04:00
gabrpham 614c89889c [SWDEV-450553] Added gpu memory overdrive to metric function
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: If7bd6865d641a5a83c594a4d3c57938b1b6dc18e


[ROCm/amdsmi commit: 7d8e54d0e1]
2024-09-04 12:54:14 -04:00
Maisam Arif 37f3f625a0 Update market name device ids
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I10ce84c8466ff30e2486ed3664a9fe1b57d9c9e4


[ROCm/amdsmi commit: ae2c713d67]
2024-09-04 10:33:43 -05:00
Maisam Arif adc9b69f39 Updated cli init functions to not intersect with lib init functions
Added Quick start script to quickly test python APIs
"python3 -i tools/amdsmi_quick_start.py"
Fixed ESMI lib macros

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I55370a0cb79d631f7f2f2b91568f089b503ebfad


[ROCm/amdsmi commit: 1efb5e9910]
2024-09-04 10:23:36 -04:00
gabrpham 62d3348c9e Changed power parameter in amdsmi_get_energy_count() to energy_accumulator
Issue linked here: https://github.com/ROCm/amdsmi/issues/38

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I622236eb3f0144aefeb6c82d2713b4822bfeeb11


[ROCm/amdsmi commit: 95ca2b83a1]
2024-09-04 09:38:08 -04:00
muthusamy 0560419474 [SWDEV-481002] Fix in Update Market Names
Signed-off-by: muthusamy <muthusamy.ramalingam@amd.com>
Change-Id: I16ea6bdd70f7ed847ef56ddf99dfe66d42c7942a


[ROCm/amdsmi commit: 3c954e78fc]
2024-09-03 11:42:24 +00:00
Maisam Arif 801778d976 [SWDEV-481002] Update Market Names
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I23d129712fd7d7a0d9de73511c71a2eeeb3ec183


[ROCm/amdsmi commit: b5424c1c7e]
2024-08-30 13:31:25 -04:00
Charis Poag 31a2921b62 [SWDEV-451960] [WIP] Add Pytest
Updates:
- Added pytest to shared/pytest folder
- User can execute tests:

[pytest]
python3 -m pytest -p no:cacheprovider /opt/rocm/share/amd_smi/tests/pytest/unit_tests.py -s -v
python3 -m pytest -p no:cacheprovider /opt/rocm/share/amd_smi/tests/pytest/integration_test.py -s -v

[unittest]
/opt/rocm/share/amd_smi/tests/pytest/unit_tests.py -v
/opt/rocm/share/amd_smi/tests/pytest/integration_test.py -v

- Automatically installs pytest

Change-Id: Ia3281a9608aeeb803b91f8b83f87ff84b01037f4
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: d9d6637cb7]
2024-08-29 10:09:29 -04:00
Oliveira, Daniel 55b88706e1 SWDEV-463401: amdsmi_get_gpu_asic_info() adds num_of_compute_units
number of compute units `amdgpu_gpu_info.num_of_compute_units` is exposed through amdsmi_get_gpu_asic_info().

Code changes related to the following:
  * API
  * CLI
  * Unit tests
  * Examples

Change-Id: Ibeb612d079ed87437a0e56124b8504098fc2dcfd
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: b05849dad0]
2024-08-28 10:15:07 -04:00
Oliveira, Daniel e2e63055a6 SWDEV-463399: amdsmi_get_gpu_vram_info() adds bit-width
Driver info `amdgpu_gpu_info.vram_bit_width` is exposed through amdsmi_get_gpu_vram_info().

Code changes related to the following:
  * API
  * CLI
  * Unit tests
  * Examples

Change-Id: I8abd8db7a603078b2b1c008b2685cecf35caf3d2
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 893f13ab98]
2024-08-27 18:22:50 -04:00
Tom St Denis c59f7b7705 Add amdsmi_get_gpu_pm_metrics_info and amdsmi_get_gpu_reg_table_info to py-interface (v3)
v2: drop depend on libc
v3: whitespace

Signed-off-by: Tom St Denis <tom.stdenis@amd.com>
Change-Id: I2eff7aa9d4f0ca8635796f82b106ac0d36176346


[ROCm/amdsmi commit: f4506cfd65]
2024-08-21 08:38:14 -04:00
Bill(Shuzhou) Liu ef78459f75 Set soft min or max clock
Add the API to support set soft min or max clock.

Change-Id: Ia34381a721ef3c3d894d5a89d25afa757be46a79


[ROCm/amdsmi commit: 97e70d44cf]
2024-08-20 13:22:32 -04:00
Oliveira, Daniel 45652b301b fix: [SWDEV-466302] [rocm/amd_smi_lib]
Fixes `amdsmi_get_gpu_process_list` now requires sudo to access pid and memory information

Code changes related to the following:
  * amdsmi_get_gpu_process_list()
  * CLI

Change-Id: I72b154c220276b354c350fcc067c9a7c32e6c173
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: a20db864b8]
2024-06-24 00:38:17 -04:00
muthusamy 7793021ac5 amd-smi [CPU]: Added Support to get number of threads per core
Change-Id: I7e6500f3f53068a3483b64a54d78ac9e1d9cd183


[ROCm/amdsmi commit: 057d688b55]
2024-06-21 17:22:55 -04:00
Bill(Shuzhou) Liu f86ba0a7c4 Change the clean shader API to clean local data
To be align with the unified API.

Change-Id: I2819339fba6f528204cebd3e9605109e82cbc5b4


[ROCm/amdsmi commit: e3c63628e5]
2024-06-17 16:23:33 -05:00
Dalibor Stanisavljevic a993473e13 Changed type to uint32_t oam_id due to header unification
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>
Change-Id: I351415f4a766ad6aa0c2e81adf8b416d066048ea


[ROCm/amdsmi commit: 80043adb81]
2024-06-13 05:12:55 -04:00
Maisam Arif 807ca0ad89 SWDEV-466598 - Fixed CLI process outputs
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I902e82b6e78311e99542b109435346889daa49fc


[ROCm/amdsmi commit: 9fb2c967de]
2024-06-08 18:31:08 -05:00
Bill(Shuzhou) Liu b517f3c214 Change the name of clear sram to run cleaner shader
The function is to clean the local data in LDS/GPRs. The clear sram
is misleading.

Change-Id: I0385e6d6348602fe0f347d17e48ed8983f7ceb87


[ROCm/amdsmi commit: 4cf59c4edb]
2024-06-05 12:07:39 -05:00
Maisam Arif 7dfe4276cc Use different sysfs for soc_pstate and xmgi_plpd
The sysfs is changed to use the pm_policy folder with multiple
dpm_policy files.

Change-Id: I40fac8de2d0cb127950d238b8196f6d2416778d0


[ROCm/amdsmi commit: e5d1ba4621]
2024-05-31 01:38:41 -04:00
Dalibor Stanisavljevic 5153032416 SWDEV-457337 - Header aligment
Missing AMDSMI_STATUS prefix

Change-Id: I15d050a146c92f6897d48317d8fec51d046535d1
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>


[ROCm/amdsmi commit: 458dc8f180]
2024-05-30 15:35:38 -04:00
Dalibor Stanisavljevic cdd24a7b0f SWDEV-457337 - Fix header alignment
Change-Id: I9f25f6c4f0d00c76b66d13162f30be11368f5b59
Signed-off-by: Dalibor Stanisavljevic <Dalibor.Stanisavljevic@amd.com>


[ROCm/amdsmi commit: 7b2463abe0]
2024-05-23 04:41:57 -04:00
Maisam Arif 914f5b2e3f Make product name empty when unable to find pciid
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: If2300bf2deb4fa099db695949bd4c74393dbbbfc


[ROCm/amdsmi commit: 1cee1baac2]
2024-05-21 02:19:19 -04:00
Charis Poag 312034409e SWDEV-462728 Add update-pciids to install + remove subsystem name
Added to install to update-pciids if there is network connection.
Removed subsystem name from outputting under model. Added TODO
to add later on.

Change-Id: I028269f2931f61e094116a85a7a1286de548122a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: c5da93ab90]
2024-05-20 12:03:31 -05:00
Charis Poag aaed28aaab [SWDEV-451104] Update static --board + amdsmi_get_gpu_board_info()
Updates:
    * Expanded `amdsmi_get_gpu_board_info()` amdsmi_board_info_t structure size
      Updated sizes that work for retrieving relevant board
      information across AMD's ASIC products.
    * Fixed `amdsmi_get_gpu_board_info()` to no longer return junk char strings

Change-Id: Ie1553c6109d678d283d82c24e9284f8e19cd6ccc
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 4295bba37f]
2024-05-13 23:05:32 -05:00
Charis Poag f7d3417a7f SWDEV-450580 - Fix powercap set
Updates:
     * CLI - Added AMDSMIHelpers.convert_SI_unit() to help
       conversion of units
     * API - Reverted to uW for power cap limits
     * CLI - amd-smi static --limit now includes MIN_POWER
     * Tests now are all using uW units to keep W conversion
       to only happen in CLI
     * Python API now reflects same units as uW (what is seen
       in amdgpu driver)
     * CLI - amd-smi metric --power:
       Fixed power seen on gpu_metrics v1.3

Change-Id: I32d9ba78d0d8806772f0860f9a803a885b3f316a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: c24d66740e]
2024-05-02 10:13:39 -05:00
Bill(Shuzhou) Liu 5b0f4638c4 Process isolation and clean shader
A few APIs and command line options are added to support process
isolation and clean shader.

Change-Id: I98ad3fc9fc7429799a21798b7fca1c307de7f403


[ROCm/amdsmi commit: 7d2ab7970d]
2024-04-24 13:22:20 -04:00
Oliveira, Daniel 9e2b1d8a09 fix: [SWDEV-442525] [rocm/amd_smi_lib]
Fixes gpu_process_list

Code changes related to the following:
  * amdsmi_get_gpu_process_list()
  * CLI
  * Examples
  * Unit tests
  * Changelog
  * Readme
  * rocm_smi_lib commit: 677433b367

Change-Id: I9210fbca7a5da92d0a8b472b72ca82597c8e4fb5
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 08e2e21bab]
2024-03-27 16:48:24 -05:00
Maisam Arif 144ddec250 SWDEV-452739 - Add CEM slot type to amd-smi
Updated CHANGELOG.md and re-added spaces after bolded lines

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic728b3e9b083c62fe4c9791b8ede991f5dacc1ca


[ROCm/amdsmi commit: 51b3f8cccb]
2024-03-27 02:01:25 -04:00
Maisam Arif 4d62fc8bd6 SWDEV-445664 - Aligned metric --clock with Host
Change-Id: Ib4dc372aed61f6301680ac746eccf448e9d0ed00
Signed-off-by: Maisam Arif <maisarif@amd.com>


[ROCm/amdsmi commit: 93b81e5012]
2024-03-26 16:30:31 -04:00
Bill(Shuzhou) Liu b9b958b82c Get and set the XGMI PLPD
Update the API and CLI to support XGMI Per-Link Power Down Policy.

Change-Id: Iaf04a771eb8bb0829a5b3088d803a7355a8dfd0b


[ROCm/amdsmi commit: e4085c6414]
2024-03-26 01:48:14 -05:00
Oliveira, Daniel 51d25d8feb fix: [SWDEV-448201] [rocm/amd_smi_lib]
Adds Add PCIE Errors

Code changes related to the following:
  * amdsmi_get_pcie_info()
  * CLI
  * examples

Change-Id: Ie0b7053e77c88fb18309c16e74bce75d862c45a9
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 1310c767ce]
2024-03-24 23:33:32 -04:00
Bill(Shuzhou) Liu 46ab68f840 Set and get DPM policy for GPU device
Add new APIs to set and get dpm policy for the GPU device.

Change-Id: I26fa49cd17d0ce66bda3446c38945a6cf35717ff


[ROCm/amdsmi commit: 108e6d4ae6]
2024-03-12 10:32:31 -04:00
Bill(Shuzhou) Liu f0e5bffab3 Add support for deferred RAS errors in API
The API will support the deferred errors

Change-Id: I221a146f09fefde1fc31e5f746d0870e07c93561


[ROCm/amdsmi commit: c489cb8f3f]
2024-03-04 22:46:44 -05:00