نمودار کامیت

1558 کامیت‌ها

مولف SHA1 پیام تاریخ
Charis Poag 3a4abbd8c0 [SWDEV-422195/SWDEV-440985] GPU metrics 1.6
Changes:
    - Added new GPU metrics:
      1) Violation status' (ex. PVIOL/TVIOL) accumulators
      2) XCP (Graphics Compute Partitions) statistics
      3) pcie other end recovery counter
    - CLI/API/tests changes were made accordingly

Change-Id: I589b9b1f570f25dda12d95bb501feca85da8b3bb
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-09-27 12:04:21 -05:00
Lang Yu 7a557b1c50 SWDEV-463405: Add amdsmi_get_link_topology_nearest support
amdsmi_get_link_topology_nearest() is used to retrieve
the set of GPUs that are nearest to a given device
at a specific interconnectivity level.

Code changes related to the following:
    * API
    * CLI
    * Unit tests
    * Examples

Header Unification Change: "/amdsmi/+/1122408"

Change-Id: Id0317797c652c267742513936d321677793ec634
Signed-off-by: Lang Yu <lang.yu@amd.com>
2024-09-26 16:43:27 -05:00
Ranjith Ramakrishnan f00a03ed2b Remove package provides field from RPM and DEB package
The provides tag is required when the package provides a virtual package.
Package name along with version will be provided by default and the provides tag is not required for this.

Change-Id: I6d42cd1a6e2247e33708a1fa2627897e86099815
2024-09-26 17:42:49 -04:00
Ryo Ficano 9979be8512 [SWDEV-482963] [Test updates] Add new tests for p0 items - BM v2
Updates:
- Added tests for these API calls:

amdsmi_get_socket_handles
amdsmi_get_processor_type
amdsmi_get_clk_freq
amdsmi_get_gpu_process_info
amdsmi_get_gpu_ras_block_features_enabled
amdsmi_get_gpu_ecc_count
amdsmi_get_gpu_memory_usage
amdsmi_get_gpu_vendor_name
amdsmi_get_utilization_count

- Added amdsmi_init() and amdsmi_shut_down() before and after each test.
- Updated README and removed all pytest references.

Change-Id: Ida0c165a466571b1df36c413161bd95c070f6ff1
Signed-off-by: Ryo Ficano <Ryo.Ficano@amd.com>
2024-09-26 14:08:13 -04:00
Zhang Ava 42418833da Merge amd-dev into amd-master 20240926
Signed-off-by: Zhang Ava <niandong.zhang@amd.com>
Change-Id: I8697f9b43e7da03b05f17efb6e6e26118fd2a139
2024-09-26 18:34:14 +08:00
Justin Williams 807f1e3111 Removed Post Install PyYAML and Pip Upgrades
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
Change-Id: I25f0e8087a212fd29d33a8a40303436279789029
2024-09-25 18:20:23 -05:00
muthusamy e037cde86b amdsmi: Optimizing go shim to default pick amdsmi
Optimizing go shim to default pick amdsmi and other code cleanup in goshim.

Signed-off-by: muthusamy <muthusamy.ramalingam@amd.com>
Change-Id: I0e6a2d28404cbb751d2b6e90c793b359fec9be13
2024-09-25 16:30:02 -04:00
Bill(Shuzhou) Liu 69109de8d3 amdsmi cannot read power cap more than 10 characters
Extend the default read array size.

Change-Id: I2739981873cb3c360661e3ef5f6e70d4f36cb0e8
2024-09-24 14:31:40 -04:00
Harkirat Gill 3660724a08 Updated error message when driver modules not loaded
Small change to add sudo modprobe amdgpu/amd_hsmp suggestion if modules are not loaded. Requested per Maisam, will close https://github.com/ROCm/amdsmi/issues/45

Change-Id: Ia7ffcc99df18296c5c682f2082ff8dd8f007d557
2024-09-23 23:56:05 -04:00
Maisam Arif dfbd0ab8ba Update spacing in amdsmi.h
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I6147b8e545fdb50f3d3ef37f4df994e7cd9c3046
2024-09-23 22:53:13 -05:00
Maisam Arif 5fc8dc5eed Merge amd-dev into amd-master 20240920
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I25f83ee9a7329f5bce49ac3c0305e1ebaacf27b9
2024-09-20 17:11:47 -05:00
Maisam Arif 09c9574454 [SWDEV-469278] - Lowered PyYAML dependency
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Icfee09b84cf1071ec82b65fc2877be69e0283489
2024-09-20 18:03:00 -04:00
gabrpham 8bc4abc88b Corrected partition changes in header and wrapper
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Iafd7de8f08924873da841ee6eca62100a17b2b6c
2024-09-20 17:01:55 -05:00
Dmitrii Galantsev 6beec5f3ec Revert "[SWDEV-469278] Lowered PyYAML post install script dependency"
Revert submission 1125402

Reason for revert: Packaging a tar archive of 3rd party sources
Reverted Changes:
I8908451c0:[SWDEV-482058] Updated Packaging for offline insta...
I764c8bf01:[SWDEV-469278] Lowered PyYAML post install script ...

Change-Id: I3886b5370e352fc33a249c4657d7ed0c1ee75baf
2024-09-20 16:42:29 -04:00
Dmitrii Galantsev 9924574cbe Revert "[SWDEV-482058] Updated Packaging for offline installs"
Revert submission 1125402

Reason for revert: Packaging a tar archive of 3rd party sources
Reverted Changes:
I8908451c0:[SWDEV-482058] Updated Packaging for offline insta...
I764c8bf01:[SWDEV-469278] Lowered PyYAML post install script ...

Change-Id: Ib32fa5b9351b1cfc2a8d453e744c0d00209359eb
2024-09-20 16:42:29 -04:00
muthusamy 66c98fd722 amdsmi: Adding GO wrappers for amd_smi_exporter
Adding GO wrappers as part of amdsmi library, so that
amd_smi_exporter can fetch the cpu, gpu data directly from amdsmi library.

Signed-off-by: muthusamy <muthusamy.ramalingam@amd.com>
Change-Id: I8fba57c1d20d21758a1aed38ed2c00c9d5c9ecfa
2024-09-20 04:08:27 -04:00
Maisam Arif b40b405332 [SWDEV-456049] & [SWDEV-442181] Fix early exiting loop while enumerating GPU stats
Skip missing vram_str_path and sdma_str_path if sysfs files not created when passing some, but not all, GPUs to a docker image.

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I83b7a62331672810688a94e4023b0ae740436e6d
2024-09-20 03:01:22 -05:00
Maisam Arif 6a76f8a705 Bump Version to 24.6.5.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I93d6d397bd8d647f472017c28101dabe9ff8199b
2024-09-20 02:53:45 -05:00
gabrpham c9a489d437 Moved partition_id from static --asic-info to static --partition.
partition_id also removed from the `amdsmi_asic_info_t` struct and
supporting API has been added for querying partition information.

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Id5a6291a77d11bb97a1c7a200fc465898e86e081
2024-09-20 03:48:42 -04:00
Maisam Arif 3b7f661e71 Moved KFD information to separate structure and API
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: If6eaea589edc704cf408d6391b5f2154134035e7
2024-09-20 03:48:42 -04:00
Maisam Arif 2cfae06560 [SWDEV-482058] Updated Packaging for offline installs
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8908451c013fc944645b5b5df3104a2ff73e72bd
2024-09-20 00:55:48 -04:00
Justin Williams f2f02aa317 [SWDEV-469278] Lowered PyYAML post install script dependency
Signed-off-by: Justin Williams <Justin.Williams@amd.com>
Change-Id: I764c8bf01e6cb6acb0b3fc1db396707099e5ed12
2024-09-20 00:55:48 -04:00
Charis Poag ede0e6318d Fix python unittest not installing amd-smi-lib-test package install
Moving to TESTS_COMPONENT allows files to be placed
within the amd-smi-lib-test package.
Previously, was put within the amd-smi-lib package,
which will never be triggered for installation with
latest changes.

Change-Id: Id49dbe69bfc7d5bd1af403c28b946fe1edf64d8e
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-09-18 19:25:48 -05:00
Charis Poag 5c778cadf1 Fix amd-smi CLI calls returning TypeError
$ amd-smi version
TypeError: unsupported operand type(s) for |: 'type' and 'type'

---------------
Python3 --versions lower than 3.10
do not support str | None

Using typing Optional and Union, we can create equivalent logic for
str | none
and
str | list | none

Change-Id: I1f4a7ab67333914b33639dc62652881e1127411e
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-09-18 16:59:12 -05:00
Harkirat Gill d263b53797 Fix for GitHub Issue #24: Update Event Stop Behavior
amd-smi event is failing to exit as it waits for all threads to complete before exiting. Each thread has to listen for a maximum of 10 seconds prior to exiting in the current implementation.

Lowered individual listen time for _event_thread allowing for a quicker exit while still capturing all events (Looped until escape sequence detected).

Added logging for escape character, not sure if needed but helps confirm that key press was registered.

Change-Id: I916608754798f966980a558342c7c62693252d7f
2024-09-18 14:54:40 -04:00
gabrpham b7f779182d [SWDEV-448738] Added rocmsmi extremum command as 'set -L'
Change-Id: I997c630bd20cc61673813a2301eb5e3002619a32
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>

Change-Id: Ifa884303f9a0fa058af093a23f5be449bba54f29
2024-09-18 14:51:01 -04:00
Juan Castillo ac593f9fa0 [SWDEV-482966/ SWDEV-482967] Removing pytest dependency + install path change
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Change-Id: I7aace93fcad18d67443e6849c10a1fbbc65d0fa8
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
2024-09-18 00:27:00 -04:00
gabrpham 0d4b332fe4 Removed _validate_positive function and replaced with _positive_int or _not_negative_int as appropriate
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: I01effcdf9bae31fd8bc926c5d4bdf58274838618
2024-09-17 18:37:16 -04:00
Maisam Arif 639daa3d90 Fixed amdsmi_get_utilization_count() wrapper generation
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ifd59fca042c4b3b0fc53e100b6892c6b4f7b3e95
2024-09-17 16:34:42 -04:00
Eisuke Kawashima 1b6ec8df07 chore: unset executable permission
Change-Id: I06727774f3b1657a7955b172a40d0dfc9c76d6b9
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-09-16 17:34:39 -04:00
Eisuke Kawashima 5fdcaf1248 fix(python): fix comparison to True/False
from PEP8 (https://peps.python.org/pep-0008/#programming-recommendations):

> Comparisons to singletons like None should always be done with is or
> is not, never the equality operators.

Change-Id: I710d64c380eaf420f0ad29e65623ee677b094051
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-09-16 17:34:39 -04:00
Maisam Arif 37847165c3 Revert "[SWDEV-482963] [Test updates] Add new tests for p0 items - BM"
This reverts commit f34eb94ef2.

Change-Id: Icf9fedaca2976b8ff1bc17aff8b598bfce18f095
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2024-09-16 16:17:16 -04:00
Maisam Arif 105db1afcd Udpated License Dates
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I8ca199c129c06508bc3e23745ab5ac2d20dce928
2024-09-16 16:14:47 -04:00
Broderick Gardner a3b0bc5390 Fix amdsmi_get_clk_freq list size
Python list slice is exclusive for the end index, so this -1 is cutting off an element.

Change-Id: I309a0a41447405b1aac465472871e169f2c405e8
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-09-16 15:45:44 -04:00
Michael John d75d127864 Proper escape Windows path \include in generator.py
Change-Id: I9042de7e9cb08c247b7bf21a8de2b8cbceb483da
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-09-16 15:45:15 -04:00
Michael John d9ccc44146 Use correct regex to avoid SyntaxWarning: invalid escape sequence '\.'
Change-Id: I1c6179be294bf21c0897a3abf7e8ab1d270ae238
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-09-16 15:45:15 -04:00
danzimm 91199279b0 Explicitly specify data_type in capture
Change-Id: I3a49ee3acc235df88c2df1d150803b2db2143aee
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-09-16 15:45:15 -04:00
Ryo Ficano f34eb94ef2 [SWDEV-482963] [Test updates] Add new tests for p0 items - BM
Change-Id: I3266ff7ab14959f1824f408a44e82b861d88d61f
Signed-off-by: Ryo Ficano <Ryo.Ficano@amd.com>
2024-09-13 22:29:45 -04:00
Maisam Arif 787d4462fa [SWDEV-482412] Optimized PCIe Bandwidth gpu_metrics calls
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ib37d232b94a080e9b490dd065628d2567aaf4642
2024-09-11 23:26:30 -05:00
Maisam Arif 8b3d45e301 Udpated amdsmi_quick_start.py with cpus preloaded
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I4a5ca0d30d2fce3b4fa3a6a13599a18b0dd16ce7
2024-09-11 17:38:17 -04:00
Maisam Arif 397d8d9339 Bump Version to 24.6.4.0
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I75b6039c221ecea1e36a451d93bb52b5406bd106
2024-09-11 17:36:07 -04:00
Charis Poag a33e4c9e14 [SWDEV-483526] Fix MI3x partitions not showing all logical nodes
Changes:
- Updates to amdsmi_asic_info_t structure to include:
  target_graphics_version, kfd_id, node_id, partition_id
- Updates to amd-smi static --asic to display new
  samdsmi_asic_info_t fields
- Updates to gpu enumeration during amdsmi_init()
  to discover all logical GPUs when in a non-SPX mode
  (ex. DPX, TPX, QPX, or CPX)
 - Updates to amdsmi_get_gpu_bdf_id(..) to include
   partition_id details when in BDF or optional bits.
     - bits [63:32] = domain
     - bits [31:28] or bits [2:0] = partition id
     - bits [27:16] = reserved
     - bits [15:8]  = Bus
     - bits [7:3] = Device
     - bits [2:0] = Function (partition id maybe in bits [2:0]) <-- Fallback for non SPX modes

- C++/Python tests updated to reflect these outputs

Change-Id: I4be0ea35bb98f3109ae2ca9e82f6b21baa38de29
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-09-11 16:35:17 -05:00
Tim Huang 260edaa752 [SWDEV-463402] - Support retrieving connection type and P2P capabilities between two GPUs
1. Add a API interface amdsmi_topo_get_p2p_status to retrieve
connection type and P2P capabilities between 2 GPUs.

2. Add getting p2p status test in hw_topology_read
to print P2P capability information.

3. Add below tables for cli topology sub commands:
  - CACHE COHERANCY TABLE
  - ATOMICS TABLE
  - DMA TABLE
  - BI-DIRECTIONAL TABLE

Change-Id: I199173030d4170115cea27c472958a4826e4e1bf
Signed-off-by: Tim Huang <tim.huang@amd.com>
2024-09-06 09:42:34 -04:00
Zhang Ava 92110c301d Merge amd-dev into amd-master 20240905
Signed-off-by: Zhang Ava <niandong.zhang@amd.com>
Change-Id: I19843f720dc105a343cd1921196549f8ac69e487
2024-09-06 14:36:21 +08:00
Maisam Arif 97c487372f Clean up unused files & Update License info
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I5b58e8fe3d9eeac207b07ce0fe4134dd717dbd90
2024-09-05 09:52:48 -04:00
Galantsev, Dmitrii fa4e488111 Remove python-clang dependency
python3-clang was only used to generate the python wrapper
We now use it only within the docker image for the generator

Change-Id: Id574f109b959d72f0734b0df4c26b3bbab3238fd
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-09-04 15:33:28 -05:00
Maisam Arif cf7c5813b7 Merge amd-dev into amd-master 20240904
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ibaaf5d0399ca1a1333aa789e50a8de66d5ddf426
2024-09-04 12:53:04 -05:00
Maisam Arif bc4ca45862 [SWDEV-450553] Added Subsystem Device ID to amd-smi static --asic
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: I428b4993cca027a6eb1bb9c617fe715118a59407
2024-09-04 12:51:02 -05:00
gabrpham 7d8e54d0e1 [SWDEV-450553] Added gpu memory overdrive to metric function
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: If7bd6865d641a5a83c594a4d3c57938b1b6dc18e
2024-09-04 12:54:14 -04:00
Maisam Arif e5569ee925 Fix C Library call in amdsmi_get_gpu_reg_table_info
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ib732ade8c0e48fdc7d09d920bca8b4fe4e773cac
2024-09-04 11:42:45 -05:00