614 Коммитов

Автор SHA1 Сообщение Дата
Deepak Mewar d41232363c DCSM-371 - Observing previous mode details as null for amdsmi_set_cpu_pcie_link_rate
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Change-Id: I79a61d7b10aaff27b07e3d108a9b817c5ead6cf3


[ROCm/amdsmi commit: f48e3f48a3]
2024-02-22 16:30:18 -05:00
Bill(Shuzhou) Liu 21cf0c1b5c Unify the amdsmi_get_pcie_info python interface
Make the python interface consistent with the C interface.

Change-Id: Idda08f888947c757e475d5a024b0ec3d8e1d846a


[ROCm/amdsmi commit: db33cda0c1]
2024-02-22 03:33:59 -05:00
Maisam Arif 2c3537e389 Refactor ESMI Initialization and Argument Parsing
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Iefab3a8110e0d3c525ee0cef1bdef9101550e9de


[ROCm/amdsmi commit: f58613561c]
2024-02-21 19:02:14 -05:00
Maisam Arif c3bfdbe806 Aligned cache property enum with Host
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ie64a33f55c9a9a7cc8c806419509897351f37c70


[ROCm/amdsmi commit: 703fdb0ed2]
2024-02-20 05:48:53 -06:00
Maisam Arif 4728d05c5f Align list and cache_info to Host
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I4fa55b360b74d5a202d0b9b4eb7aee660b0a1bcf


[ROCm/amdsmi commit: 77710921a4]
2024-02-15 01:47:59 -05:00
Oliveira, Daniel e9e246f23d fix: [rocm/amd_smi_lib] amdsmi_get_gpu_activity gfx/memory activity does not update
Checks and forces rereading gpu metrics unconditionally

Code changes related to the following:
  * Device::dev_log_gpu_metrics()
  * amdsmi_get_gpu_metrics_header_info()
    Removed unintentionally during work on 'header cleanup Remove non-unified headers'
  * Examples
  * Unit tests

Change-Id: I83710e173c0f7102d0b7f865c18474c979a95cd8
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 78074d7d77]
2024-02-13 10:15:17 -06:00
Maisam Arif b6f62bb651 Renamed amdsmi_get_metrics_table to amdsmi_get_cpu_metrics_table
Renamed structs to be more conistent with what they are calling

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I6f2be2fcb76f004aa592f0dad8545565700ccd4b


[ROCm/amdsmi commit: f831cf49f7]
2024-02-12 16:30:18 -06:00
Maisam Arif b3387610de Added Navi21 Device ID
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I0765283afda4c5cb04e2ad986863ec788da233cc


[ROCm/amdsmi commit: 9fe10fc98a]
2024-02-07 05:18:47 -05:00
Deepak Mewar 1bbb19c8b7 Added amdsmi cpu family & cpu model
- Updated header and source files
- Updated python interface
- Generated python wrapper for updated header
- Updated the CLI to have cpu family & cpu model
  as part of metric table

Change-Id: Iea440251797270d5d29ffe883b0ad6db790be658


[ROCm/amdsmi commit: 6f7273fda5]
2024-02-06 18:46:27 -05:00
Maisam Arif 39537d999d SWDEV-436533 - Cache Info Struct Update
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic640fa657cdcc32d7b00ff78fc9452ec7e05dd07


[ROCm/amdsmi commit: 88192d8b6b]
2024-02-05 16:51:04 -05:00
Maisam Arif d5f2a6770a Fixed gpu_metric and cache cli checks
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic71e2b50dfa8fc106a17079842a7564a8e24b69d


[ROCm/amdsmi commit: 59d885a9ca]
2024-02-01 05:47:18 -05:00
Oliveira, Daniel a2f04dd3bc fix: [rocm/amd_smi_lib] header cleanup Remove non-unified headers
Cleans up individual gpu metric APIs which will be implemented according to 'unified-headers' standards

Code changes related to the following:
  * '_get_gpu_metrics_' APIs
  * Functional tests

Change-Id: I2dd2ecde11c1d77e343e0ae0e10aeb9120ae9b99
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 55734d2d7a]
2024-01-26 10:38:48 -05:00
Charis Poag f357c180e7 Fix metric type error output + re-align with ROCm SMI metrics
Changes:
* [CLI] Provide fix for "/opt/rocm/bin/amd-smi metric
TypeError: '>' not supported between instances of 'str' and 'i"
--> Python API was updated, CLI needed to reflect these changes
* [API] Updated amdsmi.h's with ROCm SMI
--> Incorrectly added mem_bandwidth_acc & mem_max_bandwidth
--> Realigned wrapper with updates
* [Test] Added metrics not shown in gpu_metrics_read.cc

Change-Id: Ia3a172377fd5a582254dd5a46d81dbec7e763cd9
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 34bd26c68e]
2024-01-24 21:23:40 -06:00
Bill(Shuzhou) Liu 25ffbb0304 Unified API
amdsmi_get_link_metrics() and amdsmi_get_pcie_info()

Change-Id: Iea060e449813b842236243b772e8809497ce98fe


[ROCm/amdsmi commit: 0b67c2ccc4]
2024-01-24 18:27:20 -05:00
Maisam Arif 084d8f89d1 SWDEV-434348: Corrected Guest Vendor Name values
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Iee0d45fc64386f0417a0e30cce05608ca2186990


[ROCm/amdsmi commit: 53177525bf]
2024-01-24 07:34:06 -06:00
Maisam Arif 56f96b613e Handled unkown vram type out of bounds error
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I2d32c7043c78c0651f1b4db565a299b6b96abbcc


[ROCm/amdsmi commit: ee80c2cac4]
2024-01-24 06:50:17 -05:00
Deepak Mewar 6ef2131a21 amdsmi library updated for esmi error status mapping to amdsmi
Change-Id: I7e4dd146a1a9af496556efcf811b2e1ed565b09e


[ROCm/amdsmi commit: 5d0b479661]
2024-01-16 11:41:22 -06:00
Deepak Mewar 171f4818f4 amdsmi library updated for metric table structure
Change-Id: Ie8a9840a9020282599dd413e964d86bfb8850f6a


[ROCm/amdsmi commit: a0c95e855b]
2024-01-16 11:41:22 -06:00
Deepak Mewar 3a00172186 amdsmi library and sample code updated for amdsmi_get_metrics_table
Change-Id: Ie03c556f5c38fe4a0365743d3a94220e3aa62b23


[ROCm/amdsmi commit: 9f3a6dbd29]
2024-01-16 11:41:22 -06:00
Bill(Shuzhou) Liu 28f354796d Use the same mutex as rocm-smi
Share the same mutex as rocm-smi implementation. Handle the crash
when a user is not in render group.

Change-Id: I486b26569f9b523b41bbdaf95d51f4a730978cfd


[ROCm/amdsmi commit: 5a6b5d2a0a]
2024-01-15 13:12:49 -05:00
Charis Poag 31081fa8b0 Fix AMD-SMI test segmentation fault TestGpuMetricsRead
Issue: need to return on any failure.
The nullptr check test would segfault without-
all values in struct are not initialized.

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: I4987fb73ba9bcb182de7a439a4286333a41bf7eb


[ROCm/amdsmi commit: d74be3120e]
2024-01-14 19:27:34 -06:00
Charis Poag 601a254f37 Fix GPU metric tests & cleanup test output
- CLI: Added average_power to display if current_power is empty
    - CLI: fixed PCIe current_speed not displaying GT/s
    - ROCm API: 1.3 & 1.4
                -> commented out setting avg clocks to current clock value
(leave as max uint value, not re-assign; these are not same values)
                    -> commented out setting current_socket_power = average_power
(leave as max uint value, not re-assign; these are not same values)
                    -> For all non-array clocks, placed value in first
                        array[0] to keep outputs consistent
                    (helps xcd calc)
      - ROCm API: rsmi_dev_metrics_curr_gfxclk_get fixed to count
        XCDs using backwards compatible rsmi_dev_gpu_metrics_info_get.
      - ^ Fixes XCD count overall + assigning clock[0] in 1.3 to curr
        freq
      - AMD SMI API: amdsmi_get_gpu_metrics_info() initialized all new
        1.5 metric values for all lower metric tables
      - AMD SMI API: wrapper -> fix is here + returns correct AMD SMI return
      - AMD SMI API: wrapper -> now displays amdsmi return status as
        string in logs
      - gpu_metrics_read.cc -> now has better overview of backwards
        compatible output
      - gpu_metrics_read.cc -> Cleaned up output, added units, and
        display all array output

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: Id5b60ded5b0ed2cdf0f96ca72c79e356f0410960


[ROCm/amdsmi commit: 5ff5af0b5a]
2023-12-19 14:18:15 -05:00
Naveen Krishna Chatradhi 4bd015f945 amd-smi: fix cpu specific apis and header
1. provide prototype and documentation for esmi specific api.
   define structures and update classes as required
2. update cmake files as required and add esmi api to the
   amdsmi esmi integration example.

Change-Id: I753ec176f9b381e74c9646525dfd9075237bf8d9


[ROCm/amdsmi commit: 65eed73f4d]
2023-12-18 06:28:15 -05:00
Charis Poag 4f502e5dab Add vcn and jpeg activity
Changes:
    - Add new engine field vcn_activity (from 1.4/1.5
      gpu_metrics
    - Updated log output to enhance view of gpu_metric
      data as json pretty print
    - Added new fields provided in 1.5
    - Added unit overview in python API, CLI is WIP

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: I7d9f29e7ecc35dcd0697814c222cdd02b0d5518e


[ROCm/amdsmi commit: 8f3861e1d9]
2023-12-15 22:18:46 -05:00
Bill(Shuzhou) Liu 9dc60e00cb Support max_num_cu_shared and num_cache_instance
Add above fields for cache info. Remove driver_date in CLI and
Remove the disable properties of cache.

Change-Id: I80672490908d9e32a149076cc37459fa56b8b0bf


[ROCm/amdsmi commit: 59b510de2b]
2023-12-14 09:59:35 -05:00
Bill(Shuzhou) Liu 985ddbc5d5 Collect compute partition devices under the same socket
The socket represents a physical device, and the partition devices
should belong to the socket. The partition devices are only
different in function id in BDF. Use the BD part of the BDF to
identify a socket.

Change-Id: I5d355a6f5db02faa7555b760a36c7351b8d8d835


[ROCm/amdsmi commit: de7e74f7db]
2023-11-29 08:23:23 -06:00
Maisam Arif a8138bfd5e Change xgmi_physical_id to oam_id
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I35fb36ec0e9f72a7135d8bb9070dbdc0e956b93a


[ROCm/amdsmi commit: b54086a037]
2023-11-22 12:16:38 -06:00
Maisam Arif 09f4046345 Refactor gpu_metrics usage in CLI
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I599878971ab94a768d008f046f2d303ad76fdb3b


[ROCm/amdsmi commit: 5b36b438b7]
2023-11-22 03:32:55 -06:00
Maisam Arif ff96f50145 Refactor gpu_metrics usage in libraries
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I763638d4b546bf49b234e823df81028c357e8f49


[ROCm/amdsmi commit: d790ebc62b]
2023-11-22 03:32:15 -06:00
Bill(Shuzhou) Liu c7f9cff2cb Add APIs for PM table and register table
Read the PM table and register table as the name value pair.

Change-Id: Ie44fe67a28af3341bd6beb90d809e90f280351ac


[ROCm/amdsmi commit: ac1ba33371]
2023-11-20 12:31:18 -05:00
Deepak Mewar 09125ca639 Added AMDSMI_CHECK_INIT to esmi library wrappers
Change-Id: Id187a9152399cdefec21a0d310bdb78f593426af


[ROCm/amdsmi commit: baa2c94a86]
2023-11-14 11:58:18 -05:00
Maisam Arif 888f0d67cb SWDEV-425887 - Corrected vbios population
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I3728064c77fae8cfa006254769a2cc821b4d5362


[ROCm/amdsmi commit: 6441abdc1a]
2023-11-14 11:56:43 -05:00
Maisam Arif 37a41c3bc8 SWDEV-426130 - Updated firmware subcommand output
Corrected truncation
	corrected xgmi to ta_xgmi
	remapped smc(system management controller) to pm(power
management)

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I404cefa7b90a454d4f4b08f6490448b47cf32107


[ROCm/amdsmi commit: 545e57d3e3]
2023-11-14 11:56:43 -05:00
Deepak Mewar 591221eee6 modified local esmi functions called from amdsmi_init
for gtest compatibility

Change-Id: I627c9887a1f1e340c358f060818a1a7d74ce33f9


[ROCm/amdsmi commit: 0c790752ac]
2023-11-10 15:50:42 -05:00
Maisam Arif 0a20cc33ab Updated License Dates
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Id6fd66b03c602232ecc1a063a534a15fe3a03f56


[ROCm/amdsmi commit: 5dba2f3120]
2023-11-07 03:57:08 -05:00
Maisam Arif 0a90c7cfc8 SWDEV-429037 - Automatically install amdsmi python lib with package
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I9c8c1335982ebd61a29da6f66c635f0a97d95f6e


[ROCm/amdsmi commit: c2e12feb6a]
2023-11-02 15:15:27 -04:00
Bill(Shuzhou) Liu e05f594cba Support cache type in cache info
Add the cache type to the cache info.

Change-Id: Ic13ca9640b65d2b414eeebe7b884530f2036aac8


[ROCm/amdsmi commit: 56b246cc3c]
2023-11-02 04:53:38 -05:00
Oliveira, Daniel f13cbb8d10 amd_smi_lib: Fix missing sym link causes segfault
Changes AMDSmiDrm to use the versioned library for its dependency

Code changes related to the following:
  * AMDSmiDrm::init()

Build changes related to the following: None

Change-Id: Ibd5b3dd88f679912acdfa292502003f58b28daf5
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: e20fd12934]
2023-10-31 10:33:34 -04:00
Deepak Mewar a09863f550 Esmi Auxillary API wrappers removed from amdsmi library
that are called during amdsmi inititalization
    amdsmi_get_cpu_family,
    amdsmi_get_cpu_model,
    amdsmi_get_cpu_threads_per_core,
    amdsmi_get_number_of_cpu_cores,
    amdsmi_get_number_of_cpu_sockets

Added amdsmi_get_cpucore_info to amdsmi library

Change-Id: Ib88d580e1d85afdf578963247e585cfae05c58ad


[ROCm/amdsmi commit: 28f6383639]
2023-10-30 20:59:21 -04:00
Galantsev, Dmitrii 21fa9c0950 SWDEV-424983 - Fix supported metrics api checks
Change-Id: I5c95bb3057dd7546036cbd87bbf7025469d2b3d5
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 88d5e011e6]
2023-10-30 17:28:59 -04:00
Maisam Arif b14d1ca543 SWDEV-410051 - Updates to board_info struct & CLI
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I8735d8965140ee5da0c35106b388af1dca87ec71


[ROCm/amdsmi commit: 2b4637ff9f]
2023-10-27 16:52:56 -05:00
Maisam Arif 0bddd17717 Updated READMEs & Versioning for 6.0 Release
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Idadece3c1022ecba4291b96ddbe23112e27394de


[ROCm/amdsmi commit: 5018a57b62]
2023-10-16 16:57:49 -05:00
Maisam Arif 3588704718 Enabled events subcommand to non-virtual systems
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ied56ef015bba606b1bca1a1108a237d0c1cc7fdb


[ROCm/amdsmi commit: ec24a0f66d]
2023-10-16 16:52:47 -05:00
Suma Hegde 1bf35b5c05 esmi: Clone open-source esmi repo as part of build
1. Remove esmi (internal gerrit) repo as git submodule
2. Clone esmi (open-source) repo during cmake using "git clone"
3. Download amd_hsmp.h header file during cmake build

TODO:
We can update the amd_hsmp.h to mainline linux kernel repo after
next Linux kernel release.

Change-Id: I763b5e287e24337c8e9e25f4e421cdb8698b9322


[ROCm/amdsmi commit: 597fb00bef]
2023-10-16 15:06:02 -04:00
Maisam Arif e77abc0a1d Added memory & compute partitions to amd-smi lib
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: If3acea6ad281298f1f05785b2e6d8e70fae8d89b


[ROCm/amdsmi commit: 1f8d9cb9ef]
2023-10-13 21:47:59 -04:00
Deepak Mewar fad0af9ba2 esmi: remove energy reporting, fix errors from clang compiler
Clang compiler reporting errors while generating python wrappers for esmi lib

Change-Id: I62352aba3b87f9a6b044c97af6b9fd649612b622


[ROCm/amdsmi commit: ee890c5060]
2023-10-13 14:45:25 -04:00
Bill(Shuzhou) Liu b9073f2bf7 Add new API for RAS related information
The API to get the EEPROM version and ECC schema.

Change-Id: Iee6b3c555541a33bf16bf9ac1fd60100dfff5643


[ROCm/amdsmi commit: d92d4e4b38]
2023-10-13 02:06:14 -04:00
Galantsev, Dmitrii f29d776cf6 CMAKE - Fix amdsmi lib version
This allows for lib version to change

before: libamd_smi.so.1.0
after:  libamd_smi.so.23.4

Change-Id: Iaba991afac4e625d11df2bacdf6287c6f8bf5383
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 69c35a4cff]
2023-10-12 22:13:30 -04:00
Dmitrii Galantsev 837611e992 Merge "Merge rocmsmi/amd-staging into amd-dev 20231010" into amd-dev
[ROCm/amdsmi commit: cb9875b056]
2023-10-12 00:46:08 -04:00
Galantsev, Dmitrii e3ee60fc5e Merge rocmsmi/amd-staging into amd-dev 20231010
Change-Id: I492562094a004eb78b2cc2b52d14d013d9f97112
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 6d72d65c48]
2023-10-11 18:58:12 -05:00