提交線圖

314 次程式碼提交

作者 SHA1 備註 日期
Maisam Arif d790ebc62b Refactor gpu_metrics usage in libraries
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I763638d4b546bf49b234e823df81028c357e8f49
2023-11-22 03:32:15 -06:00
Bill(Shuzhou) Liu ac1ba33371 Add APIs for PM table and register table
Read the PM table and register table as the name value pair.

Change-Id: Ie44fe67a28af3341bd6beb90d809e90f280351ac
2023-11-20 12:31:18 -05:00
Maisam Arif 545e57d3e3 SWDEV-426130 - Updated firmware subcommand output
Corrected truncation
	corrected xgmi to ta_xgmi
	remapped smc(system management controller) to pm(power
management)

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I404cefa7b90a454d4f4b08f6490448b47cf32107
2023-11-14 11:56:43 -05:00
Deepak Mewar 0c790752ac modified local esmi functions called from amdsmi_init
for gtest compatibility

Change-Id: I627c9887a1f1e340c358f060818a1a7d74ce33f9
2023-11-10 15:50:42 -05:00
Maisam Arif 5dba2f3120 Updated License Dates
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Id6fd66b03c602232ecc1a063a534a15fe3a03f56
2023-11-07 03:57:08 -05:00
Bill(Shuzhou) Liu 56b246cc3c Support cache type in cache info
Add the cache type to the cache info.

Change-Id: Ic13ca9640b65d2b414eeebe7b884530f2036aac8
2023-11-02 04:53:38 -05:00
Deepak Mewar 28f6383639 Esmi Auxillary API wrappers removed from amdsmi library
that are called during amdsmi inititalization
    amdsmi_get_cpu_family,
    amdsmi_get_cpu_model,
    amdsmi_get_cpu_threads_per_core,
    amdsmi_get_number_of_cpu_cores,
    amdsmi_get_number_of_cpu_sockets

Added amdsmi_get_cpucore_info to amdsmi library

Change-Id: Ib88d580e1d85afdf578963247e585cfae05c58ad
2023-10-30 20:59:21 -04:00
Maisam Arif 2b4637ff9f SWDEV-410051 - Updates to board_info struct & CLI
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I8735d8965140ee5da0c35106b388af1dca87ec71
2023-10-27 16:52:56 -05:00
Maisam Arif 5018a57b62 Updated READMEs & Versioning for 6.0 Release
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Idadece3c1022ecba4291b96ddbe23112e27394de
2023-10-16 16:57:49 -05:00
Maisam Arif 1f8d9cb9ef Added memory & compute partitions to amd-smi lib
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: If3acea6ad281298f1f05785b2e6d8e70fae8d89b
2023-10-13 21:47:59 -04:00
Deepak Mewar ee890c5060 esmi: remove energy reporting, fix errors from clang compiler
Clang compiler reporting errors while generating python wrappers for esmi lib

Change-Id: I62352aba3b87f9a6b044c97af6b9fd649612b622
2023-10-13 14:45:25 -04:00
Bill(Shuzhou) Liu d92d4e4b38 Add new API for RAS related information
The API to get the EEPROM version and ECC schema.

Change-Id: Iee6b3c555541a33bf16bf9ac1fd60100dfff5643
2023-10-13 02:06:14 -04:00
Galantsev, Dmitrii 6d72d65c48 Merge rocmsmi/amd-staging into amd-dev 20231010
Change-Id: I492562094a004eb78b2cc2b52d14d013d9f97112
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-10-11 18:58:12 -05:00
Galantsev, Dmitrii 1b606acf73 Fix amdsmi.h and update wrapper
Having an unnamed struct confuses our wrapper generator.
Adding a name solved it.

Change-Id: Iab3e73317fb21fb3667beef04878d4f3da96eadf
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-10-10 17:58:25 -05:00
Bill(Shuzhou) Liu 6ca95c1a2d Add support to XGMI physical id
Get XGMI physical id from sysfs.

Change-Id: Ifd9e431bc2fbfd759d888a71b99046a5eb07b6ed
2023-10-10 09:29:05 -04:00
Charis Poag 31a1fcce7d Add rsmi_dev_power_get
* Updates:
  - [API] Added rsmi_dev_power_get(uint32_t dv_ind,
                                   uint64_t *power,
                                   RSMI_POWER_TYPE
                                   *type)
          provides generic get to average or
          current power & provides backwards
          compatibility
  - Added a utility function to get MonitorTypes
    (monitor_type_string(type)) &
    RSMI_POWER_TYPE (power_type_string(type))
    strings
  - [Tests] Added rsmi_dev_power_get tests and
    provided better verification of return values for
    all power APIs
  - [Tests] Updated power outputs to show correct
    units
  - [example] Now uses avg, current, and generic
    power functions with type output response

Change-Id: I5ca06ca37fd5f61e100f2835b664d6cdd1ca42e6
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-10-10 00:34:19 -05:00
Deepak Mewar 192fb538be added metric table wrapper APIS & test code
Change-Id: I24207b3c32d7294337140a1f5108b81f3bf33580
2023-10-10 00:03:11 -04:00
Oliveira, Daniel 4e4ebde640 rocm_smi_lib: Fix Modernize and refactor gpu_metrics
Adds support for 'gpu_metrics_v1_4' and new counters

Code changes related to the following:
  * rsmi gpu_metrics APIs
  * rsmi gpu_metrics Logs
  * The new gpu_metrics are now part of the Device

Build changes related to the following: None

Change-Id: Ie748e977cd0a01c6a2fb82260014c0699605dbb3
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2023-10-09 21:43:22 -05:00
Charis Poag b251bb0c9f Rename NPS -> memory partition + compute partition node fix
* Updates:
        - rocm_smi_lib + CLI:
          Rename all "NPS mode" -> "memory partition"
          related files/functions/API/CLI to align with correct
          technical naming
        - rocm_smi_main: fixed identifying primary card's unique id
          utilize rsmi_dev_unique_id_get to map which
          KFD nodes belong to it
        - rsmi_dev_*_partition*: now have better logging output
        - compute partition tests:
          Added 20 sec delay for workaround until GPU
          busy is confirmed as the issue
        - CPPLint fixes/formatting
        - [Example] Moved all endl to "\n" for efficiency
        - [Example] Added Edge & Junction temperature examples
        - [Example] Added rsmi_minmax_bandwidth_get() example - WIP

Change-Id: Ida6db6fda7e0ac9d696a34cb15b4746e69d58d51
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-10-06 11:51:09 -04:00
Bill(Shuzhou) Liu 1a233f93fb APIs for the cache level and size
Read the cache level and size from topoogy sysfs file.

Change-Id: Id3c558c95bcb79139a19e4adbaa7ff333d06098f
2023-10-05 11:10:54 -05:00
Maisam Arif 572bf563d1 Added driver_name to amdsmi_cli tool
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I8f3d52e0b23298443b2b16afec418cbbbc5f77e0
2023-10-04 08:54:19 -04:00
Maisam Arif fadf1b6cc9 SWDEV-410230 - Added slot_type to amd-smi static --bus
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I2006a3525a8aa9091bf54501461d364f7237f00f
2023-10-02 10:15:34 -04:00
Bill(Shuzhou) Liu 9eccf20f0c Get PCIe slot type
Add API to get the PCIe slot type.

Change-Id: If6894af53894c524d61c7586c59768541bbf0ac6
2023-09-27 23:31:09 -04:00
Maisam Arif 95337c88fc Added sleep state to amd-smi metric --clock
Change-Id: Idb5fbc84a787ef1affdf0449b6dd77ab6e50e91d
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-09-26 15:21:25 -05:00
Galantsev, Dmitrii 21dcf6d66c SWDEV-423796 - Resolve stack smashing issue
Inconsistency between struct fields caused stack smashing

Change-Id: Ib06d67723e062d4306420854ba7ab45fb252ffe3
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-25 11:24:55 -05:00
Galantsev, Dmitrii 31cc2eecfb Merge remote-tracking branch 'rocmsmi/amd-staging' into HEAD
Change-Id: I0661926c10eef2bc32b83d9a63a3a6eb6991e781
2023-09-25 04:35:53 -05:00
Maisam Arif 25b055014d Updated tool & lib versions & README.md
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ic41a36bcfa988ce9c8304157593012752857e919
2023-09-25 02:02:22 -05:00
Charis Poag f078375350 Add Current (Instant) Socket Power
* Updates:
    - rocm_smi_logger:
      General cleanup &
      Aligned to cpplint rules for usage
    - rocm_smi_monitor:
      Fixed MonitorTypes
      from not displaying properly in logs
      & Added socket power label + current
      socket power MonitorTypes
    - rocm_smi API:
      Added rsmi_dev_current_socket_power_get API
    - rocm_smi CLI:
      General cleanup,
      Concise info now displays device data
      in variable width (see printLogSpacer's
      new field),
      printLogSpacer now as an adjustable
      variable that overrides appWidth,
      Added Socket Power to base rocm-smi +
      --showpower CLI calls,
      --showpower & base rocm-smi CLI defaults
      to printing socket power (if not available,
      displays average power)
    - Cleaned up temp label references
    - power_read gtests:
      Added current socket power to testing

Change-Id: Ica57e6f98ad96e2584e7c7955e188f68d2dab89d
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-09-25 01:38:54 -04:00
Galantsev, Dmitrii 3d40c4bb2c SWDEV-422836 - Add sleep frequency support
Change-Id: I0bde403b010bf036ce44ed0600cc7eb03742c6b6
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-25 01:38:27 -04:00
Ori Messinger d44a6ef523 ROCm SMI LIB: Add Missing Firmware Blocks
The purpose of this patch is to add the following missing firmware
blocks to the SMI LIB:
-RSMI_FW_BLOCK_MES
-RSMI_FW_BLOCK_MES_KIQ

Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I5d4d37d883878dd02ef8533d4eb8891d54d70630
2023-09-25 01:37:38 -04:00
Galantsev, Dmitrii 2589d677b0 actvity -> activity
Change-Id: Ie31d9faca2181cb2d47f7f4764b64ed8cc7f8007
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-22 11:45:21 -05:00
Maisam Arif e4fac177c1 SWDEV-417124 - Implement Power Management
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ib0d37038e49cec61d5415076a46a5666d95dcea2
2023-09-21 14:23:26 -05:00
Oliveira, Daniel e0483f2ee2 rocm_smi_lib: Fix [linux BM] [AMDSMI] Memory Bandwidth
Implements APIs for 'gpu_metrics_v1_3' utilization averages

Code changes related to the following:
  * rsmi_dev_activity_metric_get()
  * rsmi_dev_activity_avg_mm_get()
  * CLI shows "Avg.Memory Bandwidth" under "--showmemuse"

Change-Id: I8e4600f350a7c18499abf022534db2b875f09d5f
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
2023-09-21 11:00:29 -04:00
Bill(Shuzhou) Liu f86f62b3f7 Return the driver loading status
When init the library, it could return status whether the driver is
loaded or not.

Change-Id: Id26b8058e32881ebe2514067a639a2a871d1f252
2023-09-18 08:38:16 -05:00
Maisam Arif 42b030def3 Spell check bandwith to bandwidth
Change-Id: Icfb3b2398fe0590dbab6e531c8ec1cdceebe658d
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-09-14 18:43:49 -04:00
Maisam Arif d2ef113457 SWDEV-412847 - Changed junction to hotspot
Change-Id: I7f6c1a0a77e6a09d2a3e831463cf03e35266bf40
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-09-14 17:43:26 -05:00
Shuzhou Liu ab615f6b2a Merge "Add API for the memory type" into amd-dev 2023-09-12 09:34:03 -04:00
Charis Poag ed6777a8e7 Add GPU partition nodes
* Updates:
    - Fixed infinit loop on systems
      which did not have VRAM files
    - Fixed concise info from throwing exception
      with no amdgpu driver loaded
    - Fix for ability to see all nodes when
      after switching partitions (mirrors
      original card display/settings)
    - Added to logs build type, lib path,
      and set env. variables

Change-Id: Ic0333df355144ce2242cecea93fe4ce51caf311c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2023-09-07 22:17:54 -05:00
Galantsev, Dmitrii 4aef767596 Cleanup rocm_smi.cc
Change-Id: Ia676c237222b0dd5d9e8a054a93776f3b11e2225
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-09-07 15:50:40 -04:00
Bill(Shuzhou) Liu b52034fed8 Add API for the memory type
Get the memory type from libdrm and add a new API.

Change-Id: I89327bca2ef860f2e3f4f6ca20def2331eba66c0
2023-09-07 13:05:58 -05:00
Bill(Shuzhou) Liu fab0542ab1 Fix doxygen warning messages
The Doxygen will enable warning as error message.

Change-Id: Ie7a7c9a823388c4140f31489604d65ec43005772
2023-09-07 08:48:38 -04:00
Deepak Mewar 14cf5f2762 Updated esmi error checking for graceful return
Change-Id: I1bcd498e3482dc7acd92b1a762f892b3dd978ff2
2023-09-04 08:27:12 -04:00
Dmitrii Galantsev f96c7663b5 Merge "Update amdsmi_wrapper.py and name fields" into amd-dev 2023-08-30 17:30:38 -04:00
Galantsev, Dmitrii 03cfdeefd5 Update amdsmi_wrapper.py and name fields
When updating the wrapper I ran into an issue with anonymous structs.
Generated wrapper would contain a string split into multiple lines,
which is invalid python.

e.g.
    'struct_struct anonymous
    (struct.... amdsmi.h:355)'

After naming the structs - the issue is gone. BDF union now has to be
addressed with .fields

e.g.
    OLD: bdf.function_number
    NEW: bdf.fields.function_number

Change-Id: Ib3c640c088ad0cc67893d636827356902051f17f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-08-30 16:30:03 -05:00
Shuzhou Liu fc5b481124 Merge "Support PCIe vendor name" into amd-dev 2023-08-30 09:58:21 -04:00
Deepak Mewar f1ade88d47 wrapper API to get first online core on cpu socket
Change-Id: Ia1785f94ff687e53fdb868e56d4a83c2466ba2ed
2023-08-29 05:15:33 -04:00
Deepak Mewar 0baa3f6b6a Renamed esmi library APIs and bound the APIs
to cpusocket handle

Change-Id: I6e3d8aa667df475339c28b27294349843f32230c
2023-08-29 05:15:12 -04:00
Deepak Mewar 7c0e21ddc7 Wrapper API declared for esmi error status
Change-Id: Ie3e00a50740d9ba58d7f4955ea6b76ab8b46fb5e
2023-08-29 05:14:01 -04:00
Galantsev, Dmitrii 1d24dd93a6 Fix uint32* -> int32* conversion error
Change-Id: I23c2a842468896e8d120ac4b8b55ef433dff6d85
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-08-28 18:32:31 -05:00
Bill(Shuzhou) Liu 9021ef96dc Support PCIe vendor name
Add the support for PCIe vendor name.

Change-Id: Ibc1d289a08731e4c5a14f992f3b0d31b51482396
2023-08-28 16:46:43 -05:00