Commit-Graf

984 Incheckningar

Upphovsman SHA1 Meddelande Datum
Oliveira, Daniel a2f04dd3bc fix: [rocm/amd_smi_lib] header cleanup Remove non-unified headers
Cleans up individual gpu metric APIs which will be implemented according to 'unified-headers' standards

Code changes related to the following:
  * '_get_gpu_metrics_' APIs
  * Functional tests

Change-Id: I2dd2ecde11c1d77e343e0ae0e10aeb9120ae9b99
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>


[ROCm/amdsmi commit: 55734d2d7a]
2024-01-26 10:38:48 -05:00
Deepak Mewar 8eac06c5fd amdsmi README updated for python interface
Change-Id: I92c1e8eb646488a9cdc32d0933f27e5db8c172ef


[ROCm/amdsmi commit: 3aabb927b4]
2024-01-25 02:19:38 -05:00
Maisam Arif 95f4b4eaf3 Updated engine_activity api
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I3f62e093fdc0254015c0837dca59763551d3659c


[ROCm/amdsmi commit: 0550c9352c]
2024-01-24 22:23:48 -05:00
Charis Poag f357c180e7 Fix metric type error output + re-align with ROCm SMI metrics
Changes:
* [CLI] Provide fix for "/opt/rocm/bin/amd-smi metric
TypeError: '>' not supported between instances of 'str' and 'i"
--> Python API was updated, CLI needed to reflect these changes
* [API] Updated amdsmi.h's with ROCm SMI
--> Incorrectly added mem_bandwidth_acc & mem_max_bandwidth
--> Realigned wrapper with updates
* [Test] Added metrics not shown in gpu_metrics_read.cc

Change-Id: Ia3a172377fd5a582254dd5a46d81dbec7e763cd9
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 34bd26c68e]
2024-01-24 21:23:40 -06:00
Bill(Shuzhou) Liu 25ffbb0304 Unified API
amdsmi_get_link_metrics() and amdsmi_get_pcie_info()

Change-Id: Iea060e449813b842236243b772e8809497ce98fe


[ROCm/amdsmi commit: 0b67c2ccc4]
2024-01-24 18:27:20 -05:00
Deepak Mewar a8b48ff1e5 amdsmi README updated for esmi library usage
Change-Id: I1406f0b0434e735b7d1cc1d931e7a2c92dfba728


[ROCm/amdsmi commit: 9375b6f820]
2024-01-24 14:30:26 -05:00
Maisam Arif 7e831b1992 24.2.0 Version update
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ied7c24d63ca38c2e5ea5eca6b411e0156f61a403


[ROCm/amdsmi commit: c400a22d4d]
2024-01-24 11:13:02 -06:00
Maisam Arif 9eef868334 24.1.0 Version update
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ibfe92d199b10dc48ece85dfdeda1041f5ea98626


[ROCm/amdsmi commit: c48c989bbc]
2024-01-24 12:09:48 -05:00
Maisam Arif d269c35312 Fix subvendor_id error handling
Change-Id: Ibb2e8e329233221e72247674b4f2fbaef51baa32
Signed-off-by: Maisam Arif <maisarif@amd.com>


[ROCm/amdsmi commit: 5e25c0771b]
2024-01-24 10:59:14 -06:00
Maisam Arif b6a3bb8109 Corrected AmdSmiCacheTypeNames interface class
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Iec9c6097aec460b180a112be2d24293a40bde125


[ROCm/amdsmi commit: 94f41f2b70]
2024-01-24 07:48:30 -06:00
Maisam Arif 084d8f89d1 SWDEV-434348: Corrected Guest Vendor Name values
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Iee0d45fc64386f0417a0e30cce05608ca2186990


[ROCm/amdsmi commit: 53177525bf]
2024-01-24 07:34:06 -06:00
Maisam Arif 5edb7a559f Corrected Cache Type Enum
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I7d861d022e2855c35e4a79681f83977cc633d1c6


[ROCm/amdsmi commit: 2c87d95ffb]
2024-01-24 07:28:04 -06:00
Maisam Arif 3273fb6239 SWDEV-440760: Removed specific gpu_metric calls & fixed pcie metrics
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I679ecede4825c119925de3c9140453653f3f84aa


[ROCm/amdsmi commit: fec1173321]
2024-01-24 05:51:36 -06:00
Maisam Arif 037a4283cd SWDEV-441635: Updated amdsmi_get_utilization_count python API
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I588e1a61e000d9a5f77f0e8c63f4fef1ec76063e


[ROCm/amdsmi commit: 1ed5080433]
2024-01-24 05:51:36 -06:00
Maisam Arif 56f96b613e Handled unkown vram type out of bounds error
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I2d32c7043c78c0651f1b4db565a299b6b96abbcc


[ROCm/amdsmi commit: ee80c2cac4]
2024-01-24 06:50:17 -05:00
Charis Poag f2587543e8 SWDEV-436533 [CLI/Python API] Align Cache Info BM UI to Host
- [CLI] Refactored cache info to display
cache flags as "cache_properties" names.
Names are displayed as a list of comma-separated
cache type strings. Previously, values
were shown one by one as ENABLED.

ex.
CACHE_PROPERTIES = <a,b,c>

- [JSON] mirrors CLI fields.
No longer display "cache_flags", renamed
field as "cache_properties" dictionary. This
allows users to better understand the
list of names provided.

- [Python API] Updated amdsmi_get_gpu_cache_info
to mirror Host return.

README.md - updated to reflect all changes.

Change-Id: Ife2ef5adcef30058937d1376efb01749e45c02fb
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: fe86afed8c]
2024-01-24 06:21:55 -05:00
Charis Poag 6dc774c275 Add ROCm 6.0 change log
Update our change log to reflect a few of the major updates
for ROCm 6.0.

Change-Id: I82157fcfad22e63b62d2409bdc979b312356abe8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: c260819003]
2024-01-23 17:48:13 -05:00
Maisam Arif 05210b2c16 SWDEV-440462: Fixed metric functionality to Linux Guest
Change-Id: Ia69d01251d1e9bb3717bda3a7d0f752c739393a6
Signed-off-by: Maisam Arif <maisarif@amd.com>


[ROCm/amdsmi commit: 6292ac513c]
2024-01-21 02:46:15 -06:00
Charis Poag c5ba765be0 Merge rocm-smi/amd-staging into amd-dev 20240119
Change-Id: Ie706473ff92a91b19e95d2d58f64904cad73a89a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 6132074089]
2024-01-19 03:57:00 -05:00
Deepak Mewar c081e9e6f8 amdsmi wrapper generated for updated amdsmi_get_esmi_err_msg
Change-Id: I2388cd75111774852ae6426071d890bbf2d9d0c9


[ROCm/amdsmi commit: 730a82417e]
2024-01-16 11:41:22 -06:00
Deepak Mewar 6ef2131a21 amdsmi library updated for esmi error status mapping to amdsmi
Change-Id: I7e4dd146a1a9af496556efcf811b2e1ed565b09e


[ROCm/amdsmi commit: 5d0b479661]
2024-01-16 11:41:22 -06:00
khashaik a08809a3ca amdsmi_cli: Update help section
- Update help section

Change-Id: Ida8022a27ecc9df3ebef94e27e89624c18a9cf46


[ROCm/amdsmi commit: 27fbbc3388]
2024-01-16 11:41:22 -06:00
khashaik 6395351ac3 amdsmi_cli: Updated README.md file in the amdsmi_cli
- Update the README.md file in amdsmi_cli folder to include information
    for CPU's and CORE's along with the GPU's

Change-Id: I7670811696bc5299a287a6bc8883afe40eeeb557


[ROCm/amdsmi commit: 994b956d5e]
2024-01-16 11:41:22 -06:00
Deepak Mewar 1f7c6771eb amdsmi wrapper generated for updated hsmp metric table
Change-Id: I18c795e18d9c95320826cb965f36d3fb5546ea5c


[ROCm/amdsmi commit: 19451cc508]
2024-01-16 11:41:22 -06:00
Deepak Mewar 171f4818f4 amdsmi library updated for metric table structure
Change-Id: Ie8a9840a9020282599dd413e964d86bfb8850f6a


[ROCm/amdsmi commit: a0c95e855b]
2024-01-16 11:41:22 -06:00
khashaik a66efce2da amdsmi_cli: Add checks for no gpu devices, cpu and core devices
- Add checks for no gpu devices, cpu and core devices
  - Update units for core energy and cpu energy

Change-Id: Ieea43f1bb7fc303ebbbdf72f1ab22644a28df25c


[ROCm/amdsmi commit: 18d8087711]
2024-01-16 11:41:22 -06:00
khashaik c500be9b35 amdsmi_cli: Update parser to add neg values check for the cpu and core arguments
Change-Id: Ia7959826637e7749d999a6570df590221e85cf50


[ROCm/amdsmi commit: 108ae03c23]
2024-01-16 11:41:22 -06:00
khashaik 47ca69f2a6 amdsmi_cli: Fix issues for CPU related API's for DIMM
- Fix interface issues for dimm temperature, dimm refresh rate and dimm power consumption

Change-Id: I998209c8314e4d78a842187c5a0b127aea7dbef2


[ROCm/amdsmi commit: 4971466c22]
2024-01-16 11:41:22 -06:00
Deepak Mewar 1a2b556dce amdsmi interface updated to additionally return the freq src
from amdsmi_get_cpu_socket_current_active_freq_limit

Change-Id: I48f1026474115848a30352637415e7a1a52f3481


[ROCm/amdsmi commit: 7dcd5a3fd6]
2024-01-16 11:41:22 -06:00
Deepak Mewar 148ecb1805 amdsmi interface updated for amdsmi_get_metrics_table units
Change-Id: If211292e894df9d832b879252bebf91c17112d14


[ROCm/amdsmi commit: 898c4bc06f]
2024-01-16 11:41:22 -06:00
khashaik 323cf14a9c amdsmi_cli: Fix issues in cpu API "cpu_lclk_dpm_level"
- Fix issues in cpu API "cpu_lclk_dpm_level"
  - Fix issue for invalid core id
  - Update the error message for invalid devices

Change-Id: I71216ff72f89cfe0c86928ae3dce1f88eae91665


[ROCm/amdsmi commit: 256907989b]
2024-01-16 11:41:22 -06:00
Deepak Mewar 8bd95a26b4 amdsmi_cli: Enabled hsmp metric table from CLI
Change-Id: I7f9c13255f952136438249f5180dec5586d01bd7


[ROCm/amdsmi commit: c74f01f401]
2024-01-16 11:41:22 -06:00
Deepak Mewar 5d5bb11625 amdsmi interface updated for amdsmi_get_metrics_table encodings
Change-Id: Iffed4071d5b2b5645f8118f3fbce26ab258e7882


[ROCm/amdsmi commit: 1b1591571b]
2024-01-16 11:41:22 -06:00
khashaik 68a49c6c27 amdsmi_cli: Fix issues in "cpu_enable_apb" API
Change-Id: I8237fb4641f1a6aecec815fdc020abbf9a3195ba


[ROCm/amdsmi commit: 087a0d3ead]
2024-01-16 11:41:22 -06:00
Deepak Mewar 52c1014196 amdsmi interface updated for amdsmi_get_metrics_table
Change-Id: I0618dd411caf6d30f74793e937984273f9c5b70e


[ROCm/amdsmi commit: 31dc8d0ee8]
2024-01-16 11:41:22 -06:00
Deepak Mewar a45d2e1684 amdsmi wrapper generated for updated amdsmi_get_metrics_table
Change-Id: Id55a5647064998d8f546c806f857a8745afe52ea


[ROCm/amdsmi commit: 4ecf25e882]
2024-01-16 11:41:22 -06:00
Deepak Mewar 3a00172186 amdsmi library and sample code updated for amdsmi_get_metrics_table
Change-Id: Ie03c556f5c38fe4a0365743d3a94220e3aa62b23


[ROCm/amdsmi commit: 9f3a6dbd29]
2024-01-16 11:41:22 -06:00
Charis Poag 23a0cb827f GPU Usage/activity update
CLI:
Every usage field is notated by "activity"
gfx_usage -> gfx_activity
umc_usage -> umc_activity
vcn_activities -> vcn_activity
jpeg_activities[AID#] -> jpeg_activity

Wrapper: fixed metric output, misalignment
with generator

update_wrapper.sh:
DOCKER_BUILDKIT to 0 (if unset)

API:
amdsmi_get_gpu_metrics_info:
1.3: Removed commenting out avg socket power

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: Id3fcc20aef420c7b7a90ba22fa3bc643b2716333


[ROCm/amdsmi commit: 4575990ae7]
2024-01-15 23:34:08 -06:00
Bill(Shuzhou) Liu 28f354796d Use the same mutex as rocm-smi
Share the same mutex as rocm-smi implementation. Handle the crash
when a user is not in render group.

Change-Id: I486b26569f9b523b41bbdaf95d51f4a730978cfd


[ROCm/amdsmi commit: 5a6b5d2a0a]
2024-01-15 13:12:49 -05:00
Charis Poag 31081fa8b0 Fix AMD-SMI test segmentation fault TestGpuMetricsRead
Issue: need to return on any failure.
The nullptr check test would segfault without-
all values in struct are not initialized.

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: I4987fb73ba9bcb182de7a439a4286333a41bf7eb


[ROCm/amdsmi commit: d74be3120e]
2024-01-14 19:27:34 -06:00
Galantsev, Dmitrii 64969f2c61 SWDEV-409184 - Exclude some tests in VM
Change-Id: Ic196a113426fc63a0b2aadfa04ab4b10ed6434e3
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: a60f5d2d4c]
2024-01-11 01:38:15 -06:00
Bill(Shuzhou) Liu 29f1584b9c Add the ROCm version in CLI
Print the ROCm version in CLI

Change-Id: I529201274e114bde44722aa9a6aec13c2bedecf7


[ROCm/amdsmi commit: 9dd24a2b5a]
2024-01-04 15:25:08 -06:00
khashaik 3557e98535 amdsmi_cli: Fix issue in static cpu to fetch only smu fw version when smu is pased as option
- Fix issue in static cpu to fetch only smu fw version or prototype whne smu or prototype is
    passed as option

Change-Id: Idec3b4e571ae576d1f71df74fa9a5befea5a1585


[ROCm/amdsmi commit: 1c90b1dea7]
2023-12-21 00:34:18 -05:00
khashaik 313dbe77b9 amdsmi: Interface: Add units to the cpu related interfaces.
Change-Id: I294439c345a3e4ca399eb6b3f53eb1f18777180a


[ROCm/amdsmi commit: cdf31b8d6a]
2023-12-21 00:09:23 -05:00
Charis Poag 601a254f37 Fix GPU metric tests & cleanup test output
- CLI: Added average_power to display if current_power is empty
    - CLI: fixed PCIe current_speed not displaying GT/s
    - ROCm API: 1.3 & 1.4
                -> commented out setting avg clocks to current clock value
(leave as max uint value, not re-assign; these are not same values)
                    -> commented out setting current_socket_power = average_power
(leave as max uint value, not re-assign; these are not same values)
                    -> For all non-array clocks, placed value in first
                        array[0] to keep outputs consistent
                    (helps xcd calc)
      - ROCm API: rsmi_dev_metrics_curr_gfxclk_get fixed to count
        XCDs using backwards compatible rsmi_dev_gpu_metrics_info_get.
      - ^ Fixes XCD count overall + assigning clock[0] in 1.3 to curr
        freq
      - AMD SMI API: amdsmi_get_gpu_metrics_info() initialized all new
        1.5 metric values for all lower metric tables
      - AMD SMI API: wrapper -> fix is here + returns correct AMD SMI return
      - AMD SMI API: wrapper -> now displays amdsmi return status as
        string in logs
      - gpu_metrics_read.cc -> now has better overview of backwards
        compatible output
      - gpu_metrics_read.cc -> Cleaned up output, added units, and
        display all array output

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: Id5b60ded5b0ed2cdf0f96ca72c79e356f0410960


[ROCm/amdsmi commit: 5ff5af0b5a]
2023-12-19 14:18:15 -05:00
Naveen Krishna Chatradhi e924266a25 amdsmi_cli: Add support for CPU specific API in amdsmi_cli tool
- Add support for only CPU if only the hsmp driver is driver is present.
  - Add support for both the amdgpu and amdcpu's if both the amdgpu driver and cpu's are present.
  - Add support for socket power metrics
  - Add support for hsmp proto type version, prochot status, read current fclkmclk freq
    and current cclk freq limit, c0 residency, lclk dpm level range, socket frequency range
  - Add CPU socket current frequency limit.
  - Update tool for API's IO bandwidth, XGMI bandwidth,
    power telemetry rails, APB enable and APB disable API's
  - Add support set_pow_limit, set_xgmi_link_width, set_lclk_dpm_level, core_boost_limit,
    curr_active_freq_core_limit, set_soc_boost_limit and set_core_boost_limit.
  - Add support for the following cpu related API's in tool
    core_energy, socket energy, set power efficiency mode, ddr bandwidth,
    cpu temperature, dimm temperature range rate, dimm power consumption
    and dimm thermal temperature.
  - Add support for set_gmi3_link_width, set_pcie_lnk_rate, set_df_pstate_range

Change-Id: I5a35d1cceeb7df0bc8b7116df7c27bb7f376e839


[ROCm/amdsmi commit: 19030e5b72]
2023-12-18 06:31:49 -05:00
Naveen Krishna Chatradhi 37f1d47b0e amdsmi: py-interface: Add python interface for esmi api
Change-Id: I4a3ab1168a7d1bf011ecc9c508e111c281503520


[ROCm/amdsmi commit: 94d3c563a3]
2023-12-18 06:31:35 -05:00
Naveen Krishna Chatradhi 4bd015f945 amd-smi: fix cpu specific apis and header
1. provide prototype and documentation for esmi specific api.
   define structures and update classes as required
2. update cmake files as required and add esmi api to the
   amdsmi esmi integration example.

Change-Id: I753ec176f9b381e74c9646525dfd9075237bf8d9


[ROCm/amdsmi commit: 65eed73f4d]
2023-12-18 06:28:15 -05:00
Charis Poag 4f502e5dab Add vcn and jpeg activity
Changes:
    - Add new engine field vcn_activity (from 1.4/1.5
      gpu_metrics
    - Updated log output to enhance view of gpu_metric
      data as json pretty print
    - Added new fields provided in 1.5
    - Added unit overview in python API, CLI is WIP

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: I7d9f29e7ecc35dcd0697814c222cdd02b0d5518e


[ROCm/amdsmi commit: 8f3861e1d9]
2023-12-15 22:18:46 -05:00
Maisam Arif 030a971ce4 SWDEV-437729 - Post install script fixes for pip & pyyaml
Change-Id: If5c4a7947764a0cb5717c906198436304fa62784
Signed-off-by: Maisam Arif <maisarif@amd.com>


[ROCm/amdsmi commit: 498fde5cf4]
2023-12-15 13:34:56 -06:00