Graf commitů

1045 Commity

Autor SHA1 Zpráva Datum
Maisam Arif c400a22d4d 24.2.0 Version update
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ied7c24d63ca38c2e5ea5eca6b411e0156f61a403
2024-01-24 11:13:02 -06:00
Maisam Arif c48c989bbc 24.1.0 Version update
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ibfe92d199b10dc48ece85dfdeda1041f5ea98626
2024-01-24 12:09:48 -05:00
Maisam Arif 5e25c0771b Fix subvendor_id error handling
Change-Id: Ibb2e8e329233221e72247674b4f2fbaef51baa32
Signed-off-by: Maisam Arif <maisarif@amd.com>
2024-01-24 10:59:14 -06:00
Maisam Arif 94f41f2b70 Corrected AmdSmiCacheTypeNames interface class
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Iec9c6097aec460b180a112be2d24293a40bde125
2024-01-24 07:48:30 -06:00
Maisam Arif 53177525bf SWDEV-434348: Corrected Guest Vendor Name values
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Iee0d45fc64386f0417a0e30cce05608ca2186990
2024-01-24 07:34:06 -06:00
Maisam Arif 2c87d95ffb Corrected Cache Type Enum
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I7d861d022e2855c35e4a79681f83977cc633d1c6
2024-01-24 07:28:04 -06:00
Maisam Arif fec1173321 SWDEV-440760: Removed specific gpu_metric calls & fixed pcie metrics
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I679ecede4825c119925de3c9140453653f3f84aa
2024-01-24 05:51:36 -06:00
Maisam Arif 1ed5080433 SWDEV-441635: Updated amdsmi_get_utilization_count python API
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I588e1a61e000d9a5f77f0e8c63f4fef1ec76063e
2024-01-24 05:51:36 -06:00
Maisam Arif ee80c2cac4 Handled unkown vram type out of bounds error
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I2d32c7043c78c0651f1b4db565a299b6b96abbcc
2024-01-24 06:50:17 -05:00
Charis Poag fe86afed8c SWDEV-436533 [CLI/Python API] Align Cache Info BM UI to Host
- [CLI] Refactored cache info to display
cache flags as "cache_properties" names.
Names are displayed as a list of comma-separated
cache type strings. Previously, values
were shown one by one as ENABLED.

ex.
CACHE_PROPERTIES = <a,b,c>

- [JSON] mirrors CLI fields.
No longer display "cache_flags", renamed
field as "cache_properties" dictionary. This
allows users to better understand the
list of names provided.

- [Python API] Updated amdsmi_get_gpu_cache_info
to mirror Host return.

README.md - updated to reflect all changes.

Change-Id: Ife2ef5adcef30058937d1376efb01749e45c02fb
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-01-24 06:21:55 -05:00
Charis Poag c260819003 Add ROCm 6.0 change log
Update our change log to reflect a few of the major updates
for ROCm 6.0.

Change-Id: I82157fcfad22e63b62d2409bdc979b312356abe8
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-01-23 17:48:13 -05:00
Maisam Arif 6292ac513c SWDEV-440462: Fixed metric functionality to Linux Guest
Change-Id: Ia69d01251d1e9bb3717bda3a7d0f752c739393a6
Signed-off-by: Maisam Arif <maisarif@amd.com>
2024-01-21 02:46:15 -06:00
Charis Poag 6132074089 Merge rocm-smi/amd-staging into amd-dev 20240119
Change-Id: Ie706473ff92a91b19e95d2d58f64904cad73a89a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-01-19 03:57:00 -05:00
Deepak Mewar 730a82417e amdsmi wrapper generated for updated amdsmi_get_esmi_err_msg
Change-Id: I2388cd75111774852ae6426071d890bbf2d9d0c9
2024-01-16 11:41:22 -06:00
Deepak Mewar 5d0b479661 amdsmi library updated for esmi error status mapping to amdsmi
Change-Id: I7e4dd146a1a9af496556efcf811b2e1ed565b09e
2024-01-16 11:41:22 -06:00
khashaik 27fbbc3388 amdsmi_cli: Update help section
- Update help section

Change-Id: Ida8022a27ecc9df3ebef94e27e89624c18a9cf46
2024-01-16 11:41:22 -06:00
khashaik 994b956d5e amdsmi_cli: Updated README.md file in the amdsmi_cli
- Update the README.md file in amdsmi_cli folder to include information
    for CPU's and CORE's along with the GPU's

Change-Id: I7670811696bc5299a287a6bc8883afe40eeeb557
2024-01-16 11:41:22 -06:00
Deepak Mewar 19451cc508 amdsmi wrapper generated for updated hsmp metric table
Change-Id: I18c795e18d9c95320826cb965f36d3fb5546ea5c
2024-01-16 11:41:22 -06:00
Deepak Mewar a0c95e855b amdsmi library updated for metric table structure
Change-Id: Ie8a9840a9020282599dd413e964d86bfb8850f6a
2024-01-16 11:41:22 -06:00
khashaik 18d8087711 amdsmi_cli: Add checks for no gpu devices, cpu and core devices
- Add checks for no gpu devices, cpu and core devices
  - Update units for core energy and cpu energy

Change-Id: Ieea43f1bb7fc303ebbbdf72f1ab22644a28df25c
2024-01-16 11:41:22 -06:00
khashaik 108ae03c23 amdsmi_cli: Update parser to add neg values check for the cpu and core arguments
Change-Id: Ia7959826637e7749d999a6570df590221e85cf50
2024-01-16 11:41:22 -06:00
khashaik 4971466c22 amdsmi_cli: Fix issues for CPU related API's for DIMM
- Fix interface issues for dimm temperature, dimm refresh rate and dimm power consumption

Change-Id: I998209c8314e4d78a842187c5a0b127aea7dbef2
2024-01-16 11:41:22 -06:00
Deepak Mewar 7dcd5a3fd6 amdsmi interface updated to additionally return the freq src
from amdsmi_get_cpu_socket_current_active_freq_limit

Change-Id: I48f1026474115848a30352637415e7a1a52f3481
2024-01-16 11:41:22 -06:00
Deepak Mewar 898c4bc06f amdsmi interface updated for amdsmi_get_metrics_table units
Change-Id: If211292e894df9d832b879252bebf91c17112d14
2024-01-16 11:41:22 -06:00
khashaik 256907989b amdsmi_cli: Fix issues in cpu API "cpu_lclk_dpm_level"
- Fix issues in cpu API "cpu_lclk_dpm_level"
  - Fix issue for invalid core id
  - Update the error message for invalid devices

Change-Id: I71216ff72f89cfe0c86928ae3dce1f88eae91665
2024-01-16 11:41:22 -06:00
Deepak Mewar c74f01f401 amdsmi_cli: Enabled hsmp metric table from CLI
Change-Id: I7f9c13255f952136438249f5180dec5586d01bd7
2024-01-16 11:41:22 -06:00
Deepak Mewar 1b1591571b amdsmi interface updated for amdsmi_get_metrics_table encodings
Change-Id: Iffed4071d5b2b5645f8118f3fbce26ab258e7882
2024-01-16 11:41:22 -06:00
khashaik 087a0d3ead amdsmi_cli: Fix issues in "cpu_enable_apb" API
Change-Id: I8237fb4641f1a6aecec815fdc020abbf9a3195ba
2024-01-16 11:41:22 -06:00
Deepak Mewar 31dc8d0ee8 amdsmi interface updated for amdsmi_get_metrics_table
Change-Id: I0618dd411caf6d30f74793e937984273f9c5b70e
2024-01-16 11:41:22 -06:00
Deepak Mewar 4ecf25e882 amdsmi wrapper generated for updated amdsmi_get_metrics_table
Change-Id: Id55a5647064998d8f546c806f857a8745afe52ea
2024-01-16 11:41:22 -06:00
Deepak Mewar 9f3a6dbd29 amdsmi library and sample code updated for amdsmi_get_metrics_table
Change-Id: Ie03c556f5c38fe4a0365743d3a94220e3aa62b23
2024-01-16 11:41:22 -06:00
Charis Poag 4575990ae7 GPU Usage/activity update
CLI:
Every usage field is notated by "activity"
gfx_usage -> gfx_activity
umc_usage -> umc_activity
vcn_activities -> vcn_activity
jpeg_activities[AID#] -> jpeg_activity

Wrapper: fixed metric output, misalignment
with generator

update_wrapper.sh:
DOCKER_BUILDKIT to 0 (if unset)

API:
amdsmi_get_gpu_metrics_info:
1.3: Removed commenting out avg socket power

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: Id3fcc20aef420c7b7a90ba22fa3bc643b2716333
2024-01-15 23:34:08 -06:00
Bill(Shuzhou) Liu 5a6b5d2a0a Use the same mutex as rocm-smi
Share the same mutex as rocm-smi implementation. Handle the crash
when a user is not in render group.

Change-Id: I486b26569f9b523b41bbdaf95d51f4a730978cfd
2024-01-15 13:12:49 -05:00
Charis Poag d74be3120e Fix AMD-SMI test segmentation fault TestGpuMetricsRead
Issue: need to return on any failure.
The nullptr check test would segfault without-
all values in struct are not initialized.

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: I4987fb73ba9bcb182de7a439a4286333a41bf7eb
2024-01-14 19:27:34 -06:00
Galantsev, Dmitrii a60f5d2d4c SWDEV-409184 - Exclude some tests in VM
Change-Id: Ic196a113426fc63a0b2aadfa04ab4b10ed6434e3
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2024-01-11 01:38:15 -06:00
Bill(Shuzhou) Liu 9dd24a2b5a Add the ROCm version in CLI
Print the ROCm version in CLI

Change-Id: I529201274e114bde44722aa9a6aec13c2bedecf7
2024-01-04 15:25:08 -06:00
khashaik 1c90b1dea7 amdsmi_cli: Fix issue in static cpu to fetch only smu fw version when smu is pased as option
- Fix issue in static cpu to fetch only smu fw version or prototype whne smu or prototype is
    passed as option

Change-Id: Idec3b4e571ae576d1f71df74fa9a5befea5a1585
2023-12-21 00:34:18 -05:00
khashaik cdf31b8d6a amdsmi: Interface: Add units to the cpu related interfaces.
Change-Id: I294439c345a3e4ca399eb6b3f53eb1f18777180a
2023-12-21 00:09:23 -05:00
Charis Poag 5ff5af0b5a Fix GPU metric tests & cleanup test output
- CLI: Added average_power to display if current_power is empty
    - CLI: fixed PCIe current_speed not displaying GT/s
    - ROCm API: 1.3 & 1.4
                -> commented out setting avg clocks to current clock value
(leave as max uint value, not re-assign; these are not same values)
                    -> commented out setting current_socket_power = average_power
(leave as max uint value, not re-assign; these are not same values)
                    -> For all non-array clocks, placed value in first
                        array[0] to keep outputs consistent
                    (helps xcd calc)
      - ROCm API: rsmi_dev_metrics_curr_gfxclk_get fixed to count
        XCDs using backwards compatible rsmi_dev_gpu_metrics_info_get.
      - ^ Fixes XCD count overall + assigning clock[0] in 1.3 to curr
        freq
      - AMD SMI API: amdsmi_get_gpu_metrics_info() initialized all new
        1.5 metric values for all lower metric tables
      - AMD SMI API: wrapper -> fix is here + returns correct AMD SMI return
      - AMD SMI API: wrapper -> now displays amdsmi return status as
        string in logs
      - gpu_metrics_read.cc -> now has better overview of backwards
        compatible output
      - gpu_metrics_read.cc -> Cleaned up output, added units, and
        display all array output

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: Id5b60ded5b0ed2cdf0f96ca72c79e356f0410960
2023-12-19 14:18:15 -05:00
Naveen Krishna Chatradhi 19030e5b72 amdsmi_cli: Add support for CPU specific API in amdsmi_cli tool
- Add support for only CPU if only the hsmp driver is driver is present.
  - Add support for both the amdgpu and amdcpu's if both the amdgpu driver and cpu's are present.
  - Add support for socket power metrics
  - Add support for hsmp proto type version, prochot status, read current fclkmclk freq
    and current cclk freq limit, c0 residency, lclk dpm level range, socket frequency range
  - Add CPU socket current frequency limit.
  - Update tool for API's IO bandwidth, XGMI bandwidth,
    power telemetry rails, APB enable and APB disable API's
  - Add support set_pow_limit, set_xgmi_link_width, set_lclk_dpm_level, core_boost_limit,
    curr_active_freq_core_limit, set_soc_boost_limit and set_core_boost_limit.
  - Add support for the following cpu related API's in tool
    core_energy, socket energy, set power efficiency mode, ddr bandwidth,
    cpu temperature, dimm temperature range rate, dimm power consumption
    and dimm thermal temperature.
  - Add support for set_gmi3_link_width, set_pcie_lnk_rate, set_df_pstate_range

Change-Id: I5a35d1cceeb7df0bc8b7116df7c27bb7f376e839
2023-12-18 06:31:49 -05:00
Naveen Krishna Chatradhi 94d3c563a3 amdsmi: py-interface: Add python interface for esmi api
Change-Id: I4a3ab1168a7d1bf011ecc9c508e111c281503520
2023-12-18 06:31:35 -05:00
Naveen Krishna Chatradhi 65eed73f4d amd-smi: fix cpu specific apis and header
1. provide prototype and documentation for esmi specific api.
   define structures and update classes as required
2. update cmake files as required and add esmi api to the
   amdsmi esmi integration example.

Change-Id: I753ec176f9b381e74c9646525dfd9075237bf8d9
2023-12-18 06:28:15 -05:00
Charis Poag 8f3861e1d9 Add vcn and jpeg activity
Changes:
    - Add new engine field vcn_activity (from 1.4/1.5
      gpu_metrics
    - Updated log output to enhance view of gpu_metric
      data as json pretty print
    - Added new fields provided in 1.5
    - Added unit overview in python API, CLI is WIP

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: I7d9f29e7ecc35dcd0697814c222cdd02b0d5518e
2023-12-15 22:18:46 -05:00
Maisam Arif 498fde5cf4 SWDEV-437729 - Post install script fixes for pip & pyyaml
Change-Id: If5c4a7947764a0cb5717c906198436304fa62784
Signed-off-by: Maisam Arif <maisarif@amd.com>
2023-12-15 13:34:56 -06:00
Maisam Arif 16ed186760 SWDEV-436531 - Changed human readable topology output to tables
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I104ff21f650ea7fd6804d3b60da9b1feddb2a701
2023-12-14 22:32:12 -05:00
Bill(Shuzhou) Liu 59b510de2b Support max_num_cu_shared and num_cache_instance
Add above fields for cache info. Remove driver_date in CLI and
Remove the disable properties of cache.

Change-Id: I80672490908d9e32a149076cc37459fa56b8b0bf
2023-12-14 09:59:35 -05:00
Maisam Arif e9a6153836 Fix imports for partition API's
Change-Id: Ic3bc0230405ee5e662bfd2d5c6d0ed5bca42a671
2023-12-13 23:52:54 -06:00
Maisam Arif f91fc97fed SWDEV-413122 - Initial Monitor subcommand
Change-Id: Iaeaef77efeaa4289b19f1f676dcae6245f0e0c9e
2023-12-13 23:43:43 -06:00
Galantsev, Dmitrii 45dc83f81d SWDEV-436561 - Add CODEOWNERS
Change-Id: Id8c5e9bbbf92dc028fa1f66a7de5b3ab4fe4ab2a
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
2023-12-12 11:41:12 -06:00
Bill(Shuzhou) Liu de7e74f7db Collect compute partition devices under the same socket
The socket represents a physical device, and the partition devices
should belong to the socket. The partition devices are only
different in function id in BDF. Use the BD part of the BDF to
identify a socket.

Change-Id: I5d355a6f5db02faa7555b760a36c7351b8d8d835
2023-11-29 08:23:23 -06:00