Commit Graph

966 Commits

Author SHA1 Message Date
Charis Poag c5ba765be0 Merge rocm-smi/amd-staging into amd-dev 20240119
Change-Id: Ie706473ff92a91b19e95d2d58f64904cad73a89a
Signed-off-by: Charis Poag <Charis.Poag@amd.com>


[ROCm/amdsmi commit: 6132074089]
2024-01-19 03:57:00 -05:00
Deepak Mewar c081e9e6f8 amdsmi wrapper generated for updated amdsmi_get_esmi_err_msg
Change-Id: I2388cd75111774852ae6426071d890bbf2d9d0c9


[ROCm/amdsmi commit: 730a82417e]
2024-01-16 11:41:22 -06:00
Deepak Mewar 6ef2131a21 amdsmi library updated for esmi error status mapping to amdsmi
Change-Id: I7e4dd146a1a9af496556efcf811b2e1ed565b09e


[ROCm/amdsmi commit: 5d0b479661]
2024-01-16 11:41:22 -06:00
khashaik a08809a3ca amdsmi_cli: Update help section
- Update help section

Change-Id: Ida8022a27ecc9df3ebef94e27e89624c18a9cf46


[ROCm/amdsmi commit: 27fbbc3388]
2024-01-16 11:41:22 -06:00
khashaik 6395351ac3 amdsmi_cli: Updated README.md file in the amdsmi_cli
- Update the README.md file in amdsmi_cli folder to include information
    for CPU's and CORE's along with the GPU's

Change-Id: I7670811696bc5299a287a6bc8883afe40eeeb557


[ROCm/amdsmi commit: 994b956d5e]
2024-01-16 11:41:22 -06:00
Deepak Mewar 1f7c6771eb amdsmi wrapper generated for updated hsmp metric table
Change-Id: I18c795e18d9c95320826cb965f36d3fb5546ea5c


[ROCm/amdsmi commit: 19451cc508]
2024-01-16 11:41:22 -06:00
Deepak Mewar 171f4818f4 amdsmi library updated for metric table structure
Change-Id: Ie8a9840a9020282599dd413e964d86bfb8850f6a


[ROCm/amdsmi commit: a0c95e855b]
2024-01-16 11:41:22 -06:00
khashaik a66efce2da amdsmi_cli: Add checks for no gpu devices, cpu and core devices
- Add checks for no gpu devices, cpu and core devices
  - Update units for core energy and cpu energy

Change-Id: Ieea43f1bb7fc303ebbbdf72f1ab22644a28df25c


[ROCm/amdsmi commit: 18d8087711]
2024-01-16 11:41:22 -06:00
khashaik c500be9b35 amdsmi_cli: Update parser to add neg values check for the cpu and core arguments
Change-Id: Ia7959826637e7749d999a6570df590221e85cf50


[ROCm/amdsmi commit: 108ae03c23]
2024-01-16 11:41:22 -06:00
khashaik 47ca69f2a6 amdsmi_cli: Fix issues for CPU related API's for DIMM
- Fix interface issues for dimm temperature, dimm refresh rate and dimm power consumption

Change-Id: I998209c8314e4d78a842187c5a0b127aea7dbef2


[ROCm/amdsmi commit: 4971466c22]
2024-01-16 11:41:22 -06:00
Deepak Mewar 1a2b556dce amdsmi interface updated to additionally return the freq src
from amdsmi_get_cpu_socket_current_active_freq_limit

Change-Id: I48f1026474115848a30352637415e7a1a52f3481


[ROCm/amdsmi commit: 7dcd5a3fd6]
2024-01-16 11:41:22 -06:00
Deepak Mewar 148ecb1805 amdsmi interface updated for amdsmi_get_metrics_table units
Change-Id: If211292e894df9d832b879252bebf91c17112d14


[ROCm/amdsmi commit: 898c4bc06f]
2024-01-16 11:41:22 -06:00
khashaik 323cf14a9c amdsmi_cli: Fix issues in cpu API "cpu_lclk_dpm_level"
- Fix issues in cpu API "cpu_lclk_dpm_level"
  - Fix issue for invalid core id
  - Update the error message for invalid devices

Change-Id: I71216ff72f89cfe0c86928ae3dce1f88eae91665


[ROCm/amdsmi commit: 256907989b]
2024-01-16 11:41:22 -06:00
Deepak Mewar 8bd95a26b4 amdsmi_cli: Enabled hsmp metric table from CLI
Change-Id: I7f9c13255f952136438249f5180dec5586d01bd7


[ROCm/amdsmi commit: c74f01f401]
2024-01-16 11:41:22 -06:00
Deepak Mewar 5d5bb11625 amdsmi interface updated for amdsmi_get_metrics_table encodings
Change-Id: Iffed4071d5b2b5645f8118f3fbce26ab258e7882


[ROCm/amdsmi commit: 1b1591571b]
2024-01-16 11:41:22 -06:00
khashaik 68a49c6c27 amdsmi_cli: Fix issues in "cpu_enable_apb" API
Change-Id: I8237fb4641f1a6aecec815fdc020abbf9a3195ba


[ROCm/amdsmi commit: 087a0d3ead]
2024-01-16 11:41:22 -06:00
Deepak Mewar 52c1014196 amdsmi interface updated for amdsmi_get_metrics_table
Change-Id: I0618dd411caf6d30f74793e937984273f9c5b70e


[ROCm/amdsmi commit: 31dc8d0ee8]
2024-01-16 11:41:22 -06:00
Deepak Mewar a45d2e1684 amdsmi wrapper generated for updated amdsmi_get_metrics_table
Change-Id: Id55a5647064998d8f546c806f857a8745afe52ea


[ROCm/amdsmi commit: 4ecf25e882]
2024-01-16 11:41:22 -06:00
Deepak Mewar 3a00172186 amdsmi library and sample code updated for amdsmi_get_metrics_table
Change-Id: Ie03c556f5c38fe4a0365743d3a94220e3aa62b23


[ROCm/amdsmi commit: 9f3a6dbd29]
2024-01-16 11:41:22 -06:00
Charis Poag 23a0cb827f GPU Usage/activity update
CLI:
Every usage field is notated by "activity"
gfx_usage -> gfx_activity
umc_usage -> umc_activity
vcn_activities -> vcn_activity
jpeg_activities[AID#] -> jpeg_activity

Wrapper: fixed metric output, misalignment
with generator

update_wrapper.sh:
DOCKER_BUILDKIT to 0 (if unset)

API:
amdsmi_get_gpu_metrics_info:
1.3: Removed commenting out avg socket power

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: Id3fcc20aef420c7b7a90ba22fa3bc643b2716333


[ROCm/amdsmi commit: 4575990ae7]
2024-01-15 23:34:08 -06:00
Bill(Shuzhou) Liu 28f354796d Use the same mutex as rocm-smi
Share the same mutex as rocm-smi implementation. Handle the crash
when a user is not in render group.

Change-Id: I486b26569f9b523b41bbdaf95d51f4a730978cfd


[ROCm/amdsmi commit: 5a6b5d2a0a]
2024-01-15 13:12:49 -05:00
Charis Poag 31081fa8b0 Fix AMD-SMI test segmentation fault TestGpuMetricsRead
Issue: need to return on any failure.
The nullptr check test would segfault without-
all values in struct are not initialized.

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: I4987fb73ba9bcb182de7a439a4286333a41bf7eb


[ROCm/amdsmi commit: d74be3120e]
2024-01-14 19:27:34 -06:00
Galantsev, Dmitrii 64969f2c61 SWDEV-409184 - Exclude some tests in VM
Change-Id: Ic196a113426fc63a0b2aadfa04ab4b10ed6434e3
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: a60f5d2d4c]
2024-01-11 01:38:15 -06:00
Bill(Shuzhou) Liu 29f1584b9c Add the ROCm version in CLI
Print the ROCm version in CLI

Change-Id: I529201274e114bde44722aa9a6aec13c2bedecf7


[ROCm/amdsmi commit: 9dd24a2b5a]
2024-01-04 15:25:08 -06:00
khashaik 3557e98535 amdsmi_cli: Fix issue in static cpu to fetch only smu fw version when smu is pased as option
- Fix issue in static cpu to fetch only smu fw version or prototype whne smu or prototype is
    passed as option

Change-Id: Idec3b4e571ae576d1f71df74fa9a5befea5a1585


[ROCm/amdsmi commit: 1c90b1dea7]
2023-12-21 00:34:18 -05:00
khashaik 313dbe77b9 amdsmi: Interface: Add units to the cpu related interfaces.
Change-Id: I294439c345a3e4ca399eb6b3f53eb1f18777180a


[ROCm/amdsmi commit: cdf31b8d6a]
2023-12-21 00:09:23 -05:00
Charis Poag 601a254f37 Fix GPU metric tests & cleanup test output
- CLI: Added average_power to display if current_power is empty
    - CLI: fixed PCIe current_speed not displaying GT/s
    - ROCm API: 1.3 & 1.4
                -> commented out setting avg clocks to current clock value
(leave as max uint value, not re-assign; these are not same values)
                    -> commented out setting current_socket_power = average_power
(leave as max uint value, not re-assign; these are not same values)
                    -> For all non-array clocks, placed value in first
                        array[0] to keep outputs consistent
                    (helps xcd calc)
      - ROCm API: rsmi_dev_metrics_curr_gfxclk_get fixed to count
        XCDs using backwards compatible rsmi_dev_gpu_metrics_info_get.
      - ^ Fixes XCD count overall + assigning clock[0] in 1.3 to curr
        freq
      - AMD SMI API: amdsmi_get_gpu_metrics_info() initialized all new
        1.5 metric values for all lower metric tables
      - AMD SMI API: wrapper -> fix is here + returns correct AMD SMI return
      - AMD SMI API: wrapper -> now displays amdsmi return status as
        string in logs
      - gpu_metrics_read.cc -> now has better overview of backwards
        compatible output
      - gpu_metrics_read.cc -> Cleaned up output, added units, and
        display all array output

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: Id5b60ded5b0ed2cdf0f96ca72c79e356f0410960


[ROCm/amdsmi commit: 5ff5af0b5a]
2023-12-19 14:18:15 -05:00
Naveen Krishna Chatradhi e924266a25 amdsmi_cli: Add support for CPU specific API in amdsmi_cli tool
- Add support for only CPU if only the hsmp driver is driver is present.
  - Add support for both the amdgpu and amdcpu's if both the amdgpu driver and cpu's are present.
  - Add support for socket power metrics
  - Add support for hsmp proto type version, prochot status, read current fclkmclk freq
    and current cclk freq limit, c0 residency, lclk dpm level range, socket frequency range
  - Add CPU socket current frequency limit.
  - Update tool for API's IO bandwidth, XGMI bandwidth,
    power telemetry rails, APB enable and APB disable API's
  - Add support set_pow_limit, set_xgmi_link_width, set_lclk_dpm_level, core_boost_limit,
    curr_active_freq_core_limit, set_soc_boost_limit and set_core_boost_limit.
  - Add support for the following cpu related API's in tool
    core_energy, socket energy, set power efficiency mode, ddr bandwidth,
    cpu temperature, dimm temperature range rate, dimm power consumption
    and dimm thermal temperature.
  - Add support for set_gmi3_link_width, set_pcie_lnk_rate, set_df_pstate_range

Change-Id: I5a35d1cceeb7df0bc8b7116df7c27bb7f376e839


[ROCm/amdsmi commit: 19030e5b72]
2023-12-18 06:31:49 -05:00
Naveen Krishna Chatradhi 37f1d47b0e amdsmi: py-interface: Add python interface for esmi api
Change-Id: I4a3ab1168a7d1bf011ecc9c508e111c281503520


[ROCm/amdsmi commit: 94d3c563a3]
2023-12-18 06:31:35 -05:00
Naveen Krishna Chatradhi 4bd015f945 amd-smi: fix cpu specific apis and header
1. provide prototype and documentation for esmi specific api.
   define structures and update classes as required
2. update cmake files as required and add esmi api to the
   amdsmi esmi integration example.

Change-Id: I753ec176f9b381e74c9646525dfd9075237bf8d9


[ROCm/amdsmi commit: 65eed73f4d]
2023-12-18 06:28:15 -05:00
Charis Poag 4f502e5dab Add vcn and jpeg activity
Changes:
    - Add new engine field vcn_activity (from 1.4/1.5
      gpu_metrics
    - Updated log output to enhance view of gpu_metric
      data as json pretty print
    - Added new fields provided in 1.5
    - Added unit overview in python API, CLI is WIP

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Change-Id: I7d9f29e7ecc35dcd0697814c222cdd02b0d5518e


[ROCm/amdsmi commit: 8f3861e1d9]
2023-12-15 22:18:46 -05:00
Maisam Arif 030a971ce4 SWDEV-437729 - Post install script fixes for pip & pyyaml
Change-Id: If5c4a7947764a0cb5717c906198436304fa62784
Signed-off-by: Maisam Arif <maisarif@amd.com>


[ROCm/amdsmi commit: 498fde5cf4]
2023-12-15 13:34:56 -06:00
Maisam Arif 632443e2a3 SWDEV-436531 - Changed human readable topology output to tables
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I104ff21f650ea7fd6804d3b60da9b1feddb2a701


[ROCm/amdsmi commit: 16ed186760]
2023-12-14 22:32:12 -05:00
Bill(Shuzhou) Liu 9dc60e00cb Support max_num_cu_shared and num_cache_instance
Add above fields for cache info. Remove driver_date in CLI and
Remove the disable properties of cache.

Change-Id: I80672490908d9e32a149076cc37459fa56b8b0bf


[ROCm/amdsmi commit: 59b510de2b]
2023-12-14 09:59:35 -05:00
Maisam Arif aa654175f4 Fix imports for partition API's
Change-Id: Ic3bc0230405ee5e662bfd2d5c6d0ed5bca42a671


[ROCm/amdsmi commit: e9a6153836]
2023-12-13 23:52:54 -06:00
Maisam Arif 6f778a4c62 SWDEV-413122 - Initial Monitor subcommand
Change-Id: Iaeaef77efeaa4289b19f1f676dcae6245f0e0c9e


[ROCm/amdsmi commit: f91fc97fed]
2023-12-13 23:43:43 -06:00
Galantsev, Dmitrii b6e3d6bfb6 SWDEV-436561 - Add CODEOWNERS
Change-Id: Id8c5e9bbbf92dc028fa1f66a7de5b3ab4fe4ab2a
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 45dc83f81d]
2023-12-12 11:41:12 -06:00
Bill(Shuzhou) Liu 985ddbc5d5 Collect compute partition devices under the same socket
The socket represents a physical device, and the partition devices
should belong to the socket. The partition devices are only
different in function id in BDF. Use the BD part of the BDF to
identify a socket.

Change-Id: I5d355a6f5db02faa7555b760a36c7351b8d8d835


[ROCm/amdsmi commit: de7e74f7db]
2023-11-29 08:23:23 -06:00
Maisam Arif a8138bfd5e Change xgmi_physical_id to oam_id
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I35fb36ec0e9f72a7135d8bb9070dbdc0e956b93a


[ROCm/amdsmi commit: b54086a037]
2023-11-22 12:16:38 -06:00
Maisam Arif 09f4046345 Refactor gpu_metrics usage in CLI
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I599878971ab94a768d008f046f2d303ad76fdb3b


[ROCm/amdsmi commit: 5b36b438b7]
2023-11-22 03:32:55 -06:00
Maisam Arif ff96f50145 Refactor gpu_metrics usage in libraries
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I763638d4b546bf49b234e823df81028c357e8f49


[ROCm/amdsmi commit: d790ebc62b]
2023-11-22 03:32:15 -06:00
Maisam Arif 662eaa6ad3 Merge rocmsmi/amd-staging into amd-dev 20231121
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I5cc6accced971479583954e0b93cd90c510ca814
Signed-off-by: Maisam Arif <maisarif@amd.com>


[ROCm/amdsmi commit: 02d310e525]
2023-11-22 03:31:35 -06:00
Galantsev, Dmitrii b2785f6b7b SWDEV-423944 - Clean-up un/install scripts
Change-Id: Ib16935ca456f889dc8d1280a37693858afa82715
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/amdsmi commit: 4022889701]
2023-11-20 13:09:09 -05:00
Bill(Shuzhou) Liu c7f9cff2cb Add APIs for PM table and register table
Read the PM table and register table as the name value pair.

Change-Id: Ie44fe67a28af3341bd6beb90d809e90f280351ac


[ROCm/amdsmi commit: ac1ba33371]
2023-11-20 12:31:18 -05:00
Deepak Mewar 09125ca639 Added AMDSMI_CHECK_INIT to esmi library wrappers
Change-Id: Id187a9152399cdefec21a0d310bdb78f593426af


[ROCm/amdsmi commit: baa2c94a86]
2023-11-14 11:58:18 -05:00
Maisam Arif 888f0d67cb SWDEV-425887 - Corrected vbios population
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I3728064c77fae8cfa006254769a2cc821b4d5362


[ROCm/amdsmi commit: 6441abdc1a]
2023-11-14 11:56:43 -05:00
Maisam Arif 37a41c3bc8 SWDEV-426130 - Updated firmware subcommand output
Corrected truncation
	corrected xgmi to ta_xgmi
	remapped smc(system management controller) to pm(power
management)

Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: I404cefa7b90a454d4f4b08f6490448b47cf32107


[ROCm/amdsmi commit: 545e57d3e3]
2023-11-14 11:56:43 -05:00
Sam Wu 643042695d SWDEV-425457 - Correct typo in link to ESMI Library README
Change-Id: Ia16ba0175b0cd3d4a38bab71eb0f1281878d81d2


[ROCm/amdsmi commit: 2ef675b9db]
2023-11-10 18:23:53 -05:00
Deepak Mewar 591221eee6 modified local esmi functions called from amdsmi_init
for gtest compatibility

Change-Id: I627c9887a1f1e340c358f060818a1a7d74ce33f9


[ROCm/amdsmi commit: 0c790752ac]
2023-11-10 15:50:42 -05:00
Deepak Mewar dbc139a8b4 another set of esmi python wrappers updated to amdsmi python library
Change-Id: I33557b9021ecfdab76daaf65ad63f624115aa322


[ROCm/amdsmi commit: 14c50c9b4e]
2023-11-10 15:50:42 -05:00