Gráfico de commits

354 Commits

Autor SHA1 Mensaje Fecha
Kanangot Balakrishnan, Bindhiya 7d109001ac [SWDEV-513855] Add power cap to power monitor (#193)
Added power cap to display on amd-smi monitor -p.
Updated help and Changelog as well.

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
2025-03-26 17:45:08 -05:00
Kanangot Balakrishnan, Bindhiya 9b64dcb61a [SWDEV-513958] Add help flag to valid commands (#204)
Added '-h' flag to valid first input command list

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-03-26 17:43:28 -05:00
Kanangot Balakrishnan, Bindhiya 3ddfbcc0a3 [SWDEV-520148] Modify VRAM details in monitor output (#199)
Earlier amd-smi monitor was showing VRAM usage as used and total.
Modified it to display free VRAM and VRAM percentage. Updated
Changelog.

Signed-off-by: Kanangot Balakrishnan, Bindhiya <Bindhiya.KanangotBalakrishnan@amd.com>
2025-03-26 13:12:41 -05:00
Kanangot Balakrishnan, Bindhiya 3681f900ee [SWDEV-513958] Fix error message due to argparse behavior (#108)
When argparse parses multiple invalid arguments, the error
message displays only the last argument and this leads to
confusion. To avoid the scenario, added valid command check
before argparse and in case of invalid first command, added
new exception.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-03-26 13:11:17 -05:00
Pham, Gabriel b72cd22225 [SWDEV-520754] Fixed str int concatenation issue (#186)
Signed-off-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
2025-03-25 17:43:59 -05:00
Poag, Charis 0402bb4d75 [SWDEV-513807] Fix amd-smi partition --accelerator not returning AMDSMI_STATUS_NO_PERM (#192)
* [SWDEV-513807] Fix amd-smi partition --accelerator not returning AMDSMI_STATUS_NO_PERM

Changes:
- Fixed amdsmi_get_gpu_accelerator_partition_profile_config() from not
  returning AMDSMI_STATUS_NO_PERM
- Changed amd-smi partition --accelerator to provide user with a warning
  if users does not use sudo or root permissions.
- Updated changelog for fixes planned for 6.4.1 release

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-03-20 17:23:01 -05:00
Poag, Charis 48cb5529d2 [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition (#127)
* [SWDEV-493274/SWDEV-514998] Add AMD SMI partition tests + Add Guest amd-smi static --partition

Changes:
    - Added amd-smi static --partition for guest systems
    - Added C++ tests for memory and compute (accelerator) partitions
    - Added Python tests for amdsmi_get_gpu_vram_info(),
       amdsmi_get_gpu_accelerator_partition_profile_config()
    - Updated Python tests for
      amdsmi_get_gpu_accelerator_partition_profile()
      Now includes more profile and resource detail
    - Added amdsmi_get_gpu_xcd_counter();
      Tests provided for both C++/Python APIs
    - Added AmdSmiVramType & AmdSmiVramVendor: they were missing
      python testing required adding.

Change-Id: Ib6549d8ccc5fb68726f38745b87c78f890186022
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-03-11 16:38:46 -05:00
Arif, Maisam d9fee767c3 [SWDEV-509342] Fixed incorrect import ordering affecting fallback (#175)
Change-Id: I10478cd5797811172d966f4eb845440b51c2b25c

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-03-10 17:31:27 -05:00
AL Musaffar, Yazen 2936e00fed [SWDEV-453922] AMD SMI to provide mapping feature of other enumeration methods (#51)
Added enumeration mapping for 
- drm render
- drm card
- hsa id 
- hip id
- hip uuid (rocminfo uuid)

Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-03-07 09:09:12 -06:00
Arif, Maisam a340a54ee6 [SWDEV-518976] Fix amd-smi metric clock checks (#154)
[SWDEV-518976] Fixed amd-smi version default args

Change-Id: Id88da4d8f31aa0b0cf55b3cf796c16981931d857

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-03-06 11:25:59 -06:00
Pryor, Adam ace5162735 [SWDEV-495057] Update check to ignore sudo (#150)
Change-Id: Id4f24c254e805647782ae68667903a8d467c49b1

Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-03-05 23:54:24 -06:00
Arif, Maisam 2e00fd607f Fixed help text error in VM (#148)
Change-Id: I4b0bba8613837f04cbff1efdcea32523bf20a182

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-03-04 09:29:15 -06:00
Kanangot Balakrishnan, Bindhiya 2141e0336c [SWDEV-514182] Update amd-smi help with sudo requirement (#142)
To execute set and reset commands, amd-smi needs sudo
privileges. Updated the subcommand help text to show
'sudo' requirement for these commands.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-03-03 14:53:30 -06:00
Arif, Maisam 3a02cc6f61 [SWDEV-518976] Fix amd-smi metric clock checks (#146)
Change-Id: I7daf6f87c4c8331662b6b8f543e3e7f966724714

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-03-03 14:06:15 -06:00
Kanangot Balakrishnan, Bindhiya edd2268076 [SWDEV-512474] Conform amd-smi monitor output to 80 chars (#68)
Updated spacing and column headers
Updated Changelog

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-02-26 22:34:36 -06:00
Narlo, Joseph d7c3ad0886 [SWDEV-515031] Change Header Version to 25.2.0 (#109)
Change Versioning Scheme to match https://semver.org/
Dropping the year enum and API fields in a future release.
Should not impact library versioning since we are now starting from 25.2.0
---------

Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
Change-Id: Id090e23f156926d08f9c0b781447388adf268cf6
2025-02-26 19:17:09 -06:00
Pryor, Adam 7745f4f6c9 SWDEV-495057 Check video and render groups (#121)
Change-Id: I0a96dfbd84ae84fa29f41e1fe5ea89c05c5e5e7e

Signed-off-by: adapryor <Adam.pryor@amd.com>
2025-02-26 17:16:41 -06:00
Arif, Maisam c14397209d Moved clk-limit and process-isolation back to VMs (#135)
Change-Id: Ie46f1e94c09c71fcb8805f89749a5b1d8f9995e3

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-02-26 14:47:49 -06:00
Arif, Maisam 52b3ee2dc6 [SWDEV-503520] Add amdsmi_get_rocm_version() in python library (#76)
Changed amdsmi_get_rocm_version() to be an API in the python library only. 
Updated usage and version detection
Updated path detection of librocm-core.so
Updated docs to reflect both amdsmi_get_rocm_version and amdsmi_get_lib_version() do not require initialization.

Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-02-26 05:45:58 -06:00
Mewar, Deepak 83c88f26d1 [SWDEV-511011] version command fails when driver isn't loaded (#123)
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>

Change-Id: Ibe77c904-1a5c-4361-8fad-291a203c6755
2025-02-26 03:13:22 -06:00
Pham, Gabriel 71a8f35a7d [SWDEV-509287] Fixed metric command issue with min_clk and deep sleep (#131)
Improved deep sleep detection

Change-Id: I4179084da6c2849275957adb7b57797846a0f748
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-02-26 03:00:04 -06:00
Arif, Maisam 9d2bbcf14d Updated amdsmi_get_driver_info() to handle empty strings (#126)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-02-21 19:15:18 -06:00
Castillo, Juan 1b9841450a [SWDEV-514394] Added additional try catch statements for gpu_metrics (#117)
Update to break apart try/except clause around entire gpu clocks functions. Broke down to each individual gpu_metric which allowed valid data to populate.

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
2025-02-18 15:18:18 -06:00
Pham, Gabriel ce526724d3 Corrected PERF_LEVELS in set --clk-level command to FREQ_LEVELS (#112)
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
2025-02-18 14:44:55 -06:00
Narlo, Joseph dc4a16da6f [SWDEV-513651] Sync Unified And Linux Header (#98)
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
2025-02-06 22:25:50 -06:00
Arif, Maisam 548ed781c7 [SWDEV-494072] Added Fallback to metric command to for pcie replay_counter (#99)
Change-Id: I5392e8f881b1e69d9a76b01813a66b08fb70e006

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-02-06 19:33:49 -06:00
Maisam Arif 08be2db720 [SWDEV-513127] - Fixed spelling in amdsmi_parser.py
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Id15b7e1e15d8719c36d508db6cf2667234928b92
2025-02-04 17:57:16 -06:00
Poag, Charis 1d2272490e [SWDEV-513127] Fix AttributeError: 'AMDSMILogger' object has no attribute 'clear_multiple_devices_output' (#92)
Full output:
$ amd-smi metric:
 AttributeError: 'AMDSMILogger' object has no attribute 'clear_multiple_devices_output'. Did you mean: 'clear_multiple_devices_ouput'?

Changes:
* Changed CLI function definition clear_multiple_devices_ouput(self) ->
clear_multiple_devices_output(self)
* Updated all references to clear_multiple_devices_ouput() to use
  clear_multiple_devices_ouput()

Change-Id: Ibd4e210ea30c9dd51fba17981a524b823f2db054
2025-02-04 09:30:12 -06:00
Pham, Gabriel e663bed7d6 [SWDEV-462952] Updated passthrough to use virtualization mode struct
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-01-31 17:34:01 -06:00
Poag, Charis 3a94f5a880 Fix AttributeError: 'Namespace' object has no attribute 'cpu_pwr_svi_telemetry_rails'
Updated missing references to cpu_pwr_svi_telemetry_rails

Change-Id: I1828ad3122a602dc5c4253500f83c3910b682cb3
Signed-off-by: Poag, Charis <Charis.Poag@amd.com>
2025-01-31 08:12:42 -06:00
Pham, Gabriel 0f79efac78 [SWDEV-462952] Options enabled for GPU passthrough scenarios
Added Dynamic Passthrough detection

Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
2025-01-30 18:12:03 -06:00
Pham, Gabriel 5b2c271eff [SWDEV-493207] Made fixes to enable hsmp version
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-01-30 16:14:21 -06:00
Kanangot Balakrishnan, Bindhiya 51c705fd43 [SWDEV-511961] Wrap BM specific set help text
BM specific help text contained functions that required the driver to be loaded.
this was causing amd-smi not supported error on Linux guests.
Fixed this by wrapping the help text in the proper checks

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-01-30 03:25:48 -06:00
Pham, Gabriel 0326c52ce9 [SWDEV-493207] Added amd_hsmp version to version command
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-01-30 03:12:08 -06:00
Arif, Maisam 703415cb1f Updated Import Error Logging
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Change-Id: Ief4a5f100f54668c5bce001ea051136738fbc468
2025-01-28 15:56:49 -06:00
Kanangot Balakrishnan, Bindhiya e3e11835e4 [SWDEV-508042] Fix TypeError in specific clocks csv logging (#57)
Logging specific clocks in csv format was causing TypeError as the levels were int.
Fixed this by appending Level string at the beginning.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-01-22 18:06:13 -06:00
Pham, Gabriel b779ce2831 [SWDEV-493207] Added amdgpu version to version command
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
2025-01-22 18:05:25 -06:00
Kanangot Balakrishnan, Bindhiya 834993e1c3 SWDEV-457845: Fix Linux VM clean_local_data error on set
Corrected clean_local_data error in Linux VM's while doing
amd-smi set without args.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-01-20 14:45:27 -06:00
Poag, Charis c1cd2b46ef [SWDEV-488276] Add partition 2.0 functionality (#44)
Changes:
* CLI:
  - Updated amd-smi partition
  - Updated amd-smi partition -c
  - Updated amd-smi partition -m
  - Updated amd-smi partition -a
  - Updated amd-smi set -M <NPS1/NPS2/NPS4/NPS8>
  - Updated amd-smi set -C <SPX/DPX/QPX/TPX/CPX>
  - Updated amd-smi set -C <ACCELERATOR_TYPE> or <PROFILE_INDEX>
    Where PROFILE_INDEX = available ACCELERATOR_TYPES
  - Updated amd-smi set --help, now includes more detail for
    amd-smi set -C <ACCELERATOR_TYPE> or <PROFILE_INDEX>

* API:
  - Added amdsmi_get_gpu_memory_partition_config
  - Added amdsmi_set_gpu_memory_partition_mode
  - Added amdsmi_get_gpu_accelerator_partition_profile_config
  - Updated amdsmi_get_gpu_accelerator_partition_profile_config
  - Added amdsmi_set_gpu_accelerator_partition_profile

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2025-01-16 00:53:46 -06:00
Scaffidi, Salvatore 3793be7735 [SWDEV-463406] Update API with fields for gfx_clock_below_host_limit and low_utilization violations
Updated API with fields for gfx_clock_below_host_limit and low_utilization violations
Change-Id: I25647bae6e7b785f44dab024272767658688bcad

---------
Signed-off-by: Scaffidi, Salvatore <Salvatore.Scaffidi@amd.com>
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
2025-01-08 22:07:23 -06:00
Kanangot Balakrishnan, Bindhiya d0e770ffbc SWDEV-504130 Add temperature violation status to amd-smi monitor (#2)
Added boolean temperature violation status to amd-smi monitor.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-01-08 16:35:53 -06:00
Pham, Gabriel 129ad8ffad [SWDEV-502523] Made amd-smi reset command arguments mutually exclusive
Made reset arguments mutually exclusive so that users can only 
select one option at a time to prevent throwing of errors.

---------
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-01-08 16:24:05 -06:00
Kanangot Balakrishnan, Bindhiya 3897670757 [SWDEV-439701] Fix wrong error handling in MissingParameterValue (#32)
Error handling was not displaying the missing parameter details in
argument type validator functions. Fixed this by passing param name to
AmdSmiMissingParameterValueException.

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-01-07 17:13:00 -06:00
Pham, Gabriel 5ed340c08b [SWDEV-502523] made set gpu arguments mutually exclusive (#31)
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2025-01-07 16:48:01 -06:00
Pham, Gabriel 93a027ec95 [SWDEV-476303] Exposed valid values for set command (#8)
Updated amd-smi set help text
---------

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
2024-12-20 15:32:10 -06:00
gabrpham 23da950ef0 Additional fixes for amd-smi static --clock
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
2024-12-20 14:45:20 -06:00
Charis Poag 3226a1d0ea [SWDEV-484382] Fix VCLK/DCLK outputs for monitor, static, metric
Units were off and VCLK/DCLK outputs were not coming in
properly through amdsmi_get_clk_freq()

Now we match units sent back through rsmi_dev_gpu_clk_freq_get (MHz).

CLI now shows maximum of 2 VCLK/DCLKs otherwise shows N/A if there
is no current_freq listed.

Change-Id: I8a7b66cbb5263e8d396f8568c104e1ce3512923d
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
2024-12-20 14:11:08 -06:00
Juan Castillo f8b8347627 [SWDEV-496693]GPU Metrics 1.7
Features added:
- [SWDEV-475244] Add new interface to get max memory bandwidth
Updated API: amdsmi_get_gpu_vram_info
Updated: struct amdsmi_vram_info_t to include vram_max_bandwidth
CLI: amd-smi static --vram

- [SWDEV-488349] Add new interface for XGMI link status
New API: amdsmi_get_gpu_xgmi_link_status
CLI: amd-smi xgmi --link-status

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Change-Id: I1aa35b741136eb4f02f7ea9a95b865886273eb72
2024-12-18 10:57:06 -06:00
gabrpham fe290a2056 [SWDEV-484382] Added fclk and socclk to amd-smi metric -c
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Ie7e19c757b05455693c0d26eeb5e8b6c1e238375
2024-12-13 00:33:12 -05:00
gabrpham 5f9c2db6f3 [SWDEV-484382] Added new command amd-smi set -c/--clk-level
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: If45152e3a3c94f65b6a8a960601b9ed16fa3d0d7
2024-12-13 00:32:19 -05:00