The xgmi command was showing pcie bit rate and bandwidth instead of xgmi. Corrected the API to get xgmi data from gpu metric.
Added python API for amdsmi_get_link_metrics. Modified the amdsmi_link_metrics struct.
Added check to confirm non zero partition got xgmi command.
---------
Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
* [SWDEV-488303] Adjusted process vram_mem data source
* Standardized sscanf format strings
---------
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: gabrpham_amdeng <Gabriel.Pham@amd.com>
* Add the API and CLI to show the board voltage.
---------
Change-Id: Icb25bd653bb1d004704b5a21b378ca31b2b242c7
Signed-off-by: Arif, Maisam <Maisam.Arif@amd.com>
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
* Update to remove vram enum and instead use the string directly from the driver.
Signed-off-by: Narlo, Joseph <Joseph.Narlo@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
feat: Report PLDM Bundle from SMC to IB
Code changes related to the following:
* APIs
* CLI
* Unit tests
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
Change-Id: I35ccf01eb612ca80e3ae6b72039085c18c989222
- Added more events to `amdsmi_evt_notification_type_t`
Change-Id: I6a256fe828e4bec3197c7fecbed374ab17c6f850
Signed-off-by: Adam Pryor <Adam.Pryor@amd.com>
* Small Fixes
* CLI Help text and parser formatting updates
* Changed metavar for set partition
---------
Change-Id: Ia8809665f6fac670452cd4db4e5e8f9c7270faba
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Signed-off-by: Pham, Gabriel <Gabriel.Pham@amd.com>
Co-authored-by: Pham, Gabriel <Gabriel.Pham@amd.com>
This is required for GPU busy percent in RDC
Change-Id: Idf2ab72993ecc8227958e6eb47f36fc68c93759f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
- Add support to check CPU vendor info which will be called by RDC to
discovery CPU information
- Move esmi headers declaration to impl/amd_smi_common.h
- remove duplicated amdsmi_cpu_util_t
---------
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>
- Updates:
- Adding the following metrics to allow new calculations for violation status:
- Per XCP metrics gfx_below_host_limit_ppt_acc
- Per XCP metrics gfx_below_host_limit_thm_acc
- Per XCP metrics gfx_low_utilization_acc
- Per XCP metrics gfx_below_host_limit_total_acc
- Increasing available JPEG engines to 40. Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI.
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
Added enumeration mapping for
- drm render
- drm card
- hsa id
- hip id
- hip uuid (rocminfo uuid)
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
* amdsmi: Adding Support to get hsmp Driver version
Adding Support to fetch hsmp driver version from ESmi Interfaces.
Adding Support to fetch memory bandwidth per socket.
Signed-off-by: muthusamy <muthusamy.ramalingam@amd.com>
Features added:
- [SWDEV-475244] Add new interface to get max memory bandwidth
Updated API: amdsmi_get_gpu_vram_info
Updated: struct amdsmi_vram_info_t to include vram_max_bandwidth
CLI: amd-smi static --vram
- [SWDEV-488349] Add new interface for XGMI link status
New API: amdsmi_get_gpu_xgmi_link_status
CLI: amd-smi xgmi --link-status
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Change-Id: I1aa35b741136eb4f02f7ea9a95b865886273eb72
Issues include:
SWDEV-480250
SWDEV-480255
SWDEV-480248
Known issue:
`amd-smi event` has threads taking events from the same device
which, in the case of resetting gpus, makes it seem like some gpus have
reset mulitple times and other have not reset at all.
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Ic7dcc214e0366fc1532ece579d915d34d35d5407
Changes:
- [CLI] Added warning screen to AMD SMI users
setting memory partition
- [CLI] Added a progress bar time-bar for CLI sets display to 40 seconds
- [API] Updated to wait until the driver reloads with SYSFS files active
- [CLI] Now users can set or reset without providing:
amd-smi set -g all <set arguments>
or amd-smi reset -g all <set arguments>
now can directly call -> sudo amd-smi set <set arguments>
or sudo amd-smi reset <set arguments>
- [SWDEV-475712][CLI/API] Fixed target_graphics_version field
not properly displaying for older MI or Navi ASICs.
- [All APIs] Added a catch for the driver to report invalid arguments
now these APIs will show AMDSMI_STATUS_INVAL
(ex. changing to NPS8 if the device does not support it)
- [Install] Modified paths for Python install commands to support
multi-ROCm installs
Change-Id: Id11f25d68a82d23c6b2d77ccb30b51e860dd0ca7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
The reset gpu partition support for both compute and memory were removed
Code changes related to the following:
* amdsmi_reset_gpu_compute_partition()
* amdsmi_reset_gpu_memory_partition()
* CLI
Change-Id: I372589074b4da172bedd39223edde18939e373ae
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
- Update the API names, parameters to return cpu handles and core
handles in the system.
- Update the amdsmi_wrapper.py.
- Update the amdsmi_interface.py to use the processor handles and
core handles API.
Change-Id: Ie24f62f345864f8b6773fdb3c6369993bca7e25b
Changes:
- amdsmi_violation_status_t now includes current accumulated/counter
values
- Tests/wrapper now include added values
- Removed ASIC references in header for host/bm alignment
- Fix violation_status->per_hbm_thrm /
violation_status->active_hbm_thrm
calculations.
Change-Id: Ic86a7cbad5198a41018f82f6b588b83158d9ba0b
Signed-off-by: Charis Poag <Charis.Poag@amd.com>