- Add support to check CPU vendor info which will be called by RDC to
discovery CPU information
- Move esmi headers declaration to impl/amd_smi_common.h
- remove duplicated amdsmi_cpu_util_t
---------
Signed-off-by: Perry Yuan <perry.yuan@amd.com>
Signed-off-by: Deepak Mewar <deepak.mewar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Deepak Mewar <deepak.mewar@amd.com>
* [SWDEV-513807] Fix amd-smi partition --accelerator not returning AMDSMI_STATUS_NO_PERM
Changes:
- Fixed amdsmi_get_gpu_accelerator_partition_profile_config() from not
returning AMDSMI_STATUS_NO_PERM
- Changed amd-smi partition --accelerator to provide user with a warning
if users does not use sudo or root permissions.
- Updated changelog for fixes planned for 6.4.1 release
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
- Updates:
- Adding the following metrics to allow new calculations for violation status:
- Per XCP metrics gfx_below_host_limit_ppt_acc
- Per XCP metrics gfx_below_host_limit_thm_acc
- Per XCP metrics gfx_low_utilization_acc
- Per XCP metrics gfx_below_host_limit_total_acc
- Increasing available JPEG engines to 40. Current ASICs may not support all 40. These will be indicated as UINT16_MAX or N/A in CLI.
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
Added enumeration mapping for
- drm render
- drm card
- hsa id
- hip id
- hip uuid (rocminfo uuid)
Signed-off-by: AL Musaffar, Yazen <Yazen.ALMusaffar@amd.com>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
Change Versioning Scheme to match https://semver.org/
Dropping the year enum and API fields in a future release.
Should not impact library versioning since we are now starting from 25.2.0
---------
Signed-off-by: Joseph Narlo <joseph.narlo@amd.com>
Co-authored-by: Arif, Maisam <Maisam.Arif@amd.com>
Change-Id: Id090e23f156926d08f9c0b781447388adf268cf6
* amdsmi: Adding Support to get hsmp Driver version
Adding Support to fetch hsmp driver version from ESmi Interfaces.
Adding Support to fetch memory bandwidth per socket.
Signed-off-by: muthusamy <muthusamy.ramalingam@amd.com>
Features added:
- [SWDEV-475244] Add new interface to get max memory bandwidth
Updated API: amdsmi_get_gpu_vram_info
Updated: struct amdsmi_vram_info_t to include vram_max_bandwidth
CLI: amd-smi static --vram
- [SWDEV-488349] Add new interface for XGMI link status
New API: amdsmi_get_gpu_xgmi_link_status
CLI: amd-smi xgmi --link-status
Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Change-Id: I1aa35b741136eb4f02f7ea9a95b865886273eb72
Issues include:
SWDEV-480250
SWDEV-480255
SWDEV-480248
Known issue:
`amd-smi event` has threads taking events from the same device
which, in the case of resetting gpus, makes it seem like some gpus have
reset mulitple times and other have not reset at all.
Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Change-Id: Ic7dcc214e0366fc1532ece579d915d34d35d5407
Changes:
* [API] Removed checking board name, fixes for other MI ASICs
* [API] Fixed unable to restart AMD GPU, libdrm blocked
doing this operation
* [API] Added ability to unload/reload libdrm
from within AMD SMI APIs
* [CLI] Increased progress bar to change memory partition modes
to 140 seconds, since driver reload is variable per system
Change-Id: I52f227f2ab850c4a6332ff3ecdc899903b1080f1
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Changes:
- [CLI] Added warning screen to AMD SMI users
setting memory partition
- [CLI] Added a progress bar time-bar for CLI sets display to 40 seconds
- [API] Updated to wait until the driver reloads with SYSFS files active
- [CLI] Now users can set or reset without providing:
amd-smi set -g all <set arguments>
or amd-smi reset -g all <set arguments>
now can directly call -> sudo amd-smi set <set arguments>
or sudo amd-smi reset <set arguments>
- [SWDEV-475712][CLI/API] Fixed target_graphics_version field
not properly displaying for older MI or Navi ASICs.
- [All APIs] Added a catch for the driver to report invalid arguments
now these APIs will show AMDSMI_STATUS_INVAL
(ex. changing to NPS8 if the device does not support it)
- [Install] Modified paths for Python install commands to support
multi-ROCm installs
Change-Id: Id11f25d68a82d23c6b2d77ccb30b51e860dd0ca7
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Move memory_caps defintion and correct the number in reserved to match Confluence
Signed-off-by: Joe Narlo <Joseph.Narlo@amd.com>
Change-Id: Id94144f4b3d2d3d7b4d7327211ffc1957ffd0a93
Changes:
- Corrected max speed users can sample from FW/driver
is 100 ms
- Added warning to amdsmi_get_violation_status()
call on delay required 100ms to sample
- Removed guest support, this API will not be supported
- Updated CLI `amd-smi metric --throttle` outputs from
XXX_active -> XXX_status
XXX_percent -> XXX_activity
to align with host
- Changelog updated
Change-Id: Ib30dd35dcc04ff67904ca82c86a55a16689df226
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
The reset gpu partition support for both compute and memory were removed
Code changes related to the following:
* amdsmi_reset_gpu_compute_partition()
* amdsmi_reset_gpu_memory_partition()
* CLI
Change-Id: I372589074b4da172bedd39223edde18939e373ae
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
- Update the API names, parameters to return cpu handles and core
handles in the system.
- Update the amdsmi_wrapper.py.
- Update the amdsmi_interface.py to use the processor handles and
core handles API.
Change-Id: Ie24f62f345864f8b6773fdb3c6369993bca7e25b
Changes:
- amdsmi_violation_status_t now includes current accumulated/counter
values
- Tests/wrapper now include added values
- Removed ASIC references in header for host/bm alignment
- Fix violation_status->per_hbm_thrm /
violation_status->active_hbm_thrm
calculations.
Change-Id: Ic86a7cbad5198a41018f82f6b588b83158d9ba0b
Signed-off-by: Charis Poag <Charis.Poag@amd.com>