* Updates:
- [API] After discovering all amd gpus, we now properly
map correct bdf (xgmi nodes). Especially important for
partition changes - aka secondary nodes.
- [API] While adding new secondary nodes we now have
better grouping -> due to resorting based on
kfd properties list & matching to primary uniqueid
- [API] All secondary nodes are now AddToDeviceList
with correct bdf (location id), provided by kfd
- [API] Modified AddToDeviceList(..., uint64_t bdfid):
providing an optional field - bdfid. This allows working
around primary pcie cards with xgmi nodes
- [API] Utils - cpplint minor fixes
- [Example] Removed all endl references w/ newline, fixed
spacing, and some incorrect values displaying as hex
(needed dec representation)
- [API] kfd node functions - now print full path of file
for trace logs
- [Tests] power_read.cc: Added in generic power test to
confirm guaranteeing specific return values
Change-Id: I143474e8d64c4915a966e789be6bcea4fa7f4472
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
* Updates:
- [API] Added rsmi_dev_power_get(uint32_t dv_ind,
uint64_t *power,
RSMI_POWER_TYPE
*type)
provides generic get to average or
current power & provides backwards
compatibility
- Added a utility function to get MonitorTypes
(monitor_type_string(type)) &
RSMI_POWER_TYPE (power_type_string(type))
strings
- [Tests] Added rsmi_dev_power_get tests and
provided better verification of return values for
all power APIs
- [Tests] Updated power outputs to show correct
units
- [example] Now uses avg, current, and generic
power functions with type output response
Change-Id: I5ca06ca37fd5f61e100f2835b664d6cdd1ca42e6
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
* Updates:
- rocm_smi_lib + CLI:
Rename all "NPS mode" -> "memory partition"
related files/functions/API/CLI to align with correct
technical naming
- rocm_smi_main: fixed identifying primary card's unique id
utilize rsmi_dev_unique_id_get to map which
KFD nodes belong to it
- rsmi_dev_*_partition*: now have better logging output
- compute partition tests:
Added 20 sec delay for workaround until GPU
busy is confirmed as the issue
- CPPLint fixes/formatting
- [Example] Moved all endl to "\n" for efficiency
- [Example] Added Edge & Junction temperature examples
- [Example] Added rsmi_minmax_bandwidth_get() example - WIP
Change-Id: Ida6db6fda7e0ac9d696a34cb15b4746e69d58d51
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
- Return from freq_output function early if clock is unsupported
- Right-align frequencies
Change-Id: I799c9351dac8a5be161bc9243cd3816539728357
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Also change the TARGET from amd_smi_libraries to rocm_smi_libraries
This helps reduce confusion between rocm-smi and amd-smi
Change-Id: Ie54cedd831ba24bd9afc341ad15b7e8e20732059
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
* Updates:
- rocm_smi_logger:
General cleanup &
Aligned to cpplint rules for usage
- rocm_smi_monitor:
Fixed MonitorTypes
from not displaying properly in logs
& Added socket power label + current
socket power MonitorTypes
- rocm_smi API:
Added rsmi_dev_current_socket_power_get API
- rocm_smi CLI:
General cleanup,
Concise info now displays device data
in variable width (see printLogSpacer's
new field),
printLogSpacer now as an adjustable
variable that overrides appWidth,
Added Socket Power to base rocm-smi +
--showpower CLI calls,
--showpower & base rocm-smi CLI defaults
to printing socket power (if not available,
displays average power)
- Cleaned up temp label references
- power_read gtests:
Added current socket power to testing
Change-Id: Ica57e6f98ad96e2584e7c7955e188f68d2dab89d
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
The purpose of this patch is to add the following missing firmware
blocks to the SMI LIB:
-RSMI_FW_BLOCK_MES
-RSMI_FW_BLOCK_MES_KIQ
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I5d4d37d883878dd02ef8533d4eb8891d54d70630
This commit makes sure GTest is always compiled with rocm_smi_lib_tests.
GTest installation was inconsistent outside of AMD CI environment.
libgtest.so wouldn't get installed with rocm_smi_lib_tests if gtest
existed on the build machine. Which is undesirable when packaging.
Change-Id: I607df6c67c81480e3b6487b28f14924e8bf56ad4
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Implements APIs for 'gpu_metrics_v1_3' utilization averages
Code changes related to the following:
* rsmi_dev_activity_metric_get()
* rsmi_dev_activity_avg_mm_get()
* CLI shows "Avg.Memory Bandwidth" under "--showmemuse"
Change-Id: I8e4600f350a7c18499abf022534db2b875f09d5f
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
It is not guaranteed that power can be read or set for some GPUs
(MI300). It is also not guaranteed that frequencies can be set.
As this is not a tool issue - we simply skip the failing test.
Change-Id: I134e96a476040cef513cd924f00e30cd6dea42a5
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
* Updates:
- Fixed infinit loop on systems
which did not have VRAM files
- Fixed concise info from throwing exception
with no amdgpu driver loaded
- Fix for ability to see all nodes when
after switching partitions (mirrors
original card display/settings)
- Added to logs build type, lib path,
and set env. variables
Change-Id: Ic0333df355144ce2242cecea93fe4ce51caf311c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
corrected to amd-smi version from rocm-smi version
Added newline characters in the gpu choices
Updated cli versioning to 23.2.1.0 to match amd-smi
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ia6db3a281e2349e05a09209bdcfdfa5ac48e3a86
Code changes related to the following:
* Added 'rsmi_dev_revision_get()' related code
* Test code
* Functional tests
Change-Id: I8c2097c65384a028c8c8437b717d05d52fe45250
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
The following read tests were failing:
*.TestIdInfoRead
*.TestSysInfoRead
1. *.TestIdInfoRead failed because rsmi_dev_brand_get did not specify
dependency on vbios_version.
2. *.TestSysInfoRead failed because the test didn't expect vbios_version to
be missing. Which is a new behavior in Aqua Vanjaram.
Change-Id: I9ee88a12fcf6cff2032049e2ecdfb2957efb03ab
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
See SWDEV-391039 and SWDEV-391040 for details
Change-Id: I662ba43363d949465454ea4af4d4586b3d47a811
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>