* Updates:
- rocm_smi_logger:
General cleanup &
Aligned to cpplint rules for usage
- rocm_smi_monitor:
Fixed MonitorTypes
from not displaying properly in logs
& Added socket power label + current
socket power MonitorTypes
- rocm_smi API:
Added rsmi_dev_current_socket_power_get API
- rocm_smi CLI:
General cleanup,
Concise info now displays device data
in variable width (see printLogSpacer's
new field),
printLogSpacer now as an adjustable
variable that overrides appWidth,
Added Socket Power to base rocm-smi +
--showpower CLI calls,
--showpower & base rocm-smi CLI defaults
to printing socket power (if not available,
displays average power)
- Cleaned up temp label references
- power_read gtests:
Added current socket power to testing
Change-Id: Ica57e6f98ad96e2584e7c7955e188f68d2dab89d
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
The purpose of this patch is to add the following missing firmware
blocks to the SMI LIB:
-RSMI_FW_BLOCK_MES
-RSMI_FW_BLOCK_MES_KIQ
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
Change-Id: I5d4d37d883878dd02ef8533d4eb8891d54d70630
Implements APIs for 'gpu_metrics_v1_3' utilization averages
Code changes related to the following:
* rsmi_dev_activity_metric_get()
* rsmi_dev_activity_avg_mm_get()
* CLI shows "Avg.Memory Bandwidth" under "--showmemuse"
Change-Id: I8e4600f350a7c18499abf022534db2b875f09d5f
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
* Updates:
- Fixed infinit loop on systems
which did not have VRAM files
- Fixed concise info from throwing exception
with no amdgpu driver loaded
- Fix for ability to see all nodes when
after switching partitions (mirrors
original card display/settings)
- Added to logs build type, lib path,
and set env. variables
Change-Id: Ic0333df355144ce2242cecea93fe4ce51caf311c
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
When updating the wrapper I ran into an issue with anonymous structs.
Generated wrapper would contain a string split into multiple lines,
which is invalid python.
e.g.
'struct_struct anonymous
(struct.... amdsmi.h:355)'
After naming the structs - the issue is gone. BDF union now has to be
addressed with .fields
e.g.
OLD: bdf.function_number
NEW: bdf.fields.function_number
Change-Id: Ib3c640c088ad0cc67893d636827356902051f17f
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>
Sending RSMI_STATUS_UNEXPECTED_DATA for drivers
which do not set some clock freqs
Change-Id: I43a9515c2757dddd412bb25cfd54095e63367030
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
* Updates:
- Fix for devices which do not have edge sensors, but junction
- Added partitioning (memory and dynamic) displays for
base rocm-smi CLI calls
- Added subheading for base rocm-smi call output
- Added better hwmon and device detection logging
Change-Id: I8219884b2e532d6ed379527cacdc1f2b232a5451
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Code changes related to the following:
* All reinforcement work moved to their own files
* Self contained changes only to support them
* New files added to CMakeLists.txt
Change-Id: I761e91f54392824df9145eaed8b9805986861285
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>
* Updates:
- Env variable RSMI_LOGGING=0 or any other value
-> all logging off
- Env variable RSMI_LOGGING=1 -> logs only
- Env variable RSMI_LOGGING=2 -> console only
- Env variable RSMI_LOGGING=3 -> both logs + console
- Metrics output includes hexdump of current file
and decoded metrics (functions: logHexDump
and log_gpu_metrics)
- System info gathered, now includes if system's
perceived endianness - little or big endian
helpful for viewing decoded hexdump or any
binary translation
- Added templates for printing unsigned hex
(print_unsigned_hex_and_int), unsigned integers
(print_unsigned_int), and printing both unsigned
hex and int with an optional header
(print_unsigned_hex_and_int)
- Fixed some build compile warnings/errors -
ex. doing strncpys for sku or board names
this operation is expected and needed
and for temp file writes if unsuccessful
we now properly send RSMI_STATUS_FILE_ERROR
- Fixed on RHEL 8.8/9.x logrotate does not properly
initialize
Change-Id: Ifa0f0218c9cafd0a8cd6aa8e7f94d61e9107200f
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
corrected to amd-smi version from rocm-smi version
Added newline characters in the gpu choices
Updated cli versioning to 23.2.1.0 to match amd-smi
Signed-off-by: Maisam Arif <maisarif@amd.com>
Change-Id: Ia6db3a281e2349e05a09209bdcfdfa5ac48e3a86
Corrected asic serial fallback to use rsmi's unique id
Removed product serial due to duplication
Change-Id: Ib4e9ac00d2bf31ccbc35060bc84f7e79e5332d37
Signed-off-by: Maisam Arif <maisarif@amd.com>
1. new class files for cpu socket and cpu core created
2. wrapper API's for getting energy monitoring, system
statistics, power monitoring values implemented
3. modified amdsmi init & cleanup functions for esmi lib support
4. modified amdsmi system class for esmi lib support
5. sample test code created in example dir
Change-Id: Ic41f31641c283a681de696bb4346b557265bad42
1. New processor types AMD_CPU_CORE, AMD_APU added to ENUM
2. esmi errorcodes, wrappers for structures and library APIs
3. Macro introduced to enable/disable the esmi library code
Change-Id: Ia64b29303c231d3f17ac6b40fcd09b09b4380903
Code changes related to the following:
* Added 'rsmi_dev_revision_get()' related code
* Test code
* Functional tests
Change-Id: I8c2097c65384a028c8c8437b717d05d52fe45250
Signed-off-by: Oliveira, Daniel <daniel.oliveira@amd.com>