Changes:
- Added new GPU metrics:
1) XGMI link status - Up/Down; 1 = up; 0 = down
2) Graphics clocks below host limit (per XCP)
accumulators -> used to help calculate a violation status
3) VRAM max bandwidth at max memory clock
- Updated rocm-smi --showmetrics to include new metrics.
Units/values reflect as indicated by driver, may differ
from AMD SMI or other ROCm SMI interfaces which
use these fields.
- N/A fields means the device does not support providing this
data.
Change-Id: I17b313345f15070a76b3a30dd8d5645d212d601b
Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Use C++ in ROCm SMI
Device Indices
Many of the functions in the library take a "device index". The device index is a number greater than or equal to 0, and less than the number of devices detected, as determined by rsmi_num_monitor_devices(). The index is used to distinguish the detected devices from one another. It is important to note that a device may end up with a different index after a reboot, so an index should not be relied upon to be constant over reboots.
Hello ROCm SMI
The only required ROCm-SMI call for any program that wants to use ROCm-SMI is the rsmi_init() call. This call initializes some internal data structures that will be used by subsequent ROCm-SMI calls.
When ROCm-SMI is no longer being used, rsmi_shut_down() should be called. This provides a way to do any releasing of resources that ROCm-SMI may have held. In many cases, this may have no effect, but may be necessary in future versions of the library.
A simple "Hello World" type program that displays the device ID of detected devices would look like this:
#include <stdint.h>
#include "rocm_smi/rocm_smi.h"
int main() {
rsmi_status_t ret;
uint32_t num_devices;
uint16_t dev_id;
// We will skip return code checks for this example, but it
// is recommended to always check this as some calls may not
// apply for some devices or ROCm releases
ret = rsmi_init(0);
ret = rsmi_num_monitor_devices(&num_devices);
for (int i=0; i < num_devices; ++i) {
ret = rsmi_dev_id_get(i, &dev_id);
// dev_id holds the device ID of device i, upon a
// successful call
}
ret = rsmi_shut_down();
return 0;
}