Граф коммитов

2 Коммитов

Автор SHA1 Сообщение Дата
Bill(Shuzhou) Liu 66e4e790c3 Add SSL mutual authentication support for rdci
The RDC API is changed to pass the certificates to the gRPC.

Add the support to add all GPUs in the host to a group. Also before
add a GPU to a group, the RDC API will verify that GPU exists or not.

Add the support to fetch the temperature metrics.

Change-Id: I5857ef03fede233d16e8b2836be120f33172da93
2020-08-17 14:07:25 -05:00
Bill(Shuzhou) Liu a5f063f8b3 Support discovery and group management in rdc_lib
The rdc.h is modified for new discovery and grouping APIs.

The RdcGroupSettingsImpl.cc is added to implement the GPU group and
the field group management.

The RdcMetricFetcherImpl.cc is added to fetch the metrics from
rocm_smi_lib. Currently, only support power, memory, GPU utilization,
temperature, GPU clock, total device and device name.

A new example field_value_example.cc is added to demo how to record
the fields and retrieve data from cache.

Change-Id: I57acfa048fe9b3d848e2d441e768b3a63ccae3f8
2020-08-17 14:07:25 -05:00