In the rdci dmon and fieldgroup, now the fields can be specified
using either number id or the field name.
When the rdc is async fetching metrics, it will not report that fetch
as an error.
Change-Id: I81331e2c239af987181147be5ac0e29ba1617ab4
[ROCm/rdc commit: d30cb81fdb]
Remove the * in the rdci stats
When a group is created, the GPUs can be added in the same command.
Add the support to the memory temperature.
Add the support to the memory clock.
Add the support to report the ECC errors.
Add the support to report the PCIe bandwidth throughput.
Since the RX/TX throughput may take 1 second to retreive, an async fetch is implemented
in the RdcMetricFetcherImpl.
Change-Id: If04f602fe1f2d14dbf7c2fb189549fd030523f9a
[ROCm/rdc commit: f4a3fd4dda]
Add support for the stats subsystem in rdci
Modify the dmon system to handle the case when no GPUs in a group
Change-Id: I5a18e1201d24b5318b8e324a77551a757b108f25
[ROCm/rdc commit: 096dc2dadb]
Add the function to start and stop the job recording.
Add the function to get the job stats for each GPU and summary of multiple GPUs
Add the function to remove the jobs.
Add a class RdcLogger which can control the log level using the environment variable RDC_LOG.
This is similar to GRPC_VERBOSITY gRPC. When the customer has the issues, he can enable the verbose
log to help us to troubleshoot the issues.
Add the -u support in the rdci group, fieldgroup and dmon for connecting to rdcd without authentication.
Change-Id: I22c591823c1ee6485db106b911bed8271d1b2769
[ROCm/rdc commit: a547dc7efd]
Add the support for rdci subsystem group create, delete and query
Add the support for rdci subsystem fieldgroup create, delete and query
Add the support for rdci dmon system. The dmon system may show the stats every
a few seconds until press Ctrl-C. To cleanup the resources (for example, unwatch),
a signal handler is added.
Change-Id: Ib22a8a43b7083c7c72819ca21145e22702d9ad6c
[ROCm/rdc commit: 16bce67835]