Files
rocm-systems/projects/rdc/python_binding
Bill(Shuzhou) Liu 79897be094 Add new XGMI and PCIE bandwidth fields from gpu_metrics
For new ASIC, the RDC_EVNT_XGMI, RDC_FI_PCIE_RX and RDC_FI_PCIE_TX
are not supported. New fileds RDC_FI_XGMI and RDC_FI_PCIE_BANDWIDTH
should be used.

Change-Id: Iff5bbef4c07994090fa7c4e9b319966215525283


[ROCm/rdc commit: 61a75d346b]
2024-05-03 16:18:17 -04:00
..
2020-08-17 14:09:37 -05:00
2020-11-10 14:26:49 -05:00
2020-08-17 14:09:37 -05:00
2022-04-27 14:38:48 -04:00

Quick start

If you do not have the RDC installed, please specify the RDC library path using:

$ export LD_LIBRARY_PATH=<rdc_libs_path>

Then you can run RdcReader in python_binding folder:

$ python RdcReader.py

Prometheus plugin

Install the prometheus_client:

$ pip install prometheus_client

Start the rdcd with auth and then run plugin to connect to it:

$ python rdc_prometheus.py

Check the options of the plugin:

$ python rdc_prometheus.py --help

Verify the plugin is running:

$ curl localhost:5000

In the managment computer, install the Prometheus from https://github.com/prometheus/prometheus

Modify the file prometheus_targets.json to add the compute nodes running the plugin. Start the Prometheus

$ prometheus --config.file=<full path of the rdc_prometheus_example.yml>

Browse to localhost:9090 in the managment computer for metrics from RDC.