İşleme Grafiği

5 İşleme

Yazar SHA1 Mesaj Tarih
Galantsev, Dmitrii 028355dff0 SWDEV-439576 - rocmsmi -> amdsmi
- Migrate to amdsmi library
- NOTE: raslib still uses rocmsmi
- Remove unused rocmsmi service
- Remove unused RDC client code
- Remove RSMI calls from protos/rdc.proto

Change-Id: Ifc34a264c506b0ec5792307ee56b34526268762d
Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>


[ROCm/rdc commit: 9702d0f2d7]
2024-04-09 20:19:28 -05:00
Bill(Shuzhou) Liu d1efa59fe8 Fallback to junction temperature and socket power
If the card does not have edge temperature, fallback to junction
temperature. If the card only have socket power, then use socket
power instead.

Change-Id: I053a67a89cf3b29a34e82123f522c08d7dd68916


[ROCm/rdc commit: 5cfe2b4169]
2024-02-05 10:10:26 -06:00
Bill(Shuzhou) Liu e16a8bcaf5 Identify GPUs using PCI device identifier in RDC Prometheus plugin
Add a new option --enable_pci_id to Prometheus plugin, which will map
the GPU index to the PCI Device Identifier.

Change-Id: I38a2a7e4841975da095391002397d4515ffb8e0d


[ROCm/rdc commit: 23ab2c0671]
2022-05-05 09:16:05 -04:00
Bill(Shuzhou) Liu d53a9c4e21 RDC Prometheus plugin return errors when use the --rdc_gpu_indexes
When above option is used, the plugin returns errors:
  result = rdc.rdc_group_gpu_add(rdc_handle, gpu_group_id, gpu)
  ctypes.ArgumentError: argument 3: <type 'exceptions.TypeError'>: wrong type

The rdc_prometheus.py is changed to convert string to integer.
The RdcUtil.py is also changed to raise Exception properly.

Change-Id: I9535091ff1fc8882cccd32e5f2810da5241768c3


[ROCm/rdc commit: 7ca7a571a7]
2021-02-23 14:15:04 -05:00
Bill(Shuzhou) Liu b91560f0a8 RDC Prometheus plugin
The rdc_prometheus.py is a Prometheus plugin for RDC
The rdc_prometheus_example.yml and prometheus_targets.json are
example Prometheus configuration. If there are multiple compute
nodes, they can be defined at prometheus_targets.json.

Change-Id: I3611b1e8a166f6608351f6e7644808bf72a4d3a0


[ROCm/rdc commit: 9c7a1347ea]
2020-08-17 14:09:37 -05:00