[SWDEV-463402] - Support retrieving connection type and P2P capabilities between two GPUs

1. Add a API interface amdsmi_topo_get_p2p_status to retrieve
connection type and P2P capabilities between 2 GPUs.

2. Add getting p2p status test in hw_topology_read
to print P2P capability information.

3. Add below tables for cli topology sub commands:
  - CACHE COHERANCY TABLE
  - ATOMICS TABLE
  - DMA TABLE
  - BI-DIRECTIONAL TABLE

Change-Id: I199173030d4170115cea27c472958a4826e4e1bf
Signed-off-by: Tim Huang <tim.huang@amd.com>


[ROCm/amdsmi commit: 260edaa752]
Этот коммит содержится в:
Tim Huang
2024-08-21 11:26:36 +08:00
коммит произвёл Charis Poag
родитель 104bf22265
Коммит 202ddc01aa
13 изменённых файлов: 842 добавлений и 11 удалений
+157
Просмотреть файл
@@ -19,6 +19,163 @@ Added `amdsmi_get_gpu_mem_overdrive_level()` function to amd-smi C and Python Li
- **Added Subsystem Device ID to `amd-smi static --asic`**.
No underlying changes to amdsmi_get_gpu_asic_info
- **Added retrieving connection type and P2P capabilities between two GPUs**.
- Added `amdsmi_topo_get_p2p_status` function to amd-smi C and Python Libraries.
- Added retrieving P2P link capabilities to CLI `amd-smi topology`.
```shell
$ amd-smi topology -h
usage: amd-smi topology [-h] [--json | --csv] [--file FILE] [--loglevel LEVEL]
[-g GPU [GPU ...]] [-a] [-w] [-o] [-t] [-b]
If no GPU is specified, returns information for all GPUs on the system.
If no topology argument is provided all topology information will be displayed.
Topology arguments:
-h, --help show this help message and exit
-g, --gpu GPU [GPU ...] Select a GPU ID, BDF, or UUID from the possible choices:
ID: 0 | BDF: 0000:0c:00.0 | UUID: 5fff74a1-0000-1000-808c-324a4d24b37e
ID: 1 | BDF: 0000:22:00.0 | UUID: 06ff74a1-0000-1000-80d3-f5e97636ae62
ID: 2 | BDF: 0000:38:00.0 | UUID: 87ff74a1-0000-1000-80a0-d0a45576c5ed
ID: 3 | BDF: 0000:5c:00.0 | UUID: 5dff74a1-0000-1000-8054-a29c595fd7f3
ID: 4 | BDF: 0000:9f:00.0 | UUID: a8ff74a1-0000-1000-805b-92615ca9e7b4
ID: 5 | BDF: 0000:af:00.0 | UUID: ddff74a1-0000-1000-809e-5a98a60013bd
ID: 6 | BDF: 0000:bf:00.0 | UUID: 9aff74a1-0000-1000-80e8-cbefaf9f72c3
ID: 7 | BDF: 0000:df:00.0 | UUID: 48ff74a1-0000-1000-806e-3c0b30d78e00
all | Selects all devices
-a, --access Displays link accessibility between GPUs
-w, --weight Displays relative weight between GPUs
-o, --hops Displays the number of hops between GPUs
-t, --link-type Displays the link type between GPUs
-b, --numa-bw Display max and min bandwidth between nodes
-c, --coherent Display cache coherant (or non-coherant) link capability between nodes
-n, --atomics Display 32 and 64-bit atomic io link capability between nodes
-d, --dma Display P2P direct memory access (DMA) link capability between nodes
-z, --bi-dir Display P2P bi-directional link capability between nodes
Command Modifiers:
--json Displays output in JSON format (human readable by default).
--csv Displays output in CSV format (human readable by default).
--file FILE Saves output into a file on the provided path (stdout by default).
--loglevel LEVEL Set the logging level from the possible choices:
DEBUG, INFO, WARNING, ERROR, CRITICAL
```
```shell
$ amd-smi topology
ACCESS TABLE:
0000:0c:00.0 0000:22:00.0 0000:38:00.0 0000:5c:00.0 0000:9f:00.0 0000:af:00.0 0000:bf:00.0 0000:df:00.0
0000:0c:00.0 ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED
0000:22:00.0 ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED
0000:38:00.0 ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED
0000:5c:00.0 ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED
0000:9f:00.0 ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED
0000:af:00.0 ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED
0000:bf:00.0 ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED
0000:df:00.0 ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED ENABLED
WEIGHT TABLE:
0000:0c:00.0 0000:22:00.0 0000:38:00.0 0000:5c:00.0 0000:9f:00.0 0000:af:00.0 0000:bf:00.0 0000:df:00.0
0000:0c:00.0 0 15 15 15 15 15 15 15
0000:22:00.0 15 0 15 15 15 15 15 15
0000:38:00.0 15 15 0 15 15 15 15 15
0000:5c:00.0 15 15 15 0 15 15 15 15
0000:9f:00.0 15 15 15 15 0 15 15 15
0000:af:00.0 15 15 15 15 15 0 15 15
0000:bf:00.0 15 15 15 15 15 15 0 15
0000:df:00.0 15 15 15 15 15 15 15 0
HOPS TABLE:
0000:0c:00.0 0000:22:00.0 0000:38:00.0 0000:5c:00.0 0000:9f:00.0 0000:af:00.0 0000:bf:00.0 0000:df:00.0
0000:0c:00.0 0 1 1 1 1 1 1 1
0000:22:00.0 1 0 1 1 1 1 1 1
0000:38:00.0 1 1 0 1 1 1 1 1
0000:5c:00.0 1 1 1 0 1 1 1 1
0000:9f:00.0 1 1 1 1 0 1 1 1
0000:af:00.0 1 1 1 1 1 0 1 1
0000:bf:00.0 1 1 1 1 1 1 0 1
0000:df:00.0 1 1 1 1 1 1 1 0
LINK TYPE TABLE:
0000:0c:00.0 0000:22:00.0 0000:38:00.0 0000:5c:00.0 0000:9f:00.0 0000:af:00.0 0000:bf:00.0 0000:df:00.0
0000:0c:00.0 SELF XGMI XGMI XGMI XGMI XGMI XGMI XGMI
0000:22:00.0 XGMI SELF XGMI XGMI XGMI XGMI XGMI XGMI
0000:38:00.0 XGMI XGMI SELF XGMI XGMI XGMI XGMI XGMI
0000:5c:00.0 XGMI XGMI XGMI SELF XGMI XGMI XGMI XGMI
0000:9f:00.0 XGMI XGMI XGMI XGMI SELF XGMI XGMI XGMI
0000:af:00.0 XGMI XGMI XGMI XGMI XGMI SELF XGMI XGMI
0000:bf:00.0 XGMI XGMI XGMI XGMI XGMI XGMI SELF XGMI
0000:df:00.0 XGMI XGMI XGMI XGMI XGMI XGMI XGMI SELF
NUMA BW TABLE:
0000:0c:00.0 0000:22:00.0 0000:38:00.0 0000:5c:00.0 0000:9f:00.0 0000:af:00.0 0000:bf:00.0 0000:df:00.0
0000:0c:00.0 N/A 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000
0000:22:00.0 50000-50000 N/A 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000
0000:38:00.0 50000-50000 50000-50000 N/A 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000
0000:5c:00.0 50000-50000 50000-50000 50000-50000 N/A 50000-50000 50000-50000 50000-50000 50000-50000
0000:9f:00.0 50000-50000 50000-50000 50000-50000 50000-50000 N/A 50000-50000 50000-50000 50000-50000
0000:af:00.0 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000 N/A 50000-50000 50000-50000
0000:bf:00.0 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000 N/A 50000-50000
0000:df:00.0 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000 50000-50000 N/A
CACHE COHERANCY TABLE:
0000:0c:00.0 0000:22:00.0 0000:38:00.0 0000:5c:00.0 0000:9f:00.0 0000:af:00.0 0000:bf:00.0 0000:df:00.0
0000:0c:00.0 SELF C NC NC C C C NC
0000:22:00.0 C SELF NC C C C NC C
0000:38:00.0 NC NC SELF C C NC C NC
0000:5c:00.0 NC C C SELF NC C NC NC
0000:9f:00.0 C C C NC SELF NC NC C
0000:af:00.0 C C NC C NC SELF C C
0000:bf:00.0 C NC C NC NC C SELF NC
0000:df:00.0 NC C NC NC C C NC SELF
ATOMICS TABLE:
0000:0c:00.0 0000:22:00.0 0000:38:00.0 0000:5c:00.0 0000:9f:00.0 0000:af:00.0 0000:bf:00.0 0000:df:00.0
0000:0c:00.0 SELF 64,32 64,32 64 32 32 N/A 64,32
0000:22:00.0 64,32 SELF 64 32 32 N/A 64,32 64,32
0000:38:00.0 64,32 64 SELF 32 N/A 64,32 64,32 64,32
0000:5c:00.0 64 32 32 SELF 64,32 64,32 64,32 32
0000:9f:00.0 32 32 N/A 64,32 SELF 64,32 32 32
0000:af:00.0 32 N/A 64,32 64,32 64,32 SELF 32 N/A
0000:bf:00.0 N/A 64,32 64,32 64,32 32 32 SELF 64,32
0000:df:00.0 64,32 64,32 64,32 32 32 N/A 64,32 SELF
DMA TABLE:
0000:0c:00.0 0000:22:00.0 0000:38:00.0 0000:5c:00.0 0000:9f:00.0 0000:af:00.0 0000:bf:00.0 0000:df:00.0
0000:0c:00.0 SELF T T F F T F T
0000:22:00.0 T SELF F F T F T T
0000:38:00.0 T F SELF T F T T T
0000:5c:00.0 F F T SELF T T T F
0000:9f:00.0 F T F T SELF T F F
0000:af:00.0 T F T T T SELF F T
0000:bf:00.0 F T T T F F SELF F
0000:df:00.0 T T T F F T F SELF
BI-DIRECTIONAL TABLE:
0000:0c:00.0 0000:22:00.0 0000:38:00.0 0000:5c:00.0 0000:9f:00.0 0000:af:00.0 0000:bf:00.0 0000:df:00.0
0000:0c:00.0 SELF T T F F T F T
0000:22:00.0 T SELF F F T F T T
0000:38:00.0 T F SELF T F T T T
0000:5c:00.0 F F T SELF T T T F
0000:9f:00.0 F T F T SELF T F F
0000:af:00.0 T F T T T SELF F T
0000:bf:00.0 F T T T F F SELF F
0000:df:00.0 T T T F F T F SELF
Legend:
SELF = Current GPU
ENABLED / DISABLED = Link is enabled or disabled
N/A = Not supported
T/F = True / False
C/NC = Coherant / Non-Coherant io links
64,32 = 64 bit and 32 bit atomic support
<BW from>-<BW to>
```
### Removals
- N/A
+239 -3
Просмотреть файл
@@ -2822,7 +2822,8 @@ class AMDSMICommands():
def topology(self, args, multiple_devices=False, gpu=None, access=None,
weight=None, hops=None, link_type=None, numa_bw=None):
weight=None, hops=None, link_type=None, numa_bw=None,
coherent=None, atomics=None, dma=None, bi_dir=None):
""" Get topology information for target gpus
params:
args - argparser args to pass to subcommand
@@ -2833,6 +2834,10 @@ class AMDSMICommands():
hops (bool) - Value override for args.hops
type (bool) - Value override for args.type
numa_bw (bool) - Value override for args.numa_bw
coherent (bool) - Value override for args.coherent
atomics (bool) - Value override for args.atomics
dma (bool) - Value override for args.dma
bi_dir (bool) - Value override for args.bi_dir
return:
Nothing
"""
@@ -2849,6 +2854,14 @@ class AMDSMICommands():
args.link_type = link_type
if numa_bw:
args.numa_bw = numa_bw
if coherent:
args.coherent = coherent
if atomics:
args.atomics = atomics
if dma:
args.dma = dma
if bi_dir:
args.bi_dir = bi_dir
# Handle No GPU passed
if args.gpu == None:
@@ -2858,8 +2871,10 @@ class AMDSMICommands():
args.gpu = [args.gpu]
# Handle all args being false
if not any([args.access, args.weight, args.hops, args.link_type, args.numa_bw]):
args.access = args.weight = args.hops = args.link_type= args.numa_bw = True
if not any([args.access, args.weight, args.hops, args.link_type, args.numa_bw,
args.coherent, args.atomics, args.dma, args.bi_dir]):
args.access = args.weight = args.hops = args.link_type= args.numa_bw = \
args.coherent = args.atomics = args.dma = args.bi_dir = True
# Clear the table header
self.logger.table_header = ''.rjust(12)
@@ -2890,6 +2905,10 @@ class AMDSMICommands():
# "num_hops": num_hops - # of hops between devices
# "bandwidth": numa_bw - The NUMA "minimum bandwidth-maximum bandwidth" beween src and dest nodes
# "N/A" - self node or not connected devices
# "coherent": coherent - Coherant / Non-Coherant io links
# "atomics": atomics - 32 and 64-bit atomic io link capability between nodes
# "dma": dma - P2P direct memory access (DMA) link capability between nodes
# "bi_dir": bi_dir - P2P bi-directional link capability between nodes
# }
for dest_gpu_index, dest_gpu in enumerate(args.gpu):
@@ -2928,6 +2947,42 @@ class AMDSMICommands():
else:
link_status = "DISABLED"
link_coherent = "SELF"
link_atomics = "SELF"
link_dma = "SELF"
link_bi_dir = "SELF"
if src_gpu != dest_gpu:
try:
cap = amdsmi_interface.amdsmi_topo_get_p2p_status(src_gpu, dest_gpu)['cap']
link_coherent = (
"C" if cap['is_iolink_coherent'] == 1 else
"NC" if cap['is_iolink_coherent'] == 0 else
"N/A"
)
link_atomics = (
"64,32" if cap['is_iolink_atomics_32bit'] == 1 and cap['is_iolink_atomics_64bit'] == 1 else
"32" if cap['is_iolink_atomics_32bit'] == 1 else
"64" if cap['is_iolink_atomics_64bit'] == 1 else
"N/A"
)
link_dma = (
"T" if cap['is_iolink_dma'] == 1 else
"F" if cap['is_iolink_dma'] == 0 else
"N/A"
)
link_bi_dir = (
"T" if cap['is_iolink_bi_directional'] == 1 else
"F" if cap['is_iolink_bi_directional'] == 0 else
"N/A"
)
except amdsmi_exception.AmdSmiLibraryException as e:
logging.debug("Failed to get link status for %s to %s | %s",
self.helpers.get_gpu_id_from_device_handle(src_gpu),
self.helpers.get_gpu_id_from_device_handle(dest_gpu),
e.get_error_info())
# link_status = amdsmi_is_P2P_accessible(src,dest)
dest_gpu_links = {
"gpu": self.helpers.get_gpu_id_from_device_handle(dest_gpu),
@@ -2937,6 +2992,10 @@ class AMDSMICommands():
"link_type": link_type,
"num_hops": num_hops,
"bandwidth": numa_bw,
"coherent": link_coherent,
"atomics": link_atomics,
"dma": link_dma,
"bi_dir": link_bi_dir
}
if not args.access:
del dest_gpu_links['link_status']
@@ -2948,6 +3007,14 @@ class AMDSMICommands():
del dest_gpu_links['num_hops']
if not args.numa_bw:
del dest_gpu_links['bandwidth']
if not args.coherent:
del dest_gpu_links['coherent']
if not args.atomics:
del dest_gpu_links['atomics']
if not args.dma:
del dest_gpu_links['dma']
if not args.bi_dir:
del dest_gpu_links['bi_dir']
links.append(dest_gpu_links)
dest_end = dest_gpu_index+1 == len(args.gpu)
isEndOfSrc = src_gpu_index+1 == len(args.gpu)
@@ -3165,6 +3232,175 @@ class AMDSMICommands():
self.logger.table_title = "NUMA BW TABLE"
self.logger.print_output(multiple_device_enabled=True, tabular=True)
if args.coherent:
tabular_output = []
for src_gpu_index, src_gpu in enumerate(args.gpu):
src_gpu_bdf = amdsmi_interface.amdsmi_get_gpu_device_bdf(src_gpu)
if self.logger.is_human_readable_format():
tabular_output_dict = {'gpu' : f"{src_gpu_bdf} "}
else:
tabular_output_dict = {'gpu' : src_gpu_bdf}
src_gpu_coherent = {}
for dest_gpu in args.gpu:
dest_gpu_id = self.helpers.get_gpu_id_from_device_handle(dest_gpu)
dest_gpu_key = f'gpu_{dest_gpu_id}'
if src_gpu == dest_gpu:
src_gpu_coherent[dest_gpu_key] = "SELF"
continue
try:
iolink_coherent = amdsmi_interface.amdsmi_topo_get_p2p_status(src_gpu, dest_gpu)['cap']['is_iolink_coherent']
src_gpu_coherent[dest_gpu_key] = "C" if iolink_coherent == 1 else "NC" if iolink_coherent == 0 else "N/A"
except amdsmi_exception.AmdSmiLibraryException as e:
src_gpu_coherent[dest_gpu_key] = "N/A"
logging.debug("Failed to get link coherent for %s to %s | %s",
self.helpers.get_gpu_id_from_device_handle(src_gpu),
self.helpers.get_gpu_id_from_device_handle(dest_gpu),
e.get_error_info())
topo_values[src_gpu_index]['coherent'] = src_gpu_coherent
tabular_output_dict.update(src_gpu_coherent)
tabular_output.append(tabular_output_dict)
if self.logger.is_human_readable_format():
self.logger.multiple_device_output = tabular_output
self.logger.table_title = "CACHE COHERANCY TABLE"
self.logger.print_output(multiple_device_enabled=True, tabular=True)
if args.atomics:
tabular_output = []
for src_gpu_index, src_gpu in enumerate(args.gpu):
src_gpu_bdf = amdsmi_interface.amdsmi_get_gpu_device_bdf(src_gpu)
if self.logger.is_human_readable_format():
tabular_output_dict = {'gpu' : f"{src_gpu_bdf} "}
else:
tabular_output_dict = {'gpu' : src_gpu_bdf}
src_gpu_atomics = {}
for dest_gpu in args.gpu:
dest_gpu_id = self.helpers.get_gpu_id_from_device_handle(dest_gpu)
dest_gpu_key = f'gpu_{dest_gpu_id}'
if src_gpu == dest_gpu:
src_gpu_atomics[dest_gpu_key] = "SELF"
continue
try:
cap = amdsmi_interface.amdsmi_topo_get_p2p_status(src_gpu, dest_gpu)['cap']
src_gpu_atomics[dest_gpu_key] = (
"64,32" if cap['is_iolink_atomics_32bit'] == 1 and cap['is_iolink_atomics_64bit'] == 1 else
"32" if cap['is_iolink_atomics_32bit'] == 1 else
"64" if cap['is_iolink_atomics_64bit'] == 1 else
"N/A"
)
except amdsmi_exception.AmdSmiLibraryException as e:
src_gpu_atomics[dest_gpu_key] = "N/A"
logging.debug("Failed to get link atomics for %s to %s | %s",
self.helpers.get_gpu_id_from_device_handle(src_gpu),
self.helpers.get_gpu_id_from_device_handle(dest_gpu),
e.get_error_info())
topo_values[src_gpu_index]['atomics'] = src_gpu_atomics
tabular_output_dict.update(src_gpu_atomics)
tabular_output.append(tabular_output_dict)
if self.logger.is_human_readable_format():
self.logger.multiple_device_output = tabular_output
self.logger.table_title = "ATOMICS TABLE"
self.logger.print_output(multiple_device_enabled=True, tabular=True)
if args.dma:
tabular_output = []
for src_gpu_index, src_gpu in enumerate(args.gpu):
src_gpu_bdf = amdsmi_interface.amdsmi_get_gpu_device_bdf(src_gpu)
if self.logger.is_human_readable_format():
tabular_output_dict = {'gpu' : f"{src_gpu_bdf} "}
else:
tabular_output_dict = {'gpu' : src_gpu_bdf}
src_gpu_dma = {}
for dest_gpu in args.gpu:
dest_gpu_id = self.helpers.get_gpu_id_from_device_handle(dest_gpu)
dest_gpu_key = f'gpu_{dest_gpu_id}'
if src_gpu == dest_gpu:
src_gpu_dma[dest_gpu_key] = "SELF"
continue
try:
iolink_dma = amdsmi_interface.amdsmi_topo_get_p2p_status(src_gpu, dest_gpu)['cap']['is_iolink_dma']
src_gpu_dma[dest_gpu_key] = "T" if iolink_dma == 1 else "F" if iolink_dma == 0 else "N/A"
except amdsmi_exception.AmdSmiLibraryException as e:
src_gpu_dma[dest_gpu_key] = "N/A"
logging.debug("Failed to get link dma for %s to %s | %s",
self.helpers.get_gpu_id_from_device_handle(src_gpu),
self.helpers.get_gpu_id_from_device_handle(dest_gpu),
e.get_error_info())
topo_values[src_gpu_index]['dma'] = src_gpu_dma
tabular_output_dict.update(src_gpu_dma)
tabular_output.append(tabular_output_dict)
if self.logger.is_human_readable_format():
self.logger.multiple_device_output = tabular_output
self.logger.table_title = "DMA TABLE"
self.logger.print_output(multiple_device_enabled=True, tabular=True)
if args.bi_dir:
tabular_output = []
for src_gpu_index, src_gpu in enumerate(args.gpu):
src_gpu_bdf = amdsmi_interface.amdsmi_get_gpu_device_bdf(src_gpu)
if self.logger.is_human_readable_format():
tabular_output_dict = {'gpu' : f"{src_gpu_bdf} "}
else:
tabular_output_dict = {'gpu' : src_gpu_bdf}
src_gpu_bi_dir = {}
for dest_gpu in args.gpu:
dest_gpu_id = self.helpers.get_gpu_id_from_device_handle(dest_gpu)
dest_gpu_key = f'gpu_{dest_gpu_id}'
if src_gpu == dest_gpu:
src_gpu_bi_dir[dest_gpu_key] = "SELF"
continue
try:
iolink_bi_dir = amdsmi_interface.amdsmi_topo_get_p2p_status(src_gpu, dest_gpu)['cap']['is_iolink_bi_directional']
src_gpu_bi_dir[dest_gpu_key] = "T" if iolink_bi_dir == 1 else "F" if iolink_bi_dir == 0 else "N/A"
except amdsmi_exception.AmdSmiLibraryException as e:
src_gpu_bi_dir[dest_gpu_key] = "N/A"
logging.debug("Failed to get link bi-directional for %s to %s | %s",
self.helpers.get_gpu_id_from_device_handle(src_gpu),
self.helpers.get_gpu_id_from_device_handle(dest_gpu),
e.get_error_info())
topo_values[src_gpu_index]['bi_dir'] = src_gpu_bi_dir
tabular_output_dict.update(src_gpu_bi_dir)
tabular_output.append(tabular_output_dict)
if self.logger.is_human_readable_format():
self.logger.multiple_device_output = tabular_output
self.logger.table_title = "BI-DIRECTIONAL TABLE"
self.logger.print_output(multiple_device_enabled=True, tabular=True)
if self.logger.is_human_readable_format():
# Populate the legend output
legend_parts = [
"\n\nLegend:",
" SELF = Current GPU",
" ENABLED / DISABLED = Link is enabled or disabled",
" N/A = Not supported",
" T/F = True / False",
" C/NC = Coherant / Non-Coherant io links",
" 64,32 = 64 bit and 32 bit atomic support",
" <BW from>-<BW to>"
]
legend_output = "\n".join(legend_parts)
if self.logger.destination == 'stdout':
print(legend_output)
else:
with self.logger.destination.open('a', encoding="utf-8") as output_file:
output_file.write(legend_output + '\n')
self.logger.multiple_device_output = topo_values
if self.logger.is_csv_format():
+8
Просмотреть файл
@@ -972,6 +972,10 @@ class AMDSMIParser(argparse.ArgumentParser):
hops_help = "Displays the number of hops between GPUs"
link_type_help = "Displays the link type between GPUs"
numa_bw_help = "Display max and min bandwidth between nodes"
coherent_help = "Display cache coherant (or non-coherant) link capability between nodes"
atomics_help = "Display 32 and 64-bit atomic io link capability between nodes"
dma_help = "Display P2P direct memory access (DMA) link capability between nodes"
bi_dir_help = "Display P2P bi-directional link capability between nodes"
# Create topology subparser
topology_parser = subparsers.add_parser('topology', help=topology_help, description=topology_subcommand_help)
@@ -989,6 +993,10 @@ class AMDSMIParser(argparse.ArgumentParser):
topology_parser.add_argument('-o', '--hops', action='store_true', required=False, help=hops_help)
topology_parser.add_argument('-t', '--link-type', action='store_true', required=False, help=link_type_help)
topology_parser.add_argument('-b', '--numa-bw', action='store_true', required=False, help=numa_bw_help)
topology_parser.add_argument('-c', '--coherent', action='store_true', required=False, help=coherent_help)
topology_parser.add_argument('-n', '--atomics', action='store_true', required=False, help=atomics_help)
topology_parser.add_argument('-d', '--dma', action='store_true', required=False, help=dma_help)
topology_parser.add_argument('-z', '--bi-dir', action='store_true', required=False, help=bi_dir_help)
def _add_set_value_parser(self, subparsers, func):
+41
Просмотреть файл
@@ -690,6 +690,17 @@ typedef struct {
uint32_t reserved[4];
} amdsmi_proc_info_t;
/**
* @brief IO Link P2P Capability
*/
typedef struct {
uint8_t is_iolink_coherent; // 1 = true, 0 = false, UINT8_MAX = Not defined.
uint8_t is_iolink_atomics_32bit;
uint8_t is_iolink_atomics_64bit;
uint8_t is_iolink_dma;
uint8_t is_iolink_bi_directional;
} amdsmi_p2p_capability_t;
//! Guaranteed maximum possible number of supported frequencies
#define AMDSMI_MAX_NUM_FREQUENCIES 33
@@ -4283,6 +4294,36 @@ amdsmi_is_P2P_accessible(amdsmi_processor_handle processor_handle_src,
amdsmi_processor_handle processor_handle_dst,
bool *accessible);
/**
* @brief Retrieve connection type and P2P capabilities between 2 GPUs
*
* @platform{gpu_bm_linux} @platform{host} @platform{guest_1vf} @platform{guest_mvf}
*
* @details Given a source processor handle @p processor_handle_src and
* a destination processor handle @p processor_handle_dst, a pointer to an amdsmi_io_link_type_t @p type,
* and a pointer to amdsmi_p2p_capability_t @p cap. This function will write the connection type,
* and io link capabilities between the device
* @p processor_handle_src and @p processor_handle_dst to the memory
* pointed to by @p cap and @p type.
*
* @param[in] processor_handle_src the source processor handle
*
* @param[in] processor_handle_dst the destination processor handle
*
* @param[in,out] type A pointer to an ::amdsmi_io_link_type_t to which the
* type for the connection should be written.
*
* @param[in,out] type A pointer to an ::amdsmi_p2p_capability_t to which the
* io link capabilities should be written.
*
* @return ::amdsmi_status_t | ::AMDSMI_STATUS_SUCCESS on success, non-zero on fail
*/
amdsmi_status_t
amdsmi_topo_get_p2p_status(amdsmi_processor_handle processor_handle_src,
amdsmi_processor_handle processor_handle_dst,
amdsmi_io_link_type_t *type, amdsmi_p2p_capability_t *cap);
/** @} End HWTopo */
/*****************************************************************************/
+1
Просмотреть файл
@@ -211,6 +211,7 @@ from .amdsmi_interface import amdsmi_topo_get_numa_node_number
from .amdsmi_interface import amdsmi_topo_get_link_weight
from .amdsmi_interface import amdsmi_get_minmax_bandwidth_between_processors
from .amdsmi_interface import amdsmi_topo_get_link_type
from .amdsmi_interface import amdsmi_topo_get_p2p_status
from .amdsmi_interface import amdsmi_is_P2P_accessible
from .amdsmi_interface import amdsmi_get_xgmi_info
+34
Просмотреть файл
@@ -2540,6 +2540,40 @@ def amdsmi_topo_get_link_type(
return {"hops": hops.value, "type": type.value}
def amdsmi_topo_get_p2p_status(
processor_handle_src: amdsmi_wrapper.amdsmi_processor_handle,
processor_handle_dst: amdsmi_wrapper.amdsmi_processor_handle,
):
if not isinstance(processor_handle_src, amdsmi_wrapper.amdsmi_processor_handle):
raise AmdSmiParameterException(
processor_handle_src, amdsmi_wrapper.amdsmi_processor_handle
)
if not isinstance(processor_handle_dst, amdsmi_wrapper.amdsmi_processor_handle):
raise AmdSmiParameterException(
processor_handle_dst, amdsmi_wrapper.amdsmi_processor_handle
)
type = ctypes.c_uint32()
cap = amdsmi_wrapper.struct_amdsmi_p2p_capability_t()
_check_res(
amdsmi_wrapper.amdsmi_topo_get_p2p_status(
processor_handle_src, processor_handle_dst, ctypes.byref(type), ctypes.byref(cap)
)
)
return {
'type' : type,
'cap': {
'is_iolink_coherent': cap.is_iolink_coherent,
'is_iolink_atomics_32bit': cap.is_iolink_atomics_32bit,
'is_iolink_atomics_64bit': cap.is_iolink_atomics_64bit,
'is_iolink_dma': cap.is_iolink_dma,
'is_iolink_bi_directional': cap.is_iolink_bi_directional
}
}
def amdsmi_is_P2P_accessible(
processor_handle_src: amdsmi_wrapper.amdsmi_processor_handle,
+21 -4
Просмотреть файл
@@ -1065,6 +1065,19 @@ struct_amdsmi_proc_info_t._fields_ = [
]
amdsmi_proc_info_t = struct_amdsmi_proc_info_t
class struct_amdsmi_p2p_capability_t(Structure):
pass
struct_amdsmi_p2p_capability_t._pack_ = 1 # source:False
struct_amdsmi_p2p_capability_t._fields_ = [
('is_iolink_coherent', ctypes.c_ubyte),
('is_iolink_atomics_32bit', ctypes.c_ubyte),
('is_iolink_atomics_64bit', ctypes.c_ubyte),
('is_iolink_dma', ctypes.c_ubyte),
('is_iolink_bi_directional', ctypes.c_ubyte),
]
amdsmi_p2p_capability_t = struct_amdsmi_p2p_capability_t
# values for enumeration 'amdsmi_dev_perf_level_t'
amdsmi_dev_perf_level_t__enumvalues = {
@@ -2201,6 +2214,9 @@ amdsmi_topo_get_link_type.argtypes = [amdsmi_processor_handle, amdsmi_processor_
amdsmi_is_P2P_accessible = _libraries['libamd_smi.so'].amdsmi_is_P2P_accessible
amdsmi_is_P2P_accessible.restype = amdsmi_status_t
amdsmi_is_P2P_accessible.argtypes = [amdsmi_processor_handle, amdsmi_processor_handle, ctypes.POINTER(ctypes.c_bool)]
amdsmi_topo_get_p2p_status = _libraries['libamd_smi.so'].amdsmi_topo_get_p2p_status
amdsmi_topo_get_p2p_status.restype = amdsmi_status_t
amdsmi_topo_get_p2p_status.argtypes = [amdsmi_processor_handle, amdsmi_processor_handle, ctypes.POINTER(amdsmi_io_link_type_t), ctypes.POINTER(struct_amdsmi_p2p_capability_t)]
amdsmi_get_gpu_compute_partition = _libraries['libamd_smi.so'].amdsmi_get_gpu_compute_partition
amdsmi_get_gpu_compute_partition.restype = amdsmi_status_t
amdsmi_get_gpu_compute_partition.argtypes = [amdsmi_processor_handle, ctypes.POINTER(ctypes.c_char), uint32_t]
@@ -2726,9 +2742,9 @@ __all__ = \
'amdsmi_memory_partition_type_t', 'amdsmi_memory_type_t',
'amdsmi_mm_ip_t', 'amdsmi_name_value_t', 'amdsmi_od_vddc_point_t',
'amdsmi_od_volt_curve_t', 'amdsmi_od_volt_freq_data_t',
'amdsmi_pcie_bandwidth_t', 'amdsmi_pcie_info_t',
'amdsmi_power_cap_info_t', 'amdsmi_power_info_t',
'amdsmi_power_profile_preset_masks_t',
'amdsmi_p2p_capability_t', 'amdsmi_pcie_bandwidth_t',
'amdsmi_pcie_info_t', 'amdsmi_power_cap_info_t',
'amdsmi_power_info_t', 'amdsmi_power_profile_preset_masks_t',
'amdsmi_power_profile_status_t', 'amdsmi_power_type_t',
'amdsmi_proc_info_t', 'amdsmi_process_handle_t',
'amdsmi_process_info_t', 'amdsmi_processor_handle',
@@ -2761,7 +2777,7 @@ __all__ = \
'amdsmi_temp_range_refresh_rate_t', 'amdsmi_temperature_metric_t',
'amdsmi_temperature_type_t', 'amdsmi_topo_get_link_type',
'amdsmi_topo_get_link_weight', 'amdsmi_topo_get_numa_node_number',
'amdsmi_utilization_counter_t',
'amdsmi_topo_get_p2p_status', 'amdsmi_utilization_counter_t',
'amdsmi_utilization_counter_type_t', 'amdsmi_vbios_info_t',
'amdsmi_version_t', 'amdsmi_voltage_metric_t',
'amdsmi_voltage_type_t', 'amdsmi_vram_info_t',
@@ -2785,6 +2801,7 @@ __all__ = \
'struct_amdsmi_name_value_t', 'struct_amdsmi_od_vddc_point_t',
'struct_amdsmi_od_volt_curve_t',
'struct_amdsmi_od_volt_freq_data_t',
'struct_amdsmi_p2p_capability_t',
'struct_amdsmi_pcie_bandwidth_t', 'struct_amdsmi_pcie_info_t',
'struct_amdsmi_power_cap_info_t', 'struct_amdsmi_power_info_t',
'struct_amdsmi_power_profile_status_t',
+42
Просмотреть файл
@@ -866,6 +866,17 @@ typedef struct {
typedef rsmi_frequencies_t rsmi_frequencies;
/// \endcond
/**
* @brief IO Link P2P Capability
*/
typedef struct {
uint8_t is_iolink_coherent; // 1 = true, 0 = false, UINT8_MAX = Not defined.
uint8_t is_iolink_atomics_32bit;
uint8_t is_iolink_atomics_64bit;
uint8_t is_iolink_dma;
uint8_t is_iolink_bi_directional;
} rsmi_p2p_capability_t;
/**
* @brief This structure holds information about the possible PCIe
* bandwidths. Specifically, the possible transfer rates and their
@@ -4326,6 +4337,37 @@ rsmi_status_t
rsmi_is_P2P_accessible(uint32_t dv_ind_src, uint32_t dv_ind_dst,
bool *accessible);
/**
* @brief Retrieve connection type and P2P capabilities between 2 GPUs
*
* @platform{gpu_bm_linux} @platform{host} @platform{guest_1vf} @platform{guest_mvf}
*
* @details Given a source processor handle @p processor_handle_src and
* a destination processor handle @p processor_handle_dst, a pointer to an amdsmi_io_link_type_t @p type,
* and a pointer to rsmi_p2p_capability_t @p cap. This function will write the connection type,
* and io link capabilities between the device
* @p processor_handle_src and @p processor_handle_dst to the memory
* pointed to by @p cap and @p type.
*
* @param[in] dv_ind_src the source device index
*
* @param[in] dv_ind_dst the destination device index
*
* @param[inout] type A pointer to an ::RSMI_IO_LINK_TYPE to which the
* type for the connection should be written.
*
* @param[in,out] cap A pointer to an ::rsmi_p2p_capability_t to which the
* io link capabilities should be written.
*
* @retval ::RSMI_STATUS_SUCCESS call was successful
* @retval ::RSMI_STATUS_INVALID_ARGS the provided arguments are not valid
* @retval ::RSMI_STATUS_NOT_SUPPORTED installed software or hardware does not
* support this function
*/
rsmi_status_t
rsmi_topo_get_p2p_status(uint32_t dv_ind_src, uint32_t dv_ind_dst,
RSMI_IO_LINK_TYPE *type, rsmi_p2p_capability_t *cap);
/** @} */ // end of HWTopo
/*****************************************************************************/
+8 -2
Просмотреть файл
@@ -85,7 +85,8 @@ typedef enum _LINK_DIRECTORY_TYPE {
class IOLink {
public:
explicit IOLink(uint32_t node_indx, uint32_t link_indx, LINK_DIRECTORY_TYPE link_dir_type) :
node_indx_(node_indx), link_indx_(link_indx), link_dir_type_(link_dir_type) {}
node_indx_(node_indx), link_indx_(link_indx), link_dir_type_(link_dir_type),
link_cap_{UINT8_MAX, UINT8_MAX, UINT8_MAX, UINT8_MAX, UINT8_MAX} {}
~IOLink();
int Initialize();
@@ -96,23 +97,28 @@ class IOLink {
IO_LINK_TYPE type(void) const {return type_;}
uint32_t node_from(void) const {return node_from_;}
uint32_t node_to(void) const {return node_to_;}
uint32_t flag(void) const {return flags_;}
uint64_t weight(void) const {return weight_;}
LINK_DIRECTORY_TYPE get_directory_type(void) const {return link_dir_type_;}
uint64_t min_bandwidth(void) const {return min_bandwidth_;}
uint64_t max_bandwidth(void) const {return max_bandwidth_;}
const rsmi_p2p_capability_t& get_link_capability(void) const {return link_cap_;}
protected:
virtual int UpdateP2pCapability(void);
private:
uint32_t node_indx_;
uint32_t link_indx_;
IO_LINK_TYPE type_;
uint32_t node_from_;
uint32_t node_to_;
uint32_t flags_;
uint64_t weight_;
uint64_t min_bandwidth_;
uint64_t max_bandwidth_;
std::map<std::string, uint64_t> properties_;
LINK_DIRECTORY_TYPE link_dir_type_;
rsmi_p2p_capability_t link_cap_;
};
int
+75
Просмотреть файл
@@ -5285,6 +5285,81 @@ rsmi_is_P2P_accessible(uint32_t dv_ind_src, uint32_t dv_ind_dst,
CATCH
}
rsmi_status_t
rsmi_topo_get_p2p_status(uint32_t dv_ind_src, uint32_t dv_ind_dst,
RSMI_IO_LINK_TYPE *type, rsmi_p2p_capability_t *cap) {
TRY
uint32_t dv_ind = dv_ind_src;
GET_DEV_AND_KFDNODE_FROM_INDX
DEVICE_MUTEX
if (type == nullptr || cap == nullptr) {
return RSMI_STATUS_INVALID_ARGS;
}
// If source device is same as destination, return invalid args
if (dv_ind_src == dv_ind_dst) {
return RSMI_STATUS_INVALID_ARGS;
}
uint32_t node_ind_src, node_ind_dst;
// Fetch the source and destination node index
if (smi.get_node_index(dv_ind_src, &node_ind_src) ||
smi.get_node_index(dv_ind_dst, &node_ind_dst)) {
return RSMI_STATUS_INVALID_ARGS;
}
bool node_is_find = false;
std::map<uint32_t, std::shared_ptr<amd::smi::IOLink>> io_link_map_tmp;
std::map<uint32_t, std::shared_ptr<amd::smi::IOLink>>::iterator it;
// Iterate over P2P links
if (DiscoverP2PLinksPerNode(node_ind_src, &io_link_map_tmp) == 0) {
for (it = io_link_map_tmp.begin(); it != io_link_map_tmp.end(); it++) {
if (it->first == node_ind_dst) {
node_is_find = true;
break;
}
}
io_link_map_tmp.clear();
} else {
return RSMI_STATUS_FILE_ERROR;
}
if (!node_is_find) {
// Iterate over IO links
if (DiscoverIOLinksPerNode(node_ind_src, &io_link_map_tmp) == 0) {
for (it = io_link_map_tmp.begin(); it != io_link_map_tmp.end(); it++) {
if (it->first == node_ind_dst) {
node_is_find = true;
break;
}
}
io_link_map_tmp.clear();
} else {
return RSMI_STATUS_FILE_ERROR;
}
}
if (node_is_find) {
amd::smi::IO_LINK_TYPE io_link_type = it->second->type();
if (io_link_type == amd::smi::IOLINK_TYPE_PCIEXPRESS) {
*type = RSMI_IOLINK_TYPE_PCIEXPRESS;
} else if (io_link_type == amd::smi::IOLINK_TYPE_XGMI) {
*type = RSMI_IOLINK_TYPE_XGMI;
} else {
// Unexpected IO Link type read
return RSMI_STATUS_NOT_SUPPORTED;
}
*cap = it->second->get_link_capability();
return RSMI_STATUS_SUCCESS;
}
return RSMI_STATUS_NOT_SUPPORTED;
CATCH
}
static rsmi_status_t
get_compute_partition(uint32_t dv_ind, std::string &compute_partition) {
TRY
+42 -1
Просмотреть файл
@@ -57,6 +57,15 @@
#include "rocm_smi/rocm_smi_utils.h"
#include "rocm_smi/rocm_smi_io_link.h"
#define CRAT_IOLINK_FLAGS_ENABLED (1 << 0)
#define CRAT_IOLINK_FLAGS_NON_COHERENT (1 << 1)
#define CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT (1 << 2)
#define CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT (1 << 3)
#define CRAT_IOLINK_FLAGS_NO_PEER_TO_PEER_DMA (1 << 4)
#define CRAT_IOLINK_FLAGS_BI_DIRECTIONAL (1 << 31)
#define CRAT_IOLINK_FLAGS_RESERVED_MASK 0x7fffffe0
namespace amd {
namespace smi {
@@ -76,7 +85,7 @@ static const char *kIOLinkPropMIN_BANDWIDTHStr = "min_bandwidth";
static const char *kIOLinkPropMAX_BANDWIDTHStr = "max_bandwidth";
// static const char *kIOLinkPropRECOMMENDED_TRANSFER_SIZEStr =
// "recommended_transfer_size";
// static const char *kIOLinkPropFLAGSStr = "flags";
static const char *kIOLinkPropFLAGSStr = "flags";
static bool is_number(const std::string &s) {
return !s.empty() && std::all_of(s.begin(), s.end(), ::isdigit);
@@ -380,6 +389,12 @@ IOLink::Initialize(void) {
ret = get_property_value(kIOLinkPropWEIGHTStr, &weight_);
if (ret) {return ret;}
ret = get_property_value(kIOLinkPropFLAGSStr, reinterpret_cast<uint64_t *>(&flags_));
if (ret) {return ret;}
ret = UpdateP2pCapability();
if (ret) {return ret;}
ret = get_property_value(kIOLinkPropMIN_BANDWIDTHStr, &min_bandwidth_);
if (ret) {return ret;}
@@ -401,5 +416,31 @@ IOLink::get_property_value(std::string property, uint64_t *value) {
return 0;
}
int IOLink::UpdateP2pCapability(void) {
const uint8_t cap_true = 1;
const uint8_t cap_false = 0;
if (!(flags_ & CRAT_IOLINK_FLAGS_ENABLED)) {
return 0;
}
link_cap_.is_iolink_coherent =
(flags_ & CRAT_IOLINK_FLAGS_NON_COHERENT) ? cap_false : cap_true;
link_cap_.is_iolink_atomics_32bit =
(flags_ & CRAT_IOLINK_FLAGS_NO_ATOMICS_32_BIT) ? cap_false : cap_true;
link_cap_.is_iolink_atomics_64bit =
(flags_ & CRAT_IOLINK_FLAGS_NO_ATOMICS_64_BIT) ? cap_false : cap_true;
link_cap_.is_iolink_bi_directional =
(flags_ & CRAT_IOLINK_FLAGS_BI_DIRECTIONAL) ? cap_true : cap_false;
link_cap_.is_iolink_dma =
(flags_ & CRAT_IOLINK_FLAGS_NO_PEER_TO_PEER_DMA) ? cap_false : cap_true;
return 0;
}
} // namespace smi
} // namespace amd
+20
Просмотреть файл
@@ -1053,6 +1053,26 @@ amdsmi_is_P2P_accessible(amdsmi_processor_handle processor_handle_src,
return amd::smi::rsmi_to_amdsmi_status(rstatus);
}
amdsmi_status_t
amdsmi_topo_get_p2p_status(amdsmi_processor_handle processor_handle_src,
amdsmi_processor_handle processor_handle_dst,
amdsmi_io_link_type_t *type, amdsmi_p2p_capability_t *cap) {
AMDSMI_CHECK_INIT();
amd::smi::AMDSmiGPUDevice* src_device = nullptr;
amd::smi::AMDSmiGPUDevice* dst_device = nullptr;
amdsmi_status_t r = get_gpu_device_from_handle(processor_handle_src, &src_device);
if (r != AMDSMI_STATUS_SUCCESS)
return r;
r = get_gpu_device_from_handle(processor_handle_dst, &dst_device);
if (r != AMDSMI_STATUS_SUCCESS)
return r;
auto rstatus = rsmi_topo_get_p2p_status(src_device->get_gpu_id(), dst_device->get_gpu_id(),
reinterpret_cast<RSMI_IO_LINK_TYPE*>(type),
reinterpret_cast<rsmi_p2p_capability_t*>(cap));
return amd::smi::rsmi_to_amdsmi_status(rstatus);
}
// Compute Partition functions
amdsmi_status_t
amdsmi_get_gpu_compute_partition(amdsmi_processor_handle processor_handle,
+154 -1
Просмотреть файл
@@ -60,6 +60,7 @@ typedef struct {
uint64_t hops;
uint64_t weight;
bool accessible;
amdsmi_p2p_capability_t cap;
} gpu_link_t;
TestHWTopologyRead::TestHWTopologyRead() : TestBase() {
@@ -136,9 +137,11 @@ void TestHWTopologyRead::Run(void) {
gpu_links[dv_ind_src][dv_ind_dst].hops = 0;
gpu_links[dv_ind_src][dv_ind_dst].weight = 0;
gpu_links[dv_ind_src][dv_ind_dst].accessible = true;
gpu_links[dv_ind_src][dv_ind_dst].cap =
{UINT8_MAX, UINT8_MAX, UINT8_MAX, UINT8_MAX, UINT8_MAX};
} else {
amdsmi_io_link_type_t type;
err = amdsmi_topo_get_link_type(processor_handles_[dv_ind_src],
err = amdsmi_topo_get_link_type(processor_handles_[dv_ind_src],
processor_handles_[dv_ind_dst],
&gpu_links[dv_ind_src][dv_ind_dst].hops, &type);
if (err != AMDSMI_STATUS_SUCCESS) {
@@ -170,6 +173,34 @@ void TestHWTopologyRead::Run(void) {
}
}
}
err = amdsmi_topo_get_p2p_status(processor_handles_[dv_ind_src],
processor_handles_[dv_ind_dst],
&type, &gpu_links[dv_ind_src][dv_ind_dst].cap);
if (err != AMDSMI_STATUS_SUCCESS) {
if (err == AMDSMI_STATUS_NOT_SUPPORTED) {
IF_VERB(STANDARD) {
std::cout <<
"\t**Link Type. read: Not supported on this machine"
<< std::endl;
return;
}
} else {
CHK_ERR_ASRT(err)
}
} else {
switch (type) {
case AMDSMI_IOLINK_TYPE_PCIEXPRESS:
case AMDSMI_IOLINK_TYPE_XGMI:
// Do nothing, the type is printed by the previous test for amdsmi_topo_get_link_type
break;
default:
gpu_links[dv_ind_src][dv_ind_dst].type = "XXXX";
IF_VERB(STANDARD) {
std::cout << "\t**Invalid IO LINK type. type=" << type <<
std::endl;
}
}
}
err = amdsmi_topo_get_link_weight(processor_handles_[dv_ind_src],
processor_handles_[dv_ind_dst],
&gpu_links[dv_ind_src][dv_ind_dst].weight);
@@ -286,6 +317,7 @@ void TestHWTopologyRead::Run(void) {
std::cout << std::endl;
}
std::cout << std::endl;
std::cout << "**Access between two GPUs**" << std::endl;
std::cout << " ";
for (i = 0; i < num_devices; ++i) {
@@ -303,4 +335,125 @@ void TestHWTopologyRead::Run(void) {
std::cout << std::endl;
}
std::cout << std::endl;
std::cout << "**Cache coherency between two GPUs**" << std::endl;
std::cout << " ";
for (i = 0; i < num_devices; ++i) {
tmp = "GPU" + std::to_string(i);
std::cout << std::setw(12) << std::left << tmp;
}
std::cout << std::endl;
for (i = 0; i < num_devices; i++) {
tmp = "GPU" + std::to_string(i);
std::cout << std::setw(6) << std::left << tmp;
for (j = 0; j < num_devices; j++) {
if (i == j) {
std::cout << std::setw(12) << std::left << "X";
continue;
}
if (gpu_links[i][j].cap.is_iolink_coherent == UINT8_MAX) {
std::cout << std::setw(12) << std::left << "N/A";
continue;
}
std::cout << std::setw(12) << std::left
<< (gpu_links[i][j].cap.is_iolink_coherent ? "C" : "NC");
}
std::cout << std::endl;
}
std::cout << std::endl;
std::cout << "**Atomics between two GPUs**" << std::endl;
std::cout << " ";
for (i = 0; i < num_devices; ++i) {
tmp = "GPU" + std::to_string(i);
std::cout << std::setw(12) << std::left << tmp;
}
std::cout << std::endl;
for (i = 0; i < num_devices; i++) {
tmp = "GPU" + std::to_string(i);
std::cout << std::setw(6) << std::left << tmp;
for (j = 0; j < num_devices; j++) {
if (i == j) {
std::cout << std::setw(12) << std::left << "X";
continue;
}
if (gpu_links[i][j].cap.is_iolink_atomics_64bit == UINT8_MAX ||
gpu_links[i][j].cap.is_iolink_atomics_32bit == UINT8_MAX) {
std::cout << std::setw(12) << std::left << "N/A";
continue;
}
tmp = gpu_links[i][j].cap.is_iolink_atomics_64bit ? "64" : "";
if (gpu_links[i][j].cap.is_iolink_atomics_32bit) {
if (!tmp.empty()) {
tmp += ",";
}
tmp += "32";
}
std::cout << std::setw(12) << std::left << (tmp.empty() ? "N/A" : tmp);
}
std::cout << std::endl;
}
std::cout << std::endl;
std::cout << "**DMA between two GPUs**" << std::endl;
std::cout << " ";
for (i = 0; i < num_devices; ++i) {
tmp = "GPU" + std::to_string(i);
std::cout << std::setw(12) << std::left << tmp;
}
std::cout << std::endl;
for (i = 0; i < num_devices; i++) {
tmp = "GPU" + std::to_string(i);
std::cout << std::setw(6) << std::left << tmp;
for (j = 0; j < num_devices; j++) {
if (i == j) {
std::cout << std::setw(12) << std::left << "X";
continue;
}
if (gpu_links[i][j].cap.is_iolink_dma == UINT8_MAX) {
std::cout << std::setw(12) << std::left << "N/A";
continue;
}
std::cout << std::boolalpha;
std::cout << std::setw(12) << std::left
<< static_cast<bool>(gpu_links[i][j].cap.is_iolink_dma);
}
std::cout << std::endl;
}
std::cout << std::endl;
std::cout << "**BI-Directional between two GPUs**" << std::endl;
std::cout << " ";
for (i = 0; i < num_devices; ++i) {
tmp = "GPU" + std::to_string(i);
std::cout << std::setw(12) << std::left << tmp;
}
std::cout << std::endl;
for (i = 0; i < num_devices; i++) {
tmp = "GPU" + std::to_string(i);
std::cout << std::setw(6) << std::left << tmp;
for (j = 0; j < num_devices; j++) {
if (i == j) {
std::cout << std::setw(12) << std::left << "X";
continue;
}
if (gpu_links[i][j].cap.is_iolink_dma == UINT8_MAX) {
std::cout << std::setw(12) << std::left << "N/A";
continue;
}
std::cout << std::boolalpha;
std::cout << std::setw(12) << std::left
<< static_cast<bool>(gpu_links[i][j].cap.is_iolink_bi_directional);
}
std::cout << std::endl;
}
std::cout << std::endl;
}