rocr: Memory copy based on recommended SDMA engines

Recommended SDMA engines for DMA copies are now exposed for better
GPU-GPU performance. ROCr can now select those DMA engines.

Also lock-in host-device copies to SDMA0 and device-host copies to
SDMA1 for better stability and performance.

Change-Id: Ideff2e13daf537104efecb8b837bd49ee5096cb5
This commit is contained in:
Jonathan Kim
2024-08-13 14:54:13 -04:00
rodzic 2f588a2406
commit eb30a5bbc7
9 zmienionych plików z 73 dodań i 22 usunięć
@@ -268,10 +268,11 @@ void Runtime::SetLinkCount(size_t num_nodes) {
}
void Runtime::RegisterLinkInfo(uint32_t node_id_from, uint32_t node_id_to,
uint32_t num_hop,
uint32_t num_hop, uint32_t rec_sdma_eng_id_mask,
hsa_amd_memory_pool_link_info_t& link_info) {
const uint32_t idx = GetIndexLinkInfo(node_id_from, node_id_to);
link_matrix_[idx].num_hop = num_hop;
link_matrix_[idx].rec_sdma_eng_id_mask = rec_sdma_eng_id_mask;
link_matrix_[idx].info = link_info;
// Limit the number of hop to 1 since the runtime does not have enough