Allen Hubbe 4b80581422 gda ionic: restore functionality of ionic gda in rocshmem (#269)
* Revamp findibverbs to find ionic again

* gda ionic: rename ionic_sq_buf ionic_cq_buf

Avoid duplicating member names used by mlx5 gda.

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>

* gda: move spin lock to util.hpp

Move spin lock out of ionic gda to util.hpp.

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>

* gda ionic: assume latest fwabi changes

There is no firmware abi compatibility in this ionic gda code yet, so
assume we are using the latest firmware abi as of now.

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>

* gda ionic: allow doorbell with incomplete wqes

Use spin lock to ensure doorbell is only written with an increasing
producer index.  Ring the doorbell after this wave has initialized its
wqes.  Wqes of other waves might not be fully initialized, but firmware
will not process them until the phase/color flag is updated in the
respecitve wqes.

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>

* gda ionic: poll cq for additional completions

Keep polling the cq for more than just the minimum number of completions
for this wave of threads to make progress, as long as the cq is not
empty.  A part of wave-optimized cq polling, at the expense of one wave
polling additional completions, it was observed that nearly all other
waves avoid taking the cq lock at all.

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>

* gda: max_rd_atomic in rts transition

In modify_qp(RTS), specify max_rd_atomic, not max_dest_rd_atomic.

By not speicfying max_rd_atomic (rather, max_rd_atomic=zero), the local
nic may get stuck transmitting the first read or atomic request.  One
read or atomic request is greater than the initiator depth of zero.

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>

* gda ionic: allow specifying traffic class

Allow specifying a traffic class.  The network might have a specific
traffic class configured as no-drop, for example.

Co-authored-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>
Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>

* gda ionic: tweak uxdma assignment

The ideal arrangement will have an equal number of QPs active on each
uxdma pipeline.

Pre-rebase, the better arrangement for rocshmem funcitonal test
benchmarks was [0, 1], [1, 0], [0, 1], [1, 0], ...

Now, following changes that add 'ROCSHMEM_GDA_ALTERNATE_QP_PORTS=1' by
default, the better arrangement is [0, 1], [0, 1], [0, 1], [0, 1], ...

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>

---------

Signed-off-by: Allen Hubbe <allen.hubbe@amd.com>
Co-authored-by: Aurelien Bouteiller <abouteil@amd.com>
Co-authored-by: Aurelien Bouteiller <aurelien.bouteiller@amd.com>

[ROCm/rocshmem commit: c84bbc250b]
2025-10-07 10:08:19 -04:00
S
説明
説明が提供されていません
282 MiB
言語
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
その他 1.1%