6034c27655
Fixes nccl-tests#37.
Direct offsets were still on 32 bits in the low-level primitives.
[ROCm/rccl commit: c38f174bd4]