SWDEV-198556 - [HIP] Use src/dstMemory->getContext instead of host_context.
Also relax the check for P2P copies in case of hipMemcpy(hostMalloced, hipMalloced(dev1), dev0)
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#67 edit
SWDEV-198556 - [HIP] Gnarly bug due to macros:
HIP_RETURN(ret) duplicates ret twice first by setting the last error
then via LogDebugInfo. So if HIP_RETURN has a function as a parameter,
the function would get called twice. So ihipMalloc and ihipMemcpy were
being called twice (and perhaps more functions).
Also logging the pointer returned by ihipMalloc so we can track memory
in logs more easily.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#33 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#65 edit
SWDEV-197168 - [HIP] handle width or height or src or dst being 0
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#63 edit
SWDEV-189500 - [HIP] Have to force async=false for host to device case as well
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#61 edit
SWDEV-194872 - [HIP] CUDA and HCC sync after a DeviceToHost async copy.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#60 edit
SWDEV-189383 - [HIP CQE][HIPonPAL][WIN] hipDeviceMalloc, hip_test_ldg, hipHostRegister, hipModule, hipStreamSync2 tests failed on VEGA10.
1. For pinned memory allocations add the host pointer and all of its respective device pointers to the memory object map.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#57 edit
SWDEV-189488 - [HIP] Caffe2 TensorTest.TensorSerializationMultiDevices fails
1. Make sure to set attributes->device to current device for host malloc'd
2. Return hipSuccess for hipDeviceCanAccessPeer
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#56 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_peer.cpp#4 edit
SWDEV-145570 - Check host_context when matching GPU device.
- In CL#1766264, `host_context` is introduced for mGPU support. Need to
match that context specially when trying to match GPU device context.
The following tests passed:
$ python test_dataloader.py TestDictDataLoader.test_pin_memory
.
----------------------------------------------------------------------
Ran 1 test in 0.004s
OK
$ python test_dataloader.py TestDataLoader.test_sequential_pin_memory
.
----------------------------------------------------------------------
Ran 1 test in 0.063s
OK
$ python test_dataloader.py TestDataLoader.test_shuffle_pin_memory
.
----------------------------------------------------------------------
Ran 1 test in 0.174s
OK
$ python test_dataloader.py TestStringDataLoader.test_shuffle_pin_memory
.
----------------------------------------------------------------------
Ran 1 test in 0.104s
OK
$ python test_torch.py TestTorch.test_pin_memory
.
----------------------------------------------------------------------
Ran 1 test in 0.124s
OK
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#52 edit
SWDEV-144570 - Fix pointer attribute query.
- For memory not registered with runtime, return
`hipErrorInvalidValue`. That's the behavior expected to check whether
a host buffer is pinned.
- Return `hipErrorInvalidDevice` in case a registered memory object
cannot find its matching device.
RB: http://ocltc.amd.com/reviews/r/17094/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#51 edit
SWDEV-145570 - [HIP] Use a context with all devices in system for host register
hipHostRegister and hipMemcpy 0x10 and 0x20 fail in mGPU systems because
we only register the memory on the current device. But in HIP, the registering
needs to happen on all devices.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_context.cpp#17 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_internal.hpp#26 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#50 edit
SWDEV-145570 - [HIP] - Fix some issues in hip runtime
- Set stream for event
- Free mem needs to be reported in bytes but runtime backends reports in Kb
ReviewBoardURL = http://ocltc.amd.com/reviews/r/15586/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#40 edit
... //depot/stg/opencl/drivers/opencl/api/hip/hip_module.cpp#15 edit
SWDEV-145570 - [HIP] refactor hipMemcpy* functions to correctly handle copies using prepinned memory
The current implementation of hipMemcpy functions picks the copy type based on a flag that the user passes. However, one can use the hipMemcpyHostToDevice/hipMemcpyDeviceToHost flag in a combination with prepinned memory. By using the WriteMemoryCommand/ReadMemoyCommand in this case, we will pin the same host memory twice. This is fine on PAL/Linux, since pinning the same VA range is a noop, but this will start failing once we switch to using device memory with HIP/VDI/HSA.
The solution is to ignore the hipMemcpyKind flag and let the runtime decide what kind of copy is best to do. Except for the case when hipMemcpyHostToHost is passed, since both host pointers may be prepinned.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/15482/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#37 edit
SWDEV-145570 - [HIP] - Since getMemoryObject is now used in hip_texture it shouldnt be inline
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#32 edit
SWDEV-145570 - [HIP] Store HIP mem flags inside amd::Buffer's flags
Use the 16 upper bits of amd::Buffer's flags field instead of adding a new field.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#86 edit
SWDEV-145570 - [HIP] Fix offset calculation when getting a memory object. Also include case when destination VA may just be a CPU host VA and not nessarily device alloced.
- Fix hipMemset* to write each byte and now a dword as per the spec
ReviewBoardURL = http://ocltc.amd.com/reviews/r/14787/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#24 edit
SWDEV-145570 - [HIP]
- Implement hipMemcpyHtoD/DtoH/DtoD/ and their Async APIs
- Combine logic for hipMemset/Memcpy/Memset2D/Memcpy2D that can be shared across multiple APIs
ReviewBoardURL = http://ocltc.amd.com/reviews/r/14782/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/hip/hip_memory.cpp#23 edit