SWDEV-225266: [HIP-VDI] HIP-VDI disabled tests (p2p_copy_coherency.cpp)
SWDEV-225388: hipTestDeviceSymbol.cpp & hipTestConstant.cpp failed to build on hip-vdi
For hipTestDeviceSymbol.cpp & hipTestConstant.cpp tests:
Currently "__HIP_VDI__" flag is enabled in CMakeLists.txt, but when application is compiled with hipcc,
__HIP_VDI__ is not defined to differentiate if compiled for VDI/HCC for headers.
For ./src/runtimeApi/memory/p2p_copy_coherency.cpp:
Fixed compilation issue to include only when compile for HCC runtime "<hc_am.hpp> not found"
Currently test is disabled to run on all platforms. When validated on multi-GPU machine,
memcpy between multiple GPUs via GPU synchronization is not working on hcc and vdi path.
Need to validate on nvidia machine to know if test is valid. Disabled GPU synchronization test for now.
For ./src/runtimeApi/module/hipModuleTexture2dDrv.cpp:
updated test to generate tex2d_kernel.code object in build directory. Currently ctest looks for it in build directory.
Change-Id: I629d395a919c2440d921422716944c7940ed6010
[ROCm/clr commit: ae465bc338]
~45% to 50% of Performance drop on rocBLAS_int8 test
Enable cudaSetDeviceFlags() api call. Use active wait by default
for all devices.
Change-Id: Ifc2ebe3dd9b0aa3fdbfbc9cb5c2cd8b3b726124f
[ROCm/clr commit: b93d997fb7]
Incoming changes from upstream split the struct hipMemcpy3DParms into two separate ones - hipMemcpy3DParms and HIP_MEMCPY3D, which are cudaMemcpy3DParms and CUDA_MEMCPY3D equivalents respectively.
Note that HIP_MEMCPY3D is missing half the members of CUDA_MEMCPY3D (this should be fixed in PR#1887). Work around this by using a substitute _HIP_MEMCPY3D struct for now.
Change-Id: Ic15134e6deb260189b662b3804d2309a9b8473e9
[ROCm/clr commit: d28b77bf23]
Add hip::syncStreams(dev) to sync blocking streams on a given device.
hip::syncStreams(void) should only sync streams on the current device.
Change-Id: Ib6b0735215fa0ed12c646ebd029e9763ee3712ce
[ROCm/clr commit: e9af4c8794]
If a user passes a ptr that was allocated by hipMalloc(), the offset is guaranteed to be 0 and NULL may be passed as the offset parameter. We shouldn't return an error in this case.
Change-Id: I4a8d645121e5a17d5e2861a0629356a3599de9ee
[ROCm/clr commit: fd76d220a5]
Do not create a new queue to call finish in hipFree if none was
created earlier elsewhere.
Change-Id: I87bb191e6b186ddbe607ab29d11e3ae5bc2ac8e6
[ROCm/clr commit: 1ccaea7ca8]
Also make D2H and H2D keep track of the chain of events
when we need to use a different HostQueue.
Change-Id: I1c5da6ea6104b37ad7aac00f0eb8ea9371e6ba1c
[ROCm/clr commit: 2bdfc73649]
The current texture implementation is based off the one for HIP-HCC. There's a lot of problems with it - only creating images from buffers, hard coding logic and ignoring user parameters. This leads to a whole lot of UB even with simple examples (as seen with RedShift's code).
This CL is aimed to bring the HIP-VDI texture implementation closer to what is described by Cuda.
hipMemcpyAtoA() - image to image copy.
hipMemcpyHtoA()/hipMemcpyDtoA() - buffer to image copy.
hipMemcpyAtoH()/hipMemcpyAtoD() - image to buffer copy.
hipArrayCreate()/hipArray3DCreate()/hipMallocArray()/hipMalloc3DArray() - creates 1D/2D/3D/1D Array/2D Array images.
hipCreateTextureObject() - creates sampler, (optional) creates 1D/2D image from buffer, (optional) creates image views.
hipBindTexture() - creates 1D image from buffer (should create a typed buffer, however this is not compatible with HIP-HCC).
hipBindTexture2D() - creates 2D image form buffer.
hipBindTextureToArray() - creates image view.
hipTexRefSetAddress() - creates 1D image from buffer (should create a typed buffer, however this is not compatible with HIP-HCC).
hipTexRefSetAddress2D() - creates 2D image from buffer.
hipTexRefSetArray() - creates image view.
There are still a lot of TODOs in the code, here's a few important ones:
1. VDI doesn't support a lot of sampler flags.
2. VDI doesn't support device to image 2D/3D copy.
3. Mipmaps implementation is incomplete.
4. Image view implementation is incomplete.
Change-Id: Ia374ee27aa14f76451fee7667495036f4419a487
[ROCm/clr commit: f71817a342]
Review comments - generate hiprtc lib everytime when HIP_PLATFORM is hcc
Changes for hip-clang
Removing pre processor directive to simplify
Change-Id: Id38ab368362b58ee0458baeb8051fea709ae6bba
[ROCm/clr commit: 8bf287ef18]
The current implementation of surd2D{read/write} directly addresses into
the image buffer via the hipArray::data ptr. This is incorrect to do
since we don't know the layout of the image. Also with VDI we won't have
access to the underlying image buffer.
Disable the surface api untill the device functions are switched to
using __ockl_image_{load/store}().
Change-Id: I19a33680176812d5aad3660e9045812061a1c443
[ROCm/clr commit: 8c8d963c65]
Make __gnu_h2f_ieee and __gnu_f2h_ieee visible so that hipTestHalf
test can succeed in Clang compiler + Hcc RT.
Change-Id: I5f7d5db19e559b3b66356f0170a8dbc1e5505f3e
[ROCm/clr commit: 314766b4c2]