Currently there is a clang bug on Windows causing duplicate -mllvm options in clang -cc1.
Tempoarily disable -mllvm options for HIP-Clang on Windows until the bug is fixed.
Change-Id: I3a4393ba7745989398dc6c6001722837dad18704
HIP assumes that image width is in bytes, but VDI assumes that image width in pixels. Need to perform byte -> pixel coversion before doing anything.
Change-Id: Ia9fd1f46d05db3fbe8049add10b4d7e5118a2b9a
Fix compilation error on hip-hcc+clang , hip-vdi+clang
Enabled hipExtLaunchMultiKernelMultiDevice test on hip-vdi path
hipExtLaunchMultiKernelMultiDevice common declaration for all paths
Change-Id: I76031840614fce8e12a8e845548fa43a389a741a
SWDEV-225266: [HIP-VDI] HIP-VDI disabled tests (p2p_copy_coherency.cpp)
SWDEV-225388: hipTestDeviceSymbol.cpp & hipTestConstant.cpp failed to build on hip-vdi
For hipTestDeviceSymbol.cpp & hipTestConstant.cpp tests:
Currently "__HIP_VDI__" flag is enabled in CMakeLists.txt, but when application is compiled with hipcc,
__HIP_VDI__ is not defined to differentiate if compiled for VDI/HCC for headers.
For ./src/runtimeApi/memory/p2p_copy_coherency.cpp:
Fixed compilation issue to include only when compile for HCC runtime "<hc_am.hpp> not found"
Currently test is disabled to run on all platforms. When validated on multi-GPU machine,
memcpy between multiple GPUs via GPU synchronization is not working on hcc and vdi path.
Need to validate on nvidia machine to know if test is valid. Disabled GPU synchronization test for now.
For ./src/runtimeApi/module/hipModuleTexture2dDrv.cpp:
updated test to generate tex2d_kernel.code object in build directory. Currently ctest looks for it in build directory.
Change-Id: I629d395a919c2440d921422716944c7940ed6010
~45% to 50% of Performance drop on rocBLAS_int8 test
Enable cudaSetDeviceFlags() api call. Use active wait by default
for all devices.
Change-Id: Ifc2ebe3dd9b0aa3fdbfbc9cb5c2cd8b3b726124f
Incoming changes from upstream split the struct hipMemcpy3DParms into two separate ones - hipMemcpy3DParms and HIP_MEMCPY3D, which are cudaMemcpy3DParms and CUDA_MEMCPY3D equivalents respectively.
Note that HIP_MEMCPY3D is missing half the members of CUDA_MEMCPY3D (this should be fixed in PR#1887). Work around this by using a substitute _HIP_MEMCPY3D struct for now.
Change-Id: Ic15134e6deb260189b662b3804d2309a9b8473e9
There were several error messages that appeared even if the hipEnvVarDriver.exe test passes and executes successfully. Now it is cleaned up. The following are those instances:
* When popen searches for directed_test directory but does not find it, it outputs an error, then finds the hipEnvVar at the same level. Currently the fix will prompt the test to only output an error if both searches for hipEnvVar fails.
* When assertion is used towards the later half of the test, conditions were set to specifically hide the devices, resulting in No Hip Device detected in the latter half of the test. The fix will make these errors not appear as they are intended to not find any devices. Assertions themselves are untouched.
HipEnvVarDriver.cpp has also been refactored. Reading HipEnvVar will now happen in a helper function for getDeviceNumber and getDevicePCIBusNumRemote, as the code to read HipEnvVar were really similar in them.