There are now two implementations of printf in HIP:
1. The implemenation for HCC is controlled by the HC_FEATURE_PRINTF
macro, and it works only with the HCC compiler used in combination
with the HCC runtime.
2. The implementation for hip-clang requires the VDI runtime, and is
always enabled with that combination.
Change-Id: Ibaeda7900ffe2ce602ca0094aafed0f1147ac2b6
[ROCm/clr commit: d48738856c]
Currently there is a clang bug on Windows causing duplicate -mllvm options in clang -cc1.
Tempoarily disable -mllvm options for HIP-Clang on Windows until the bug is fixed.
Change-Id: I3a4393ba7745989398dc6c6001722837dad18704
[ROCm/clr commit: e796a1ed78]
extern "C" on Windows implies nothrow. We shouldn't be throwing exceptions either way.
Change-Id: If0ed1f7ec194bf7f65b7cea1a5c250e768a8f190
[ROCm/clr commit: a15e895cfd]
Handle converting signed int cases of hipResourceViewFormat to number of channels.
Change-Id: Ica8ae6f644edfaa0d4803d0b8e90e320479118e2
[ROCm/clr commit: e746584d24]
HIP assumes that image width is in bytes, but VDI assumes that image width in pixels. Need to perform byte -> pixel coversion before doing anything.
Change-Id: Ia9fd1f46d05db3fbe8049add10b4d7e5118a2b9a
[ROCm/clr commit: 801c70279f]
Fix compilation error on hip-hcc+clang , hip-vdi+clang
Enabled hipExtLaunchMultiKernelMultiDevice test on hip-vdi path
hipExtLaunchMultiKernelMultiDevice common declaration for all paths
Change-Id: I76031840614fce8e12a8e845548fa43a389a741a
[ROCm/clr commit: 5a6c605730]
SWDEV-225266: [HIP-VDI] HIP-VDI disabled tests (p2p_copy_coherency.cpp)
SWDEV-225388: hipTestDeviceSymbol.cpp & hipTestConstant.cpp failed to build on hip-vdi
For hipTestDeviceSymbol.cpp & hipTestConstant.cpp tests:
Currently "__HIP_VDI__" flag is enabled in CMakeLists.txt, but when application is compiled with hipcc,
__HIP_VDI__ is not defined to differentiate if compiled for VDI/HCC for headers.
For ./src/runtimeApi/memory/p2p_copy_coherency.cpp:
Fixed compilation issue to include only when compile for HCC runtime "<hc_am.hpp> not found"
Currently test is disabled to run on all platforms. When validated on multi-GPU machine,
memcpy between multiple GPUs via GPU synchronization is not working on hcc and vdi path.
Need to validate on nvidia machine to know if test is valid. Disabled GPU synchronization test for now.
For ./src/runtimeApi/module/hipModuleTexture2dDrv.cpp:
updated test to generate tex2d_kernel.code object in build directory. Currently ctest looks for it in build directory.
Change-Id: I629d395a919c2440d921422716944c7940ed6010
[ROCm/clr commit: ae465bc338]
~45% to 50% of Performance drop on rocBLAS_int8 test
Enable cudaSetDeviceFlags() api call. Use active wait by default
for all devices.
Change-Id: Ifc2ebe3dd9b0aa3fdbfbc9cb5c2cd8b3b726124f
[ROCm/clr commit: b93d997fb7]
Incoming changes from upstream split the struct hipMemcpy3DParms into two separate ones - hipMemcpy3DParms and HIP_MEMCPY3D, which are cudaMemcpy3DParms and CUDA_MEMCPY3D equivalents respectively.
Note that HIP_MEMCPY3D is missing half the members of CUDA_MEMCPY3D (this should be fixed in PR#1887). Work around this by using a substitute _HIP_MEMCPY3D struct for now.
Change-Id: Ic15134e6deb260189b662b3804d2309a9b8473e9
[ROCm/clr commit: d28b77bf23]