Fix issues of missing kernel function symbols and missing argument list via
using __hipRegister* functions.
Then the following tests can pass,
directed_tests/runtimeApi/module/hipFuncGetAttributes
directed_tests/runtimeApi/module/hipExtLaunchMultiKernelMultiDevice
directed_tests/gcc/LaunchKernel
Change-Id: I52135b61e8283eb4f9f10f77895151e4e55418d9
The hipExtModuleLaunchKernel and hipModuleLoadDataMultiThreaded tests keeps randomly failing on Jenkins.
Change-Id: I87e5d54fb7429c14ff1dcecb20e03a7816670fae
This is currently so buggy that it causes a runtime crash on Nvidia platfrom...
Disable the new version for hcc and vdi, header fixes are required for it to pass.
Currently tex1D<char, hipTextureType1D, hipReadModeNormalizedFloat> returns a char, when the actual sampled pixel value is a float, so the hi 3 bytes get lost.
Change-Id: I8222a4d8d1d8b101eb43f3f8dfbe4818f885f8ea
Fix compilation error on hip-hcc+clang , hip-vdi+clang
Enabled hipExtLaunchMultiKernelMultiDevice test on hip-vdi path
hipExtLaunchMultiKernelMultiDevice common declaration for all paths
Change-Id: I76031840614fce8e12a8e845548fa43a389a741a
SWDEV-225266: [HIP-VDI] HIP-VDI disabled tests (p2p_copy_coherency.cpp)
SWDEV-225388: hipTestDeviceSymbol.cpp & hipTestConstant.cpp failed to build on hip-vdi
For hipTestDeviceSymbol.cpp & hipTestConstant.cpp tests:
Currently "__HIP_VDI__" flag is enabled in CMakeLists.txt, but when application is compiled with hipcc,
__HIP_VDI__ is not defined to differentiate if compiled for VDI/HCC for headers.
For ./src/runtimeApi/memory/p2p_copy_coherency.cpp:
Fixed compilation issue to include only when compile for HCC runtime "<hc_am.hpp> not found"
Currently test is disabled to run on all platforms. When validated on multi-GPU machine,
memcpy between multiple GPUs via GPU synchronization is not working on hcc and vdi path.
Need to validate on nvidia machine to know if test is valid. Disabled GPU synchronization test for now.
For ./src/runtimeApi/module/hipModuleTexture2dDrv.cpp:
updated test to generate tex2d_kernel.code object in build directory. Currently ctest looks for it in build directory.
Change-Id: I629d395a919c2440d921422716944c7940ed6010
There were several error messages that appeared even if the hipEnvVarDriver.exe test passes and executes successfully. Now it is cleaned up. The following are those instances:
* When popen searches for directed_test directory but does not find it, it outputs an error, then finds the hipEnvVar at the same level. Currently the fix will prompt the test to only output an error if both searches for hipEnvVar fails.
* When assertion is used towards the later half of the test, conditions were set to specifically hide the devices, resulting in No Hip Device detected in the latter half of the test. The fix will make these errors not appear as they are intended to not find any devices. Assertions themselves are untouched.
HipEnvVarDriver.cpp has also been refactored. Reading HipEnvVar will now happen in a helper function for getDeviceNumber and getDevicePCIBusNumRemote, as the code to read HipEnvVar were really similar in them.
Review comments - generate hiprtc lib everytime when HIP_PLATFORM is hcc
Changes for hip-clang
Removing pre processor directive to simplify
Change-Id: Id38ab368362b58ee0458baeb8051fea709ae6bba
Temporarily comment out Hcc-specific template functions
hipExtLaunchKernelGGL and hipOccupancyMaxPotentialBlockSize for CLang
compiler so that all test cases under hip/samples can be built
successfully for Clang + Hip/Hcc runtime.
Change-Id: Iafc761257be4a7b34eafa6759a01f369570cd6ce
Fix the following issues:
1.Ignore hidden arguments of kernel functions.
2.Look up both origial function name and function name with .kd postfix
when argments are retrived from module.
3.Addition, fix compiling issue of LaunchKernel test app.
Change-Id: I9400943f2f02433cb4409b19c0cac3626c2bc454
* Use deque instead of vector for code readers so that the iterators and references will be stable
* Fix compile error
* Assign the iterator
* Add multithreaded test
* Make threads a multiple of hardware concurrency
* Output on failure
* Add setDevice to try and initialize the context on cuda
* Create context for cuda
* Set context on each thread
* Reduce threads on cuda
* Skip test on cuda
* Try to initialize the primary context on cuda
* Push ctx to the stack as current
* Revert "Push ctx to the stack as current"
This reverts commit bff8cbe950.
* Revert "Try to initialize the primary context on cuda"
This reverts commit fd98514113.
* updated test for nvidia path
* Add c++11 option for nvcc
Co-authored-by: satyanveshd <53337087+satyanveshd@users.noreply.github.com>