+ affects cuDNN and CUB tests, paths to libraries of which are defined by CUDA_DNN_ROOT_DIR and CUDA_CUB_ROOT_DIR
+ Warn about excluding and why, for instance:
"WARN: cuDNN tests are excluded due to unset CUDA_DNN_ROOT_DIR"
+ Add one matcher (will be more)
+ Update Maps and Statistics
+ Add cub_01.cu unit test
+ Update lit harness to support standalone CUB
+ Update README.md
+ Update hipify-perl (only CUB header is supported for now)
[IMPORTANT]
clang (and hipify-clang) works correctly only with official NVLabs version on GitHub.
Compilation of CUB from official CUDA release has conflicts with THRUST.
Thus, to compile CUB sources, option "-I" should be specified to the cloned CUB from NVLAB on GitHub.
directed_tests/runtimeApi/module/hipLaunchCooperativeKernel.tst - Disabling test temporarily until driver support is available.
directed_tests/runtimeApi/memory/hipArray.tst - Disabling test temporarily to reimplement it correctly.
Added new memory API's hipMemAllocPitch, hipMemAllocHost, hipMemsetD16, hipMemsetD16Async, hipMemsetD8Async
Modified to support all scenarios hipMemcpyParam2DAsync, hipMemcpyParam2D.
[REASON]
1. hip-clang is fine with the templated kernel launch, brackets are unneeded: HIP_KERNEL_NAME(...) __VA_ARGS__
2. HCC is not, thus: HIP_KERNEL_NAME(...) (__VA_ARGS__)
[TODO] Clean-up entirely kernel name wrapping when HCC is finally obsolete.
+ Update perl generation, hipify-perl, and affected tests accordingly.
+ Perl part of [#1458]
+ Affected functions: hipFuncSetCacheConfig, hipFuncGetAttributes
+ Implement function generateHostFunctions() in hipify-clang for that purposes
+ Update hipify-perl accordingly
+ Affected functions: hipFuncSetCacheConfig, hipFuncGetAttributes
+ Add a corresponding Matcher cudaReinterpretCastArgFuncCall
+ Add reinterpret_cast.cu test
TODO: Do the same for hipify-perl
cudaMemcpyToSymbol, cudaMemcpyToSymbolAsync, cudaGetSymbolSize, cudaGetSymbolAddress, cudaMemcpyFromSymbol, cudaMemcpyFromSymbolAsync
+ Add a corresponding cudaSymbolFuncCall matcher.
+ Add device_symbols.cu test for the above 6 functions, update existed.
+ Fix dim3() type cast issue, update affected tests.
TODO: Do the same in hipify-perl
+ Do not treat somenamespace::device_function_name as a device function
+ Fix generation of warnUnsupportedDeviceFunctions function in hipify-clang
+ Update hipify-perl based on hipify-clang -perl generation
+ Update device test math_functions.cu for hipify-perl
[Restrictions]
- hipify-perl is yet unable to handle function declarations in user namespaces
- hipify-perl is yet unable to handle using directive
* [hip][tests] add a unit test for testing hipLaunchCooperativeKernel
* use __ockl_grid_sync function
* remove already defined __ockl_grid_sync function
* use sync function for grid synchronization
+ Add a corresponding matcher cudaDeviceFuncCall to match only (__device__ or __global__) and not __host__ functions.
+ Add a corresponding device functions mapping:
only unsupported are listed, cause supported are exactly the same as of CUDA and do not need transformation;
make FindAndReplace for device functions separated from host API calls.
+ Add a test to distinguish device functions and user-defined.
+ Start to translate preprocessor's false conditional blocks too:
based on clang's https://reviews.llvm.org/D66597;
available only starting from LLVM 10.0 or trunk.
+ Option -skip-excluded-preprocessor-conditional-blocks for skipping excluded conditional blocks:
the default behavior for hipify-clang built with LLVM < 10.0;
false by default for hipify-clang built with LLVM 10 or trunk.
+ Add 4 preprocessor unit tests, 2 of which are LLVM 10.0 only
+ Update couple of existing tests by setting -skip-excluded-preprocessor-conditional-blocks option:
update lit testing accordingly
* [dtests] refactor windows specific changes
* Refactor hipMemoryAllocateCoherentDriver - PR- 1309
* Fix missing z in _putenv_s
* Revert "Fix missing z in _putenv_s"
This reverts commit 099a1b20a5c75c5f122d57c0ad2bca01745cdc9c.
* Refactor changes from PR 1299
* Update hipEnvVarDriver.cpp
* Removed unwanted #include sys/time.h , gettimeofday() and timeval variables and this also helps avavoid compilation error in windows due to gettimeofday() call equivalent of which is not available in windows
* Changed the Macro name from GPU_PRINT_TIME to MY_LAUNCH_MACRO
Changed the third arg of the functions __hip_as_write_block and __ockl_as_write_block from ulong to uint64_t so as to fix the compilation error in windows
* Enabled gcc for hip host code
* Adding tests for hip code + (gcc & g++), without kernels
* Excluding nvcc platforms for gcc and g++ tests + Addressing review comments
* minor code clean-up
* Add rocm include path
* Added relative path for library
* Hiding non supported functions for gcc
* Incorporating review comments
...while including HIP main header file, which is inserted now after #indef controlling macro, or after #pragma once, if it's occurred earlier.
+ Add a couple of unit tests.
ToDo: Check backward compatibility on older clang versions.
* Added support of hipOccupancyMaxActiveBlocksPerMultiprocessor & hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags APIs
* Taking into account of SGPR usage to determine the max active blocks in hipOccupancyMaxActiveBlocksPerMultiprocessor()