- Fix 2 runtime API prototypes
`hipOccupancyMaxActiveBlocksPerMultiprocessor` and
`hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags`
- Add missing function templates of them in hip-clang.
[ROCm/hip commit: 5c8a7521f4]
The lifetime of the buffer given to
hsa_code_object_reader_create_from_memory must exceed that of the
code object reader. We need to create a copy of the code object
binary memory (file) that is kept allocated until the code object
reader is destroyed.
[ROCm/hip commit: 7473140a76]
This fixes a deadlock introduced by the switch to TTAS loops, and is therefore mildly urgent (to prevent the CI from hoovering in the broken code).
[ROCm/hip commit: a855a13c22]
Reverts changes made in #1399. This is a RT api test. For testing hipMemAllocPitch , a new test should be written and that should use correct memset API.
[ROCm/hip commit: 12e1a86ec1]
Adds hipMemcpy2DFromArray and hipMemcpy2DFromArrayAsync equivalent to cudaMemcpy2DFromArray and cudaMemcpy2DFromArrayAsync.
[ROCm/hip commit: 356765a223]
* Make CAS loops use the TTAS idiom.
* More efficient re-formulation of TTAS.
* Fix typo.
* The typo was not quite a typo
[ROCm/hip commit: 9ba25b42c8]
* [hip] add support for implicit kernel argument for multi-grid sync
* modified code for calculating the prev_sum
* change the impCoopArg type to size_t
* add memory clean up
* launch init_gws and main kernels into two separate loops
[ROCm/hip commit: 359dc79101]
[Algorithm]
[Release]
If CMAKE_INSTALL_PREFIX is set by the user:
If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise CMAKE_INSTALL_PREFIX is used unchanged.
If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise use PROJECT_BINARY_DIR/bin for installation.
[Debug]
If CMAKE_INSTALL_PREFIX is set by the user:
CMAKE_INSTALL_PREFIX is used unchanged.
If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
use CMAKE_CURRENT_SOURCE_DIR/bin for installation.
Standalone build left unchanged: CMAKE_INSTALL_PREFIX is used if set.
[ROCm/hip commit: 75d70a6714]
By implicit unconditional passing -fno-delayed-template-parsing option (which appeared in LLVM 3.8.0, thus doesn't need compatibility wrapping) to hipify-clang.
[Reason] To parse uncalled template functions otherwise they are not parsed without calling, thus not hipified.
Affects cub_03.cu test, which has uncalled global template function.
[ROCm/hip commit: b6e6f12b54]