Evgeny Mankov
4e02b285d6
[HIP][doc] NVIDIA-nvcc -> HIP-nvcc
...
[ROCm/clr commit: 3df22b2fde ]
2019-10-28 22:46:33 +03:00
Evgeny Mankov
935dd4ce94
[HIP][doc] AMD-hcc -> HIP-hcc
...
[ROCm/clr commit: d312bce79d ]
2019-10-28 21:41:12 +03:00
Evgeny Mankov
20b127bf45
[HIP][doc] Fix typo: AMD-clang -> HIP-clang
...
HIP-clang is already used below instead of AMD-clang
[ROCm/clr commit: 6284b041e5 ]
2019-10-28 21:19:21 +03:00
Evgeny Mankov
1ef94a4f94
Merge pull request #1590 from emankov/doc
...
[HIPIFY][tests] Fix ambiguous call to cusparseGetErrorString declared in cusparse.h
[ROCm/clr commit: 7f367ff933 ]
2019-10-25 16:08:22 +03:00
Evgeny Mankov
bcc9d88b20
[HIPIFY][tests] Rename the ambiguous call as well
...
[ROCm/clr commit: f68bee02f5 ]
2019-10-25 16:07:31 +03:00
Evgeny Mankov
91732f98c0
[HIPIFY][tests] Fix ambiguous call to cusparseGetErrorString declared in cusparse.h
...
[ROCm/clr commit: 9529e1d91d ]
2019-10-25 16:04:20 +03:00
Alex Voicu
f22391c362
Add missing operators, fix GCC compilation. ( #1589 )
...
[ROCm/clr commit: 40522e2b6a ]
2019-10-25 15:44:24 +05:30
Alex Voicu
acbee5a48b
Fix deadlock, remove old __sync_* use. ( #1584 )
...
This fixes a deadlock introduced by the switch to TTAS loops, and is therefore mildly urgent (to prevent the CI from hoovering in the broken code).
[ROCm/clr commit: f909a393ff ]
2019-10-25 15:44:17 +05:30
Rahul Garg
9c599a3581
[dtest] Fix hipMemset2D test ( #1579 )
...
Reverts changes made in #1399 . This is a RT api test. For testing hipMemAllocPitch , a new test should be written and that should use correct memset API.
[ROCm/clr commit: 66a3c874c8 ]
2019-10-25 15:44:05 +05:30
Rahul Garg
7ea7a9c3b7
Add hipMemcpy2DfromArray ( #1510 )
...
Adds hipMemcpy2DFromArray and hipMemcpy2DFromArrayAsync equivalent to cudaMemcpy2DFromArray and cudaMemcpy2DFromArrayAsync.
[ROCm/clr commit: 14b870d1ce ]
2019-10-25 15:43:33 +05:30
Rahul Garg
6760e4065e
Update profiling doc ( #1576 )
...
[ROCm/clr commit: ff8d3fa446 ]
2019-10-24 17:51:55 +05:30
Jatin Chaudhary
e7f4cf4487
Adding New Analyze Target Merging with cppcheck ( #1583 )
...
[ROCm/clr commit: f53b1a1755 ]
2019-10-24 17:46:06 +05:30
Rahul Garg
7f429afe2e
Add HIP checks in texture driver sample ( #1581 )
...
[ROCm/clr commit: 170c4f0270 ]
2019-10-24 17:45:51 +05:30
gandryey
21a2925ee7
Hip vdi profiling header ( #1577 )
...
Add HIP-VDI profiling interface for GPU timing collection.
[ROCm/clr commit: f25692b399 ]
2019-10-24 17:45:42 +05:30
Alex Voicu
5b917afa5f
Make CAS loops use the TTAS idiom. ( #1573 )
...
* Make CAS loops use the TTAS idiom.
* More efficient re-formulation of TTAS.
* Fix typo.
* The typo was not quite a typo
[ROCm/clr commit: 26914ec76e ]
2019-10-24 17:45:20 +05:30
satyanveshd
ad1e409a24
Fix occupany APIs ( #1560 )
...
Addresses SWDEV-205006
[ROCm/clr commit: 6c5fbf9b4a ]
2019-10-24 17:44:47 +05:30
searlmc1
510be4b5dc
Improve performance of v2 arg handling ( #1539 )
...
* Improve performance of v2 arg handling
* Missing change to `std::string`
[ROCm/clr commit: 15a699688e ]
2019-10-24 17:44:05 +05:30
Alex Voicu
fb411b56c2
Improve scalar access into vector types. ( #1531 )
...
The improvement is based on the ideas here: https://t0rakka.silvrback.com/simd-scalar-accessor . It yields significantly better ISA when the base's .xyzw members are used.
[ROCm/clr commit: 84d5b399f6 ]
2019-10-24 17:43:49 +05:30
Aryan Salmanpour
9e0eaef846
[hip] add support for implicit kernel argument for multi-grid sync ( #1456 )
...
* [hip] add support for implicit kernel argument for multi-grid sync
* modified code for calculating the prev_sum
* change the impCoopArg type to size_t
* add memory clean up
* launch init_gws and main kernels into two separate loops
[ROCm/clr commit: 93c688a0c9 ]
2019-10-24 17:43:30 +05:30
Rahul Garg
764135d242
Merge pull request #1559 from vsytch/win10_aligned_alloc
...
Fixes for hipMemcpy_simple on Windows
[ROCm/clr commit: 465581612e ]
2019-10-23 13:10:59 -07:00
Evgeny Mankov
48b264c154
Merge pull request #1578 from emankov/doc
...
[HIPIFY][cmake][#1571 ] Take into account building hipify-clang as a part of building HIP while installing
[ROCm/clr commit: 80bf79c2f8 ]
2019-10-23 21:23:05 +03:00
Evgeny Mankov
50d72e13ca
[HIPIFY][cmake][ #1571 ] Take into account building hipify-clang as a part of building HIP while installing
...
[Algorithm]
[Release]
If CMAKE_INSTALL_PREFIX is set by the user:
If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise CMAKE_INSTALL_PREFIX is used unchanged.
If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise use PROJECT_BINARY_DIR/bin for installation.
[Debug]
If CMAKE_INSTALL_PREFIX is set by the user:
CMAKE_INSTALL_PREFIX is used unchanged.
If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
use CMAKE_CURRENT_SOURCE_DIR/bin for installation.
Standalone build left unchanged: CMAKE_INSTALL_PREFIX is used if set.
[ROCm/clr commit: 2435567e70 ]
2019-10-23 18:54:45 +03:00
Evgeny Mankov
b34b56a761
Merge pull request #1574 from emankov/hipify-clang
...
[HIPIFY] Disable delayed template parsing
[ROCm/clr commit: 3d16a8b121 ]
2019-10-22 19:09:13 +03:00
Evgeny Mankov
0896e41987
[HIPIFY] Disable delayed template parsing
...
By implicit unconditional passing -fno-delayed-template-parsing option (which appeared in LLVM 3.8.0, thus doesn't need compatibility wrapping) to hipify-clang.
[Reason] To parse uncalled template functions otherwise they are not parsed without calling, thus not hipified.
Affects cub_03.cu test, which has uncalled global template function.
[ROCm/clr commit: 7ab06b3892 ]
2019-10-22 19:07:37 +03:00
Evgeny Mankov
7426cbee0d
Merge pull request #1570 from emankov/doc
...
[HIPIFY][#1569 ] Fix
[ROCm/clr commit: bf879f9c86 ]
2019-10-22 11:13:47 +03:00
Evgeny Mankov
82222bf945
[HIPIFY][ #1569 ] Fix
...
[ROCm/clr commit: e2191e23e6 ]
2019-10-22 11:08:37 +03:00
Evgeny Mankov
fe97898c1a
Merge pull request #1568 from emankov/hipify-clang
...
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA major…
[ROCm/clr commit: 62fc6f8487 ]
2019-10-21 17:52:02 +03:00
Evgeny Mankov
e3cf10192c
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA major.minor version
...
[Reason] To support maximum CUDA features in offline tests
+ Add defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 600 restriction for atomicAdd on doubles in atomics.cu.
So if LLVM < 7 and --cuda-gpu-arch doesn't work, __CUDA_ARCH__ is unset too (350 by default in clang);
if LLVM >= 7 --cuda-gpu-arch is used and __CUDA_ARCH__ is set based on it.
[ROCm/clr commit: 3233a845f6 ]
2019-10-21 17:50:00 +03:00
Evgeny Mankov
c021444d97
Merge pull request #1567 from emankov/hipify-clang
...
[HIPIFY][perl] Support of 'using namespace cub'
[ROCm/clr commit: ef25774dae ]
2019-10-21 17:16:32 +03:00
Evgeny Mankov
de849a44e7
[HIPIFY][perl] Support of 'using namespace cub'
...
[ROCm/clr commit: 9633cdbd8a ]
2019-10-21 17:15:05 +03:00
Evgeny Mankov
514f64146e
Merge pull request #1566 from emankov/hipify-clang
...
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA version
[ROCm/clr commit: 3cf3572237 ]
2019-10-21 15:54:34 +03:00
Evgeny Mankov
665a200247
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA version
...
[Reason] To support maximum CUDA features in offline tests
+ Add CUDA_VERSION >= 800 restriction for atomics.cu
[TODO] Find a way to use or exclude atomicAdd for doubles if LLVM < 7, because
LLVM 6.0.1 and older do not use --cuda-gpu-arch in clang's Driver code at all (option is only declared)
[ROCm/clr commit: 9fc7afa738 ]
2019-10-21 15:51:25 +03:00
Evgeny Mankov
50da6fc3ac
Merge pull request #1565 from emankov/hipify-clang
...
[HIPIFY][tests] Set -I for CUDA path instead of --cuda-path for LLVM < 4
[ROCm/clr commit: a47281e8ad ]
2019-10-20 20:10:25 +03:00
Evgeny Mankov
3a45daed0a
[HIPIFY][tests] Set -I for CUDA path instead of --cuda-path for LLVM < 4
...
[ROCm/clr commit: ff6057d1ff ]
2019-10-20 20:08:56 +03:00
Evgeny Mankov
c1ef259696
Merge pull request #1564 from emankov/hipify-clang
...
[HIPIFY][tests] Exclude all CUB tests if CUDA_CUB_ROOT_DIR is not set
[ROCm/clr commit: 8a4c860ae4 ]
2019-10-20 20:04:18 +03:00
Evgeny Mankov
e07be75489
[HIPIFY][tests] Exclude all CUB tests if CUDA_CUB_ROOT_DIR is not set
...
[ROCm/clr commit: 5bf1ff19ff ]
2019-10-20 20:03:18 +03:00
Vladislav Sytchenko
33acfa17c1
Remove extra #endif.
...
[ROCm/clr commit: 432380aa5d ]
2019-10-18 16:40:29 -04:00
Evgeny Mankov
7bd5ee880b
Merge pull request #1562 from emankov/doc
...
[HIPIFY][CUB][#1460 ] Add "using namespace cub" translation support
[ROCm/clr commit: 01819a4a24 ]
2019-10-18 18:56:34 +03:00
Evgeny Mankov
bb20336fa6
[HIPIFY][tests] Test clean-up
...
[ROCm/clr commit: 44a897a146 ]
2019-10-18 18:55:52 +03:00
Evgeny Mankov
85281b1d86
[HIPIFY][CUB][ #1460 ] Add "using namespace cub" translation support
...
+ Add cub_03.cu
[ROCm/clr commit: 86f6756b02 ]
2019-10-18 18:51:40 +03:00
Evgeny Mankov
a392a050d6
Merge pull request #1558 from aaronenyeshi/fix-hipify-cmake-version
...
[HIPIFY][cmake] Make CMakeLists use default 3.5.1 for Ubuntu 16.04
[ROCm/clr commit: eb6690bbba ]
2019-10-18 06:39:35 +03:00
Rahul Garg
30759e7c9b
Merge pull request #1550 from yxsamliu/new-launch
...
Add -fhip-new-launch-api to hipcc for HIP/VDI
[ROCm/clr commit: 07eed1e5bf ]
2019-10-17 19:07:32 -07:00
Vladislav Sytchenko
54eddfc8f0
_aligned_malloc() on Windows first takes size, then alignment, which is the opposite of how the similar function behaves on Linux. Memory allocated by it also has to be freed using _aligned_free(), unlike Linux where we can use regular free().
...
Edit aligned_alloc() macro and add a aligned_free() one to align with the above behaviour.
[ROCm/clr commit: f4440817cb ]
2019-10-17 18:58:32 -04:00
Aaron Enye Shi
489e3dda9a
[HIPIFY][cmake] Make CMakeLists use default 3.5.1 for Ubuntu 16.04
...
[ROCm/clr commit: b3ea58abe7 ]
2019-10-17 21:21:24 +00:00
Evgeny Mankov
91aeabeb39
Merge pull request #1557 from emankov/hipify-clang
...
[HIPIFY][doc] Update README.md
[ROCm/clr commit: ab9072cecd ]
2019-10-17 22:28:16 +03:00
Evgeny Mankov
9fb60fa36a
[HIPIFY][doc] Update README.md
...
+ Versions, testing
[ROCm/clr commit: 1165e6bd71 ]
2019-10-17 22:26:48 +03:00
Rahul Garg
714314fa66
Revert "hipcc defaults to code object v3 ( #1298 )"
...
This reverts commit e5a2ba9602 .
[ROCm/clr commit: 446718f990 ]
2019-10-17 13:27:28 -04:00
Evgeny Mankov
416adb365c
Merge pull request #1554 from emankov/clang
...
[HIPIFY][cmake] Add install rule for clang-resource-headers
[ROCm/clr commit: 27adf6911d ]
2019-10-17 16:50:25 +03:00
Evgeny Mankov
c8238e1fd4
[HIPIFY][cmake] Add install rule for clang-resource-headers
...
+ Fix: set destination for all installing files to ${CMAKE_INSTALL_PREFIX}
[ROCm/clr commit: 8c3dff7ab9 ]
2019-10-17 15:05:55 +03:00
Rahul Garg
685a4cd182
Merge pull request #1544 from vsytch/master
...
QoL changes to the hipMemset family
[ROCm/clr commit: a21fe1443b ]
2019-10-16 18:54:20 -07:00