Anusha Godavarthy Surya
5f47e99ffe
merge from master
2019-10-25 15:52:09 +05:30
Anusha Godavarthy Surya
259d8b4cdf
Merge branch 'master' into tex_unbind_issue_fix
2019-10-25 15:36:55 +05:30
Anusha Godavarthy Surya
ce04bdaa1a
Fixed CI build failure
2019-10-25 12:21:41 +05:30
Rahul Garg
70f2cd1317
Update profiling doc ( #1576 )
2019-10-24 17:51:55 +05:30
Jatin Chaudhary
770d3412f8
Adding New Analyze Target Merging with cppcheck ( #1583 )
2019-10-24 17:46:06 +05:30
Rahul Garg
04e10814d8
Add HIP checks in texture driver sample ( #1581 )
2019-10-24 17:45:51 +05:30
gandryey
81952ce5a7
Hip vdi profiling header ( #1577 )
...
Add HIP-VDI profiling interface for GPU timing collection.
2019-10-24 17:45:42 +05:30
Alex Voicu
9ba25b42c8
Make CAS loops use the TTAS idiom. ( #1573 )
...
* Make CAS loops use the TTAS idiom.
* More efficient re-formulation of TTAS.
* Fix typo.
* The typo was not quite a typo
2019-10-24 17:45:20 +05:30
satyanveshd
af351d7e1b
Fix occupany APIs ( #1560 )
...
Addresses SWDEV-205006
2019-10-24 17:44:47 +05:30
searlmc1
c4a51f3679
Improve performance of v2 arg handling ( #1539 )
...
* Improve performance of v2 arg handling
* Missing change to `std::string`
2019-10-24 17:44:05 +05:30
Alex Voicu
4a635add45
Improve scalar access into vector types. ( #1531 )
...
The improvement is based on the ideas here: https://t0rakka.silvrback.com/simd-scalar-accessor . It yields significantly better ISA when the base's .xyzw members are used.
2019-10-24 17:43:49 +05:30
Aryan Salmanpour
359dc79101
[hip] add support for implicit kernel argument for multi-grid sync ( #1456 )
...
* [hip] add support for implicit kernel argument for multi-grid sync
* modified code for calculating the prev_sum
* change the impCoopArg type to size_t
* add memory clean up
* launch init_gws and main kernels into two separate loops
2019-10-24 17:43:30 +05:30
Rahul Garg
fe5f7d4245
Merge pull request #1559 from vsytch/win10_aligned_alloc
...
Fixes for hipMemcpy_simple on Windows
2019-10-23 13:10:59 -07:00
Evgeny Mankov
29e04f99b5
Merge pull request #1578 from emankov/doc
...
[HIPIFY][cmake][#1571 ] Take into account building hipify-clang as a part of building HIP while installing
2019-10-23 21:23:05 +03:00
Evgeny Mankov
75d70a6714
[HIPIFY][cmake][ #1571 ] Take into account building hipify-clang as a part of building HIP while installing
...
[Algorithm]
[Release]
If CMAKE_INSTALL_PREFIX is set by the user:
If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise CMAKE_INSTALL_PREFIX is used unchanged.
If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise use PROJECT_BINARY_DIR/bin for installation.
[Debug]
If CMAKE_INSTALL_PREFIX is set by the user:
CMAKE_INSTALL_PREFIX is used unchanged.
If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
use CMAKE_CURRENT_SOURCE_DIR/bin for installation.
Standalone build left unchanged: CMAKE_INSTALL_PREFIX is used if set.
2019-10-23 18:54:45 +03:00
Evgeny Mankov
cc9efa707c
Merge pull request #1574 from emankov/hipify-clang
...
[HIPIFY] Disable delayed template parsing
2019-10-22 19:09:13 +03:00
Evgeny Mankov
b6e6f12b54
[HIPIFY] Disable delayed template parsing
...
By implicit unconditional passing -fno-delayed-template-parsing option (which appeared in LLVM 3.8.0, thus doesn't need compatibility wrapping) to hipify-clang.
[Reason] To parse uncalled template functions otherwise they are not parsed without calling, thus not hipified.
Affects cub_03.cu test, which has uncalled global template function.
2019-10-22 19:07:37 +03:00
Evgeny Mankov
76c8406449
Merge pull request #1570 from emankov/doc
...
[HIPIFY][#1569 ] Fix
2019-10-22 11:13:47 +03:00
Evgeny Mankov
6f88c81a78
[HIPIFY][ #1569 ] Fix
2019-10-22 11:08:37 +03:00
Evgeny Mankov
239fb0a098
Merge pull request #1568 from emankov/hipify-clang
...
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA major…
2019-10-21 17:52:02 +03:00
Evgeny Mankov
39e7d213cf
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA major.minor version
...
[Reason] To support maximum CUDA features in offline tests
+ Add defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 600 restriction for atomicAdd on doubles in atomics.cu.
So if LLVM < 7 and --cuda-gpu-arch doesn't work, __CUDA_ARCH__ is unset too (350 by default in clang);
if LLVM >= 7 --cuda-gpu-arch is used and __CUDA_ARCH__ is set based on it.
2019-10-21 17:50:00 +03:00
Evgeny Mankov
2ddde17039
Merge pull request #1567 from emankov/hipify-clang
...
[HIPIFY][perl] Support of 'using namespace cub'
2019-10-21 17:16:32 +03:00
Evgeny Mankov
b08f29a6fa
[HIPIFY][perl] Support of 'using namespace cub'
2019-10-21 17:15:05 +03:00
Evgeny Mankov
1caeb5613d
Merge pull request #1566 from emankov/hipify-clang
...
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA version
2019-10-21 15:54:34 +03:00
Evgeny Mankov
14b4df126c
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA version
...
[Reason] To support maximum CUDA features in offline tests
+ Add CUDA_VERSION >= 800 restriction for atomics.cu
[TODO] Find a way to use or exclude atomicAdd for doubles if LLVM < 7, because
LLVM 6.0.1 and older do not use --cuda-gpu-arch in clang's Driver code at all (option is only declared)
2019-10-21 15:51:25 +03:00
Evgeny Mankov
abb34bab8e
Merge pull request #1565 from emankov/hipify-clang
...
[HIPIFY][tests] Set -I for CUDA path instead of --cuda-path for LLVM < 4
2019-10-20 20:10:25 +03:00
Evgeny Mankov
6cfea9b600
[HIPIFY][tests] Set -I for CUDA path instead of --cuda-path for LLVM < 4
2019-10-20 20:08:56 +03:00
Evgeny Mankov
7fb633bcc7
Merge pull request #1564 from emankov/hipify-clang
...
[HIPIFY][tests] Exclude all CUB tests if CUDA_CUB_ROOT_DIR is not set
2019-10-20 20:04:18 +03:00
Evgeny Mankov
ccb075b1db
[HIPIFY][tests] Exclude all CUB tests if CUDA_CUB_ROOT_DIR is not set
2019-10-20 20:03:18 +03:00
Vladislav Sytchenko
664b115c44
Remove extra #endif.
2019-10-18 16:40:29 -04:00
Evgeny Mankov
3baf7f8d93
Merge pull request #1562 from emankov/doc
...
[HIPIFY][CUB][#1460 ] Add "using namespace cub" translation support
2019-10-18 18:56:34 +03:00
Evgeny Mankov
82adc93e69
[HIPIFY][tests] Test clean-up
2019-10-18 18:55:52 +03:00
Evgeny Mankov
98874c0e7f
[HIPIFY][CUB][ #1460 ] Add "using namespace cub" translation support
...
+ Add cub_03.cu
2019-10-18 18:51:40 +03:00
Evgeny Mankov
f0ed210b19
Merge pull request #1558 from aaronenyeshi/fix-hipify-cmake-version
...
[HIPIFY][cmake] Make CMakeLists use default 3.5.1 for Ubuntu 16.04
2019-10-18 06:39:35 +03:00
Rahul Garg
1fd16d7601
Merge pull request #1550 from yxsamliu/new-launch
...
Add -fhip-new-launch-api to hipcc for HIP/VDI
2019-10-17 19:07:32 -07:00
Vladislav Sytchenko
8f0a226660
_aligned_malloc() on Windows first takes size, then alignment, which is the opposite of how the similar function behaves on Linux. Memory allocated by it also has to be freed using _aligned_free(), unlike Linux where we can use regular free().
...
Edit aligned_alloc() macro and add a aligned_free() one to align with the above behaviour.
2019-10-17 18:58:32 -04:00
Aaron Enye Shi
31e57f8b64
[HIPIFY][cmake] Make CMakeLists use default 3.5.1 for Ubuntu 16.04
2019-10-17 21:21:24 +00:00
Evgeny Mankov
10cc2f4ab3
Merge pull request #1557 from emankov/hipify-clang
...
[HIPIFY][doc] Update README.md
2019-10-17 22:28:16 +03:00
Evgeny Mankov
7ecbd71004
[HIPIFY][doc] Update README.md
...
+ Versions, testing
2019-10-17 22:26:48 +03:00
Rahul Garg
5f37f3174a
Revert "hipcc defaults to code object v3 ( #1298 )"
...
This reverts commit d39a2a0749 .
2019-10-17 13:27:28 -04:00
Evgeny Mankov
d8f512dcae
Merge pull request #1554 from emankov/clang
...
[HIPIFY][cmake] Add install rule for clang-resource-headers
2019-10-17 16:50:25 +03:00
Evgeny Mankov
f19e7c29df
[HIPIFY][cmake] Add install rule for clang-resource-headers
...
+ Fix: set destination for all installing files to ${CMAKE_INSTALL_PREFIX}
2019-10-17 15:05:55 +03:00
Rahul Garg
e1aac060da
Merge pull request #1544 from vsytch/master
...
QoL changes to the hipMemset family
2019-10-16 18:54:20 -07:00
Evgeny Mankov
84a73406e9
Merge pull request #1551 from emankov/clang
...
[HIPIFY][CUB][#1460 ] Add cub:: namespace support in TemplateInstantiation of cudaLaunchKernel
2019-10-16 19:05:18 +03:00
Evgeny Mankov
edfd05a86d
[HIPIFY][CUB][ #1460 ] Add cub:: namespace support in TemplateInstantiation of cudaLaunchKernel
...
+ Update cub_02.cu test accordingly
2019-10-16 19:02:13 +03:00
Vladislav Sytchenko
c747b77ac1
hipMemset2D and hipMemset3D tests should be passing by default.
2019-10-16 11:02:38 -04:00
Evgeny Mankov
e805e7d8cb
Merge pull request #1548 from emankov/clang
...
[HIPIFY] Refactor a couple of matcher functions
2019-10-16 13:45:59 +03:00
Evgeny Mankov
809a67a4f6
[HIPIFY] Refactor a couple of matcher functions
...
+ Separate out GetSubstrLocation function for finding substr SourceLocation in a given SourceRange
2019-10-16 13:43:56 +03:00
Evgeny Mankov
a80bad474b
Merge pull request #1547 from emankov/clang
...
[HIPIFY][CUB][#1460 ] Implement cubFunctionTemplateDecl matcher
2019-10-16 13:09:49 +03:00
Evgeny Mankov
6960574850
[HIPIFY][CUB][ #1460 ] Implement cubFunctionTemplateDecl matcher
...
+ Add cub_02.cu test
+ Partial fixes #1460
2019-10-16 13:08:11 +03:00