Wykres commitów

4097 Commity

Autor SHA1 Wiadomość Data
Evgeny Mankov a4c7894255 [HIP][cmake] Simplify UNIX related code (the beginning)
[REASONS]
1. Make OS-dependent code more clear and readable
2. To ease Windows support


[ROCm/clr commit: 4c5c6b4910]
2019-10-28 23:22:27 +03:00
Evgeny Mankov 1ef94a4f94 Merge pull request #1590 from emankov/doc
[HIPIFY][tests] Fix ambiguous call to cusparseGetErrorString declared in cusparse.h

[ROCm/clr commit: 7f367ff933]
2019-10-25 16:08:22 +03:00
Evgeny Mankov bcc9d88b20 [HIPIFY][tests] Rename the ambiguous call as well
[ROCm/clr commit: f68bee02f5]
2019-10-25 16:07:31 +03:00
Evgeny Mankov 91732f98c0 [HIPIFY][tests] Fix ambiguous call to cusparseGetErrorString declared in cusparse.h
[ROCm/clr commit: 9529e1d91d]
2019-10-25 16:04:20 +03:00
Alex Voicu f22391c362 Add missing operators, fix GCC compilation. (#1589)
[ROCm/clr commit: 40522e2b6a]
2019-10-25 15:44:24 +05:30
Alex Voicu acbee5a48b Fix deadlock, remove old __sync_* use. (#1584)
This fixes a deadlock introduced by the switch to TTAS loops, and is therefore mildly urgent (to prevent the CI from hoovering in the broken code).

[ROCm/clr commit: f909a393ff]
2019-10-25 15:44:17 +05:30
Rahul Garg 9c599a3581 [dtest] Fix hipMemset2D test (#1579)
Reverts changes made in #1399. This is a RT api test. For testing hipMemAllocPitch , a new test should be written and that should use correct memset API.

[ROCm/clr commit: 66a3c874c8]
2019-10-25 15:44:05 +05:30
Rahul Garg 7ea7a9c3b7 Add hipMemcpy2DfromArray (#1510)
Adds hipMemcpy2DFromArray and hipMemcpy2DFromArrayAsync equivalent to cudaMemcpy2DFromArray and cudaMemcpy2DFromArrayAsync.

[ROCm/clr commit: 14b870d1ce]
2019-10-25 15:43:33 +05:30
Rahul Garg 6760e4065e Update profiling doc (#1576)
[ROCm/clr commit: ff8d3fa446]
2019-10-24 17:51:55 +05:30
Jatin Chaudhary e7f4cf4487 Adding New Analyze Target Merging with cppcheck (#1583)
[ROCm/clr commit: f53b1a1755]
2019-10-24 17:46:06 +05:30
Rahul Garg 7f429afe2e Add HIP checks in texture driver sample (#1581)
[ROCm/clr commit: 170c4f0270]
2019-10-24 17:45:51 +05:30
gandryey 21a2925ee7 Hip vdi profiling header (#1577)
Add HIP-VDI profiling interface for GPU timing collection.

[ROCm/clr commit: f25692b399]
2019-10-24 17:45:42 +05:30
Alex Voicu 5b917afa5f Make CAS loops use the TTAS idiom. (#1573)
* Make CAS loops use the TTAS idiom.

* More efficient re-formulation of TTAS.

* Fix typo.

* The typo was not quite a typo


[ROCm/clr commit: 26914ec76e]
2019-10-24 17:45:20 +05:30
satyanveshd ad1e409a24 Fix occupany APIs (#1560)
Addresses SWDEV-205006 

[ROCm/clr commit: 6c5fbf9b4a]
2019-10-24 17:44:47 +05:30
searlmc1 510be4b5dc Improve performance of v2 arg handling (#1539)
* Improve performance of v2 arg handling

* Missing change to `std::string`


[ROCm/clr commit: 15a699688e]
2019-10-24 17:44:05 +05:30
Alex Voicu fb411b56c2 Improve scalar access into vector types. (#1531)
The improvement is based on the ideas here: https://t0rakka.silvrback.com/simd-scalar-accessor. It yields significantly better ISA when the base's .xyzw members are used.

[ROCm/clr commit: 84d5b399f6]
2019-10-24 17:43:49 +05:30
Aryan Salmanpour 9e0eaef846 [hip] add support for implicit kernel argument for multi-grid sync (#1456)
* [hip] add support for implicit kernel argument for multi-grid sync

* modified code for calculating the prev_sum

* change the impCoopArg type to size_t

* add memory clean up

* launch init_gws and main kernels into two separate loops


[ROCm/clr commit: 93c688a0c9]
2019-10-24 17:43:30 +05:30
Rahul Garg 764135d242 Merge pull request #1559 from vsytch/win10_aligned_alloc
Fixes for hipMemcpy_simple on Windows

[ROCm/clr commit: 465581612e]
2019-10-23 13:10:59 -07:00
Evgeny Mankov 48b264c154 Merge pull request #1578 from emankov/doc
[HIPIFY][cmake][#1571] Take into account building hipify-clang as a part of building HIP while installing

[ROCm/clr commit: 80bf79c2f8]
2019-10-23 21:23:05 +03:00
Evgeny Mankov 50d72e13ca [HIPIFY][cmake][#1571] Take into account building hipify-clang as a part of building HIP while installing
[Algorithm]
  [Release]
    If CMAKE_INSTALL_PREFIX is set by the user:
       If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise CMAKE_INSTALL_PREFIX is used unchanged.
    If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
       If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise use PROJECT_BINARY_DIR/bin for installation.
  [Debug]
    If CMAKE_INSTALL_PREFIX is set by the user:
       CMAKE_INSTALL_PREFIX is used unchanged.
    If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
       use CMAKE_CURRENT_SOURCE_DIR/bin for installation.

Standalone build left unchanged: CMAKE_INSTALL_PREFIX is used if set.


[ROCm/clr commit: 2435567e70]
2019-10-23 18:54:45 +03:00
Evgeny Mankov b34b56a761 Merge pull request #1574 from emankov/hipify-clang
[HIPIFY] Disable delayed template parsing

[ROCm/clr commit: 3d16a8b121]
2019-10-22 19:09:13 +03:00
Evgeny Mankov 0896e41987 [HIPIFY] Disable delayed template parsing
By implicit unconditional passing -fno-delayed-template-parsing option (which appeared in LLVM 3.8.0, thus doesn't need compatibility wrapping) to hipify-clang.

[Reason] To parse uncalled template functions otherwise they are not parsed without calling, thus not hipified.

Affects cub_03.cu test, which has uncalled global template function.


[ROCm/clr commit: 7ab06b3892]
2019-10-22 19:07:37 +03:00
Evgeny Mankov 7426cbee0d Merge pull request #1570 from emankov/doc
[HIPIFY][#1569] Fix

[ROCm/clr commit: bf879f9c86]
2019-10-22 11:13:47 +03:00
Evgeny Mankov 82222bf945 [HIPIFY][#1569] Fix
[ROCm/clr commit: e2191e23e6]
2019-10-22 11:08:37 +03:00
Evgeny Mankov fe97898c1a Merge pull request #1568 from emankov/hipify-clang
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA major…

[ROCm/clr commit: 62fc6f8487]
2019-10-21 17:52:02 +03:00
Evgeny Mankov e3cf10192c [HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA major.minor version
[Reason] To support maximum CUDA features in offline tests

+ Add defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 600 restriction for atomicAdd on doubles in atomics.cu.
  So if LLVM < 7 and --cuda-gpu-arch doesn't work, __CUDA_ARCH__ is unset too (350 by default in clang);
  if LLVM >= 7 --cuda-gpu-arch is used and __CUDA_ARCH__ is set based on it.


[ROCm/clr commit: 3233a845f6]
2019-10-21 17:50:00 +03:00
Evgeny Mankov c021444d97 Merge pull request #1567 from emankov/hipify-clang
[HIPIFY][perl] Support of 'using namespace cub'

[ROCm/clr commit: ef25774dae]
2019-10-21 17:16:32 +03:00
Evgeny Mankov de849a44e7 [HIPIFY][perl] Support of 'using namespace cub'
[ROCm/clr commit: 9633cdbd8a]
2019-10-21 17:15:05 +03:00
Evgeny Mankov 514f64146e Merge pull request #1566 from emankov/hipify-clang
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA version

[ROCm/clr commit: 3cf3572237]
2019-10-21 15:54:34 +03:00
Evgeny Mankov 665a200247 [HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA version
[Reason] To support maximum CUDA features in offline tests

+ Add CUDA_VERSION >= 800 restriction for atomics.cu

[TODO] Find a way to use or exclude atomicAdd for doubles if LLVM < 7, because
LLVM 6.0.1 and older do not use --cuda-gpu-arch in clang's Driver code at all (option is only declared)


[ROCm/clr commit: 9fc7afa738]
2019-10-21 15:51:25 +03:00
Evgeny Mankov 50da6fc3ac Merge pull request #1565 from emankov/hipify-clang
[HIPIFY][tests] Set -I for CUDA path instead of --cuda-path for LLVM < 4

[ROCm/clr commit: a47281e8ad]
2019-10-20 20:10:25 +03:00
Evgeny Mankov 3a45daed0a [HIPIFY][tests] Set -I for CUDA path instead of --cuda-path for LLVM < 4
[ROCm/clr commit: ff6057d1ff]
2019-10-20 20:08:56 +03:00
Evgeny Mankov c1ef259696 Merge pull request #1564 from emankov/hipify-clang
[HIPIFY][tests] Exclude all CUB tests if CUDA_CUB_ROOT_DIR is not set

[ROCm/clr commit: 8a4c860ae4]
2019-10-20 20:04:18 +03:00
Evgeny Mankov e07be75489 [HIPIFY][tests] Exclude all CUB tests if CUDA_CUB_ROOT_DIR is not set
[ROCm/clr commit: 5bf1ff19ff]
2019-10-20 20:03:18 +03:00
Vladislav Sytchenko 33acfa17c1 Remove extra #endif.
[ROCm/clr commit: 432380aa5d]
2019-10-18 16:40:29 -04:00
Evgeny Mankov 7bd5ee880b Merge pull request #1562 from emankov/doc
[HIPIFY][CUB][#1460] Add "using namespace cub" translation support

[ROCm/clr commit: 01819a4a24]
2019-10-18 18:56:34 +03:00
Evgeny Mankov bb20336fa6 [HIPIFY][tests] Test clean-up
[ROCm/clr commit: 44a897a146]
2019-10-18 18:55:52 +03:00
Evgeny Mankov 85281b1d86 [HIPIFY][CUB][#1460] Add "using namespace cub" translation support
+ Add cub_03.cu


[ROCm/clr commit: 86f6756b02]
2019-10-18 18:51:40 +03:00
Evgeny Mankov a392a050d6 Merge pull request #1558 from aaronenyeshi/fix-hipify-cmake-version
[HIPIFY][cmake] Make CMakeLists use default 3.5.1 for Ubuntu 16.04

[ROCm/clr commit: eb6690bbba]
2019-10-18 06:39:35 +03:00
Rahul Garg 30759e7c9b Merge pull request #1550 from yxsamliu/new-launch
Add -fhip-new-launch-api to hipcc for HIP/VDI

[ROCm/clr commit: 07eed1e5bf]
2019-10-17 19:07:32 -07:00
Vladislav Sytchenko 54eddfc8f0 _aligned_malloc() on Windows first takes size, then alignment, which is the opposite of how the similar function behaves on Linux. Memory allocated by it also has to be freed using _aligned_free(), unlike Linux where we can use regular free().
Edit aligned_alloc() macro and add a aligned_free() one to align with the above behaviour.


[ROCm/clr commit: f4440817cb]
2019-10-17 18:58:32 -04:00
Aaron Enye Shi 489e3dda9a [HIPIFY][cmake] Make CMakeLists use default 3.5.1 for Ubuntu 16.04
[ROCm/clr commit: b3ea58abe7]
2019-10-17 21:21:24 +00:00
Evgeny Mankov 91aeabeb39 Merge pull request #1557 from emankov/hipify-clang
[HIPIFY][doc] Update README.md

[ROCm/clr commit: ab9072cecd]
2019-10-17 22:28:16 +03:00
Evgeny Mankov 9fb60fa36a [HIPIFY][doc] Update README.md
+ Versions, testing


[ROCm/clr commit: 1165e6bd71]
2019-10-17 22:26:48 +03:00
Rahul Garg 714314fa66 Revert "hipcc defaults to code object v3 (#1298)"
This reverts commit e5a2ba9602.


[ROCm/clr commit: 446718f990]
2019-10-17 13:27:28 -04:00
Evgeny Mankov 416adb365c Merge pull request #1554 from emankov/clang
[HIPIFY][cmake] Add install rule for clang-resource-headers

[ROCm/clr commit: 27adf6911d]
2019-10-17 16:50:25 +03:00
Evgeny Mankov c8238e1fd4 [HIPIFY][cmake] Add install rule for clang-resource-headers
+ Fix: set destination for all installing files to ${CMAKE_INSTALL_PREFIX}


[ROCm/clr commit: 8c3dff7ab9]
2019-10-17 15:05:55 +03:00
Rahul Garg 685a4cd182 Merge pull request #1544 from vsytch/master
QoL changes to the hipMemset family

[ROCm/clr commit: a21fe1443b]
2019-10-16 18:54:20 -07:00
Evgeny Mankov 127e98dd46 Merge pull request #1551 from emankov/clang
[HIPIFY][CUB][#1460] Add cub:: namespace support in TemplateInstantiation of cudaLaunchKernel

[ROCm/clr commit: ada37f1b78]
2019-10-16 19:05:18 +03:00
Evgeny Mankov b357301610 [HIPIFY][CUB][#1460] Add cub:: namespace support in TemplateInstantiation of cudaLaunchKernel
+ Update cub_02.cu test accordingly


[ROCm/clr commit: e557563947]
2019-10-16 19:02:13 +03:00