Gráfico de commits

4121 Commits

Autor SHA1 Mensaje Fecha
Rahul Garg aeb7cebbad Merge pull request #1515 from ansurya/tex_unbind_issue_fix
Fix undefined ref to hipUnbindTexture for texture types
2019-10-30 17:54:15 -07:00
Evgeny Mankov 961bc5737e Merge pull request #1593 from emankov/doc
[HIP][cmake] Move all *_INSTALL_DIR variables up before first add_subdirectory()
2019-10-30 22:10:05 +03:00
Rahul Garg b94f5bd667 Merge pull request #1607 from mhbliao/hliao/master/missing.api.hip.clang
[HIP] Correct headers and add missing function templates for hip-clang.
2019-10-30 07:48:57 -07:00
Michael LIAO 61bc68a5f4 [HIP] Correct headers and add missing function templates for hip-clang.
- Fix 2 runtime API prototypes
  `hipOccupancyMaxActiveBlocksPerMultiprocessor` and
  `hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags`
- Add missing function templates of them in hip-clang.
2019-10-29 22:00:11 -04:00
Rahul Garg 9840cdac99 Merge pull request #1602 from ROCm-Developer-Tools/revert-1560-satyanveshd/hipoccupy
Revert "Cooperative groups match with cuda SWDEV-205006"
2019-10-29 16:54:36 -07:00
Evgeny Mankov daab61e8e8 Merge pull request #1604 from emankov/hipify
[HIPIFY][#1603] Fix
2019-10-29 22:12:39 +03:00
Evgeny Mankov 050fdad7b7 [HIPIFY][#1603] Fix 2019-10-29 22:10:36 +03:00
Rahul Garg 27221bc823 Revert "Fix occupany APIs (#1560)"
This reverts commit 6c5fbf9b4a.
2019-10-29 11:41:08 -07:00
Evgeny Mankov 8a7e6fb747 Merge pull request #1601 from emankov/hipify
[HIPIFY][Linux] Rollback --cuda-compile-host-device on Linux
2019-10-29 20:55:29 +03:00
Evgeny Mankov dd2243f2fa [HIPIFY][Linux] Rollback --cuda-compile-host-device on Linux
[Reason] It doesn't work with LLVM 9 and higher; Windows is fine
2019-10-29 20:53:54 +03:00
Evgeny Mankov 99c4a40da1 Merge pull request #1600 from emankov/hipify
[HIPIFY] Introduce --cuda-compile-host-device for LLVM >= 9
2019-10-29 19:47:15 +03:00
Evgeny Mankov 411b18a124 [HIPIFY] Introduce --cuda-compile-host-device for LLVM >= 9
* LLVM < 9 continues using --cuda-host-only
2019-10-29 19:42:53 +03:00
Evgeny Mankov 50df94be1e Merge pull request #1599 from emankov/hipify
[HIPIFY] cudaMemcpy2DFromArray(Async) support
2019-10-29 19:14:00 +03:00
Evgeny Mankov 5dd00bdf52 [HIPIFY] cudaMemcpy2DFromArray(Async) support 2019-10-29 19:12:42 +03:00
Evgeny Mankov 3921ea9057 Merge pull request #1594 from emankov/HIP
[HIP][doc] Fix typo: AMD-clang -> HIP-clang
2019-10-28 23:22:57 +03:00
Evgeny Mankov 3df22b2fde [HIP][doc] NVIDIA-nvcc -> HIP-nvcc 2019-10-28 22:46:33 +03:00
Evgeny Mankov d312bce79d [HIP][doc] AMD-hcc -> HIP-hcc 2019-10-28 21:41:12 +03:00
Evgeny Mankov 6284b041e5 [HIP][doc] Fix typo: AMD-clang -> HIP-clang
HIP-clang is already used below instead of AMD-clang
2019-10-28 21:19:21 +03:00
Evgeny Mankov 8100e084b8 [HIP][cmake] Move all *_INSTALL_DIR variables up before first add_subdirectory()
[REASON]
Those vars (may) used by cmake in subdirectories (#1571)
2019-10-28 21:07:00 +03:00
Evgeny Mankov 7f367ff933 Merge pull request #1590 from emankov/doc
[HIPIFY][tests] Fix ambiguous call to cusparseGetErrorString declared in cusparse.h
2019-10-25 16:08:22 +03:00
Evgeny Mankov f68bee02f5 [HIPIFY][tests] Rename the ambiguous call as well 2019-10-25 16:07:31 +03:00
Evgeny Mankov 9529e1d91d [HIPIFY][tests] Fix ambiguous call to cusparseGetErrorString declared in cusparse.h 2019-10-25 16:04:20 +03:00
Anusha Godavarthy Surya 9332a39838 Merge branch 'master' into tex_unbind_issue_fix 2019-10-25 15:54:25 +05:30
Anusha Godavarthy Surya ae838f8cee merge from master 2019-10-25 15:52:09 +05:30
Alex Voicu 40522e2b6a Add missing operators, fix GCC compilation. (#1589) 2019-10-25 15:44:24 +05:30
Alex Voicu f909a393ff Fix deadlock, remove old __sync_* use. (#1584)
This fixes a deadlock introduced by the switch to TTAS loops, and is therefore mildly urgent (to prevent the CI from hoovering in the broken code).
2019-10-25 15:44:17 +05:30
Rahul Garg 66a3c874c8 [dtest] Fix hipMemset2D test (#1579)
Reverts changes made in #1399. This is a RT api test. For testing hipMemAllocPitch , a new test should be written and that should use correct memset API.
2019-10-25 15:44:05 +05:30
Rahul Garg 14b870d1ce Add hipMemcpy2DfromArray (#1510)
Adds hipMemcpy2DFromArray and hipMemcpy2DFromArrayAsync equivalent to cudaMemcpy2DFromArray and cudaMemcpy2DFromArrayAsync.
2019-10-25 15:43:33 +05:30
Anusha Godavarthy Surya c0fc5e718c Merge branch 'master' into tex_unbind_issue_fix 2019-10-25 15:36:55 +05:30
Anusha Godavarthy Surya b9c8dd8ac6 Fixed CI build failure 2019-10-25 12:21:41 +05:30
Rahul Garg ff8d3fa446 Update profiling doc (#1576) 2019-10-24 17:51:55 +05:30
Jatin Chaudhary f53b1a1755 Adding New Analyze Target Merging with cppcheck (#1583) 2019-10-24 17:46:06 +05:30
Rahul Garg 170c4f0270 Add HIP checks in texture driver sample (#1581) 2019-10-24 17:45:51 +05:30
gandryey f25692b399 Hip vdi profiling header (#1577)
Add HIP-VDI profiling interface for GPU timing collection.
2019-10-24 17:45:42 +05:30
Alex Voicu 26914ec76e Make CAS loops use the TTAS idiom. (#1573)
* Make CAS loops use the TTAS idiom.

* More efficient re-formulation of TTAS.

* Fix typo.

* The typo was not quite a typo
2019-10-24 17:45:20 +05:30
satyanveshd 6c5fbf9b4a Fix occupany APIs (#1560)
Addresses SWDEV-205006
2019-10-24 17:44:47 +05:30
searlmc1 15a699688e Improve performance of v2 arg handling (#1539)
* Improve performance of v2 arg handling

* Missing change to `std::string`
2019-10-24 17:44:05 +05:30
Alex Voicu 84d5b399f6 Improve scalar access into vector types. (#1531)
The improvement is based on the ideas here: https://t0rakka.silvrback.com/simd-scalar-accessor. It yields significantly better ISA when the base's .xyzw members are used.
2019-10-24 17:43:49 +05:30
Aryan Salmanpour 93c688a0c9 [hip] add support for implicit kernel argument for multi-grid sync (#1456)
* [hip] add support for implicit kernel argument for multi-grid sync

* modified code for calculating the prev_sum

* change the impCoopArg type to size_t

* add memory clean up

* launch init_gws and main kernels into two separate loops
2019-10-24 17:43:30 +05:30
Rahul Garg 465581612e Merge pull request #1559 from vsytch/win10_aligned_alloc
Fixes for hipMemcpy_simple on Windows
2019-10-23 13:10:59 -07:00
Evgeny Mankov 80bf79c2f8 Merge pull request #1578 from emankov/doc
[HIPIFY][cmake][#1571] Take into account building hipify-clang as a part of building HIP while installing
2019-10-23 21:23:05 +03:00
Evgeny Mankov 2435567e70 [HIPIFY][cmake][#1571] Take into account building hipify-clang as a part of building HIP while installing
[Algorithm]
  [Release]
    If CMAKE_INSTALL_PREFIX is set by the user:
       If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise CMAKE_INSTALL_PREFIX is used unchanged.
    If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
       If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise use PROJECT_BINARY_DIR/bin for installation.
  [Debug]
    If CMAKE_INSTALL_PREFIX is set by the user:
       CMAKE_INSTALL_PREFIX is used unchanged.
    If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
       use CMAKE_CURRENT_SOURCE_DIR/bin for installation.

Standalone build left unchanged: CMAKE_INSTALL_PREFIX is used if set.
2019-10-23 18:54:45 +03:00
Evgeny Mankov 3d16a8b121 Merge pull request #1574 from emankov/hipify-clang
[HIPIFY] Disable delayed template parsing
2019-10-22 19:09:13 +03:00
Evgeny Mankov 7ab06b3892 [HIPIFY] Disable delayed template parsing
By implicit unconditional passing -fno-delayed-template-parsing option (which appeared in LLVM 3.8.0, thus doesn't need compatibility wrapping) to hipify-clang.

[Reason] To parse uncalled template functions otherwise they are not parsed without calling, thus not hipified.

Affects cub_03.cu test, which has uncalled global template function.
2019-10-22 19:07:37 +03:00
Evgeny Mankov bf879f9c86 Merge pull request #1570 from emankov/doc
[HIPIFY][#1569] Fix
2019-10-22 11:13:47 +03:00
Evgeny Mankov e2191e23e6 [HIPIFY][#1569] Fix 2019-10-22 11:08:37 +03:00
Evgeny Mankov 62fc6f8487 Merge pull request #1568 from emankov/hipify-clang
[HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA major…
2019-10-21 17:52:02 +03:00
Evgeny Mankov 3233a845f6 [HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA major.minor version
[Reason] To support maximum CUDA features in offline tests

+ Add defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 600 restriction for atomicAdd on doubles in atomics.cu.
  So if LLVM < 7 and --cuda-gpu-arch doesn't work, __CUDA_ARCH__ is unset too (350 by default in clang);
  if LLVM >= 7 --cuda-gpu-arch is used and __CUDA_ARCH__ is set based on it.
2019-10-21 17:50:00 +03:00
Evgeny Mankov ef25774dae Merge pull request #1567 from emankov/hipify-clang
[HIPIFY][perl] Support of 'using namespace cub'
2019-10-21 17:16:32 +03:00
Evgeny Mankov 9633cdbd8a [HIPIFY][perl] Support of 'using namespace cub' 2019-10-21 17:15:05 +03:00