2
0
Gráfico de cometimentos

3648 Cometimentos

Autor(a) SHA1 Mensagem Data
Rahul Garg 9c599a3581 [dtest] Fix hipMemset2D test (#1579)
Reverts changes made in #1399. This is a RT api test. For testing hipMemAllocPitch , a new test should be written and that should use correct memset API.

[ROCm/clr commit: 66a3c874c8]
2019-10-25 15:44:05 +05:30
Rahul Garg 7ea7a9c3b7 Add hipMemcpy2DfromArray (#1510)
Adds hipMemcpy2DFromArray and hipMemcpy2DFromArrayAsync equivalent to cudaMemcpy2DFromArray and cudaMemcpy2DFromArrayAsync.

[ROCm/clr commit: 14b870d1ce]
2019-10-25 15:43:33 +05:30
Rahul Garg 6760e4065e Update profiling doc (#1576)
[ROCm/clr commit: ff8d3fa446]
2019-10-24 17:51:55 +05:30
Jatin Chaudhary e7f4cf4487 Adding New Analyze Target Merging with cppcheck (#1583)
[ROCm/clr commit: f53b1a1755]
2019-10-24 17:46:06 +05:30
Rahul Garg 7f429afe2e Add HIP checks in texture driver sample (#1581)
[ROCm/clr commit: 170c4f0270]
2019-10-24 17:45:51 +05:30
gandryey 21a2925ee7 Hip vdi profiling header (#1577)
Add HIP-VDI profiling interface for GPU timing collection.

[ROCm/clr commit: f25692b399]
2019-10-24 17:45:42 +05:30
Alex Voicu 5b917afa5f Make CAS loops use the TTAS idiom. (#1573)
* Make CAS loops use the TTAS idiom.

* More efficient re-formulation of TTAS.

* Fix typo.

* The typo was not quite a typo


[ROCm/clr commit: 26914ec76e]
2019-10-24 17:45:20 +05:30
satyanveshd ad1e409a24 Fix occupany APIs (#1560)
Addresses SWDEV-205006 

[ROCm/clr commit: 6c5fbf9b4a]
2019-10-24 17:44:47 +05:30
searlmc1 510be4b5dc Improve performance of v2 arg handling (#1539)
* Improve performance of v2 arg handling

* Missing change to `std::string`


[ROCm/clr commit: 15a699688e]
2019-10-24 17:44:05 +05:30
Alex Voicu fb411b56c2 Improve scalar access into vector types. (#1531)
The improvement is based on the ideas here: https://t0rakka.silvrback.com/simd-scalar-accessor. It yields significantly better ISA when the base's .xyzw members are used.

[ROCm/clr commit: 84d5b399f6]
2019-10-24 17:43:49 +05:30
Aryan Salmanpour 9e0eaef846 [hip] add support for implicit kernel argument for multi-grid sync (#1456)
* [hip] add support for implicit kernel argument for multi-grid sync

* modified code for calculating the prev_sum

* change the impCoopArg type to size_t

* add memory clean up

* launch init_gws and main kernels into two separate loops


[ROCm/clr commit: 93c688a0c9]
2019-10-24 17:43:30 +05:30
Rahul Garg 764135d242 Merge pull request #1559 from vsytch/win10_aligned_alloc
Fixes for hipMemcpy_simple on Windows

[ROCm/clr commit: 465581612e]
2019-10-23 13:10:59 -07:00
Evgeny Mankov 50d72e13ca [HIPIFY][cmake][#1571] Take into account building hipify-clang as a part of building HIP while installing
[Algorithm]
  [Release]
    If CMAKE_INSTALL_PREFIX is set by the user:
       If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise CMAKE_INSTALL_PREFIX is used unchanged.
    If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
       If BIN_INSTALL_DIR is set by HIP, use it as CMAKE_INSTALL_PREFIX, otherwise use PROJECT_BINARY_DIR/bin for installation.
  [Debug]
    If CMAKE_INSTALL_PREFIX is set by the user:
       CMAKE_INSTALL_PREFIX is used unchanged.
    If the user does not set CMAKE_INSTALL_PREFIX (CMAKE_INSTALL_PREFIX_INITIALIZED_TO_DEFAULT):
       use CMAKE_CURRENT_SOURCE_DIR/bin for installation.

Standalone build left unchanged: CMAKE_INSTALL_PREFIX is used if set.


[ROCm/clr commit: 2435567e70]
2019-10-23 18:54:45 +03:00
Evgeny Mankov 0896e41987 [HIPIFY] Disable delayed template parsing
By implicit unconditional passing -fno-delayed-template-parsing option (which appeared in LLVM 3.8.0, thus doesn't need compatibility wrapping) to hipify-clang.

[Reason] To parse uncalled template functions otherwise they are not parsed without calling, thus not hipified.

Affects cub_03.cu test, which has uncalled global template function.


[ROCm/clr commit: 7ab06b3892]
2019-10-22 19:07:37 +03:00
Evgeny Mankov 82222bf945 [HIPIFY][#1569] Fix
[ROCm/clr commit: e2191e23e6]
2019-10-22 11:08:37 +03:00
Evgeny Mankov e3cf10192c [HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA major.minor version
[Reason] To support maximum CUDA features in offline tests

+ Add defined(__CUDA_ARCH__) && __CUDA_ARCH__ >= 600 restriction for atomicAdd on doubles in atomics.cu.
  So if LLVM < 7 and --cuda-gpu-arch doesn't work, __CUDA_ARCH__ is unset too (350 by default in clang);
  if LLVM >= 7 --cuda-gpu-arch is used and __CUDA_ARCH__ is set based on it.


[ROCm/clr commit: 3233a845f6]
2019-10-21 17:50:00 +03:00
Evgeny Mankov de849a44e7 [HIPIFY][perl] Support of 'using namespace cub'
[ROCm/clr commit: 9633cdbd8a]
2019-10-21 17:15:05 +03:00
Evgeny Mankov 665a200247 [HIPIFY][tests] Set max clang's CudaArch for corresponding CUDA version
[Reason] To support maximum CUDA features in offline tests

+ Add CUDA_VERSION >= 800 restriction for atomics.cu

[TODO] Find a way to use or exclude atomicAdd for doubles if LLVM < 7, because
LLVM 6.0.1 and older do not use --cuda-gpu-arch in clang's Driver code at all (option is only declared)


[ROCm/clr commit: 9fc7afa738]
2019-10-21 15:51:25 +03:00
Evgeny Mankov 3a45daed0a [HIPIFY][tests] Set -I for CUDA path instead of --cuda-path for LLVM < 4
[ROCm/clr commit: ff6057d1ff]
2019-10-20 20:08:56 +03:00
Evgeny Mankov e07be75489 [HIPIFY][tests] Exclude all CUB tests if CUDA_CUB_ROOT_DIR is not set
[ROCm/clr commit: 5bf1ff19ff]
2019-10-20 20:03:18 +03:00
Vladislav Sytchenko 33acfa17c1 Remove extra #endif.
[ROCm/clr commit: 432380aa5d]
2019-10-18 16:40:29 -04:00
Evgeny Mankov bb20336fa6 [HIPIFY][tests] Test clean-up
[ROCm/clr commit: 44a897a146]
2019-10-18 18:55:52 +03:00
Evgeny Mankov 85281b1d86 [HIPIFY][CUB][#1460] Add "using namespace cub" translation support
+ Add cub_03.cu


[ROCm/clr commit: 86f6756b02]
2019-10-18 18:51:40 +03:00
Evgeny Mankov a392a050d6 Merge pull request #1558 from aaronenyeshi/fix-hipify-cmake-version
[HIPIFY][cmake] Make CMakeLists use default 3.5.1 for Ubuntu 16.04

[ROCm/clr commit: eb6690bbba]
2019-10-18 06:39:35 +03:00
Rahul Garg 30759e7c9b Merge pull request #1550 from yxsamliu/new-launch
Add -fhip-new-launch-api to hipcc for HIP/VDI

[ROCm/clr commit: 07eed1e5bf]
2019-10-17 19:07:32 -07:00
Vladislav Sytchenko 54eddfc8f0 _aligned_malloc() on Windows first takes size, then alignment, which is the opposite of how the similar function behaves on Linux. Memory allocated by it also has to be freed using _aligned_free(), unlike Linux where we can use regular free().
Edit aligned_alloc() macro and add a aligned_free() one to align with the above behaviour.


[ROCm/clr commit: f4440817cb]
2019-10-17 18:58:32 -04:00
Aaron Enye Shi 489e3dda9a [HIPIFY][cmake] Make CMakeLists use default 3.5.1 for Ubuntu 16.04
[ROCm/clr commit: b3ea58abe7]
2019-10-17 21:21:24 +00:00
Evgeny Mankov 9fb60fa36a [HIPIFY][doc] Update README.md
+ Versions, testing


[ROCm/clr commit: 1165e6bd71]
2019-10-17 22:26:48 +03:00
Rahul Garg 714314fa66 Revert "hipcc defaults to code object v3 (#1298)"
This reverts commit e5a2ba9602.


[ROCm/clr commit: 446718f990]
2019-10-17 13:27:28 -04:00
Evgeny Mankov c8238e1fd4 [HIPIFY][cmake] Add install rule for clang-resource-headers
+ Fix: set destination for all installing files to ${CMAKE_INSTALL_PREFIX}


[ROCm/clr commit: 8c3dff7ab9]
2019-10-17 15:05:55 +03:00
Rahul Garg 685a4cd182 Merge pull request #1544 from vsytch/master
QoL changes to the hipMemset family

[ROCm/clr commit: a21fe1443b]
2019-10-16 18:54:20 -07:00
Evgeny Mankov b357301610 [HIPIFY][CUB][#1460] Add cub:: namespace support in TemplateInstantiation of cudaLaunchKernel
+ Update cub_02.cu test accordingly


[ROCm/clr commit: e557563947]
2019-10-16 19:02:13 +03:00
Vladislav Sytchenko 577bac5de8 hipMemset2D and hipMemset3D tests should be passing by default.
[ROCm/clr commit: 86d0c5fa5a]
2019-10-16 11:02:38 -04:00
Evgeny Mankov 97f10790eb [HIPIFY] Refactor a couple of matcher functions
+ Separate out GetSubstrLocation function for finding substr SourceLocation in a given SourceRange


[ROCm/clr commit: 0a20048759]
2019-10-16 13:43:56 +03:00
Evgeny Mankov 643a8bcf5b [HIPIFY][CUB][#1460] Implement cubFunctionTemplateDecl matcher
+ Add cub_02.cu test
+ Partial fixes #1460


[ROCm/clr commit: 5555d46e66]
2019-10-16 13:08:11 +03:00
kjayapra-amd 97c823d552 Use the correct return type in runTest in 11_texture_driver sample. (#1546)
Fixes SWDEV-203394.
Currently in runTest() returns true, even if the texture reference copy does not happen. Using the existing testResult Flag to return from runTest().

[ROCm/clr commit: 9d571e3c9e]
2019-10-16 10:52:15 +05:30
vsytch 4b8d8034cf Update hipMathFunctions, hipTestHalf and hipTestNativeHalf tests to support Navi10 and Navi14. (#1545)
[ROCm/clr commit: c2aadd4d12]
2019-10-16 10:51:48 +05:30
kpyzhov 19f22b468b [hipcc] Temporary add -D_OPENMP to clang options to workaround cmake issue (#1540)
* Temporary add -D_OPENMP to clang options in hipcc to allow using CMake OpenMP detection with hip-clang (until updated CMake version is available).

[ROCm/clr commit: 9773f94c71]
2019-10-16 10:51:28 +05:30
Nick Curtis a7d6c03e17 Guard against division by zero for no VGPR usage (e.g., in an empty kernel) (#1528)
* guard against division by zero for no VGPR usage (e.g., in an empty kernel)

* fix bracket format

* clean up parenthesis


[ROCm/clr commit: d16963c9d5]
2019-10-16 10:49:56 +05:30
Jatin Chaudhary 1ec284d333 Adding code object manager to rtc (#1526)
Adding Code Object Manager file to rtc to resolve address of Bundled_code_object in libhiprtc.so

[ROCm/clr commit: b3351561c5]
2019-10-16 10:49:16 +05:30
Xiaozhu Meng e7fb74b07f Fix struct declaration for C (#1524)
This change is necessary for HPCToolkit to use Roctracer to produce code centric profiling view.

[ROCm/clr commit: f9b8a01c77]
2019-10-16 10:48:55 +05:30
Yaxun (Sam) Liu abfe4248da Add -fhip-new-launch-api to hipcc for HIP/VDI
[ROCm/clr commit: 739530d53b]
2019-10-15 21:47:33 -04:00
Vladislav Sytchenko 2bc49fb55c In the hipMemset2D and hipMemset3D tests synchronize with the default stream after performing an async memset.
[ROCm/clr commit: cc5abec092]
2019-10-15 17:15:49 -04:00
Vladislav Sytchenko 91ceb7dd2b Update indentation in the hipMemset3D test. Replace all tabs with four spaces.
[ROCm/clr commit: f402b6d01a]
2019-10-15 15:29:14 -04:00
Vladislav Sytchenko d20c5251b1 Add async subtest to hipMemSet3D
[ROCm/clr commit: c83b6adb33]
2019-10-15 14:24:04 -04:00
Vladislav Sytchenko e6f426dee3 hipMemset2D test should pass only if both async and sync subtests pass.
[ROCm/clr commit: 39e42d4056]
2019-10-15 14:20:14 -04:00
Vladislav Sytchenko e2c2025e3e Update the declarations of hipMemsetD8, hipMemsetD8Async, hipMemsetD16, hipMemsetD16Async. These functions are type aware and take in as their third argument the number of elements in the buffer, not the buffer size. Change the name of this argument from sizeBytes to count to align with the above description.
[ROCm/clr commit: 0200aa3a21]
2019-10-15 14:18:42 -04:00
Evgeny Mankov 39a6f5e205 Merge pull request #1541 from emankov/doc
[HIPIFY][cmake] Make CMakeLists.txt compatible with default cmake 3.5.2 for Ubuntu 16.04

[ROCm/clr commit: aa4e34cfcf]
2019-10-15 17:11:39 +03:00
Evgeny Mankov 4b0e9e9f05 [HIPIFY][tests] Exclude tests for the libs, which are not defined in cmake command line
+ affects cuDNN and CUB tests, paths to libraries of which are defined by CUDA_DNN_ROOT_DIR and CUDA_CUB_ROOT_DIR
+ Warn about excluding and why, for instance:
  "WARN: cuDNN tests are excluded due to unset CUDA_DNN_ROOT_DIR"


[ROCm/clr commit: c0f7d02ced]
2019-10-15 14:20:23 +03:00
Evgeny Mankov d40dfe354a [HIPIFY][cmake] Make CMakeLists.txt compatible with default cmake 3.5.2 for Ubuntu 16.04
+ Update README.md accordingly


[ROCm/clr commit: 5dae577d67]
2019-10-15 11:26:03 +03:00