Commit gráf

1101 Commit-ok

Szerző SHA1 Üzenet Dátum
Rahul Garg aeb7cebbad Merge pull request #1515 from ansurya/tex_unbind_issue_fix
Fix undefined ref to hipUnbindTexture for texture types
2019-10-30 17:54:15 -07:00
Michael LIAO 61bc68a5f4 [HIP] Correct headers and add missing function templates for hip-clang.
- Fix 2 runtime API prototypes
  `hipOccupancyMaxActiveBlocksPerMultiprocessor` and
  `hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags`
- Add missing function templates of them in hip-clang.
2019-10-29 22:00:11 -04:00
Rahul Garg 27221bc823 Revert "Fix occupany APIs (#1560)"
This reverts commit 6c5fbf9b4a.
2019-10-29 11:41:08 -07:00
Anusha Godavarthy Surya 9332a39838 Merge branch 'master' into tex_unbind_issue_fix 2019-10-25 15:54:25 +05:30
Anusha Godavarthy Surya ae838f8cee merge from master 2019-10-25 15:52:09 +05:30
Alex Voicu 40522e2b6a Add missing operators, fix GCC compilation. (#1589) 2019-10-25 15:44:24 +05:30
Alex Voicu f909a393ff Fix deadlock, remove old __sync_* use. (#1584)
This fixes a deadlock introduced by the switch to TTAS loops, and is therefore mildly urgent (to prevent the CI from hoovering in the broken code).
2019-10-25 15:44:17 +05:30
Rahul Garg 14b870d1ce Add hipMemcpy2DfromArray (#1510)
Adds hipMemcpy2DFromArray and hipMemcpy2DFromArrayAsync equivalent to cudaMemcpy2DFromArray and cudaMemcpy2DFromArrayAsync.
2019-10-25 15:43:33 +05:30
Anusha Godavarthy Surya c0fc5e718c Merge branch 'master' into tex_unbind_issue_fix 2019-10-25 15:36:55 +05:30
Anusha Godavarthy Surya b9c8dd8ac6 Fixed CI build failure 2019-10-25 12:21:41 +05:30
gandryey f25692b399 Hip vdi profiling header (#1577)
Add HIP-VDI profiling interface for GPU timing collection.
2019-10-24 17:45:42 +05:30
Alex Voicu 26914ec76e Make CAS loops use the TTAS idiom. (#1573)
* Make CAS loops use the TTAS idiom.

* More efficient re-formulation of TTAS.

* Fix typo.

* The typo was not quite a typo
2019-10-24 17:45:20 +05:30
satyanveshd 6c5fbf9b4a Fix occupany APIs (#1560)
Addresses SWDEV-205006
2019-10-24 17:44:47 +05:30
searlmc1 15a699688e Improve performance of v2 arg handling (#1539)
* Improve performance of v2 arg handling

* Missing change to `std::string`
2019-10-24 17:44:05 +05:30
Alex Voicu 84d5b399f6 Improve scalar access into vector types. (#1531)
The improvement is based on the ideas here: https://t0rakka.silvrback.com/simd-scalar-accessor. It yields significantly better ISA when the base's .xyzw members are used.
2019-10-24 17:43:49 +05:30
Vladislav Sytchenko 0200aa3a21 Update the declarations of hipMemsetD8, hipMemsetD8Async, hipMemsetD16, hipMemsetD16Async. These functions are type aware and take in as their third argument the number of elements in the buffer, not the buffer size. Change the name of this argument from sizeBytes to count to align with the above description. 2019-10-15 14:18:42 -04:00
Evgeny Mankov 7a1301eab9 [HIP] Fix typo in a comment 2019-10-11 15:20:58 +03:00
Evgeny Mankov 3a83b3a62c [HIP][fix] Prefix libraryPropertyType to fix build of rocFFT and TensorFlow 2019-10-11 15:18:08 +03:00
Evgeny Mankov d8d9f16f17 [HIP] Introduce library_types.h as a common header for libs (#1509)
* [HIP] Introduce library_types.h as a common header for libs

[Reason]
Currently, hipFFT, hipBLAS and other HIP libs use their own data types, prefixed with HIPFFT or HIPBLAS, whereas in CUDA those types are common and declared in library_types.h

[TODO]
Switch hipFFT, hipBLAS and other HIP libs to use common library_types.h.

* [HIP] Move include for library_types.h to hip_runtime.h

[Reason]
Repeat CUDA's behaviour, where library_types.h is included in cuda_runtime.h
2019-10-10 19:57:28 +05:30
Philip Salzmann 11f23bba39 Fix uninitialized var in hipDeviceGetAttribute (#1497)
This fixes the usage of an uninitialized cdattr variable in hipDeviceGetAttribute for the CUDA backend when taking the switch default, as detailed in #1317.

Note that the directed_tests/runtimeApi/device/hipGetDeviceAttribute.tst test fails for me, but it already did before applying this patch. Let's see what CI says!
2019-10-04 13:39:19 +05:30
Rahul Garg d5a61736d8 Add texref get APIs support (#1471)
Added support for -
    hipTexRefGetArray
    hipTexRefGetAddressMode
    hipTexRefGetAddress
2019-10-04 13:38:45 +05:30
Sarbojit2019 a7f52f8ea1 Removed definition of abs(), real() & imag() from hip_complex.h (#1448)
Addresses SWDEV-201461.
2019-10-04 13:38:02 +05:30
ansurya 96890792b0 Added new Memory API's (#1399)
Added new memory API's hipMemAllocPitch, hipMemAllocHost, hipMemsetD16, hipMemsetD16Async, hipMemsetD8Async
Modified to support all scenarios hipMemcpyParam2DAsync, hipMemcpyParam2D.
2019-10-04 13:36:31 +05:30
Yaxun (Sam) Liu 7b3b82bd0d Fix cast of __half for HIP-clang (#1475) 2019-09-30 10:40:42 +05:30
satyanveshd ff1b23b558 Map clock64() to __builtin_readcyclecounter() (#1473)
Fixes SWDEV-203215.
2019-09-30 10:40:31 +05:30
eshcherb b8139f7da4 to include hip_prof_str.h under USE_PROF_API macro (#1470) 2019-09-30 10:39:41 +05:30
Alex Voicu 33c40ee510 Optimise the gridDim.n * blockDim.m idiom (#1468) 2019-09-30 10:39:23 +05:30
Yaxun (Sam) Liu f12b060a3b Add new kernel launching API for hip-clang 2019-09-26 20:15:24 -04:00
Sarbojit2019 db4c0c0811 [HIP] Add tccDriver info in hipDeviceProp
Fixes #1433.
2019-09-26 13:53:33 +05:30
mhbliao 4691c1ef8c [HIP] Remove a circular including. (#1418) 2019-09-16 08:32:47 +00:00
ansurya e2a934f377 Added new device attributes (#1377)
* Added new device attributes

* updated comment

* updated with new device attributes supported
2019-09-16 08:31:30 +00:00
mhbliao d511c3e363 [hip] Stop using noduplicate and replace it with convergent. (#1390) 2019-09-05 10:03:43 +00:00
Yaxun (Sam) Liu e72346dc24 Do not include cuda wappers for OMP for hip-clang (#1382) 2019-09-03 05:13:59 +00:00
Sarbojit2019 4d270775c8 Removed hipLaunchKernel macro got missed in Merge (#1374) 2019-09-03 05:13:07 +00:00
Sarbojit2019 b5ff9e71ed Updated hipErrorString and CUDAErrorTohipError (#1365) 2019-08-29 01:02:59 +00:00
Sarbojit2019 1ae43cbeba [HIP] Reclaiming hipLaunchKernel API (#1353)
* [HIP] Reclaiming hipLaunchKernel API

* Reclaiming hipLaunchKernel : Incorporated review comments

* Incorporated review comments

* Removed hipLaunchKernel Macro from nvcc path
2019-08-29 01:02:41 +00:00
satyanveshd d2df21e58c [sample] add new cookbook sample - occupancy (#1352)
* occupancy.cpp with Makefile

* occupancy sample changes according tothe comments

* Changes according to the review comments

* Occupancy Sample Changes

* Changes according to review comments
2019-08-29 01:01:49 +00:00
mshivama f74903851e Device side support for Cooperative Group feature (#1202)
* first cut of the header implementation of cooperative group feature

* add diclarations for device library functions

* fixed various compile time issues in the CG headers

* enabled copy construction and copy assignment

* fixed a minor bug related to conditional compilation macro

* fixed few more CG constructor issues and added a unit testcase

* fixed typo

* extended unit testcase

* compute size of partitioned CG from mask

* bit of code refactoring

* removed boilerplate code

* fixed few of the review comments by Brian

* Changes to the sigantures of few grid and multi-grid related OCKL functions

* changes to declarations of OCKL functions related to CG feature

* removed all the block level support as it is not planned for 2.9

* Have taken care of review comments by Brian

* Have taken care of review comments by Brian

* removed unused functions which were initially intended to use in block level cg support
2019-08-29 01:01:25 +00:00
Michael LIAO c84628f420 [hcc] Fix previous replacement of result_of_t.
- `result_of_t` is defined as the shortcut of
  ```
  template< class T >
  using result_of_t = typename result_of<T>::type;
  ```
2019-08-26 10:58:38 -04:00
ramcherukuri 6ea9e2b249 moving result_of_t to result_of 2019-08-24 08:59:58 -04:00
Rahul Garg d367fdf28c Make Bundled_code_header visible for hipRTC usage (#1359) 2019-08-23 09:20:02 +00:00
Aryan Salmanpour 32ce882d6e [hip] add initial implementation for hipLaunchCooperativeKernel API (#1339)
* [hip] add initial implementation for hipLaunchCooperativeKernel API

* [hip] use total number of work groups to initialize the GWS resource

* [hip] use only one argument for init_gws kernel

* [hip] use the device associated with the stream for checking the device properties
2019-08-23 09:19:35 +00:00
Sarbojit2019 684801fc1d Compilation failure on nvcc path when using hipChannelFormatKind (#1345)
Fix for github #1183 issue reported
2019-08-21 10:01:03 +00:00
kpyzhov d81267ae15 Corrected declaration of __ockl_clz_u64() (#1340) 2019-08-20 12:06:36 +00:00
Yaxun (Sam) Liu 68a59ddbe8 Fix missing decl for hip-clang
Add back decl for hipHccModuleLaunchKernel and hipExtModuleLaunchKernel for HIP/VDI only
2019-08-19 18:27:13 -04:00
mhbliao 282f06014e [hip] Allow from/to half conversion on host side. (#1334) 2019-08-16 02:13:59 +00:00
Yaxun (Sam) Liu c4655d167e Fix assert for windows. (#1329)
MSVC assert.h has no guard for include once. The macro assert overrides
device assert definition. Do not include it for device compilation.
2019-08-16 02:13:33 +00:00
Rahul Garg fbc9f7e20a Add hipMemcpy3DAsync (#1320)
* Add hipMemcpy3DAsync

* Fix CI build error

* Move back stream resolution to internal function

* Remove stream redefinition and check
2019-08-16 02:13:16 +00:00
Rahul Garg 7f9de881cb Fix undefined identifier issue for hipExtModuleLaunchKernel 2019-08-14 16:46:32 -04:00
Sarbojit2019 c4618f2c3b [HIP] Fix for hipArray_t failure on nvcc path
Fixes SWDEV-148407
2019-08-14 11:30:06 +00:00