Commit Graph

1094 Commits

Author SHA1 Message Date
Anusha Godavarthy Surya 5f47e99ffe merge from master 2019-10-25 15:52:09 +05:30
Anusha Godavarthy Surya 259d8b4cdf Merge branch 'master' into tex_unbind_issue_fix 2019-10-25 15:36:55 +05:30
Anusha Godavarthy Surya ce04bdaa1a Fixed CI build failure 2019-10-25 12:21:41 +05:30
gandryey 81952ce5a7 Hip vdi profiling header (#1577)
Add HIP-VDI profiling interface for GPU timing collection.
2019-10-24 17:45:42 +05:30
Alex Voicu 9ba25b42c8 Make CAS loops use the TTAS idiom. (#1573)
* Make CAS loops use the TTAS idiom.

* More efficient re-formulation of TTAS.

* Fix typo.

* The typo was not quite a typo
2019-10-24 17:45:20 +05:30
satyanveshd af351d7e1b Fix occupany APIs (#1560)
Addresses SWDEV-205006
2019-10-24 17:44:47 +05:30
searlmc1 c4a51f3679 Improve performance of v2 arg handling (#1539)
* Improve performance of v2 arg handling

* Missing change to `std::string`
2019-10-24 17:44:05 +05:30
Alex Voicu 4a635add45 Improve scalar access into vector types. (#1531)
The improvement is based on the ideas here: https://t0rakka.silvrback.com/simd-scalar-accessor. It yields significantly better ISA when the base's .xyzw members are used.
2019-10-24 17:43:49 +05:30
Vladislav Sytchenko 0b52c1d9d8 Update the declarations of hipMemsetD8, hipMemsetD8Async, hipMemsetD16, hipMemsetD16Async. These functions are type aware and take in as their third argument the number of elements in the buffer, not the buffer size. Change the name of this argument from sizeBytes to count to align with the above description. 2019-10-15 14:18:42 -04:00
Evgeny Mankov 28c23a7b1a [HIP] Fix typo in a comment 2019-10-11 15:20:58 +03:00
Evgeny Mankov 337b7ce06a [HIP][fix] Prefix libraryPropertyType to fix build of rocFFT and TensorFlow 2019-10-11 15:18:08 +03:00
Evgeny Mankov 94eb4155dd [HIP] Introduce library_types.h as a common header for libs (#1509)
* [HIP] Introduce library_types.h as a common header for libs

[Reason]
Currently, hipFFT, hipBLAS and other HIP libs use their own data types, prefixed with HIPFFT or HIPBLAS, whereas in CUDA those types are common and declared in library_types.h

[TODO]
Switch hipFFT, hipBLAS and other HIP libs to use common library_types.h.

* [HIP] Move include for library_types.h to hip_runtime.h

[Reason]
Repeat CUDA's behaviour, where library_types.h is included in cuda_runtime.h
2019-10-10 19:57:28 +05:30
Philip Salzmann 9ababa4276 Fix uninitialized var in hipDeviceGetAttribute (#1497)
This fixes the usage of an uninitialized cdattr variable in hipDeviceGetAttribute for the CUDA backend when taking the switch default, as detailed in #1317.

Note that the directed_tests/runtimeApi/device/hipGetDeviceAttribute.tst test fails for me, but it already did before applying this patch. Let's see what CI says!
2019-10-04 13:39:19 +05:30
Rahul Garg bec725dec2 Add texref get APIs support (#1471)
Added support for -
    hipTexRefGetArray
    hipTexRefGetAddressMode
    hipTexRefGetAddress
2019-10-04 13:38:45 +05:30
Sarbojit2019 58a476abc2 Removed definition of abs(), real() & imag() from hip_complex.h (#1448)
Addresses SWDEV-201461.
2019-10-04 13:38:02 +05:30
ansurya ba9c6e13e4 Added new Memory API's (#1399)
Added new memory API's hipMemAllocPitch, hipMemAllocHost, hipMemsetD16, hipMemsetD16Async, hipMemsetD8Async
Modified to support all scenarios hipMemcpyParam2DAsync, hipMemcpyParam2D.
2019-10-04 13:36:31 +05:30
Yaxun (Sam) Liu 56193a7828 Fix cast of __half for HIP-clang (#1475) 2019-09-30 10:40:42 +05:30
satyanveshd 4b413739a9 Map clock64() to __builtin_readcyclecounter() (#1473)
Fixes SWDEV-203215.
2019-09-30 10:40:31 +05:30
eshcherb 8234da33b9 to include hip_prof_str.h under USE_PROF_API macro (#1470) 2019-09-30 10:39:41 +05:30
Alex Voicu ab8fe8a3d8 Optimise the gridDim.n * blockDim.m idiom (#1468) 2019-09-30 10:39:23 +05:30
Yaxun (Sam) Liu 3c80389584 Add new kernel launching API for hip-clang 2019-09-26 20:15:24 -04:00
Sarbojit2019 0fa42af08c [HIP] Add tccDriver info in hipDeviceProp
Fixes #1433.
2019-09-26 13:53:33 +05:30
mhbliao 1f8c3bbd3b [HIP] Remove a circular including. (#1418) 2019-09-16 08:32:47 +00:00
ansurya ceb734b917 Added new device attributes (#1377)
* Added new device attributes

* updated comment

* updated with new device attributes supported
2019-09-16 08:31:30 +00:00
mhbliao 119ee4b671 [hip] Stop using noduplicate and replace it with convergent. (#1390) 2019-09-05 10:03:43 +00:00
Yaxun (Sam) Liu 8fe8fc18c0 Do not include cuda wappers for OMP for hip-clang (#1382) 2019-09-03 05:13:59 +00:00
Sarbojit2019 e1f9e08ea7 Removed hipLaunchKernel macro got missed in Merge (#1374) 2019-09-03 05:13:07 +00:00
Sarbojit2019 0722704f35 Updated hipErrorString and CUDAErrorTohipError (#1365) 2019-08-29 01:02:59 +00:00
Sarbojit2019 5c4f78bac3 [HIP] Reclaiming hipLaunchKernel API (#1353)
* [HIP] Reclaiming hipLaunchKernel API

* Reclaiming hipLaunchKernel : Incorporated review comments

* Incorporated review comments

* Removed hipLaunchKernel Macro from nvcc path
2019-08-29 01:02:41 +00:00
satyanveshd f807cc1a7b [sample] add new cookbook sample - occupancy (#1352)
* occupancy.cpp with Makefile

* occupancy sample changes according tothe comments

* Changes according to the review comments

* Occupancy Sample Changes

* Changes according to review comments
2019-08-29 01:01:49 +00:00
mshivama d75dc4eb29 Device side support for Cooperative Group feature (#1202)
* first cut of the header implementation of cooperative group feature

* add diclarations for device library functions

* fixed various compile time issues in the CG headers

* enabled copy construction and copy assignment

* fixed a minor bug related to conditional compilation macro

* fixed few more CG constructor issues and added a unit testcase

* fixed typo

* extended unit testcase

* compute size of partitioned CG from mask

* bit of code refactoring

* removed boilerplate code

* fixed few of the review comments by Brian

* Changes to the sigantures of few grid and multi-grid related OCKL functions

* changes to declarations of OCKL functions related to CG feature

* removed all the block level support as it is not planned for 2.9

* Have taken care of review comments by Brian

* Have taken care of review comments by Brian

* removed unused functions which were initially intended to use in block level cg support
2019-08-29 01:01:25 +00:00
Michael LIAO 63e47e525b [hcc] Fix previous replacement of result_of_t.
- `result_of_t` is defined as the shortcut of
  ```
  template< class T >
  using result_of_t = typename result_of<T>::type;
  ```
2019-08-26 10:58:38 -04:00
ramcherukuri 3a6ca29815 moving result_of_t to result_of 2019-08-24 08:59:58 -04:00
Rahul Garg 0fd14a3e13 Make Bundled_code_header visible for hipRTC usage (#1359) 2019-08-23 09:20:02 +00:00
Aryan Salmanpour 5066700ace [hip] add initial implementation for hipLaunchCooperativeKernel API (#1339)
* [hip] add initial implementation for hipLaunchCooperativeKernel API

* [hip] use total number of work groups to initialize the GWS resource

* [hip] use only one argument for init_gws kernel

* [hip] use the device associated with the stream for checking the device properties
2019-08-23 09:19:35 +00:00
Sarbojit2019 84de192c9b Compilation failure on nvcc path when using hipChannelFormatKind (#1345)
Fix for github #1183 issue reported
2019-08-21 10:01:03 +00:00
kpyzhov 0e3198be25 Corrected declaration of __ockl_clz_u64() (#1340) 2019-08-20 12:06:36 +00:00
Yaxun (Sam) Liu 51f0b3f3a6 Fix missing decl for hip-clang
Add back decl for hipHccModuleLaunchKernel and hipExtModuleLaunchKernel for HIP/VDI only
2019-08-19 18:27:13 -04:00
mhbliao e919a8246e [hip] Allow from/to half conversion on host side. (#1334) 2019-08-16 02:13:59 +00:00
Yaxun (Sam) Liu 7aa7a4ce22 Fix assert for windows. (#1329)
MSVC assert.h has no guard for include once. The macro assert overrides
device assert definition. Do not include it for device compilation.
2019-08-16 02:13:33 +00:00
Rahul Garg 2405621f62 Add hipMemcpy3DAsync (#1320)
* Add hipMemcpy3DAsync

* Fix CI build error

* Move back stream resolution to internal function

* Remove stream redefinition and check
2019-08-16 02:13:16 +00:00
Rahul Garg 3dd0e988b1 Fix undefined identifier issue for hipExtModuleLaunchKernel 2019-08-14 16:46:32 -04:00
Sarbojit2019 b2fc64cc39 [HIP] Fix for hipArray_t failure on nvcc path
Fixes SWDEV-148407
2019-08-14 11:30:06 +00:00
Rahul Garg 45b73e0961 Add hipMemcpyParam2DAsync (#1296)
* Add hipMemcpyParam2DAsync

* Add NVCC path changes

* Clean up

* Fix build issue

* Fix else use in both sync and async apis
2019-08-09 11:50:37 +00:00
Siu Chi Chan 83af327ef2 Compile HIP runtime with hidden visibility by default (#1303)
* add default visibility to most APIs in program_state

* remove unwanted C++ headers

* Add symbol visibility pragmas and compiler flags

* Add visibility attribute to APIs in channel_descriptor and hip_hcc

* remove unused headers

* simplify build flags with hcc

* add pragma visibility hidden to functional_grid_launch

* [CMake] add gfx908 back
2019-08-08 08:33:04 +00:00
Rahul Garg 6ce86f409d Add support for hipFuncGetAttribute (#1279)
* Add support for hipFunGetAttribute

* Support NVCC path

* Test using sample module_api_global

* Try fixing CI build failure due to hip_prof_gen scan

* Fix for CI build issue

* Resolve conflict

* Rebase and resolve conflicts with master

* Fix build error

* Fix NVCC path build error
2019-08-08 08:27:41 +00:00
Rahul Garg 59bda14979 Enable temporarily disabled device properties on HIP/VDI 2019-08-06 22:03:19 -04:00
Maneesh Gupta 4ee600ed5e Merge pull request #1280 from ROCm-Developer-Tools/fix_dont_break_hcc_just_because
This difference makes absolutely no sense.
2019-08-05 09:51:53 +00:00
Sarbojit2019 3bfff0a23d Enabled gcc for hip host code (#1214)
* Enabled gcc for hip host code

* Adding tests for hip code + (gcc & g++), without kernels

* Excluding nvcc platforms for gcc and g++ tests + Addressing review comments

* minor code clean-up

* Add rocm include path

* Added relative path for library

* Hiding non supported functions for gcc

* Incorporating review comments
2019-08-05 09:51:36 +00:00
Jeff Daily 1eb3dbf065 consolidate thread local storage (#915)
* all thread local access now through single struct

* clean up old commented-out code, more use of GET_TLS()

* fewer calls to GET_TLS by passing tls as a funtion argument

* revert unnecessary change to printf

* fix failing tests due to TLS change

* fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor
2019-08-05 09:51:02 +00:00