Граф коммитов

1129 Коммитов

Автор SHA1 Сообщение Дата
mhbliao 209f31525f Fix hipExtLaunchMultiKernelMultiDevice refactoring. (#1714)
- Use the correct condition for HIP VDI runtime.
2019-12-06 09:49:17 -08:00
Rahul Garg a369bd4418 Revert - Changes related to hipMemcpyWithStream (#1718)
Reverting #1673, #1697 and #1707.
Support for hipMemcpyWithStream and memcpy optimizations, will be brought in again once issues seen with these are resolved independently.
2019-12-06 09:51:53 +05:30
Aryan Salmanpour 8eaea4d114 [hip] refactoring hipExtLaunchMultiKernelMultiDevice API (#1698)
[Background] it was found that if lazy linking used for a library that calls hipExtLaunchMultiKernelMultiDevice API then this API can get the wrong program_state object for looking up device kernels leading to a "No device code available" error in this API.

To fix this issue, the API was refactored to be inline and get and pass the correct program_state to an internal hip API to request a multi-device kernel launch.
2019-12-04 11:50:51 +05:30
Maneesh Gupta 4c92bd50c4 Revert changes for atomic FADD support when address is in LDS (#1701)
This reverts PR #1591 and follow-on PR #1695
2019-11-29 11:58:12 +05:30
Alex Voicu 17a4780dc6 Uniform is_shared query. (#1695) 2019-11-28 13:39:05 +05:30
Alex Voicu 306d50291e Unary operators were too restrictive in the type of their argument. (#1683) 2019-11-22 07:54:53 +05:30
ansurya e5fc5aa41c Fix rocBLAS compilation failure (#1677)
SWDEV-212749:
o Recent changes to “add support for extended launch” require hip_runtime.h to be include in hip_ext.h
o Order in which external applications include hip_hcc.h/hip_runtime.h causes compilation failure
2019-11-22 07:54:17 +05:30
Alex Voicu 2ed3a0873c Use native support for atomic FADD when address is in LDS (#1591) 2019-11-22 07:53:48 +05:30
satyanveshd d4dde7a27d fixed directed tests fail when hcc bumped to 3.0 (#1678)
Handled the HCC version check appropriately as few of the directed tests (SWDEV-212161) were failing when hcc was bumped to 3.0.
2019-11-20 21:37:52 +05:30
Alex Voicu 022ac3cb0a General sync memcpy improvements. Add hipMemcpyWithStream (#1673)
* General sync memcpy improvements. Add `hipMemcpyWithStream`

* Update hip_memory.cpp
2019-11-20 21:36:37 +05:30
Rahul Garg 13c2a31d7e Update error codes for hipGetDevice for doxygen and move up null check (#1668)
* [docs] Update error codes for hipGetDevice

* Move up out ptr check
2019-11-20 21:35:27 +05:30
Paul Fultz II 57b1b03261 Fix helper header when using c++17 (#1666)
This will fix issue #1621. It also adds tests for is_callable with c++11, c++14, and c++17.

The fallback implementation was completely broken so I rewrote it so it pass the tests as well. This should be used instead of PR #1631.
2019-11-20 21:33:42 +05:30
Alex Voicu c383f20691 Extend vector type capabilities and add tests to reflect it. (#1656) 2019-11-20 21:32:32 +05:30
mhbliao a45de95113 Fix mathlib and app builds with hip-clang. (#1665) 2019-11-18 08:18:20 -08:00
Rahul Garg ff31f734fe Fix gcc build on NVCC path (#1661)
* Fix gcc build on NVCC path

* Fix CI build errors

* [dtest] Fix texture and surface obj2D tests
2019-11-18 12:19:22 +05:30
Nick Curtis cae9b13020 fix complex conjugate for double-complex (#1659)
The sign in the y component returned from hipConj incorrect for double-complex. Fix to match as in hipConjf above.
2019-11-18 12:19:12 +05:30
Sarbojit2019 7985ad218f Added null check in hipEventSynchronize (#1627)
* Added missing null check in hipEventSynchronize
* Minor correction in the Event API description
2019-11-18 12:18:55 +05:30
Alex Voicu 355d0bdf95 Add support for extended launch syntax. (#1530)
* Add support for extended launch syntax.

* Add unit test.

* Fix typo

* hipExtLaunchKernelGGL lives in hip_ext.h

Change-Id: Ice32dab0d43475fda65c6a910c11416871a8f2ff

* [dtest] remove redundant include from hipModuleGetGlobal dtest
2019-11-16 22:24:07 -08:00
Michael LIAO d28ad401c9 Remove redundant declarations.
- The revised `hip/hip_ext.h` have that declarations.
2019-11-07 10:11:22 -05:00
Alex Voicu d38cc8efba Remove native vector support from the GCC case, since it never worked (#1637) 2019-11-07 13:19:14 +05:30
ansurya dc8f556460 Fixed texture 2D mapping for pitched arrays & 3D Texture read (#1415)
Texture 2D image mapping for pitched arrays:
github issue: Texture Object's Buffer seems to be Misaligned #886
JIRA ticket: SWDEV-199313

SWDEV-151670 : Fixed issue with 3D texture with 4 components
SWDEV-151671 : Issue with 2D layered texture with 4 components
2019-11-07 13:17:46 +05:30
Rahul Garg dfee3ae279 Rename hip/hip_hcc.h to hip/hip_ext.h (#1341)
* Rename hip/hip_hcc.h to hip/hip_ext.h

* Deprecate hip_hcc.h
2019-11-07 13:17:10 +05:30
Alex Voicu 1df423165b Remove leftover noise. 2019-11-06 02:46:21 +02:00
Alex Voicu 55fd1363e2 __half2 should walk like CUDA and talk like CUDA 2019-11-06 02:43:04 +02:00
Michael LIAO 7ca43b98d1 Use portable macro for deprecation message. 2019-11-05 11:51:00 -05:00
Rahul Garg 8b3fce8069 Deprecate HIP Markers (#1622)
* Deprecate HIP markers

* Deprecate profiler start/stop
2019-11-05 12:32:59 +05:30
Alex Voicu ed0d6ec51e Separate volatile for clarity. Handle assignment. 2019-11-02 22:02:08 +02:00
Alex Voicu 2d76dde05b Accessors should work even when oddly volatile. 2019-11-01 22:18:01 +02:00
Rahul Garg aeb7cebbad Merge pull request #1515 from ansurya/tex_unbind_issue_fix
Fix undefined ref to hipUnbindTexture for texture types
2019-10-30 17:54:15 -07:00
Michael LIAO 61bc68a5f4 [HIP] Correct headers and add missing function templates for hip-clang.
- Fix 2 runtime API prototypes
  `hipOccupancyMaxActiveBlocksPerMultiprocessor` and
  `hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags`
- Add missing function templates of them in hip-clang.
2019-10-29 22:00:11 -04:00
Rahul Garg 27221bc823 Revert "Fix occupany APIs (#1560)"
This reverts commit 6c5fbf9b4a.
2019-10-29 11:41:08 -07:00
Anusha Godavarthy Surya 9332a39838 Merge branch 'master' into tex_unbind_issue_fix 2019-10-25 15:54:25 +05:30
Anusha Godavarthy Surya ae838f8cee merge from master 2019-10-25 15:52:09 +05:30
Alex Voicu 40522e2b6a Add missing operators, fix GCC compilation. (#1589) 2019-10-25 15:44:24 +05:30
Alex Voicu f909a393ff Fix deadlock, remove old __sync_* use. (#1584)
This fixes a deadlock introduced by the switch to TTAS loops, and is therefore mildly urgent (to prevent the CI from hoovering in the broken code).
2019-10-25 15:44:17 +05:30
Rahul Garg 14b870d1ce Add hipMemcpy2DfromArray (#1510)
Adds hipMemcpy2DFromArray and hipMemcpy2DFromArrayAsync equivalent to cudaMemcpy2DFromArray and cudaMemcpy2DFromArrayAsync.
2019-10-25 15:43:33 +05:30
Anusha Godavarthy Surya c0fc5e718c Merge branch 'master' into tex_unbind_issue_fix 2019-10-25 15:36:55 +05:30
Anusha Godavarthy Surya b9c8dd8ac6 Fixed CI build failure 2019-10-25 12:21:41 +05:30
gandryey f25692b399 Hip vdi profiling header (#1577)
Add HIP-VDI profiling interface for GPU timing collection.
2019-10-24 17:45:42 +05:30
Alex Voicu 26914ec76e Make CAS loops use the TTAS idiom. (#1573)
* Make CAS loops use the TTAS idiom.

* More efficient re-formulation of TTAS.

* Fix typo.

* The typo was not quite a typo
2019-10-24 17:45:20 +05:30
satyanveshd 6c5fbf9b4a Fix occupany APIs (#1560)
Addresses SWDEV-205006
2019-10-24 17:44:47 +05:30
searlmc1 15a699688e Improve performance of v2 arg handling (#1539)
* Improve performance of v2 arg handling

* Missing change to `std::string`
2019-10-24 17:44:05 +05:30
Alex Voicu 84d5b399f6 Improve scalar access into vector types. (#1531)
The improvement is based on the ideas here: https://t0rakka.silvrback.com/simd-scalar-accessor. It yields significantly better ISA when the base's .xyzw members are used.
2019-10-24 17:43:49 +05:30
Vladislav Sytchenko 0200aa3a21 Update the declarations of hipMemsetD8, hipMemsetD8Async, hipMemsetD16, hipMemsetD16Async. These functions are type aware and take in as their third argument the number of elements in the buffer, not the buffer size. Change the name of this argument from sizeBytes to count to align with the above description. 2019-10-15 14:18:42 -04:00
Evgeny Mankov 7a1301eab9 [HIP] Fix typo in a comment 2019-10-11 15:20:58 +03:00
Evgeny Mankov 3a83b3a62c [HIP][fix] Prefix libraryPropertyType to fix build of rocFFT and TensorFlow 2019-10-11 15:18:08 +03:00
Evgeny Mankov d8d9f16f17 [HIP] Introduce library_types.h as a common header for libs (#1509)
* [HIP] Introduce library_types.h as a common header for libs

[Reason]
Currently, hipFFT, hipBLAS and other HIP libs use their own data types, prefixed with HIPFFT or HIPBLAS, whereas in CUDA those types are common and declared in library_types.h

[TODO]
Switch hipFFT, hipBLAS and other HIP libs to use common library_types.h.

* [HIP] Move include for library_types.h to hip_runtime.h

[Reason]
Repeat CUDA's behaviour, where library_types.h is included in cuda_runtime.h
2019-10-10 19:57:28 +05:30
Philip Salzmann 11f23bba39 Fix uninitialized var in hipDeviceGetAttribute (#1497)
This fixes the usage of an uninitialized cdattr variable in hipDeviceGetAttribute for the CUDA backend when taking the switch default, as detailed in #1317.

Note that the directed_tests/runtimeApi/device/hipGetDeviceAttribute.tst test fails for me, but it already did before applying this patch. Let's see what CI says!
2019-10-04 13:39:19 +05:30
Rahul Garg d5a61736d8 Add texref get APIs support (#1471)
Added support for -
    hipTexRefGetArray
    hipTexRefGetAddressMode
    hipTexRefGetAddress
2019-10-04 13:38:45 +05:30
Sarbojit2019 a7f52f8ea1 Removed definition of abs(), real() & imag() from hip_complex.h (#1448)
Addresses SWDEV-201461.
2019-10-04 13:38:02 +05:30