Alex Voicu
be70b9f7e7
static inline in a header, just like excess sugar in a diet, causes bloat (#1692 )
2019-12-23 19:09:38 +05:30
Sarbojit2019
e2fc00da65
Fix for windows dtest build failure ( #1742 )
2019-12-19 13:10:43 -08:00
mhbliao
99a3b66110
[hip] Add macro guarding the enum conversion for scalar accessor. ( #1748 )
...
- That's a high overhead part, which needs enabling ONLY if necessary.
2019-12-19 10:08:37 -08:00
Evgeny Mankov
d8737ba50c
[HIP] Unify hipError_t (Step 1)
...
Step 1. Set the same values for RT error codes as for analogous Driver's
[Reason] RT's and Driver's error codes unification in CUDA 10.2
2019-12-13 19:40:16 +03:00
mhbliao
444c931641
Only add hipExtLaunchMultiKernelMultiDevice for non-HCC compilers. ( #1729 )
2019-12-10 10:32:25 -08:00
jglaser
00d735cdc9
fix linking of vector types with gcc ( #1690 )
...
* fix linking of vector types when linking hipcc objects with gcc
* use __atribute__((vector_size)) with both clang and gcc
and reinstate nonaligned n=3 vector type
* use implicit conversion to value and ext_vector_type when available
* Alternate formulation for GCC compatibility
* Built-in arrays don't mix well with placement new
* Fix typo
* Add conversions to enum
* Fix Scalar_accessor assignment.
* Update hip_vector_types.h
* stir up the underlying_type hideous mess
This fixes the HIP build issue "error: only enumeration types have underlying types".
2019-12-10 09:40:15 +05:30
mhbliao
e9da934ac6
Fix hipExtLaunchMultiKernelMultiDevice refactoring. ( #1714 )
...
- Use the correct condition for HIP VDI runtime.
2019-12-06 09:49:17 -08:00
Rahul Garg
e53fc316f1
Revert - Changes related to hipMemcpyWithStream ( #1718 )
...
Reverting #1673 , #1697 and #1707 .
Support for hipMemcpyWithStream and memcpy optimizations, will be brought in again once issues seen with these are resolved independently.
2019-12-06 09:51:53 +05:30
Aryan Salmanpour
68cc787781
[hip] refactoring hipExtLaunchMultiKernelMultiDevice API ( #1698 )
...
[Background] it was found that if lazy linking used for a library that calls hipExtLaunchMultiKernelMultiDevice API then this API can get the wrong program_state object for looking up device kernels leading to a "No device code available" error in this API.
To fix this issue, the API was refactored to be inline and get and pass the correct program_state to an internal hip API to request a multi-device kernel launch.
2019-12-04 11:50:51 +05:30
Maneesh Gupta
32442c6506
Revert changes for atomic FADD support when address is in LDS ( #1701 )
...
This reverts PR #1591 and follow-on PR #1695
2019-11-29 11:58:12 +05:30
Alex Voicu
b6514fffb9
Uniform is_shared query. ( #1695 )
2019-11-28 13:39:05 +05:30
Alex Voicu
aaf31b6b96
Unary operators were too restrictive in the type of their argument. ( #1683 )
2019-11-22 07:54:53 +05:30
ansurya
e60dec51da
Fix rocBLAS compilation failure ( #1677 )
...
SWDEV-212749:
o Recent changes to “add support for extended launch” require hip_runtime.h to be include in hip_ext.h
o Order in which external applications include hip_hcc.h/hip_runtime.h causes compilation failure
2019-11-22 07:54:17 +05:30
Alex Voicu
d597e7ca20
Use native support for atomic FADD when address is in LDS ( #1591 )
2019-11-22 07:53:48 +05:30
satyanveshd
6b06911ef1
fixed directed tests fail when hcc bumped to 3.0 ( #1678 )
...
Handled the HCC version check appropriately as few of the directed tests (SWDEV-212161) were failing when hcc was bumped to 3.0.
2019-11-20 21:37:52 +05:30
Alex Voicu
5a1f823739
General sync memcpy improvements. Add hipMemcpyWithStream ( #1673 )
...
* General sync memcpy improvements. Add `hipMemcpyWithStream`
* Update hip_memory.cpp
2019-11-20 21:36:37 +05:30
Rahul Garg
b3161e9fa0
Update error codes for hipGetDevice for doxygen and move up null check ( #1668 )
...
* [docs] Update error codes for hipGetDevice
* Move up out ptr check
2019-11-20 21:35:27 +05:30
Paul Fultz II
8519a1411c
Fix helper header when using c++17 ( #1666 )
...
This will fix issue #1621 . It also adds tests for is_callable with c++11, c++14, and c++17.
The fallback implementation was completely broken so I rewrote it so it pass the tests as well. This should be used instead of PR #1631 .
2019-11-20 21:33:42 +05:30
Alex Voicu
b5b3d1bbaa
Extend vector type capabilities and add tests to reflect it. ( #1656 )
2019-11-20 21:32:32 +05:30
mhbliao
ebe0c56f4f
Fix mathlib and app builds with hip-clang. ( #1665 )
2019-11-18 08:18:20 -08:00
Rahul Garg
e39d7497ec
Fix gcc build on NVCC path ( #1661 )
...
* Fix gcc build on NVCC path
* Fix CI build errors
* [dtest] Fix texture and surface obj2D tests
2019-11-18 12:19:22 +05:30
Nick Curtis
3f2316086f
fix complex conjugate for double-complex ( #1659 )
...
The sign in the y component returned from hipConj incorrect for double-complex. Fix to match as in hipConjf above.
2019-11-18 12:19:12 +05:30
Sarbojit2019
b865a50e44
Added null check in hipEventSynchronize ( #1627 )
...
* Added missing null check in hipEventSynchronize
* Minor correction in the Event API description
2019-11-18 12:18:55 +05:30
Alex Voicu
69e74c3e96
Add support for extended launch syntax. ( #1530 )
...
* Add support for extended launch syntax.
* Add unit test.
* Fix typo
* hipExtLaunchKernelGGL lives in hip_ext.h
Change-Id: Ice32dab0d43475fda65c6a910c11416871a8f2ff
* [dtest] remove redundant include from hipModuleGetGlobal dtest
2019-11-16 22:24:07 -08:00
Michael LIAO
d6ff22510e
Remove redundant declarations.
...
- The revised `hip/hip_ext.h` have that declarations.
2019-11-07 10:11:22 -05:00
Alex Voicu
5530c15cc3
Remove native vector support from the GCC case, since it never worked ( #1637 )
2019-11-07 13:19:14 +05:30
ansurya
e07926ce0f
Fixed texture 2D mapping for pitched arrays & 3D Texture read ( #1415 )
...
Texture 2D image mapping for pitched arrays:
github issue: Texture Object's Buffer seems to be Misaligned #886
JIRA ticket: SWDEV-199313
SWDEV-151670 : Fixed issue with 3D texture with 4 components
SWDEV-151671 : Issue with 2D layered texture with 4 components
2019-11-07 13:17:46 +05:30
Rahul Garg
579a4f36fa
Rename hip/hip_hcc.h to hip/hip_ext.h ( #1341 )
...
* Rename hip/hip_hcc.h to hip/hip_ext.h
* Deprecate hip_hcc.h
2019-11-07 13:17:10 +05:30
Alex Voicu
b9faa9f8ae
Remove leftover noise.
2019-11-06 02:46:21 +02:00
Alex Voicu
e5bd00d06b
__half2 should walk like CUDA and talk like CUDA
2019-11-06 02:43:04 +02:00
Michael LIAO
a7f311cc14
Use portable macro for deprecation message.
2019-11-05 11:51:00 -05:00
Rahul Garg
54fab7c35c
Deprecate HIP Markers ( #1622 )
...
* Deprecate HIP markers
* Deprecate profiler start/stop
2019-11-05 12:32:59 +05:30
Alex Voicu
99b9d5449f
Separate volatile for clarity. Handle assignment.
2019-11-02 22:02:08 +02:00
Alex Voicu
ee5097f2c2
Accessors should work even when oddly volatile.
2019-11-01 22:18:01 +02:00
Rahul Garg
ba8105e0cd
Merge pull request #1515 from ansurya/tex_unbind_issue_fix
...
Fix undefined ref to hipUnbindTexture for texture types
2019-10-30 17:54:15 -07:00
Michael LIAO
5c8a7521f4
[HIP] Correct headers and add missing function templates for hip-clang.
...
- Fix 2 runtime API prototypes
`hipOccupancyMaxActiveBlocksPerMultiprocessor` and
`hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags`
- Add missing function templates of them in hip-clang.
2019-10-29 22:00:11 -04:00
Rahul Garg
e4a1e44162
Revert "Fix occupany APIs ( #1560 )"
...
This reverts commit af351d7e1b .
2019-10-29 11:41:08 -07:00
Anusha Godavarthy Surya
03623cc3f1
Merge branch 'master' into tex_unbind_issue_fix
2019-10-25 15:54:25 +05:30
Anusha Godavarthy Surya
5f47e99ffe
merge from master
2019-10-25 15:52:09 +05:30
Alex Voicu
dabd939048
Add missing operators, fix GCC compilation. ( #1589 )
2019-10-25 15:44:24 +05:30
Alex Voicu
a855a13c22
Fix deadlock, remove old __sync_* use. ( #1584 )
...
This fixes a deadlock introduced by the switch to TTAS loops, and is therefore mildly urgent (to prevent the CI from hoovering in the broken code).
2019-10-25 15:44:17 +05:30
Rahul Garg
356765a223
Add hipMemcpy2DfromArray ( #1510 )
...
Adds hipMemcpy2DFromArray and hipMemcpy2DFromArrayAsync equivalent to cudaMemcpy2DFromArray and cudaMemcpy2DFromArrayAsync.
2019-10-25 15:43:33 +05:30
Anusha Godavarthy Surya
259d8b4cdf
Merge branch 'master' into tex_unbind_issue_fix
2019-10-25 15:36:55 +05:30
Anusha Godavarthy Surya
ce04bdaa1a
Fixed CI build failure
2019-10-25 12:21:41 +05:30
gandryey
81952ce5a7
Hip vdi profiling header ( #1577 )
...
Add HIP-VDI profiling interface for GPU timing collection.
2019-10-24 17:45:42 +05:30
Alex Voicu
9ba25b42c8
Make CAS loops use the TTAS idiom. ( #1573 )
...
* Make CAS loops use the TTAS idiom.
* More efficient re-formulation of TTAS.
* Fix typo.
* The typo was not quite a typo
2019-10-24 17:45:20 +05:30
satyanveshd
af351d7e1b
Fix occupany APIs ( #1560 )
...
Addresses SWDEV-205006
2019-10-24 17:44:47 +05:30
searlmc1
c4a51f3679
Improve performance of v2 arg handling ( #1539 )
...
* Improve performance of v2 arg handling
* Missing change to `std::string`
2019-10-24 17:44:05 +05:30
Alex Voicu
4a635add45
Improve scalar access into vector types. ( #1531 )
...
The improvement is based on the ideas here: https://t0rakka.silvrback.com/simd-scalar-accessor . It yields significantly better ISA when the base's .xyzw members are used.
2019-10-24 17:43:49 +05:30
Vladislav Sytchenko
0b52c1d9d8
Update the declarations of hipMemsetD8, hipMemsetD8Async, hipMemsetD16, hipMemsetD16Async. These functions are type aware and take in as their third argument the number of elements in the buffer, not the buffer size. Change the name of this argument from sizeBytes to count to align with the above description.
2019-10-15 14:18:42 -04:00