İşleme Grafiği

1199 İşleme

Yazar SHA1 Mesaj Tarih
ansurya 770e76e752 Initial support for bfloat16 (#1980) 2020-04-06 15:35:43 +05:30
Yaxun (Sam) Liu 4af2106d10 Fix ambiguity of fma for _Float16 for libc++ (#1976)
libc++ defines fma as template function for auto promotion of mixed-type
arguments. libc++ does not handle _Float16 as _Float16 is not a supported
type by C++ standard. As such, it is unlikely we can commit our fix for
_Float16 to libc++ trunk.

Therefore we handle _Float16 with a template specialization of
__numeric_type in HIP headers.

Change-Id: If01960a657ebf1a7a67463cdcf66fab7458dff3c
2020-04-06 15:35:18 +05:30
Maneesh Gupta cbc3d1713f Remove address_space(1) typecast and use __ockl_atomic_add_noret_f32 (#1956)
* Remove address_space(1) typecast for ockl_global_atomic_add_f32
* use __ockl_atomic_add_noret_f32
2020-03-28 17:28:33 +05:30
Siu Chi Chan 43abf84f54 don't expose symbols from code_object_bundle (#1971)
Change-Id: I56479485aad42c3d517fe6d9055be1cd846eeb00
2020-03-27 14:09:07 +05:30
Sarbojit2019 5024f9057a Fix for __usad issue (#1972)
Fixes #1930
2020-03-26 17:09:44 +05:30
Benjamin Sherman 3d38135ae2 Add const qualifiers to HIP_vector_type unary arithmetic operators (#1965)
Resolves issue #1960
2020-03-26 17:09:00 +05:30
Joseph Greathouse f61b79d9a3 Fix cooperative launch APIs to set hipGetLastError (#1935)
* Fix cooperative launch APIs to set hipGetLastError

Previously, the cooperative launch APIs did not properly log their
errors in the global hipGetLastError variable before returning back
to the user. As such, the APIs would leave hipSuccess in the
last error, which would break some use cases.

This fixes that problem by making a trampoline function that does
the HIP_INIT_API and ihipLogStatus.

* Add missing flag to the log of multi-GPU launch
2020-03-25 14:39:24 -07:00
Nick Curtis b4c69a2e4a Update hip_runtime_api.h (#1966)
Correct URL for deprecated api list
2020-03-23 10:16:24 -07:00
Yaxun (Sam) Liu 08d9759eba Workaround for libc++ include path for HIP-Clang (#1917)
HIP-Clang cuda_wrapper headers require clang include path before standard C++ include path.
However libc++ include path requires to be before clang include path.
To workaround this, we pass -isystem with the parent directory of clang include
path instead of the clang include path itself.
2020-03-18 11:20:21 +05:30
Jatin Chaudhary 16a6a94fbf Adding Half Abs APIs (#1902) 2020-03-17 14:13:19 +05:30
Sameer Sahasrabuddhe 899c878703 enable HCC printf when using hip-clang (#1947)
This allows printf to work with hip-clang and HCC runtime. See comments under #1919 for a reported bug and feature request.
2020-03-17 14:03:27 +05:30
Joseph Greathouse f7e85649f4 Fix compiler warning on NVCC path (#1942)
GCC emits a warning about using static functions like
hipCUDAErrorTohipError inside this function, because it has an
inline directive, but it's not static. Adding static to this function
to silence warnings (and prevent potential problems in the future).
2020-03-17 14:02:59 +05:30
Joseph Greathouse 4128d68ed7 Fix occupancy calculations API on NVCC (#1941)
NVCC warned if you tried to use hipOccupancyMaxActiveBlocksPerMultiprocessor
because when passing in a device function pointer, "const void* func" was
insufficient to describe it accurately. Adding a C++ templated class type
definition for this function.
2020-03-17 14:02:48 +05:30
Sarbojit2019 320742e8a0 Fix __sad signature match with Cuda (#1936)
Fix for issue #1930
2020-03-17 14:02:00 +05:30
Aryan Salmanpour 015895a265 [HIP] add cooperative kernel launch APIs on NVCC (#1929) 2020-03-17 14:01:11 +05:30
Maneesh Gupta eee5cc8621 Annotate __constant__ (#1901) 2020-03-17 13:59:44 +05:30
mhbliao 774035d869 [hip] Improve the portability of the header for vector type support. (#1873)
- Need to check the availability of `__has_attribute` builtin macro
  instead of compiler versions. That's more reliable and portable among
  various compilers.
- Provides a very basic support of vectors for unknown compilers.
2020-03-17 13:59:24 +05:30
Evgeny Mankov 70f5646f8a Merge pull request #1908 from asalmanp/prop_mulit_coop
[HIP] add hip specific properties for cooperative kernel multi device
2020-03-12 19:12:11 +03:00
Alex Voicu 1c5f526e6b Merge branch 'master' of https://github.com/ROCm-Developer-Tools/HIP into feature_robust_constant 2020-03-12 14:20:26 +00:00
Maneesh Gupta 0726abf424 Expose support for non-returning atomic FADD (#1909)
Change-Id: If5359488324477315a9bd4f308a75f606c065b39
2020-03-11 14:33:15 +05:30
Nick Curtis 09edc7e49c Fix incorrect shfl_xor for Windows
copy/paste error, need __shfl_xor w/ lane_mask
2020-03-10 12:04:05 -05:00
Sameer Sahasrabuddhe 09130b3b92 separate printf declaration for vdi/clang
There are now two implementations of printf in HIP:

1. The implemenation for HCC is controlled by the HC_FEATURE_PRINTF
   macro, and it works only with the HCC compiler used in combination
   with the HCC runtime.

2. The implementation for hip-clang requires the VDI runtime, and is
   always enabled with that combination.
2020-03-09 09:40:05 +05:30
Aryan Salmanpour 7e45c54ea6 move new enums to the end to maintain compatibility 2020-03-06 11:38:44 -05:00
Maneesh Gupta 4a40010ac6 Expose support for non-returning atomic FADD
Change-Id: If5359488324477315a9bd4f308a75f606c065b39
2020-03-05 10:30:52 +05:30
Aryan Salmanpour 03797ae986 [HIP] add hip specific properties for cooperative kernel multi device 2020-03-03 13:25:36 -05:00
Alex Voicu 27480ff5a2 Annotate __constant__ 2020-02-28 22:54:00 +02:00
saleelk 3e1f41c165 Fix HIPRTC headers to export C style symbols (#1879) 2020-02-28 16:47:29 +05:30
Rahul Garg 6c5fa32815 Remove deprecated HIP markers (#1876) 2020-02-28 16:47:15 +05:30
Rahul Garg edc97f3073 Add hipDrvOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags] (#1854)
Equivalent to cuOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags].
2020-02-28 16:46:55 +05:30
Nick Curtis b7dd073d93 fix long shuffle implementations for windows (#1895)
Fixes for SWDEV-223694
2020-02-26 15:53:56 +05:30
Rahul Garg 8c5e5e435b Fix hipMemcpy3D (#1798)
Fixes #1790 and #1791. hipMemcpy3D still requires further refactoring for different input and output combinations.
2020-02-17 19:35:35 +05:30
Nick Curtis 797a929a65 Implement long / long long shuffles (#1829)
Implement additional data-types for shuffles (long and long long).
Based upon the double implementation.
2020-02-15 09:51:09 +05:30
ansurya 8c6934223b Reduce GPU copying based on arch it runs on (#1751)
Implements SWDEV-213230.
2020-02-13 14:21:51 +05:30
Aryan Salmanpour 959f1b0f0e fix build error in nvcc path 2020-02-11 12:16:51 -05:00
Aryan Salmanpour 5a29f27455 Fix a typo causing a build error 2020-02-10 11:44:40 -05:00
Aryan Salmanpour 874b201ee2 resolve merge conflict 2020-02-10 10:30:55 -05:00
Maneesh Gupta f8e1c01900 Revert "Match Occupancy APIs syntax with CUDA (#1625)" (#1857)
Reverting this for now till we figure out how to avoid the build
breakage.

This reverts commit fa98798b63.
2020-02-10 10:45:28 +05:30
Alex Voicu dd34ea95d6 (Maybe) Match alignment between Clang and GCC. (#1789)
Should fix #1740 and the related internal bug.
2020-02-10 10:44:49 +05:30
vsytch ef514eef71 Device texture functions should not normalize the sampled pixel (#1826)
* Device texture functions should not normalize the sampled pixel. This is already done by HW.
* Add support to use h/w capability for normalized float data convertion for driver API's

Co-authored-by: ansurya <50609411+ansurya@users.noreply.github.com>
2020-02-05 20:56:17 +05:30
Aryan Salmanpour c8137263d6 code clean up 2020-01-31 13:08:25 -05:00
Aryan Salmanpour 6e867eacb6 [HIP][HIPIFY] Add some missing flags for cooperative launch and occupancy APIs 2020-01-30 15:05:53 -05:00
satyanveshd fa98798b63 Match Occupancy APIs syntax with CUDA (#1625)
* Match Occupancy APIs syntax with CUDA and fix tests using these APIs
2020-01-29 13:05:53 -08:00
vsytch f72a669487 Add missing texturePitchAlignment member to the hipDeviceProp_t struct. (#1802)
* Add missing texturePitchAlignment member to the hipDeviceProp_t struct.

* Add missing hipDeviceAttributeTexturePitchAlignment enumerator to the hipDeviceAttribute_t enum.

* Initialize texturePitchAlignment to 256. This works for gfx9+, but is technically overaligned in most cases for pre-gfx9.

* Add the texturePitchAlignment property to the NVCC path.
2020-01-27 16:37:00 -08:00
vsytch 9cfada0f9d Update the HIP_TRSF_* flags to match their Cuda equivalents. (#1801) 2020-01-24 11:41:15 -08:00
mshivama bed8f1c1b8 SWDEV-220503: this_grid().thread_rank() gives incorrect result (#1808)
* fix a minor bug while computing this.grid()::thread_rank()
2020-01-24 16:23:28 +05:30
kpyzhov 566adc4594 Don't use accelerated vector element access for hip-clang. (#1796) 2020-01-15 18:17:08 -08:00
kpyzhov fae85cf6d2 Add missing constructors for Scalar_accessor class. (#1792) 2020-01-14 11:30:21 -08:00
Evgeny Mankov a005a8550d [HIP] Unify hipError_t (Step 3): Sync nvcc path (#1778)
* [HIP] Unify hipError_t (Step 3): Sync nvcc path

* [HIP][fix] Add CUDA 10.x support to nvcc path
2020-01-10 13:47:18 +05:30
Maneesh Gupta 00bd5d1cec Revert PRs that break ROCm builds (#1781)
Fixes SWDEV-218626 and SWDEV-218629

Changes:
- Revert "`static inline` in a header, just like excess sugar in a diet, causes bloat (#1692)"
   This reverts commit be70b9f7e7.
- Revert "Fix rocFFT build failure (#1777)"
   This reverts commit 753277422a.
2020-01-08 15:11:58 +05:30
ansurya 753277422a Fix rocFFT build failure (#1777)
Fixes SWDEV-217761
2020-01-07 08:12:37 +05:30