Граф коммитов

1171 Коммитов

Автор SHA1 Сообщение Дата
Rahul Garg edc97f3073 Add hipDrvOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags] (#1854)
Equivalent to cuOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags].
2020-02-28 16:46:55 +05:30
Nick Curtis b7dd073d93 fix long shuffle implementations for windows (#1895)
Fixes for SWDEV-223694
2020-02-26 15:53:56 +05:30
Rahul Garg 8c5e5e435b Fix hipMemcpy3D (#1798)
Fixes #1790 and #1791. hipMemcpy3D still requires further refactoring for different input and output combinations.
2020-02-17 19:35:35 +05:30
Nick Curtis 797a929a65 Implement long / long long shuffles (#1829)
Implement additional data-types for shuffles (long and long long).
Based upon the double implementation.
2020-02-15 09:51:09 +05:30
ansurya 8c6934223b Reduce GPU copying based on arch it runs on (#1751)
Implements SWDEV-213230.
2020-02-13 14:21:51 +05:30
Aryan Salmanpour 959f1b0f0e fix build error in nvcc path 2020-02-11 12:16:51 -05:00
Aryan Salmanpour 5a29f27455 Fix a typo causing a build error 2020-02-10 11:44:40 -05:00
Aryan Salmanpour 874b201ee2 resolve merge conflict 2020-02-10 10:30:55 -05:00
Maneesh Gupta f8e1c01900 Revert "Match Occupancy APIs syntax with CUDA (#1625)" (#1857)
Reverting this for now till we figure out how to avoid the build
breakage.

This reverts commit fa98798b63.
2020-02-10 10:45:28 +05:30
Alex Voicu dd34ea95d6 (Maybe) Match alignment between Clang and GCC. (#1789)
Should fix #1740 and the related internal bug.
2020-02-10 10:44:49 +05:30
vsytch ef514eef71 Device texture functions should not normalize the sampled pixel (#1826)
* Device texture functions should not normalize the sampled pixel. This is already done by HW.
* Add support to use h/w capability for normalized float data convertion for driver API's

Co-authored-by: ansurya <50609411+ansurya@users.noreply.github.com>
2020-02-05 20:56:17 +05:30
Aryan Salmanpour c8137263d6 code clean up 2020-01-31 13:08:25 -05:00
Aryan Salmanpour 6e867eacb6 [HIP][HIPIFY] Add some missing flags for cooperative launch and occupancy APIs 2020-01-30 15:05:53 -05:00
satyanveshd fa98798b63 Match Occupancy APIs syntax with CUDA (#1625)
* Match Occupancy APIs syntax with CUDA and fix tests using these APIs
2020-01-29 13:05:53 -08:00
vsytch f72a669487 Add missing texturePitchAlignment member to the hipDeviceProp_t struct. (#1802)
* Add missing texturePitchAlignment member to the hipDeviceProp_t struct.

* Add missing hipDeviceAttributeTexturePitchAlignment enumerator to the hipDeviceAttribute_t enum.

* Initialize texturePitchAlignment to 256. This works for gfx9+, but is technically overaligned in most cases for pre-gfx9.

* Add the texturePitchAlignment property to the NVCC path.
2020-01-27 16:37:00 -08:00
vsytch 9cfada0f9d Update the HIP_TRSF_* flags to match their Cuda equivalents. (#1801) 2020-01-24 11:41:15 -08:00
mshivama bed8f1c1b8 SWDEV-220503: this_grid().thread_rank() gives incorrect result (#1808)
* fix a minor bug while computing this.grid()::thread_rank()
2020-01-24 16:23:28 +05:30
kpyzhov 566adc4594 Don't use accelerated vector element access for hip-clang. (#1796) 2020-01-15 18:17:08 -08:00
kpyzhov fae85cf6d2 Add missing constructors for Scalar_accessor class. (#1792) 2020-01-14 11:30:21 -08:00
Evgeny Mankov a005a8550d [HIP] Unify hipError_t (Step 3): Sync nvcc path (#1778)
* [HIP] Unify hipError_t (Step 3): Sync nvcc path

* [HIP][fix] Add CUDA 10.x support to nvcc path
2020-01-10 13:47:18 +05:30
Maneesh Gupta 00bd5d1cec Revert PRs that break ROCm builds (#1781)
Fixes SWDEV-218626 and SWDEV-218629

Changes:
- Revert "`static inline` in a header, just like excess sugar in a diet, causes bloat (#1692)"
   This reverts commit be70b9f7e7.
- Revert "Fix rocFFT build failure (#1777)"
   This reverts commit 753277422a.
2020-01-08 15:11:58 +05:30
ansurya 753277422a Fix rocFFT build failure (#1777)
Fixes SWDEV-217761
2020-01-07 08:12:37 +05:30
Rahul Garg a5d7e7d8d3 Add hipBindTexture2D on NVCC path (#1773) 2020-01-06 12:33:50 +05:30
Rahul Garg f3cafd5855 Fix hipcc warning related to hipVersion (#1767)
* Fix hipcc warning related to hipVersion
* Rename hipVersion.h to hip_version.h
* Remove HIP_VERSION splitting
* Update .gitignore
- Ignore generated include/hip/hip_version.h
- Removed some stale entries
- Added executables from samples/1_Utils/*/ for consistency with bin/ entries.
2020-01-06 12:33:23 +05:30
Evgeny Mankov 0dadb23327 Merge pull request #1759 from emankov/master
[HIP] Unify hipError_t (Step 2)
2019-12-30 19:21:09 +03:00
Sarbojit2019 aa4aea0754 Change to generate hipVersion.h (#1726)
HIP_VERSION_MAJOR, HIP_VERSION_MINOR, HIP_VERSION_PATCH and HIP_VERSION pre-processor macros are now defined in hipVersion.h instead of being set by hipcc.
2019-12-30 12:44:24 +05:30
Aryan Salmanpour 6968aeb841 [hip] refactoring cooperative kernel launch APIs (#1737)
This PR is a follow-up on PR# #1698 and it makes two more APIs (hipLaunchCooperativeKernel/hipLaunchCooperativeKernelMultiDevice) inline so that they can work correctly with lazy binding.
2019-12-30 12:42:17 +05:30
Evgeny Mankov 4921678b6c [HIP] Clean-up deprecated HIP error codes
hipErrorMemoryAllocation -> hipErrorOutOfMemory
hipErrorInitializationError -> hipErrorNotInitialized
hipErrorMapBufferObjectFailed -> hipErrorMapFailed
hipErrorInvalidResourceHandle -> hipErrorInvalidHandle
2019-12-23 17:01:35 +03:00
Yaxun (Sam) Liu 3c90d57072 Add macro __HIP_ENABLE_CUDA_WRAPPER_FOR_OPENMP__ (#1761)
This is to allow force enable cuda wrapper for OpenMP for flexibility
2019-12-23 19:24:54 +05:30
saleelk 080b0b9a68 Fix the return type of demangle function so that its compatible across ABIs (#1744) 2019-12-23 19:11:40 +05:30
Alex Voicu 75a11330aa Fix late-coming issues. (#1724)
Implementation for hipMemcpyWithStream.
2019-12-23 19:11:24 +05:30
Maneesh Gupta 7d6634ce9d replace array designator C99 (#1694)
* replace array designator C99

* Update texture_functions.h

Highlight valid and invalid values in texFormatToSize

Co-authored-by: Maneesh Gupta <maneesh.gupta@amd.com>
2019-12-23 19:10:24 +05:30
Alex Voicu be70b9f7e7 static inline in a header, just like excess sugar in a diet, causes bloat (#1692) 2019-12-23 19:09:38 +05:30
Evgeny Mankov 9544682e2c [HIP] Fix typo 2019-12-23 12:06:44 +03:00
Evgeny Mankov dbad4d9b7f [HIP] Unify hipError_t (Step 2)
Step 2. Make a few hipError codes deprecated
Update hipify-clang, hipify-perl, docs and samples accordingly
2019-12-22 02:05:31 +03:00
Maneesh Gupta d92169c05a Update texture_functions.h
Highlight valid and invalid values in texFormatToSize
2019-12-21 12:25:36 +05:30
Sarbojit2019 e2fc00da65 Fix for windows dtest build failure (#1742) 2019-12-19 13:10:43 -08:00
mhbliao 99a3b66110 [hip] Add macro guarding the enum conversion for scalar accessor. (#1748)
- That's a high overhead part, which needs enabling ONLY if necessary.
2019-12-19 10:08:37 -08:00
Evgeny Mankov d8737ba50c [HIP] Unify hipError_t (Step 1)
Step 1. Set the same values for RT error codes as for analogous Driver's

[Reason] RT's and Driver's error codes unification in CUDA 10.2
2019-12-13 19:40:16 +03:00
mhbliao 444c931641 Only add hipExtLaunchMultiKernelMultiDevice for non-HCC compilers. (#1729) 2019-12-10 10:32:25 -08:00
jglaser 00d735cdc9 fix linking of vector types with gcc (#1690)
* fix linking of vector types when linking hipcc objects with gcc

* use __atribute__((vector_size)) with both clang and gcc

and reinstate nonaligned n=3 vector type

* use implicit conversion to value and ext_vector_type when available

* Alternate formulation for GCC compatibility

* Built-in arrays don't mix well with placement new

* Fix typo

* Add conversions to enum

* Fix Scalar_accessor assignment.

* Update hip_vector_types.h

* stir up the underlying_type hideous mess

This fixes the HIP build issue "error: only enumeration types have underlying types".
2019-12-10 09:40:15 +05:30
mhbliao e9da934ac6 Fix hipExtLaunchMultiKernelMultiDevice refactoring. (#1714)
- Use the correct condition for HIP VDI runtime.
2019-12-06 09:49:17 -08:00
Rahul Garg e53fc316f1 Revert - Changes related to hipMemcpyWithStream (#1718)
Reverting #1673, #1697 and #1707.
Support for hipMemcpyWithStream and memcpy optimizations, will be brought in again once issues seen with these are resolved independently.
2019-12-06 09:51:53 +05:30
Aryan Salmanpour 68cc787781 [hip] refactoring hipExtLaunchMultiKernelMultiDevice API (#1698)
[Background] it was found that if lazy linking used for a library that calls hipExtLaunchMultiKernelMultiDevice API then this API can get the wrong program_state object for looking up device kernels leading to a "No device code available" error in this API.

To fix this issue, the API was refactored to be inline and get and pass the correct program_state to an internal hip API to request a multi-device kernel launch.
2019-12-04 11:50:51 +05:30
Maneesh Gupta 32442c6506 Revert changes for atomic FADD support when address is in LDS (#1701)
This reverts PR #1591 and follow-on PR #1695
2019-11-29 11:58:12 +05:30
Alex Voicu b6514fffb9 Uniform is_shared query. (#1695) 2019-11-28 13:39:05 +05:30
Anusha Godavarthy Surya edf29b8673 replace array designator C99 2019-11-25 16:51:49 +05:30
Alex Voicu aaf31b6b96 Unary operators were too restrictive in the type of their argument. (#1683) 2019-11-22 07:54:53 +05:30
ansurya e60dec51da Fix rocBLAS compilation failure (#1677)
SWDEV-212749:
o Recent changes to “add support for extended launch” require hip_runtime.h to be include in hip_ext.h
o Order in which external applications include hip_hcc.h/hip_runtime.h causes compilation failure
2019-11-22 07:54:17 +05:30
Alex Voicu d597e7ca20 Use native support for atomic FADD when address is in LDS (#1591) 2019-11-22 07:53:48 +05:30