Maneesh Gupta
|
0c2f985553
|
Update hip_hcc_internal.h
Adding missing include for hip_hcc_internal in order to build with HCC
|
2018-07-04 09:33:51 +05:30 |
|
Maneesh Gupta
|
946c8da88a
|
Merge pull request #490 from ROCm-Developer-Tools/feature_decouple_atomics_from_hc
Switch the atomic implementation to use Clang builtins.
|
2018-06-20 14:16:43 +05:30 |
|
Maneesh Gupta
|
836627279f
|
Merge pull request #457 from whchung/hip-reinit
HIP program state re-initialization logic
|
2018-06-20 09:37:27 +05:30 |
|
Wen-Heng (Jack) Chung
|
32789a8b7d
|
Keep the map which tracks GPU kernel symbols to grow monotonically
|
2018-06-18 16:54:18 -05:00 |
|
Wen-Heng (Jack) Chung
|
ece4539c1d
|
Improve performance of re-initialization logic
Keep track of shared libaries already discovered. Do not build HSA executables
for them.
|
2018-06-15 18:07:33 -05:00 |
|
Rahul Garg
|
cd23905897
|
TEMP- fix memcpy2dAsync for trsm issue
|
2018-06-15 16:08:29 +05:30 |
|
Wen-Heng (Jack) Chung
|
379b7a2241
|
HIP program state re-initialization logic
This commit is to support kernels dynamically loaded thru means such as
dlopen() after HIP runtime initializes.
|
2018-06-14 15:46:49 +00:00 |
|
Rahul Garg
|
069e2c34c9
|
Fix stream resolution in memcpy2dasync
|
2018-06-14 11:58:56 +05:30 |
|
Rahul Garg
|
00f8a36bc7
|
Fix retrieved locked ptr offset
|
2018-06-13 23:10:05 +05:30 |
|
Maneesh Gupta
|
203dd6cb70
|
Merge pull request #482 from ROCm-Developer-Tools/feature_clean_up_hip_math
Switch to using ROCDL directly, as opposed to via HC. Add missing bits.
|
2018-06-06 16:07:22 +05:30 |
|
Maneesh Gupta
|
9e9c039ee4
|
Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
Fix hipMemcpy3D for fast path
|
2018-06-06 14:44:02 +05:30 |
|
Maneesh Gupta
|
216f34eea8
|
Merge pull request #492 from gargrahul/fix_depth_3d_alloc
Fix depth value for 3D allocations
|
2018-06-06 14:41:23 +05:30 |
|
Maneesh Gupta
|
7311b60220
|
Merge pull request #491 from scchan/fix_wait
callback handling: don't need to wait for the thread to become ready
|
2018-06-06 14:38:25 +05:30 |
|
Rahul Garg
|
a46ff2afd5
|
Fix hipMemcpy3D for fast path
|
2018-06-05 18:54:33 +05:30 |
|
Siu Chi Chan
|
a1f3b587fb
|
remove the _ready flag in ihipStreamCallback_t and the mutex that protects it.
|
2018-06-04 17:29:04 -04:00 |
|
Rahul Garg
|
276c948a16
|
Fix depth value for 3D allocations
|
2018-06-04 18:00:22 +05:30 |
|
Siu Chi Chan
|
d3a9985f10
|
callback handler: don't need to wait for the thread to become ready
|
2018-06-02 17:55:37 -04:00 |
|
Alex Voicu
|
089ab3b947
|
Switch the atomic implementation to use Clang builtins.
|
2018-06-02 12:27:17 +01:00 |
|
Alex Voicu
|
14e449b5bb
|
Remove vestigial implementations.
|
2018-06-02 11:37:08 +01:00 |
|
Rahul Garg
|
1a02bc364f
|
Add integrated device property
|
2018-06-02 13:11:16 +05:30 |
|
Alex Voicu
|
417869821d
|
Re-sync with upstream.
|
2018-06-01 15:49:05 +01:00 |
|
Maneesh Gupta
|
df450c6680
|
Merge pull request #484 from gargrahul/fix_malloc_hiphostreg
Fix memcpy2D for malloc+ hostRegister
|
2018-06-01 16:53:25 +05:30 |
|
Maneesh Gupta
|
bdf2645713
|
Merge pull request #466 from ROCm-Developer-Tools/feature_use_Float16
Feature use _Float16 and match CUDA __half behaviour.
|
2018-06-01 13:50:12 +05:30 |
|
Alex Voicu
|
ab4b2a650b
|
Re-sync with upstream. Add integer abs.
|
2018-05-31 16:38:00 +01:00 |
|
Rahul Garg
|
8d6357669d
|
Fix memcpy2D for malloc+ hostRegister
|
2018-05-31 13:14:27 +05:30 |
|
Alex Voicu
|
59db16fd36
|
Switch to using ROCDL directly, as opposed to via HC. Add missing bits.
|
2018-05-31 03:17:26 +01:00 |
|
Maneesh Gupta
|
57fb96013c
|
Merge pull request #472 from Jorghi12/patch-3
Adding double/long int signatures for abs
|
2018-05-30 08:32:14 +05:30 |
|
Jorghi12
|
ec2edb2c92
|
Update math_functions.cpp
CUDA also has a function named labs.
|
2018-05-26 16:21:14 -04:00 |
|
Jorghi12
|
4383d6c6de
|
Adding double/long int signatures for abs
Adding overloads for abs that are found in cuda's math_functions.
|
2018-05-26 00:40:14 -04:00 |
|
Rahul Garg
|
d8cb47242b
|
Use 64x4 grid dims
|
2018-05-24 23:51:52 +05:30 |
|
Rahul Garg
|
4ff059d641
|
Clean up and fix remaining bytes copy
|
2018-05-24 23:30:27 +05:30 |
|
Alex Voicu
|
9c7fbdb597
|
Remove vestigial inline LLVMIR.
|
2018-05-24 12:46:14 +01:00 |
|
Rahul Garg
|
981e56a68f
|
Fix memcpy2d kernel dims
|
2018-05-24 17:00:12 +05:30 |
|
Rahul Garg
|
dc179e0c33
|
Correct remaining bytes in copy 2d kernel
|
2018-05-24 08:27:24 +05:30 |
|
Alex Voicu
|
6f819f226b
|
Missing commit.
|
2018-05-23 17:57:47 +01:00 |
|
Rahul Garg
|
9a76d5b94c
|
Optimize memcpy2D kernel use
|
2018-05-23 14:43:47 +05:30 |
|
Maneesh Gupta
|
323a6226b0
|
Merge pull request #464 from gargrahul/fix_memcpy2d_pinned_mem_case
Fixed memcpy2D for pinned memory case using 2D kernel
|
2018-05-22 10:42:28 +05:30 |
|
Maneesh Gupta
|
df3bb9fc32
|
Merge pull request #445 from ROCm-Developer-Tools/feature_func_attributes
Add support for the hipFuncGetAttributes interface.
|
2018-05-22 09:37:41 +05:30 |
|
Rahul Garg
|
f47a8236d7
|
Fixed memcpy2D for pinned memory case using 2D kernel
|
2018-05-21 22:14:45 +05:30 |
|
Maneesh Gupta
|
0180a82963
|
hipMemcpy returns success if sizeBytes is 0.
Fixes SWDEV-153754 & SWDEV-154178.
|
2018-05-21 15:38:44 +05:30 |
|
Maneesh Gupta
|
cac3f1c7cd
|
Merge pull request #455 from ROCm-Developer-Tools/magic
Change HIP fat binary magic number
|
2018-05-21 09:52:03 +05:30 |
|
Alex Voicu
|
cd6c979c27
|
Update hip_module.cpp
Typo.
|
2018-05-18 17:50:45 +01:00 |
|
Rahul Garg
|
afe62e7030
|
Fix for memcpy2DAsync for pinned host memory case
|
2018-05-18 21:09:50 +05:30 |
|
Maneesh Gupta
|
03ac8e6a92
|
Merge pull request #433 from gargrahul/add_hipmemset3d
Added hipMemset3D
|
2018-05-18 14:54:15 +05:30 |
|
Maneesh Gupta
|
ac7713fa34
|
Merge pull request #448 from 949f45ac/master
Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs
|
2018-05-18 13:18:16 +05:30 |
|
Yaxun (Sam) Liu
|
d079463887
|
Change HIP fat binary magic number
|
2018-05-17 17:04:51 -04:00 |
|
949f45ac
|
8303bfdffd
|
Reinstate accidentally deleted uchar2Holder
|
2018-05-17 10:55:45 +02:00 |
|
Rahul Garg
|
8f010ac68e
|
Fixed hipMemcpy2D to handle 1D memcpy case
|
2018-05-16 11:07:10 +05:30 |
|
Alex Voicu
|
5325b6535e
|
Update hip_module.cpp
|
2018-05-14 17:15:36 +01:00 |
|
949f45ac
|
79480d7cbd
|
Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs
|
2018-05-14 08:34:56 +02:00 |
|