Maneesh Gupta
|
27e2af1773
|
Merge pull request #490 from ROCm-Developer-Tools/feature_decouple_atomics_from_hc
Switch the atomic implementation to use Clang builtins.
[ROCm/hip commit: 946c8da88a]
|
2018-06-20 14:16:43 +05:30 |
|
Maneesh Gupta
|
4cdc20a6ce
|
Merge pull request #457 from whchung/hip-reinit
HIP program state re-initialization logic
[ROCm/hip commit: 836627279f]
|
2018-06-20 09:37:27 +05:30 |
|
Wen-Heng (Jack) Chung
|
d214b045c3
|
Keep the map which tracks GPU kernel symbols to grow monotonically
[ROCm/hip commit: 32789a8b7d]
|
2018-06-18 16:54:18 -05:00 |
|
Wen-Heng (Jack) Chung
|
c6b975bf13
|
Improve performance of re-initialization logic
Keep track of shared libaries already discovered. Do not build HSA executables
for them.
[ROCm/hip commit: ece4539c1d]
|
2018-06-15 18:07:33 -05:00 |
|
Rahul Garg
|
388679efc8
|
TEMP- fix memcpy2dAsync for trsm issue
[ROCm/hip commit: cd23905897]
|
2018-06-15 16:08:29 +05:30 |
|
Wen-Heng (Jack) Chung
|
e5ca9eb081
|
HIP program state re-initialization logic
This commit is to support kernels dynamically loaded thru means such as
dlopen() after HIP runtime initializes.
[ROCm/hip commit: 379b7a2241]
|
2018-06-14 15:46:49 +00:00 |
|
Rahul Garg
|
312999de41
|
Fix stream resolution in memcpy2dasync
[ROCm/hip commit: 069e2c34c9]
|
2018-06-14 11:58:56 +05:30 |
|
Rahul Garg
|
1d6396dfb9
|
Fix retrieved locked ptr offset
[ROCm/hip commit: 00f8a36bc7]
|
2018-06-13 23:10:05 +05:30 |
|
Maneesh Gupta
|
b54be20b05
|
Merge pull request #482 from ROCm-Developer-Tools/feature_clean_up_hip_math
Switch to using ROCDL directly, as opposed to via HC. Add missing bits.
[ROCm/hip commit: 203dd6cb70]
|
2018-06-06 16:07:22 +05:30 |
|
Maneesh Gupta
|
ac027e4092
|
Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
Fix hipMemcpy3D for fast path
[ROCm/hip commit: 9e9c039ee4]
|
2018-06-06 14:44:02 +05:30 |
|
Maneesh Gupta
|
f4dd7fd056
|
Merge pull request #492 from gargrahul/fix_depth_3d_alloc
Fix depth value for 3D allocations
[ROCm/hip commit: 216f34eea8]
|
2018-06-06 14:41:23 +05:30 |
|
Maneesh Gupta
|
f44e77944d
|
Merge pull request #491 from scchan/fix_wait
callback handling: don't need to wait for the thread to become ready
[ROCm/hip commit: 7311b60220]
|
2018-06-06 14:38:25 +05:30 |
|
Rahul Garg
|
e7bc68d347
|
Fix hipMemcpy3D for fast path
[ROCm/hip commit: a46ff2afd5]
|
2018-06-05 18:54:33 +05:30 |
|
Siu Chi Chan
|
417dde9d73
|
remove the _ready flag in ihipStreamCallback_t and the mutex that protects it.
[ROCm/hip commit: a1f3b587fb]
|
2018-06-04 17:29:04 -04:00 |
|
Rahul Garg
|
6592b35c39
|
Fix depth value for 3D allocations
[ROCm/hip commit: 276c948a16]
|
2018-06-04 18:00:22 +05:30 |
|
Siu Chi Chan
|
4b25b76898
|
callback handler: don't need to wait for the thread to become ready
[ROCm/hip commit: d3a9985f10]
|
2018-06-02 17:55:37 -04:00 |
|
Alex Voicu
|
f7fd20ec17
|
Switch the atomic implementation to use Clang builtins.
[ROCm/hip commit: 089ab3b947]
|
2018-06-02 12:27:17 +01:00 |
|
Alex Voicu
|
980fa8050d
|
Remove vestigial implementations.
[ROCm/hip commit: 14e449b5bb]
|
2018-06-02 11:37:08 +01:00 |
|
Rahul Garg
|
07115e0c02
|
Add integrated device property
[ROCm/hip commit: 1a02bc364f]
|
2018-06-02 13:11:16 +05:30 |
|
Alex Voicu
|
b9bf931765
|
Re-sync with upstream.
[ROCm/hip commit: 417869821d]
|
2018-06-01 15:49:05 +01:00 |
|
Maneesh Gupta
|
eded0da7b5
|
Merge pull request #484 from gargrahul/fix_malloc_hiphostreg
Fix memcpy2D for malloc+ hostRegister
[ROCm/hip commit: df450c6680]
|
2018-06-01 16:53:25 +05:30 |
|
Maneesh Gupta
|
7f9b00ba19
|
Merge pull request #466 from ROCm-Developer-Tools/feature_use_Float16
Feature use _Float16 and match CUDA __half behaviour.
[ROCm/hip commit: bdf2645713]
|
2018-06-01 13:50:12 +05:30 |
|
Alex Voicu
|
63c8aa6fcb
|
Re-sync with upstream. Add integer abs.
[ROCm/hip commit: ab4b2a650b]
|
2018-05-31 16:38:00 +01:00 |
|
Rahul Garg
|
46e623fb31
|
Fix memcpy2D for malloc+ hostRegister
[ROCm/hip commit: 8d6357669d]
|
2018-05-31 13:14:27 +05:30 |
|
Alex Voicu
|
1d220f8867
|
Switch to using ROCDL directly, as opposed to via HC. Add missing bits.
[ROCm/hip commit: 59db16fd36]
|
2018-05-31 03:17:26 +01:00 |
|
Maneesh Gupta
|
b60b8591ac
|
Merge pull request #472 from Jorghi12/patch-3
Adding double/long int signatures for abs
[ROCm/hip commit: 57fb96013c]
|
2018-05-30 08:32:14 +05:30 |
|
Jorghi12
|
c46563addc
|
Update math_functions.cpp
CUDA also has a function named labs.
[ROCm/hip commit: ec2edb2c92]
|
2018-05-26 16:21:14 -04:00 |
|
Jorghi12
|
2708f3325d
|
Adding double/long int signatures for abs
Adding overloads for abs that are found in cuda's math_functions.
[ROCm/hip commit: 4383d6c6de]
|
2018-05-26 00:40:14 -04:00 |
|
Rahul Garg
|
4021f68f64
|
Use 64x4 grid dims
[ROCm/hip commit: d8cb47242b]
|
2018-05-24 23:51:52 +05:30 |
|
Rahul Garg
|
35169c5191
|
Clean up and fix remaining bytes copy
[ROCm/hip commit: 4ff059d641]
|
2018-05-24 23:30:27 +05:30 |
|
Alex Voicu
|
a7da1ccf2e
|
Remove vestigial inline LLVMIR.
[ROCm/hip commit: 9c7fbdb597]
|
2018-05-24 12:46:14 +01:00 |
|
Rahul Garg
|
fb745baa7e
|
Fix memcpy2d kernel dims
[ROCm/hip commit: 981e56a68f]
|
2018-05-24 17:00:12 +05:30 |
|
Rahul Garg
|
fb1425959e
|
Correct remaining bytes in copy 2d kernel
[ROCm/hip commit: dc179e0c33]
|
2018-05-24 08:27:24 +05:30 |
|
Alex Voicu
|
4ceb9cbc09
|
Missing commit.
[ROCm/hip commit: 6f819f226b]
|
2018-05-23 17:57:47 +01:00 |
|
Rahul Garg
|
08f750571d
|
Optimize memcpy2D kernel use
[ROCm/hip commit: 9a76d5b94c]
|
2018-05-23 14:43:47 +05:30 |
|
Maneesh Gupta
|
06db862856
|
Merge pull request #464 from gargrahul/fix_memcpy2d_pinned_mem_case
Fixed memcpy2D for pinned memory case using 2D kernel
[ROCm/hip commit: 323a6226b0]
|
2018-05-22 10:42:28 +05:30 |
|
Maneesh Gupta
|
51e2e27488
|
Merge pull request #445 from ROCm-Developer-Tools/feature_func_attributes
Add support for the hipFuncGetAttributes interface.
[ROCm/hip commit: df3bb9fc32]
|
2018-05-22 09:37:41 +05:30 |
|
Rahul Garg
|
f02803c527
|
Fixed memcpy2D for pinned memory case using 2D kernel
[ROCm/hip commit: f47a8236d7]
|
2018-05-21 22:14:45 +05:30 |
|
Maneesh Gupta
|
182f8ff28f
|
hipMemcpy returns success if sizeBytes is 0.
Fixes SWDEV-153754 & SWDEV-154178.
[ROCm/hip commit: 0180a82963]
|
2018-05-21 15:38:44 +05:30 |
|
Maneesh Gupta
|
d56eb91596
|
Merge pull request #455 from ROCm-Developer-Tools/magic
Change HIP fat binary magic number
[ROCm/hip commit: cac3f1c7cd]
|
2018-05-21 09:52:03 +05:30 |
|
Alex Voicu
|
69a32877d7
|
Update hip_module.cpp
Typo.
[ROCm/hip commit: cd6c979c27]
|
2018-05-18 17:50:45 +01:00 |
|
Rahul Garg
|
14030c3f17
|
Fix for memcpy2DAsync for pinned host memory case
[ROCm/hip commit: afe62e7030]
|
2018-05-18 21:09:50 +05:30 |
|
Maneesh Gupta
|
3d1d7ccf30
|
Merge pull request #433 from gargrahul/add_hipmemset3d
Added hipMemset3D
[ROCm/hip commit: 03ac8e6a92]
|
2018-05-18 14:54:15 +05:30 |
|
Maneesh Gupta
|
5fc95e7a30
|
Merge pull request #448 from 949f45ac/master
Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs
[ROCm/hip commit: ac7713fa34]
|
2018-05-18 13:18:16 +05:30 |
|
Yaxun (Sam) Liu
|
0a7d196cb5
|
Change HIP fat binary magic number
[ROCm/hip commit: d079463887]
|
2018-05-17 17:04:51 -04:00 |
|
949f45ac
|
c9db90b077
|
Reinstate accidentally deleted uchar2Holder
[ROCm/hip commit: 8303bfdffd]
|
2018-05-17 10:55:45 +02:00 |
|
Rahul Garg
|
4c44cd4a88
|
Fixed hipMemcpy2D to handle 1D memcpy case
[ROCm/hip commit: 8f010ac68e]
|
2018-05-16 11:07:10 +05:30 |
|
Alex Voicu
|
554aaee804
|
Update hip_module.cpp
[ROCm/hip commit: 5325b6535e]
|
2018-05-14 17:15:36 +01:00 |
|
949f45ac
|
40681fef87
|
Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs
[ROCm/hip commit: 79480d7cbd]
|
2018-05-14 08:34:56 +02:00 |
|
Alex Voicu
|
15e0bb5d15
|
Don't use magic constants, they're evil.
Also clarify that the register count cannot be queried at the moment.
[ROCm/hip commit: 1ba8a35dba]
|
2018-05-11 11:31:46 +01:00 |
|