Commit Graph

779 Commits

Author SHA1 Message Date
Maneesh Gupta 0c2f985553 Update hip_hcc_internal.h
Adding missing include for hip_hcc_internal in order to build with HCC
2018-07-04 09:33:51 +05:30
Maneesh Gupta 946c8da88a Merge pull request #490 from ROCm-Developer-Tools/feature_decouple_atomics_from_hc
Switch the atomic implementation to use Clang  builtins.
2018-06-20 14:16:43 +05:30
Maneesh Gupta 836627279f Merge pull request #457 from whchung/hip-reinit
HIP program state re-initialization logic
2018-06-20 09:37:27 +05:30
Wen-Heng (Jack) Chung 32789a8b7d Keep the map which tracks GPU kernel symbols to grow monotonically 2018-06-18 16:54:18 -05:00
Wen-Heng (Jack) Chung ece4539c1d Improve performance of re-initialization logic
Keep track of shared libaries already discovered. Do not build HSA executables
for them.
2018-06-15 18:07:33 -05:00
Rahul Garg cd23905897 TEMP- fix memcpy2dAsync for trsm issue 2018-06-15 16:08:29 +05:30
Wen-Heng (Jack) Chung 379b7a2241 HIP program state re-initialization logic
This commit is to support kernels dynamically loaded thru means such as
dlopen() after HIP runtime initializes.
2018-06-14 15:46:49 +00:00
Rahul Garg 069e2c34c9 Fix stream resolution in memcpy2dasync 2018-06-14 11:58:56 +05:30
Rahul Garg 00f8a36bc7 Fix retrieved locked ptr offset 2018-06-13 23:10:05 +05:30
Maneesh Gupta 203dd6cb70 Merge pull request #482 from ROCm-Developer-Tools/feature_clean_up_hip_math
Switch to using ROCDL directly, as opposed to via HC. Add missing bits.
2018-06-06 16:07:22 +05:30
Maneesh Gupta 9e9c039ee4 Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
Fix hipMemcpy3D for fast path
2018-06-06 14:44:02 +05:30
Maneesh Gupta 216f34eea8 Merge pull request #492 from gargrahul/fix_depth_3d_alloc
Fix depth value for 3D allocations
2018-06-06 14:41:23 +05:30
Maneesh Gupta 7311b60220 Merge pull request #491 from scchan/fix_wait
callback handling: don't need to wait for the thread to become ready
2018-06-06 14:38:25 +05:30
Rahul Garg a46ff2afd5 Fix hipMemcpy3D for fast path 2018-06-05 18:54:33 +05:30
Siu Chi Chan a1f3b587fb remove the _ready flag in ihipStreamCallback_t and the mutex that protects it. 2018-06-04 17:29:04 -04:00
Rahul Garg 276c948a16 Fix depth value for 3D allocations 2018-06-04 18:00:22 +05:30
Siu Chi Chan d3a9985f10 callback handler: don't need to wait for the thread to become ready 2018-06-02 17:55:37 -04:00
Alex Voicu 089ab3b947 Switch the atomic implementation to use Clang builtins. 2018-06-02 12:27:17 +01:00
Alex Voicu 14e449b5bb Remove vestigial implementations. 2018-06-02 11:37:08 +01:00
Rahul Garg 1a02bc364f Add integrated device property 2018-06-02 13:11:16 +05:30
Alex Voicu 417869821d Re-sync with upstream. 2018-06-01 15:49:05 +01:00
Maneesh Gupta df450c6680 Merge pull request #484 from gargrahul/fix_malloc_hiphostreg
Fix memcpy2D for malloc+ hostRegister
2018-06-01 16:53:25 +05:30
Maneesh Gupta bdf2645713 Merge pull request #466 from ROCm-Developer-Tools/feature_use_Float16
Feature use _Float16 and match CUDA __half behaviour.
2018-06-01 13:50:12 +05:30
Alex Voicu ab4b2a650b Re-sync with upstream. Add integer abs. 2018-05-31 16:38:00 +01:00
Rahul Garg 8d6357669d Fix memcpy2D for malloc+ hostRegister 2018-05-31 13:14:27 +05:30
Alex Voicu 59db16fd36 Switch to using ROCDL directly, as opposed to via HC. Add missing bits. 2018-05-31 03:17:26 +01:00
Maneesh Gupta 57fb96013c Merge pull request #472 from Jorghi12/patch-3
Adding double/long int signatures for abs
2018-05-30 08:32:14 +05:30
Jorghi12 ec2edb2c92 Update math_functions.cpp
CUDA also has a function named labs.
2018-05-26 16:21:14 -04:00
Jorghi12 4383d6c6de Adding double/long int signatures for abs
Adding overloads for abs that are found in cuda's math_functions.
2018-05-26 00:40:14 -04:00
Rahul Garg d8cb47242b Use 64x4 grid dims 2018-05-24 23:51:52 +05:30
Rahul Garg 4ff059d641 Clean up and fix remaining bytes copy 2018-05-24 23:30:27 +05:30
Alex Voicu 9c7fbdb597 Remove vestigial inline LLVMIR. 2018-05-24 12:46:14 +01:00
Rahul Garg 981e56a68f Fix memcpy2d kernel dims 2018-05-24 17:00:12 +05:30
Rahul Garg dc179e0c33 Correct remaining bytes in copy 2d kernel 2018-05-24 08:27:24 +05:30
Alex Voicu 6f819f226b Missing commit. 2018-05-23 17:57:47 +01:00
Rahul Garg 9a76d5b94c Optimize memcpy2D kernel use 2018-05-23 14:43:47 +05:30
Maneesh Gupta 323a6226b0 Merge pull request #464 from gargrahul/fix_memcpy2d_pinned_mem_case
Fixed memcpy2D for pinned memory case using 2D kernel
2018-05-22 10:42:28 +05:30
Maneesh Gupta df3bb9fc32 Merge pull request #445 from ROCm-Developer-Tools/feature_func_attributes
Add support for the hipFuncGetAttributes interface.
2018-05-22 09:37:41 +05:30
Rahul Garg f47a8236d7 Fixed memcpy2D for pinned memory case using 2D kernel 2018-05-21 22:14:45 +05:30
Maneesh Gupta 0180a82963 hipMemcpy returns success if sizeBytes is 0.
Fixes SWDEV-153754 & SWDEV-154178.
2018-05-21 15:38:44 +05:30
Maneesh Gupta cac3f1c7cd Merge pull request #455 from ROCm-Developer-Tools/magic
Change HIP fat binary magic number
2018-05-21 09:52:03 +05:30
Alex Voicu cd6c979c27 Update hip_module.cpp
Typo.
2018-05-18 17:50:45 +01:00
Rahul Garg afe62e7030 Fix for memcpy2DAsync for pinned host memory case 2018-05-18 21:09:50 +05:30
Maneesh Gupta 03ac8e6a92 Merge pull request #433 from gargrahul/add_hipmemset3d
Added hipMemset3D
2018-05-18 14:54:15 +05:30
Maneesh Gupta ac7713fa34 Merge pull request #448 from 949f45ac/master
Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs
2018-05-18 13:18:16 +05:30
Yaxun (Sam) Liu d079463887 Change HIP fat binary magic number 2018-05-17 17:04:51 -04:00
949f45ac 8303bfdffd Reinstate accidentally deleted uchar2Holder 2018-05-17 10:55:45 +02:00
Rahul Garg 8f010ac68e Fixed hipMemcpy2D to handle 1D memcpy case 2018-05-16 11:07:10 +05:30
Alex Voicu 5325b6535e Update hip_module.cpp 2018-05-14 17:15:36 +01:00
949f45ac 79480d7cbd Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs 2018-05-14 08:34:56 +02:00