Rahul Garg
90f57d452a
Return hipSuccess when sizeBytes=0 in hipMemset
2018-09-26 12:47:36 +05:30
Maneesh Gupta
66f863d1f3
Merge branch 'master' into support-malloc
2018-09-17 10:17:25 +05:30
Yaxun Sam Liu
1299b65e15
Add HIP_DB=fatbin for debugging fat binary issues
2018-08-17 11:53:45 -04:00
Maneesh Gupta
4cf851f416
Merge pull request #621 from ROCm-Developer-Tools/disable_startup_loader
...
Disable startup loader by default and guard with env var
2018-08-10 10:18:25 +05:30
sunway513
17f38937e0
resolve a segfault bug when env var not set; remove startup_kernel_loader class
2018-08-09 16:40:26 +00:00
sunway513
30dfa6f129
Add more check to ensure the startup loader only be enabled with the env var set to 1
2018-08-04 01:52:27 +00:00
sunway513
3a68ab4919
Add startup loader under HIP_STARTUP_LOADER env var, disable by default
2018-08-04 01:48:06 +00:00
Wen-Heng (Jack) Chung
2604f33930
Revert "HIP program state re-initialization logic"
...
This reverts commit 379b7a2241 .
2018-08-03 17:03:04 -05:00
Wen-Heng (Jack) Chung
3426f15171
Revert "Improve performance of re-initialization logic"
...
This reverts commit ece4539c1d .
2018-08-03 17:02:58 -05:00
Wen-Heng (Jack) Chung
136bcc2981
Revert "Keep the map which tracks GPU kernel symbols to grow monotonically"
...
This reverts commit 32789a8b7d .
2018-08-03 17:02:50 -05:00
Sarunya Pumma
8111fd3b8b
Remove device mapping from shareWithAll memory
...
When shareWithAll memory (e.g., host memory) is allocated, set appId
in hc::AmPointerInfo to -1 to indicate that this memory is not mapped
to any device. Peer checking in ihipStream_t::canSeeMemory is not
necessary if memory is shared with all devices. Thus, it is skipped.
Note that earlier host memory is always mapped to device 0 and HIP
always performs peer checking for all kinds of hipMemcpy. Since the
peer checking process requires context locking, hipMemcpy from/to host
memory always grabs device 0's context lock. Therefore, if there is
another thread holding the context lock of device 0 (e.g.,
hipDeviceSynchronize on device 0), hipMemcpy will have to wait for the
lock until it can actually perform memcpy. This can significantly
deteriorate execution performance.
Signed-off-by: Sarunya Pumma <sarunya.pumma@amd.com >
2018-07-28 23:15:16 -07:00
Yaxun Sam Liu
02d0e93601
Support malloc/free for hip-clang
2018-07-27 16:24:51 -04:00
Rahul Garg
bc4cdf7e41
Null check before setting offset
2018-07-24 12:25:40 +05:30
Rahul Garg
867a4aa971
Set offset in hipGetTextureAlignmentOffset
2018-07-24 10:11:26 +05:30
Alex Voicu
9938edb636
It is unclear what I was thinking when authoring the original code...
2018-07-17 14:04:57 +01:00
Maneesh Gupta
b5cfa773ef
Merge branch 'master' into move-memcpy
2018-07-17 10:51:42 +05:30
Maneesh Gupta
fbbe2599dd
Merge pull request #515 from ROCm-Developer-Tools/hipclang-add-amdgcn-funcs
...
Add hipclang amdgcn functions
2018-07-17 09:25:09 +05:30
Maneesh Gupta
06f4579c3a
Merge pull request #562 from ROCm-Developer-Tools/fix-build-failure
...
Fix build failure in code_object_bundle.cpp
2018-07-12 07:49:41 +05:30
Maneesh Gupta
296dce2e2b
Merge pull request #546 from gargrahul/fix_bindtex_offset_null_check
...
Fixed offset null check in bind texture functions
2018-07-11 12:52:31 +05:30
Maneesh Gupta
86e10fed99
Merge pull request #545 from ROCm-Developer-Tools/revert-521-temp_fixmemcpy2dasync_trsmissue
...
Revert "Use memcpy kernel for all pinned memory cases in hipMemcpy2DAsync"
2018-07-11 12:52:09 +05:30
Yaxun (Sam) Liu
8136a348ab
Move __hip_hc_memcpy and __hip_hc_memset from device_utils.cpp to device_functions.h as inline functions
2018-07-10 18:12:41 -04:00
Yaxun (Sam) Liu
7db7cce9e4
Fix build failure in code_object_bundle.cpp
2018-07-10 16:49:59 -04:00
Aaron Enye Shi
76f86ef097
Implement hip_ldg Functions into HIP header
...
Move all the function definitions for hip_ldg.cpp into hip_ldg.h header and enable for HIP clang path.
2018-07-05 20:38:46 +00:00
Aaron Enye Shi
47d78e372e
Implement min/max functions in HIP header
...
Remove using hc::precise_math min and max. Instead we can use ocml directly for device and std:: for host.
2018-07-05 20:15:41 +00:00
Aaron Enye Shi
930a16bccd
Implement Memory Fence Functions in header
...
Enabled __llvm_fence_* functions for seq_cst.
2018-07-04 23:35:24 +00:00
Aaron Enye Shi
2975f2a10a
Merge branch 'master' into hipclang-add-amdgcn-funcs
2018-07-04 17:36:08 +00:00
Maneesh Gupta
0c2f985553
Update hip_hcc_internal.h
...
Adding missing include for hip_hcc_internal in order to build with HCC
2018-07-04 09:33:51 +05:30
Rahul Garg
feff0aeea4
Fixed offset null check in bind texture functions
2018-07-03 08:54:17 +05:30
Rahul Garg
7cd1d5e644
Revert "Use memcpy kernel for all pinned memory cases in hipMemcpy2DAsync"
2018-07-02 14:32:11 +05:30
Aaron Enye Shi
9ac31e0bb6
Implement __shfl_* funcs into HIP headers
2018-06-26 18:32:11 +00:00
Aaron Enye Shi
6dc16bbf04
Implement __ballot, __any, __all into HIP headers
2018-06-20 17:39:39 +00:00
Aaron Enye Shi
2142eb4d12
Implement hip_hc.ll into HIP headers
...
Move all __hip_hc_ir_* functions from hip_hc.ll into HIP header as inline asm. Remove hip_hc.ll and build dependencies from HIP.
2018-06-20 17:39:31 +00:00
Aaron Enye Shi
e02fc7e680
Implement device_functions.cpp into HIP headers
...
Move all Integer Intrinsics, device_functions.cpp definitions and HIP specific device functions into HIP headers. Implement the device functions using llvm_intrinsics and device-libs functions instead of calling hc::__* functions. Remove device_functions.cpp since everything is now defined in header.
2018-06-20 17:39:23 +00:00
Aaron Enye Shi
c453b42bff
Add hipclang amdgcn functions
...
These are moving from hipclang in device library to hip headers. These are required for the functionality of HIPclang project.
2018-06-20 17:38:37 +00:00
Maneesh Gupta
946c8da88a
Merge pull request #490 from ROCm-Developer-Tools/feature_decouple_atomics_from_hc
...
Switch the atomic implementation to use Clang builtins.
2018-06-20 14:16:43 +05:30
Maneesh Gupta
836627279f
Merge pull request #457 from whchung/hip-reinit
...
HIP program state re-initialization logic
2018-06-20 09:37:27 +05:30
Wen-Heng (Jack) Chung
32789a8b7d
Keep the map which tracks GPU kernel symbols to grow monotonically
2018-06-18 16:54:18 -05:00
Wen-Heng (Jack) Chung
ece4539c1d
Improve performance of re-initialization logic
...
Keep track of shared libaries already discovered. Do not build HSA executables
for them.
2018-06-15 18:07:33 -05:00
Rahul Garg
cd23905897
TEMP- fix memcpy2dAsync for trsm issue
2018-06-15 16:08:29 +05:30
Wen-Heng (Jack) Chung
379b7a2241
HIP program state re-initialization logic
...
This commit is to support kernels dynamically loaded thru means such as
dlopen() after HIP runtime initializes.
2018-06-14 15:46:49 +00:00
Rahul Garg
069e2c34c9
Fix stream resolution in memcpy2dasync
2018-06-14 11:58:56 +05:30
Rahul Garg
00f8a36bc7
Fix retrieved locked ptr offset
2018-06-13 23:10:05 +05:30
Maneesh Gupta
203dd6cb70
Merge pull request #482 from ROCm-Developer-Tools/feature_clean_up_hip_math
...
Switch to using ROCDL directly, as opposed to via HC. Add missing bits.
2018-06-06 16:07:22 +05:30
Maneesh Gupta
9e9c039ee4
Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
...
Fix hipMemcpy3D for fast path
2018-06-06 14:44:02 +05:30
Maneesh Gupta
216f34eea8
Merge pull request #492 from gargrahul/fix_depth_3d_alloc
...
Fix depth value for 3D allocations
2018-06-06 14:41:23 +05:30
Maneesh Gupta
7311b60220
Merge pull request #491 from scchan/fix_wait
...
callback handling: don't need to wait for the thread to become ready
2018-06-06 14:38:25 +05:30
Rahul Garg
a46ff2afd5
Fix hipMemcpy3D for fast path
2018-06-05 18:54:33 +05:30
Siu Chi Chan
a1f3b587fb
remove the _ready flag in ihipStreamCallback_t and the mutex that protects it.
2018-06-04 17:29:04 -04:00
Rahul Garg
276c948a16
Fix depth value for 3D allocations
2018-06-04 18:00:22 +05:30
Siu Chi Chan
d3a9985f10
callback handler: don't need to wait for the thread to become ready
2018-06-02 17:55:37 -04:00