نمودار کامیت

166 کامیت‌ها

مولف SHA1 پیام تاریخ
Sarunya Pumma 84aadb9274 Remove device mapping from shareWithAll memory
When shareWithAll memory (e.g., host memory) is allocated, set appId
in hc::AmPointerInfo to -1 to indicate that this memory is not mapped
to any device.  Peer checking in ihipStream_t::canSeeMemory is not
necessary if memory is shared with all devices.  Thus, it is skipped.

Note that earlier host memory is always mapped to device 0 and HIP
always performs peer checking for all kinds of hipMemcpy.  Since the
peer checking process requires context locking, hipMemcpy from/to host
memory always grabs device 0's context lock.  Therefore, if there is
another thread holding the context lock of device 0 (e.g.,
hipDeviceSynchronize on device 0), hipMemcpy will have to wait for the
lock until it can actually perform memcpy.  This can significantly
deteriorate execution performance.

Signed-off-by: Sarunya Pumma <sarunya.pumma@amd.com>
2018-07-28 23:15:16 -07:00
Rahul Garg f554e48db3 Revert "Use memcpy kernel for all pinned memory cases in hipMemcpy2DAsync" 2018-07-02 14:32:11 +05:30
Rahul Garg 007e2a4b5f TEMP- fix memcpy2dAsync for trsm issue 2018-06-15 16:08:29 +05:30
Rahul Garg 2ae3be9773 Fix stream resolution in memcpy2dasync 2018-06-14 11:58:56 +05:30
Rahul Garg 68554e155b Fix retrieved locked ptr offset 2018-06-13 23:10:05 +05:30
Maneesh Gupta 53037472ff Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
Fix hipMemcpy3D for fast path
2018-06-06 14:44:02 +05:30
Rahul Garg 163d4a5b03 Fix hipMemcpy3D for fast path 2018-06-05 18:54:33 +05:30
Rahul Garg fa6ce7a724 Fix depth value for 3D allocations 2018-06-04 18:00:22 +05:30
Rahul Garg a3609eaf61 Fix memcpy2D for malloc+ hostRegister 2018-05-31 13:14:27 +05:30
Rahul Garg 27e0566c3a Use 64x4 grid dims 2018-05-24 23:51:52 +05:30
Rahul Garg aed2653857 Clean up and fix remaining bytes copy 2018-05-24 23:30:27 +05:30
Rahul Garg 53a7c61e9b Fix memcpy2d kernel dims 2018-05-24 17:00:12 +05:30
Rahul Garg 96b4618d26 Correct remaining bytes in copy 2d kernel 2018-05-24 08:27:24 +05:30
Rahul Garg 2c3d1498d4 Optimize memcpy2D kernel use 2018-05-23 14:43:47 +05:30
Maneesh Gupta 7042fe6067 Merge pull request #464 from gargrahul/fix_memcpy2d_pinned_mem_case
Fixed memcpy2D for pinned memory case using 2D kernel
2018-05-22 10:42:28 +05:30
Rahul Garg 40fb44dbe6 Fixed memcpy2D for pinned memory case using 2D kernel 2018-05-21 22:14:45 +05:30
Maneesh Gupta 66d05e6fc3 hipMemcpy returns success if sizeBytes is 0.
Fixes SWDEV-153754 & SWDEV-154178.
2018-05-21 15:38:44 +05:30
Rahul Garg 4f5bdb071c Fix for memcpy2DAsync for pinned host memory case 2018-05-18 21:09:50 +05:30
Maneesh Gupta 1c93e11cdf Merge pull request #433 from gargrahul/add_hipmemset3d
Added hipMemset3D
2018-05-18 14:54:15 +05:30
Rahul Garg 8413fb51e1 Fixed hipMemcpy2D to handle 1D memcpy case 2018-05-16 11:07:10 +05:30
Rahul Garg 78568435da Added hipMemset3D 2018-05-07 10:24:30 +05:30
Lakhan Singh 12d8a47c0c Null checks added for hipmallocpitch and hipmemcpy apis 2018-05-03 09:27:50 +05:30
Rahul Garg 1d76c48e3d Fix texture 3D for HIP/NVCC 2018-05-02 11:56:37 +05:30
Lakhan Singh 74faa61d52 SWDEV-141024 2018-04-20 17:40:00 +05:30
Rahul Garg fcc0866681 Added hipMemset2DAsync support 2018-04-17 18:27:27 +05:30
Rahul Garg 2ee80397f2 Correct missed ihipMemsetCopyDataType change 2018-04-12 10:27:19 +05:30
Rahul Garg 2658963f5b Changed ihipMemsetCopyDataType to ihipMemsetDataType 2018-04-12 09:29:22 +05:30
Rahul Garg 88073a17c1 Fix hipMemset stream resolution 2018-04-11 19:01:53 +05:30
Rahul Garg 40846f6f8e hipMemset refactoring 2018-04-11 15:58:48 +05:30
Maneesh Gupta 445de66560 hipMemcpyAsync returns success when trying to copy 0 bytes
Change-Id: I4c0ee7ccc7563e2df657b50356cdd7fec9a1ef15
2018-04-09 12:39:44 +05:30
Maneesh Gupta 9e47fccc89 Apply .clangformat to all repo source files
Change-Id: I7e79c6058f0303f9a98911e3b7dd2e8596079344
2018-03-12 11:29:03 +05:30
Alex Voicu 696f0595a3 Change directory name to match HIP lowercase style. 2018-02-22 13:15:10 +00:00
Maneesh Gupta 647d1ba310 Merge pull request #321 from gargrahul/hipMemcpyArray_Functions
Added support for hipMemcpy Array functions-
2018-02-12 10:36:38 +05:30
Rahul Garg b8c23f979b Fixed host allocated globals address lookup for host usage
Fixed texture driver APIs failure
2018-01-30 18:06:31 +05:30
Rahul Garg ca5bcb5af4 Added support for -
- hipMemcpyFromArray
- hipMemcpyAtoH
- hipMemcpyHtoA
2018-01-16 11:44:19 +05:30
Rahul Garg 105df94cd0 Added support for
- 3D texture driver APIs
- hipMalloc3D
- hipMemcpy3D for destination other than array
2017-12-05 14:11:13 +05:30
Ben Sander 5933ca7300 Fix some cppcheck style issues. 2017-12-01 20:45:34 +00:00
Ben Sander 743c54a012 Fix warning from default cppchek. 2017-12-01 20:45:33 +00:00
Alex Voicu 33bb425013 Fix legacy mode detection of the address of an agent allocated variable. In this mode, there exist two executables per each code object, one created by HCC and one created by HIP. Since we dispatch through HCC in legacy mode, we should obtain the address for an agent allocated variable from the latter's executable. Also add two omitted validity checks, whose absence could lead to segfaults when the current process had no .kernel section and / or when an invalid or empty blob was extracted from the latter. 2017-11-30 03:29:04 +00:00
Alex Voicu 4966518846 Revert "Revert adoption of CUDA indexing in general - this can only work with later versions of the compiler, just like module based dispatch, and thus must be guarded against usage in earlier (e.g. 1.6) versions."
This reverts commit 4792475
2017-11-29 21:49:10 +00:00
Alex Voicu 2557000b56 Revert "Revert adoption of CUDA indexing in general - this can only work with later versions of the compiler, just like module based dispatch, and thus must be guarded against usage in earlier (e.g. 1.6) versions."
This reverts commit d2fd1f5
2017-11-29 21:36:29 +00:00
Alex Voicu d5c8de3f41 Change memset kernel to use memcpy instead of placement new. Simplify indexers. 2017-11-28 19:45:47 +00:00
Alex Voicu d37a5a6008 Merge remote-tracking branch 'origin/master' into feature_use_module_based_dispatch_instead_of_pfe
# Conflicts:
#	src/hip_module.cpp
2017-11-28 17:29:11 +00:00
Rahul Garg 38029f2849 Fixed review comments 2017-11-21 21:19:06 +05:30
Rahul Garg 24307fe5c4 Changed function hipMemcpy_2D to hipMemcpyParam2D 2017-11-21 12:36:24 +05:30
Alex Voicu f8c1c1b38e Refactor the __device__ versions of memset and memcpy to be less awkward i.e. not return nullptr as opposed to the destination pointer (it can only be assumed it was done for maximum confusion) and actually unroll as they claim to. Change all of the {to, from}Symbol functions to use hipModuleGetGlobal, as opposed to hc::accelerator::get_symbol_address which is no longer valid with module based dispatch. 2017-11-21 02:40:34 +00:00
Rahul Garg 1851c153f6 Texture driver APIs support 2017-11-09 22:10:55 +05:30
Alex Voicu 0ad1308efd Clean up trailing whitespace so as to reduce noise in #246. 2017-11-08 00:08:55 +00:00
Alex Voicu 17753cbd92 Merge remote-tracking branch 'origin/master' into feature_use_module_based_dispatch_instead_of_pfe
# Conflicts:
#	src/hip_module.cpp
2017-11-03 10:53:39 +00:00
Alex Voicu 4693c5e56c Correctly deal with functions from shared objects, wherein the program visible VA == so_base_va + st_value(function_symbol). Remove quaint usage of pfe for hipMemset (which is actually fill_n). 2017-11-01 22:33:13 +00:00