Evgeny
0a58dc9b7b
adding activity prof layer
...
[ROCm/hip commit: b8b1637ef7 ]
2018-11-13 15:33:26 +00:00
Siu Chi Chan
1159b4aa05
Move the global arrays for hip malloc/free
...
from a header into a source file such that
there's only an unique copy in an executable
and prevent wasting static memory on the host
Change-Id: Id5b62766f77809c8d7b47892cb7149c490dcbdb9
[ROCm/hip commit: 0ff408a56c ]
2018-11-01 16:20:35 -04:00
Anton Gorenko
f2ce51bdf5
Fix allocation size of arrays with multiple and/or non-32-bit channels
...
hipMallocArray and hipMalloc3DArray must use sum of bits
of all components.
[ROCm/hip commit: 21f044eac8 ]
2018-10-29 18:12:00 +06:00
Rahul Garg
6d53af5a60
Return hipSuccess when sizeBytes=0 in hipMemset
...
[ROCm/hip commit: 90f57d452a ]
2018-09-26 12:47:36 +05:30
Sarunya Pumma
a68ea730c2
Remove device mapping from shareWithAll memory
...
When shareWithAll memory (e.g., host memory) is allocated, set appId
in hc::AmPointerInfo to -1 to indicate that this memory is not mapped
to any device. Peer checking in ihipStream_t::canSeeMemory is not
necessary if memory is shared with all devices. Thus, it is skipped.
Note that earlier host memory is always mapped to device 0 and HIP
always performs peer checking for all kinds of hipMemcpy. Since the
peer checking process requires context locking, hipMemcpy from/to host
memory always grabs device 0's context lock. Therefore, if there is
another thread holding the context lock of device 0 (e.g.,
hipDeviceSynchronize on device 0), hipMemcpy will have to wait for the
lock until it can actually perform memcpy. This can significantly
deteriorate execution performance.
Signed-off-by: Sarunya Pumma <sarunya.pumma@amd.com >
[ROCm/hip commit: 8111fd3b8b ]
2018-07-28 23:15:16 -07:00
Rahul Garg
c957c42c20
Revert "Use memcpy kernel for all pinned memory cases in hipMemcpy2DAsync"
...
[ROCm/hip commit: 7cd1d5e644 ]
2018-07-02 14:32:11 +05:30
Rahul Garg
388679efc8
TEMP- fix memcpy2dAsync for trsm issue
...
[ROCm/hip commit: cd23905897 ]
2018-06-15 16:08:29 +05:30
Rahul Garg
312999de41
Fix stream resolution in memcpy2dasync
...
[ROCm/hip commit: 069e2c34c9 ]
2018-06-14 11:58:56 +05:30
Rahul Garg
1d6396dfb9
Fix retrieved locked ptr offset
...
[ROCm/hip commit: 00f8a36bc7 ]
2018-06-13 23:10:05 +05:30
Maneesh Gupta
ac027e4092
Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
...
Fix hipMemcpy3D for fast path
[ROCm/hip commit: 9e9c039ee4 ]
2018-06-06 14:44:02 +05:30
Rahul Garg
e7bc68d347
Fix hipMemcpy3D for fast path
...
[ROCm/hip commit: a46ff2afd5 ]
2018-06-05 18:54:33 +05:30
Rahul Garg
6592b35c39
Fix depth value for 3D allocations
...
[ROCm/hip commit: 276c948a16 ]
2018-06-04 18:00:22 +05:30
Rahul Garg
46e623fb31
Fix memcpy2D for malloc+ hostRegister
...
[ROCm/hip commit: 8d6357669d ]
2018-05-31 13:14:27 +05:30
Rahul Garg
4021f68f64
Use 64x4 grid dims
...
[ROCm/hip commit: d8cb47242b ]
2018-05-24 23:51:52 +05:30
Rahul Garg
35169c5191
Clean up and fix remaining bytes copy
...
[ROCm/hip commit: 4ff059d641 ]
2018-05-24 23:30:27 +05:30
Rahul Garg
fb745baa7e
Fix memcpy2d kernel dims
...
[ROCm/hip commit: 981e56a68f ]
2018-05-24 17:00:12 +05:30
Rahul Garg
fb1425959e
Correct remaining bytes in copy 2d kernel
...
[ROCm/hip commit: dc179e0c33 ]
2018-05-24 08:27:24 +05:30
Rahul Garg
08f750571d
Optimize memcpy2D kernel use
...
[ROCm/hip commit: 9a76d5b94c ]
2018-05-23 14:43:47 +05:30
Maneesh Gupta
06db862856
Merge pull request #464 from gargrahul/fix_memcpy2d_pinned_mem_case
...
Fixed memcpy2D for pinned memory case using 2D kernel
[ROCm/hip commit: 323a6226b0 ]
2018-05-22 10:42:28 +05:30
Rahul Garg
f02803c527
Fixed memcpy2D for pinned memory case using 2D kernel
...
[ROCm/hip commit: f47a8236d7 ]
2018-05-21 22:14:45 +05:30
Maneesh Gupta
182f8ff28f
hipMemcpy returns success if sizeBytes is 0.
...
Fixes SWDEV-153754 & SWDEV-154178.
[ROCm/hip commit: 0180a82963 ]
2018-05-21 15:38:44 +05:30
Rahul Garg
14030c3f17
Fix for memcpy2DAsync for pinned host memory case
...
[ROCm/hip commit: afe62e7030 ]
2018-05-18 21:09:50 +05:30
Maneesh Gupta
3d1d7ccf30
Merge pull request #433 from gargrahul/add_hipmemset3d
...
Added hipMemset3D
[ROCm/hip commit: 03ac8e6a92 ]
2018-05-18 14:54:15 +05:30
Rahul Garg
4c44cd4a88
Fixed hipMemcpy2D to handle 1D memcpy case
...
[ROCm/hip commit: 8f010ac68e ]
2018-05-16 11:07:10 +05:30
Rahul Garg
e2a2b5bdcf
Added hipMemset3D
...
[ROCm/hip commit: da302c3e93 ]
2018-05-07 10:24:30 +05:30
Lakhan Singh
51dbf4f5ca
Null checks added for hipmallocpitch and hipmemcpy apis
...
[ROCm/hip commit: 6411ca1f6d ]
2018-05-03 09:27:50 +05:30
Rahul Garg
ab1dabe61b
Fix texture 3D for HIP/NVCC
...
[ROCm/hip commit: 9de5f23d54 ]
2018-05-02 11:56:37 +05:30
Lakhan Singh
701de3092b
SWDEV-141024
...
[ROCm/hip commit: 1c2509dc04 ]
2018-04-20 17:40:00 +05:30
Rahul Garg
e1e88f3bff
Added hipMemset2DAsync support
...
[ROCm/hip commit: 3cfb9c0d40 ]
2018-04-17 18:27:27 +05:30
Rahul Garg
89511823f0
Correct missed ihipMemsetCopyDataType change
...
[ROCm/hip commit: 16c89d101a ]
2018-04-12 10:27:19 +05:30
Rahul Garg
abe14442a5
Changed ihipMemsetCopyDataType to ihipMemsetDataType
...
[ROCm/hip commit: 3d6eb75828 ]
2018-04-12 09:29:22 +05:30
Rahul Garg
6c4236dfb6
Fix hipMemset stream resolution
...
[ROCm/hip commit: 294bf50f68 ]
2018-04-11 19:01:53 +05:30
Rahul Garg
36c6e0019d
hipMemset refactoring
...
[ROCm/hip commit: 412a35be20 ]
2018-04-11 15:58:48 +05:30
Maneesh Gupta
4077c681c9
hipMemcpyAsync returns success when trying to copy 0 bytes
...
Change-Id: I4c0ee7ccc7563e2df657b50356cdd7fec9a1ef15
[ROCm/hip commit: 03eca1c57e ]
2018-04-09 12:39:44 +05:30
Maneesh Gupta
4f42ee762d
Apply .clangformat to all repo source files
...
Change-Id: I7e79c6058f0303f9a98911e3b7dd2e8596079344
[ROCm/hip commit: 1ba06f63c4 ]
2018-03-12 11:29:03 +05:30
Alex Voicu
717a01660a
Change directory name to match HIP lowercase style.
...
[ROCm/hip commit: dc7560ef22 ]
2018-02-22 13:15:10 +00:00
Maneesh Gupta
d4a4a8a1c1
Merge pull request #321 from gargrahul/hipMemcpyArray_Functions
...
Added support for hipMemcpy Array functions-
[ROCm/hip commit: 4b8ae78891 ]
2018-02-12 10:36:38 +05:30
Rahul Garg
e2ade308cf
Fixed host allocated globals address lookup for host usage
...
Fixed texture driver APIs failure
[ROCm/hip commit: 24ab820a11 ]
2018-01-30 18:06:31 +05:30
Rahul Garg
5da7dcbd3b
Added support for -
...
- hipMemcpyFromArray
- hipMemcpyAtoH
- hipMemcpyHtoA
[ROCm/hip commit: 487a430b5a ]
2018-01-16 11:44:19 +05:30
Rahul Garg
299c873e1a
Added support for
...
- 3D texture driver APIs
- hipMalloc3D
- hipMemcpy3D for destination other than array
[ROCm/hip commit: 115c7f2b79 ]
2017-12-05 14:11:13 +05:30
Ben Sander
905389741c
Fix some cppcheck style issues.
...
[ROCm/hip commit: 9bba97fdcc ]
2017-12-01 20:45:34 +00:00
Ben Sander
419a80db24
Fix warning from default cppchek.
...
[ROCm/hip commit: 4313686d6e ]
2017-12-01 20:45:33 +00:00
Alex Voicu
08a0d96448
Fix legacy mode detection of the address of an agent allocated variable. In this mode, there exist two executables per each code object, one created by HCC and one created by HIP. Since we dispatch through HCC in legacy mode, we should obtain the address for an agent allocated variable from the latter's executable. Also add two omitted validity checks, whose absence could lead to segfaults when the current process had no .kernel section and / or when an invalid or empty blob was extracted from the latter.
...
[ROCm/hip commit: 7c0b9a005b ]
2017-11-30 03:29:04 +00:00
Alex Voicu
8d51eaafb6
Revert "Revert adoption of CUDA indexing in general - this can only work with later versions of the compiler, just like module based dispatch, and thus must be guarded against usage in earlier (e.g. 1.6) versions."
...
This reverts commit 1c50968
[ROCm/hip commit: 32e11e7dc6 ]
2017-11-29 21:49:10 +00:00
Alex Voicu
fcc42f035e
Revert "Revert adoption of CUDA indexing in general - this can only work with later versions of the compiler, just like module based dispatch, and thus must be guarded against usage in earlier (e.g. 1.6) versions."
...
This reverts commit d2fd1f5
[ROCm/hip commit: fbaf729f88 ]
2017-11-29 21:36:29 +00:00
Alex Voicu
9668003fe3
Change memset kernel to use memcpy instead of placement new. Simplify indexers.
...
[ROCm/hip commit: 6e4ca3fbb4 ]
2017-11-28 19:45:47 +00:00
Alex Voicu
11f7d895f4
Merge remote-tracking branch 'origin/master' into feature_use_module_based_dispatch_instead_of_pfe
...
# Conflicts:
# src/hip_module.cpp
[ROCm/hip commit: dc67ca3feb ]
2017-11-28 17:29:11 +00:00
Rahul Garg
13879387e5
Fixed review comments
...
[ROCm/hip commit: 56862b1c35 ]
2017-11-21 21:19:06 +05:30
Rahul Garg
03552f2e94
Changed function hipMemcpy_2D to hipMemcpyParam2D
...
[ROCm/hip commit: 9866fa250d ]
2017-11-21 12:36:24 +05:30
Alex Voicu
45ff3c31c4
Refactor the __device__ versions of memset and memcpy to be less awkward i.e. not return nullptr as opposed to the destination pointer (it can only be assumed it was done for maximum confusion) and actually unroll as they claim to. Change all of the {to, from}Symbol functions to use hipModuleGetGlobal, as opposed to hc::accelerator::get_symbol_address which is no longer valid with module based dispatch.
...
[ROCm/hip commit: 9d088d2283 ]
2017-11-21 02:40:34 +00:00