Граф коммитов

209 Коммитов

Автор SHA1 Сообщение Дата
Rahul Garg 4aa011eec6 Fix memcpy with IPC slowness (#1321)
* Fix memcpy with IPC slowness

* Make early erroneous returns

* Real Clean up

* Real Clean up++


[ROCm/hip commit: 71559200c0]
2019-08-23 09:19:18 +00:00
Jatin Chaudhary 7dca0455e9 Using HSA API for hipMemsetAsync (#1346)
[ROCm/hip commit: ac62d7a5c0]
2019-08-21 10:00:10 +00:00
Rahul Garg a984acf245 Add hipMemcpy3DAsync (#1320)
* Add hipMemcpy3DAsync

* Fix CI build error

* Move back stream resolution to internal function

* Remove stream redefinition and check


[ROCm/hip commit: 2405621f62]
2019-08-16 02:13:16 +00:00
Rahul Garg d42844182c Add hipMemcpyParam2DAsync (#1296)
* Add hipMemcpyParam2DAsync

* Add NVCC path changes

* Clean up

* Fix build issue

* Fix else use in both sync and async apis


[ROCm/hip commit: 45b73e0961]
2019-08-09 11:50:37 +00:00
Jeff Daily 9b44993343 consolidate thread local storage (#915)
* all thread local access now through single struct

* clean up old commented-out code, more use of GET_TLS()

* fewer calls to GET_TLS by passing tls as a funtion argument

* revert unnecessary change to printf

* fix failing tests due to TLS change

* fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor


[ROCm/hip commit: 1eb3dbf065]
2019-08-05 09:51:02 +00:00
Rahul Garg 0009dc1067 Change hipErrorUnknown to hipErrorInvalidValue
[ROCm/hip commit: 483aab031f]
2019-07-31 00:28:30 +05:30
Evgeny Mankov d11063e64c [HIP] Fix segfault on uninitialized struct members in hipArrayCreate and hipArray3DCreate
[ROCm/hip commit: 09162d9a53]
2019-07-12 16:38:26 +03:00
Evgeny Mankov 96801f7b3a [HIP][HIPIFY] Split HIP_ARRAY_DESCRIPTOR struct to HIP_ARRAY_DESCRIPTOR and HIP_ARRAY3D_DESCRIPTOR
[Reason] To be compatible with CUDA [#1133]

Update HIP code, hipify-clang, tests and docs

[TODO] Add support of the corresponding functions on nvcc fallback path


[ROCm/hip commit: c7117df91b]
2019-07-11 14:58:16 +03:00
Jatin Chaudhary c7f8ffe41e Adding bounds check before hipMemset (#1190)
* Adding bounds check in ihipMemset

* Adding ihipMemPtrGetInfo to hipMemPtrGetInfo


[ROCm/hip commit: 5ed16432f8]
2019-07-08 11:00:38 +00:00
Anusha Godavarthy Surya f1d6b56fc4 Added missing NULL checks and corrected API return values as per validation
[ROCm/hip commit: 3d5f6be1c7]
2019-06-27 00:19:05 +05:30
Evgeny Mankov cd309b6638 [HIP][HIPIFY] Make hipMemcpyParam2D coherent with cuMemcpy2D
+ Makes hip_Memcpy2D struct compatible with CUDA_MEMCPY2D struct
+ Add hipMemcpyParam2D support in nvcc fallback path
+ Update hipify-clang, tests and docs accordingly


[ROCm/hip commit: 8f059b0ee9]
2019-05-22 18:31:39 +03:00
Rahul Garg c4567ad01a Add fine grained host memory lock support (#1095)
* Add fine grained host memory lock support

* Fix default flag check


[ROCm/hip commit: aeeab1b23f]
2019-05-13 11:48:26 +05:30
Rahul Garg c01236f679 Add hipMallocManaged default functional support (#1036)
* Add hipMallocManaged default functional support

* Fix build error

* Add dtest


[ROCm/hip commit: 2bc2c46d4d]
2019-04-24 16:50:03 +05:30
Jeff Daily a0172ca884 In hipFree, synchronize owner of memory (#1018)
* In hipFree, if memory is associated with a device, synchronize that device's streams.

This changes the behavior from synchronizing the currently set TLS device.

* All devices sync in hipFree for _appId=-1 case.

* Revert "All devices sync in hipFree for _appId=-1 case."

This reverts commit 1efb34d6a8426661e45bc5f763422a1147aeac10.

* add HIP_SYNC_FREE env var


[ROCm/hip commit: 2b3037a6ea]
2019-04-16 08:35:55 +05:30
Rahul Garg 0eaa29ad06 Handle D2D in memcpy2D
[ROCm/hip commit: 0c55db8552]
2019-03-28 02:21:45 +05:30
Rahul Garg d98d5ca12a Let hipHostMalloc always share/map pinned host ptr
[ROCm/hip commit: f0af073793]
2019-03-26 10:19:13 +05:30
Rahul Garg c6ef785464 Avoid double mapping of devices to hostMalloc buffer
[ROCm/hip commit: 5e917d70f3]
2019-03-25 23:07:05 +05:30
Maneesh Gupta 82fd86e63f Merge pull request #970 from mangupta/swdev-172995
hipExtMallocWithFlags implementation

[ROCm/hip commit: 30b5c02ec4]
2019-03-25 07:46:53 +00:00
Maneesh Gupta 67819c0395 hipExtMallocWithFlags needs hcc workweek 19115 or higher
[ROCm/hip commit: cab119c8b2]
2019-03-25 11:41:20 +05:30
Maneesh Gupta 9ac6005d35 hipExtMallocWithFlags implementation
Change-Id: Iee9e119796472200b2933d5e23be60813f33bc75


[ROCm/hip commit: 73ec5d54b5]
2019-03-19 11:59:22 +05:30
Rahul Garg a3fb908a0a Add 2D fallback to use copy kernel
[ROCm/hip commit: 918d7e3a40]
2019-03-14 13:03:06 +05:30
Alex Voicu 0c16497abd dlopen() fixes (#929)
* Initial attempt to switch over to internally linked state.

* Add missing CMake update.

* hipLaunchKernelGGLImpl must be inline as well. Ensure internal linkage.

* Ensure global retrieval uses internally linked state.

* Hide HC in the implementation. Minimise ADL woes.

* Strange software exists, and must be catered to.

* Use a less spammy mechanism for ensuring internal linkage / non-export.

* Remove leftover internal detail.


[ROCm/hip commit: ea0fcf3e61]
2019-03-06 17:31:44 +05:30
Wen-Heng (Jack) Chung da589e38ed Address code review comments to use hipDeviceptr_t
[ROCm/hip commit: 5cbd28f29b]
2019-03-05 05:51:05 +00:00
Wen-Heng (Jack) Chung 0b7f38d100 Add hipMemsetD32 and hipMemsetD32Async
Add 2 extra memset functions which fills memory with integer-typed data

Also change the parameters of ihipMemset to better explain the semantic


[ROCm/hip commit: 7ebbbd3525]
2019-03-04 17:00:33 +00:00
Wilkin Chau 4a0d68ba3f Fix hipMemset3D test
Calculate the allocated size based on the width, height and depth.


[ROCm/hip commit: 8d92d1ebd7]
2019-02-28 22:42:46 +00:00
Evgeny 3f7ff3450e fixing HSA_INIT_API cid args
[ROCm/hip commit: 0164464bcc]
2019-01-16 23:45:44 -06:00
Maneesh Gupta 3cf96f31d0 Merge pull request #797 from gargrahul/fixhipPointerGetAttributes
Fixed hipPointerGetAttributes for hostmalloced ptr

[ROCm/hip commit: 56ce3e37d5]
2018-12-12 10:16:07 +05:30
Maneesh Gupta 07dcdff9e5 Merge pull request #608 from gargrahul/add_pinned_2d_sdma_copy
Added support for pinned 2D SDMA copy

[ROCm/hip commit: 0dd26b4f63]
2018-12-12 07:44:16 +05:30
Rahul Garg b304ff5210 Fixed hipPointerGetAttributes for hostmalloced ptr
[ROCm/hip commit: 5f12067708]
2018-12-08 01:42:08 +05:30
Maneesh Gupta 05e09614be Merge pull request #760 from eshcherb/roctracer-hip-frontend-181113
Roctracer hip frontend 181113

[ROCm/hip commit: 160c509e23]
2018-11-23 11:08:25 +05:30
Maneesh Gupta e2bc3e49a5 Merge pull request #748 from mkuron/getsymboladdress
Implement hipGetSymbolAddress and hipGetSymbolSize

[ROCm/hip commit: bcea027bf1]
2018-11-21 10:32:01 +05:30
Michael Kuron c35dfb71d5 Merge branch 'master' into getsymboladdress
[ROCm/hip commit: 8610128c3e]
2018-11-20 12:03:22 +01:00
Rahul Garg 89efed29d7 Fix hipHostRegister
[ROCm/hip commit: 1a038879a9]
2018-11-17 05:38:35 +05:30
Evgeny 73e3c4ec42 renaming HIP_INIT_CB_API to HIP_INIT_API
[ROCm/hip commit: e5ba097afd]
2018-11-13 15:33:26 +00:00
Evgeny 0a58dc9b7b adding activity prof layer
[ROCm/hip commit: b8b1637ef7]
2018-11-13 15:33:26 +00:00
Rahul Garg ecea878072 Fixed hipMemcpyToSymbol doesn't work on GPU other than device 0 SWDEV-166881
[ROCm/hip commit: 11e7ab8879]
2018-11-13 00:49:20 +05:30
Michael Kuron f69866eecc Use correct trace macro in hipGetSymbolAddress/hipGetSymbolSize
[ROCm/hip commit: 6ebcc2922c]
2018-11-06 20:46:30 +01:00
Michael Kuron cbba8221ee Introduce ihipModuleGetGlobal
[ROCm/hip commit: 31acf1c268]
2018-11-06 09:54:34 +01:00
Michael Kuron bc455ccf50 Implement hipGetSymbolAddress and hipGetSymbolSize
[ROCm/hip commit: 73616582d6]
2018-11-04 10:39:34 +01:00
Siu Chi Chan 1159b4aa05 Move the global arrays for hip malloc/free
from a header into a source file such that
there's only an unique copy in an executable
and prevent wasting static memory on the host

Change-Id: Id5b62766f77809c8d7b47892cb7149c490dcbdb9


[ROCm/hip commit: 0ff408a56c]
2018-11-01 16:20:35 -04:00
Anton Gorenko f2ce51bdf5 Fix allocation size of arrays with multiple and/or non-32-bit channels
hipMallocArray and hipMalloc3DArray must use sum of bits
of all components.


[ROCm/hip commit: 21f044eac8]
2018-10-29 18:12:00 +06:00
Rahul Garg 6d53af5a60 Return hipSuccess when sizeBytes=0 in hipMemset
[ROCm/hip commit: 90f57d452a]
2018-09-26 12:47:36 +05:30
Rahul Garg 81074364c8 Added support for pinned 2D SDMA copy
[ROCm/hip commit: 1e57764378]
2018-07-31 14:05:35 +05:30
Sarunya Pumma a68ea730c2 Remove device mapping from shareWithAll memory
When shareWithAll memory (e.g., host memory) is allocated, set appId
in hc::AmPointerInfo to -1 to indicate that this memory is not mapped
to any device.  Peer checking in ihipStream_t::canSeeMemory is not
necessary if memory is shared with all devices.  Thus, it is skipped.

Note that earlier host memory is always mapped to device 0 and HIP
always performs peer checking for all kinds of hipMemcpy.  Since the
peer checking process requires context locking, hipMemcpy from/to host
memory always grabs device 0's context lock.  Therefore, if there is
another thread holding the context lock of device 0 (e.g.,
hipDeviceSynchronize on device 0), hipMemcpy will have to wait for the
lock until it can actually perform memcpy.  This can significantly
deteriorate execution performance.

Signed-off-by: Sarunya Pumma <sarunya.pumma@amd.com>


[ROCm/hip commit: 8111fd3b8b]
2018-07-28 23:15:16 -07:00
Rahul Garg c957c42c20 Revert "Use memcpy kernel for all pinned memory cases in hipMemcpy2DAsync"
[ROCm/hip commit: 7cd1d5e644]
2018-07-02 14:32:11 +05:30
Rahul Garg 388679efc8 TEMP- fix memcpy2dAsync for trsm issue
[ROCm/hip commit: cd23905897]
2018-06-15 16:08:29 +05:30
Rahul Garg 312999de41 Fix stream resolution in memcpy2dasync
[ROCm/hip commit: 069e2c34c9]
2018-06-14 11:58:56 +05:30
Rahul Garg 1d6396dfb9 Fix retrieved locked ptr offset
[ROCm/hip commit: 00f8a36bc7]
2018-06-13 23:10:05 +05:30
Maneesh Gupta ac027e4092 Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
Fix hipMemcpy3D for fast path

[ROCm/hip commit: 9e9c039ee4]
2018-06-06 14:44:02 +05:30
Rahul Garg e7bc68d347 Fix hipMemcpy3D for fast path
[ROCm/hip commit: a46ff2afd5]
2018-06-05 18:54:33 +05:30