Граф коммитов

211 Коммитов

Автор SHA1 Сообщение Дата
Rahul Garg fe47b2185c [HACK] Temporary fix for hipFree for hipManagedMalloc 2019-09-13 02:10:21 +05:30
Rahul Garg 6545521d6c Revert "Using HSA API for hipMemsetAsync (#1346)" (#1381)
This reverts commit ac62d7a5c0.
2019-09-03 05:13:46 +00:00
Rahul Garg 71559200c0 Fix memcpy with IPC slowness (#1321)
* Fix memcpy with IPC slowness

* Make early erroneous returns

* Real Clean up

* Real Clean up++
2019-08-23 09:19:18 +00:00
Jatin Chaudhary ac62d7a5c0 Using HSA API for hipMemsetAsync (#1346) 2019-08-21 10:00:10 +00:00
Rahul Garg 2405621f62 Add hipMemcpy3DAsync (#1320)
* Add hipMemcpy3DAsync

* Fix CI build error

* Move back stream resolution to internal function

* Remove stream redefinition and check
2019-08-16 02:13:16 +00:00
Rahul Garg 45b73e0961 Add hipMemcpyParam2DAsync (#1296)
* Add hipMemcpyParam2DAsync

* Add NVCC path changes

* Clean up

* Fix build issue

* Fix else use in both sync and async apis
2019-08-09 11:50:37 +00:00
Jeff Daily 1eb3dbf065 consolidate thread local storage (#915)
* all thread local access now through single struct

* clean up old commented-out code, more use of GET_TLS()

* fewer calls to GET_TLS by passing tls as a funtion argument

* revert unnecessary change to printf

* fix failing tests due to TLS change

* fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor
2019-08-05 09:51:02 +00:00
Rahul Garg 483aab031f Change hipErrorUnknown to hipErrorInvalidValue 2019-07-31 00:28:30 +05:30
Evgeny Mankov 09162d9a53 [HIP] Fix segfault on uninitialized struct members in hipArrayCreate and hipArray3DCreate 2019-07-12 16:38:26 +03:00
Evgeny Mankov c7117df91b [HIP][HIPIFY] Split HIP_ARRAY_DESCRIPTOR struct to HIP_ARRAY_DESCRIPTOR and HIP_ARRAY3D_DESCRIPTOR
[Reason] To be compatible with CUDA [#1133]

Update HIP code, hipify-clang, tests and docs

[TODO] Add support of the corresponding functions on nvcc fallback path
2019-07-11 14:58:16 +03:00
Jatin Chaudhary 5ed16432f8 Adding bounds check before hipMemset (#1190)
* Adding bounds check in ihipMemset

* Adding ihipMemPtrGetInfo to hipMemPtrGetInfo
2019-07-08 11:00:38 +00:00
Anusha Godavarthy Surya 3d5f6be1c7 Added missing NULL checks and corrected API return values as per validation 2019-06-27 00:19:05 +05:30
Evgeny Mankov 8f059b0ee9 [HIP][HIPIFY] Make hipMemcpyParam2D coherent with cuMemcpy2D
+ Makes hip_Memcpy2D struct compatible with CUDA_MEMCPY2D struct
+ Add hipMemcpyParam2D support in nvcc fallback path
+ Update hipify-clang, tests and docs accordingly
2019-05-22 18:31:39 +03:00
Rahul Garg aeeab1b23f Add fine grained host memory lock support (#1095)
* Add fine grained host memory lock support

* Fix default flag check
2019-05-13 11:48:26 +05:30
Rahul Garg 2bc2c46d4d Add hipMallocManaged default functional support (#1036)
* Add hipMallocManaged default functional support

* Fix build error

* Add dtest
2019-04-24 16:50:03 +05:30
Jeff Daily 2b3037a6ea In hipFree, synchronize owner of memory (#1018)
* In hipFree, if memory is associated with a device, synchronize that device's streams.

This changes the behavior from synchronizing the currently set TLS device.

* All devices sync in hipFree for _appId=-1 case.

* Revert "All devices sync in hipFree for _appId=-1 case."

This reverts commit 1efb34d6a8426661e45bc5f763422a1147aeac10.

* add HIP_SYNC_FREE env var
2019-04-16 08:35:55 +05:30
Rahul Garg 0c55db8552 Handle D2D in memcpy2D 2019-03-28 02:21:45 +05:30
Rahul Garg f0af073793 Let hipHostMalloc always share/map pinned host ptr 2019-03-26 10:19:13 +05:30
Rahul Garg 5e917d70f3 Avoid double mapping of devices to hostMalloc buffer 2019-03-25 23:07:05 +05:30
Maneesh Gupta 30b5c02ec4 Merge pull request #970 from mangupta/swdev-172995
hipExtMallocWithFlags implementation
2019-03-25 07:46:53 +00:00
Maneesh Gupta cab119c8b2 hipExtMallocWithFlags needs hcc workweek 19115 or higher 2019-03-25 11:41:20 +05:30
Maneesh Gupta 73ec5d54b5 hipExtMallocWithFlags implementation
Change-Id: Iee9e119796472200b2933d5e23be60813f33bc75
2019-03-19 11:59:22 +05:30
Rahul Garg 918d7e3a40 Add 2D fallback to use copy kernel 2019-03-14 13:03:06 +05:30
Alex Voicu ea0fcf3e61 dlopen() fixes (#929)
* Initial attempt to switch over to internally linked state.

* Add missing CMake update.

* hipLaunchKernelGGLImpl must be inline as well. Ensure internal linkage.

* Ensure global retrieval uses internally linked state.

* Hide HC in the implementation. Minimise ADL woes.

* Strange software exists, and must be catered to.

* Use a less spammy mechanism for ensuring internal linkage / non-export.

* Remove leftover internal detail.
2019-03-06 17:31:44 +05:30
Wen-Heng (Jack) Chung 5cbd28f29b Address code review comments to use hipDeviceptr_t 2019-03-05 05:51:05 +00:00
Wen-Heng (Jack) Chung 7ebbbd3525 Add hipMemsetD32 and hipMemsetD32Async
Add 2 extra memset functions which fills memory with integer-typed data

Also change the parameters of ihipMemset to better explain the semantic
2019-03-04 17:00:33 +00:00
Wilkin Chau 8d92d1ebd7 Fix hipMemset3D test
Calculate the allocated size based on the width, height and depth.
2019-02-28 22:42:46 +00:00
Evgeny 0164464bcc fixing HSA_INIT_API cid args 2019-01-16 23:45:44 -06:00
Maneesh Gupta 56ce3e37d5 Merge pull request #797 from gargrahul/fixhipPointerGetAttributes
Fixed hipPointerGetAttributes for hostmalloced ptr
2018-12-12 10:16:07 +05:30
Maneesh Gupta 0dd26b4f63 Merge pull request #608 from gargrahul/add_pinned_2d_sdma_copy
Added support for pinned 2D SDMA copy
2018-12-12 07:44:16 +05:30
Rahul Garg 5f12067708 Fixed hipPointerGetAttributes for hostmalloced ptr 2018-12-08 01:42:08 +05:30
Maneesh Gupta 160c509e23 Merge pull request #760 from eshcherb/roctracer-hip-frontend-181113
Roctracer hip frontend 181113
2018-11-23 11:08:25 +05:30
Maneesh Gupta bcea027bf1 Merge pull request #748 from mkuron/getsymboladdress
Implement hipGetSymbolAddress and hipGetSymbolSize
2018-11-21 10:32:01 +05:30
Michael Kuron 8610128c3e Merge branch 'master' into getsymboladdress 2018-11-20 12:03:22 +01:00
Rahul Garg 1a038879a9 Fix hipHostRegister 2018-11-17 05:38:35 +05:30
Evgeny e5ba097afd renaming HIP_INIT_CB_API to HIP_INIT_API 2018-11-13 15:33:26 +00:00
Evgeny b8b1637ef7 adding activity prof layer 2018-11-13 15:33:26 +00:00
Rahul Garg 11e7ab8879 Fixed hipMemcpyToSymbol doesn't work on GPU other than device 0 SWDEV-166881 2018-11-13 00:49:20 +05:30
Michael Kuron 6ebcc2922c Use correct trace macro in hipGetSymbolAddress/hipGetSymbolSize 2018-11-06 20:46:30 +01:00
Michael Kuron 31acf1c268 Introduce ihipModuleGetGlobal 2018-11-06 09:54:34 +01:00
Michael Kuron 73616582d6 Implement hipGetSymbolAddress and hipGetSymbolSize 2018-11-04 10:39:34 +01:00
Siu Chi Chan 0ff408a56c Move the global arrays for hip malloc/free
from a header into a source file such that
there's only an unique copy in an executable
and prevent wasting static memory on the host

Change-Id: Id5b62766f77809c8d7b47892cb7149c490dcbdb9
2018-11-01 16:20:35 -04:00
Anton Gorenko 21f044eac8 Fix allocation size of arrays with multiple and/or non-32-bit channels
hipMallocArray and hipMalloc3DArray must use sum of bits
of all components.
2018-10-29 18:12:00 +06:00
Rahul Garg 90f57d452a Return hipSuccess when sizeBytes=0 in hipMemset 2018-09-26 12:47:36 +05:30
Rahul Garg 1e57764378 Added support for pinned 2D SDMA copy 2018-07-31 14:05:35 +05:30
Sarunya Pumma 8111fd3b8b Remove device mapping from shareWithAll memory
When shareWithAll memory (e.g., host memory) is allocated, set appId
in hc::AmPointerInfo to -1 to indicate that this memory is not mapped
to any device.  Peer checking in ihipStream_t::canSeeMemory is not
necessary if memory is shared with all devices.  Thus, it is skipped.

Note that earlier host memory is always mapped to device 0 and HIP
always performs peer checking for all kinds of hipMemcpy.  Since the
peer checking process requires context locking, hipMemcpy from/to host
memory always grabs device 0's context lock.  Therefore, if there is
another thread holding the context lock of device 0 (e.g.,
hipDeviceSynchronize on device 0), hipMemcpy will have to wait for the
lock until it can actually perform memcpy.  This can significantly
deteriorate execution performance.

Signed-off-by: Sarunya Pumma <sarunya.pumma@amd.com>
2018-07-28 23:15:16 -07:00
Rahul Garg 7cd1d5e644 Revert "Use memcpy kernel for all pinned memory cases in hipMemcpy2DAsync" 2018-07-02 14:32:11 +05:30
Rahul Garg cd23905897 TEMP- fix memcpy2dAsync for trsm issue 2018-06-15 16:08:29 +05:30
Rahul Garg 069e2c34c9 Fix stream resolution in memcpy2dasync 2018-06-14 11:58:56 +05:30
Rahul Garg 00f8a36bc7 Fix retrieved locked ptr offset 2018-06-13 23:10:05 +05:30