Commit grafiek

210 Commits

Auteur SHA1 Bericht Datum
Rahul Garg d433f6fb58 Revert "Using HSA API for hipMemsetAsync (#1346)" (#1381)
This reverts commit 9bbd09b04f.
2019-09-03 05:13:46 +00:00
Rahul Garg a786728939 Fix memcpy with IPC slowness (#1321)
* Fix memcpy with IPC slowness

* Make early erroneous returns

* Real Clean up

* Real Clean up++
2019-08-23 09:19:18 +00:00
Jatin Chaudhary 9bbd09b04f Using HSA API for hipMemsetAsync (#1346) 2019-08-21 10:00:10 +00:00
Rahul Garg fbc9f7e20a Add hipMemcpy3DAsync (#1320)
* Add hipMemcpy3DAsync

* Fix CI build error

* Move back stream resolution to internal function

* Remove stream redefinition and check
2019-08-16 02:13:16 +00:00
Rahul Garg 569f35a258 Add hipMemcpyParam2DAsync (#1296)
* Add hipMemcpyParam2DAsync

* Add NVCC path changes

* Clean up

* Fix build issue

* Fix else use in both sync and async apis
2019-08-09 11:50:37 +00:00
Jeff Daily f337ae1edb consolidate thread local storage (#915)
* all thread local access now through single struct

* clean up old commented-out code, more use of GET_TLS()

* fewer calls to GET_TLS by passing tls as a funtion argument

* revert unnecessary change to printf

* fix failing tests due to TLS change

* fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor
2019-08-05 09:51:02 +00:00
Rahul Garg 1c49943ac3 Change hipErrorUnknown to hipErrorInvalidValue 2019-07-31 00:28:30 +05:30
Evgeny Mankov 299fbd4842 [HIP] Fix segfault on uninitialized struct members in hipArrayCreate and hipArray3DCreate 2019-07-12 16:38:26 +03:00
Evgeny Mankov f0832fd968 [HIP][HIPIFY] Split HIP_ARRAY_DESCRIPTOR struct to HIP_ARRAY_DESCRIPTOR and HIP_ARRAY3D_DESCRIPTOR
[Reason] To be compatible with CUDA [#1133]

Update HIP code, hipify-clang, tests and docs

[TODO] Add support of the corresponding functions on nvcc fallback path
2019-07-11 14:58:16 +03:00
Jatin Chaudhary fcb0a3d4e2 Adding bounds check before hipMemset (#1190)
* Adding bounds check in ihipMemset

* Adding ihipMemPtrGetInfo to hipMemPtrGetInfo
2019-07-08 11:00:38 +00:00
Anusha Godavarthy Surya 4989452413 Added missing NULL checks and corrected API return values as per validation 2019-06-27 00:19:05 +05:30
Evgeny Mankov 9cb3e9aa5e [HIP][HIPIFY] Make hipMemcpyParam2D coherent with cuMemcpy2D
+ Makes hip_Memcpy2D struct compatible with CUDA_MEMCPY2D struct
+ Add hipMemcpyParam2D support in nvcc fallback path
+ Update hipify-clang, tests and docs accordingly
2019-05-22 18:31:39 +03:00
Rahul Garg e1f3dc0c80 Add fine grained host memory lock support (#1095)
* Add fine grained host memory lock support

* Fix default flag check
2019-05-13 11:48:26 +05:30
Rahul Garg 94769fc8dd Add hipMallocManaged default functional support (#1036)
* Add hipMallocManaged default functional support

* Fix build error

* Add dtest
2019-04-24 16:50:03 +05:30
Jeff Daily cf8fb43e6b In hipFree, synchronize owner of memory (#1018)
* In hipFree, if memory is associated with a device, synchronize that device's streams.

This changes the behavior from synchronizing the currently set TLS device.

* All devices sync in hipFree for _appId=-1 case.

* Revert "All devices sync in hipFree for _appId=-1 case."

This reverts commit 1efb34d6a8426661e45bc5f763422a1147aeac10.

* add HIP_SYNC_FREE env var
2019-04-16 08:35:55 +05:30
Rahul Garg 50d623981e Handle D2D in memcpy2D 2019-03-28 02:21:45 +05:30
Rahul Garg 9b38380c03 Let hipHostMalloc always share/map pinned host ptr 2019-03-26 10:19:13 +05:30
Rahul Garg ad11972f47 Avoid double mapping of devices to hostMalloc buffer 2019-03-25 23:07:05 +05:30
Maneesh Gupta c20d233585 Merge pull request #970 from mangupta/swdev-172995
hipExtMallocWithFlags implementation
2019-03-25 07:46:53 +00:00
Maneesh Gupta 45255ab492 hipExtMallocWithFlags needs hcc workweek 19115 or higher 2019-03-25 11:41:20 +05:30
Maneesh Gupta e44de376f7 hipExtMallocWithFlags implementation
Change-Id: Iee9e119796472200b2933d5e23be60813f33bc75
2019-03-19 11:59:22 +05:30
Rahul Garg af72cde0a1 Add 2D fallback to use copy kernel 2019-03-14 13:03:06 +05:30
Alex Voicu ed48847237 dlopen() fixes (#929)
* Initial attempt to switch over to internally linked state.

* Add missing CMake update.

* hipLaunchKernelGGLImpl must be inline as well. Ensure internal linkage.

* Ensure global retrieval uses internally linked state.

* Hide HC in the implementation. Minimise ADL woes.

* Strange software exists, and must be catered to.

* Use a less spammy mechanism for ensuring internal linkage / non-export.

* Remove leftover internal detail.
2019-03-06 17:31:44 +05:30
Wen-Heng (Jack) Chung 8b7baa0bd9 Address code review comments to use hipDeviceptr_t 2019-03-05 05:51:05 +00:00
Wen-Heng (Jack) Chung 392271f4db Add hipMemsetD32 and hipMemsetD32Async
Add 2 extra memset functions which fills memory with integer-typed data

Also change the parameters of ihipMemset to better explain the semantic
2019-03-04 17:00:33 +00:00
Wilkin Chau 99540373cf Fix hipMemset3D test
Calculate the allocated size based on the width, height and depth.
2019-02-28 22:42:46 +00:00
Evgeny 47625cb8fd fixing HSA_INIT_API cid args 2019-01-16 23:45:44 -06:00
Maneesh Gupta a778f7cdf7 Merge pull request #797 from gargrahul/fixhipPointerGetAttributes
Fixed hipPointerGetAttributes for hostmalloced ptr
2018-12-12 10:16:07 +05:30
Maneesh Gupta 6ce99b066c Merge pull request #608 from gargrahul/add_pinned_2d_sdma_copy
Added support for pinned 2D SDMA copy
2018-12-12 07:44:16 +05:30
Rahul Garg 77fd517e09 Fixed hipPointerGetAttributes for hostmalloced ptr 2018-12-08 01:42:08 +05:30
Maneesh Gupta 99bb89b756 Merge pull request #760 from eshcherb/roctracer-hip-frontend-181113
Roctracer hip frontend 181113
2018-11-23 11:08:25 +05:30
Maneesh Gupta 40d3184dd1 Merge pull request #748 from mkuron/getsymboladdress
Implement hipGetSymbolAddress and hipGetSymbolSize
2018-11-21 10:32:01 +05:30
Michael Kuron e9b88711e2 Merge branch 'master' into getsymboladdress 2018-11-20 12:03:22 +01:00
Rahul Garg aae87e21d2 Fix hipHostRegister 2018-11-17 05:38:35 +05:30
Evgeny e362688adf renaming HIP_INIT_CB_API to HIP_INIT_API 2018-11-13 15:33:26 +00:00
Evgeny 084a68be63 adding activity prof layer 2018-11-13 15:33:26 +00:00
Rahul Garg ac32566d9b Fixed hipMemcpyToSymbol doesn't work on GPU other than device 0 SWDEV-166881 2018-11-13 00:49:20 +05:30
Michael Kuron 357dc8be11 Use correct trace macro in hipGetSymbolAddress/hipGetSymbolSize 2018-11-06 20:46:30 +01:00
Michael Kuron 4da2d92281 Introduce ihipModuleGetGlobal 2018-11-06 09:54:34 +01:00
Michael Kuron 0b6f5791f8 Implement hipGetSymbolAddress and hipGetSymbolSize 2018-11-04 10:39:34 +01:00
Siu Chi Chan cdd0109e70 Move the global arrays for hip malloc/free
from a header into a source file such that
there's only an unique copy in an executable
and prevent wasting static memory on the host

Change-Id: Id5b62766f77809c8d7b47892cb7149c490dcbdb9
2018-11-01 16:20:35 -04:00
Anton Gorenko 6e6297f3cd Fix allocation size of arrays with multiple and/or non-32-bit channels
hipMallocArray and hipMalloc3DArray must use sum of bits
of all components.
2018-10-29 18:12:00 +06:00
Rahul Garg bd27310127 Return hipSuccess when sizeBytes=0 in hipMemset 2018-09-26 12:47:36 +05:30
Rahul Garg 5eb11b58f3 Added support for pinned 2D SDMA copy 2018-07-31 14:05:35 +05:30
Sarunya Pumma 84aadb9274 Remove device mapping from shareWithAll memory
When shareWithAll memory (e.g., host memory) is allocated, set appId
in hc::AmPointerInfo to -1 to indicate that this memory is not mapped
to any device.  Peer checking in ihipStream_t::canSeeMemory is not
necessary if memory is shared with all devices.  Thus, it is skipped.

Note that earlier host memory is always mapped to device 0 and HIP
always performs peer checking for all kinds of hipMemcpy.  Since the
peer checking process requires context locking, hipMemcpy from/to host
memory always grabs device 0's context lock.  Therefore, if there is
another thread holding the context lock of device 0 (e.g.,
hipDeviceSynchronize on device 0), hipMemcpy will have to wait for the
lock until it can actually perform memcpy.  This can significantly
deteriorate execution performance.

Signed-off-by: Sarunya Pumma <sarunya.pumma@amd.com>
2018-07-28 23:15:16 -07:00
Rahul Garg f554e48db3 Revert "Use memcpy kernel for all pinned memory cases in hipMemcpy2DAsync" 2018-07-02 14:32:11 +05:30
Rahul Garg 007e2a4b5f TEMP- fix memcpy2dAsync for trsm issue 2018-06-15 16:08:29 +05:30
Rahul Garg 2ae3be9773 Fix stream resolution in memcpy2dasync 2018-06-14 11:58:56 +05:30
Rahul Garg 68554e155b Fix retrieved locked ptr offset 2018-06-13 23:10:05 +05:30
Maneesh Gupta 53037472ff Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
Fix hipMemcpy3D for fast path
2018-06-06 14:44:02 +05:30