rocm-systems

Auteur	SHA1	Bericht	Datum
Rahul Garg	d433f6fb58	Revert "Using HSA API for hipMemsetAsync (#1346 )" (#1381 ) This reverts commit `9bbd09b04f`.	2019-09-03 05:13:46 +00:00
Rahul Garg	a786728939	Fix memcpy with IPC slowness (#1321 ) * Fix memcpy with IPC slowness * Make early erroneous returns * Real Clean up * Real Clean up++	2019-08-23 09:19:18 +00:00
Jatin Chaudhary	9bbd09b04f	Using HSA API for hipMemsetAsync (#1346 )	2019-08-21 10:00:10 +00:00
Rahul Garg	fbc9f7e20a	Add hipMemcpy3DAsync (#1320 ) * Add hipMemcpy3DAsync * Fix CI build error * Move back stream resolution to internal function * Remove stream redefinition and check	2019-08-16 02:13:16 +00:00
Rahul Garg	569f35a258	Add hipMemcpyParam2DAsync (#1296 ) * Add hipMemcpyParam2DAsync * Add NVCC path changes * Clean up * Fix build issue * Fix else use in both sync and async apis	2019-08-09 11:50:37 +00:00
Jeff Daily	f337ae1edb	consolidate thread local storage (#915 ) * all thread local access now through single struct * clean up old commented-out code, more use of GET_TLS() * fewer calls to GET_TLS by passing tls as a funtion argument * revert unnecessary change to printf * fix failing tests due to TLS change * fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor	2019-08-05 09:51:02 +00:00
Rahul Garg	1c49943ac3	Change hipErrorUnknown to hipErrorInvalidValue	2019-07-31 00:28:30 +05:30
Evgeny Mankov	299fbd4842	[HIP] Fix segfault on uninitialized struct members in hipArrayCreate and hipArray3DCreate	2019-07-12 16:38:26 +03:00
Evgeny Mankov	f0832fd968	[HIP][HIPIFY] Split HIP_ARRAY_DESCRIPTOR struct to HIP_ARRAY_DESCRIPTOR and HIP_ARRAY3D_DESCRIPTOR [Reason] To be compatible with CUDA [#1133] Update HIP code, hipify-clang, tests and docs [TODO] Add support of the corresponding functions on nvcc fallback path	2019-07-11 14:58:16 +03:00
Jatin Chaudhary	fcb0a3d4e2	Adding bounds check before hipMemset (#1190 ) * Adding bounds check in ihipMemset * Adding ihipMemPtrGetInfo to hipMemPtrGetInfo	2019-07-08 11:00:38 +00:00
Anusha Godavarthy Surya	4989452413	Added missing NULL checks and corrected API return values as per validation	2019-06-27 00:19:05 +05:30
Evgeny Mankov	9cb3e9aa5e	[HIP][HIPIFY] Make hipMemcpyParam2D coherent with cuMemcpy2D + Makes hip_Memcpy2D struct compatible with CUDA_MEMCPY2D struct + Add hipMemcpyParam2D support in nvcc fallback path + Update hipify-clang, tests and docs accordingly	2019-05-22 18:31:39 +03:00
Rahul Garg	e1f3dc0c80	Add fine grained host memory lock support (#1095 ) * Add fine grained host memory lock support * Fix default flag check	2019-05-13 11:48:26 +05:30
Rahul Garg	94769fc8dd	Add hipMallocManaged default functional support (#1036 ) * Add hipMallocManaged default functional support * Fix build error * Add dtest	2019-04-24 16:50:03 +05:30
Jeff Daily	cf8fb43e6b	In hipFree, synchronize owner of memory (#1018 ) * In hipFree, if memory is associated with a device, synchronize that device's streams. This changes the behavior from synchronizing the currently set TLS device. * All devices sync in hipFree for _appId=-1 case. * Revert "All devices sync in hipFree for _appId=-1 case." This reverts commit 1efb34d6a8426661e45bc5f763422a1147aeac10. * add HIP_SYNC_FREE env var	2019-04-16 08:35:55 +05:30
Rahul Garg	50d623981e	Handle D2D in memcpy2D	2019-03-28 02:21:45 +05:30
Rahul Garg	9b38380c03	Let hipHostMalloc always share/map pinned host ptr	2019-03-26 10:19:13 +05:30
Rahul Garg	ad11972f47	Avoid double mapping of devices to hostMalloc buffer	2019-03-25 23:07:05 +05:30
Maneesh Gupta	c20d233585	Merge pull request #970 from mangupta/swdev-172995 hipExtMallocWithFlags implementation	2019-03-25 07:46:53 +00:00
Maneesh Gupta	45255ab492	hipExtMallocWithFlags needs hcc workweek 19115 or higher	2019-03-25 11:41:20 +05:30
Maneesh Gupta	e44de376f7	hipExtMallocWithFlags implementation Change-Id: Iee9e119796472200b2933d5e23be60813f33bc75	2019-03-19 11:59:22 +05:30
Rahul Garg	af72cde0a1	Add 2D fallback to use copy kernel	2019-03-14 13:03:06 +05:30
Alex Voicu	ed48847237	dlopen() fixes (#929 ) * Initial attempt to switch over to internally linked state. * Add missing CMake update. * hipLaunchKernelGGLImpl must be inline as well. Ensure internal linkage. * Ensure global retrieval uses internally linked state. * Hide HC in the implementation. Minimise ADL woes. * Strange software exists, and must be catered to. * Use a less spammy mechanism for ensuring internal linkage / non-export. * Remove leftover internal detail.	2019-03-06 17:31:44 +05:30
Wen-Heng (Jack) Chung	8b7baa0bd9	Address code review comments to use hipDeviceptr_t	2019-03-05 05:51:05 +00:00
Wen-Heng (Jack) Chung	392271f4db	Add hipMemsetD32 and hipMemsetD32Async Add 2 extra memset functions which fills memory with integer-typed data Also change the parameters of ihipMemset to better explain the semantic	2019-03-04 17:00:33 +00:00
Wilkin Chau	99540373cf	Fix hipMemset3D test Calculate the allocated size based on the width, height and depth.	2019-02-28 22:42:46 +00:00
Evgeny	47625cb8fd	fixing HSA_INIT_API cid args	2019-01-16 23:45:44 -06:00
Maneesh Gupta	a778f7cdf7	Merge pull request #797 from gargrahul/fixhipPointerGetAttributes Fixed hipPointerGetAttributes for hostmalloced ptr	2018-12-12 10:16:07 +05:30
Maneesh Gupta	6ce99b066c	Merge pull request #608 from gargrahul/add_pinned_2d_sdma_copy Added support for pinned 2D SDMA copy	2018-12-12 07:44:16 +05:30
Rahul Garg	77fd517e09	Fixed hipPointerGetAttributes for hostmalloced ptr	2018-12-08 01:42:08 +05:30
Maneesh Gupta	99bb89b756	Merge pull request #760 from eshcherb/roctracer-hip-frontend-181113 Roctracer hip frontend 181113	2018-11-23 11:08:25 +05:30
Maneesh Gupta	40d3184dd1	Merge pull request #748 from mkuron/getsymboladdress Implement hipGetSymbolAddress and hipGetSymbolSize	2018-11-21 10:32:01 +05:30
Michael Kuron	e9b88711e2	Merge branch 'master' into getsymboladdress	2018-11-20 12:03:22 +01:00
Rahul Garg	aae87e21d2	Fix hipHostRegister	2018-11-17 05:38:35 +05:30
Evgeny	e362688adf	renaming HIP_INIT_CB_API to HIP_INIT_API	2018-11-13 15:33:26 +00:00
Evgeny	084a68be63	adding activity prof layer	2018-11-13 15:33:26 +00:00
Rahul Garg	ac32566d9b	Fixed hipMemcpyToSymbol doesn't work on GPU other than device 0 SWDEV-166881	2018-11-13 00:49:20 +05:30
Michael Kuron	357dc8be11	Use correct trace macro in hipGetSymbolAddress/hipGetSymbolSize	2018-11-06 20:46:30 +01:00
Michael Kuron	4da2d92281	Introduce ihipModuleGetGlobal	2018-11-06 09:54:34 +01:00
Michael Kuron	0b6f5791f8	Implement hipGetSymbolAddress and hipGetSymbolSize	2018-11-04 10:39:34 +01:00
Siu Chi Chan	cdd0109e70	Move the global arrays for hip malloc/free from a header into a source file such that there's only an unique copy in an executable and prevent wasting static memory on the host Change-Id: Id5b62766f77809c8d7b47892cb7149c490dcbdb9	2018-11-01 16:20:35 -04:00
Anton Gorenko	6e6297f3cd	Fix allocation size of arrays with multiple and/or non-32-bit channels hipMallocArray and hipMalloc3DArray must use sum of bits of all components.	2018-10-29 18:12:00 +06:00
Rahul Garg	bd27310127	Return hipSuccess when sizeBytes=0 in hipMemset	2018-09-26 12:47:36 +05:30
Rahul Garg	5eb11b58f3	Added support for pinned 2D SDMA copy	2018-07-31 14:05:35 +05:30
Sarunya Pumma	84aadb9274	Remove device mapping from shareWithAll memory When shareWithAll memory (e.g., host memory) is allocated, set appId in hc::AmPointerInfo to -1 to indicate that this memory is not mapped to any device. Peer checking in ihipStream_t::canSeeMemory is not necessary if memory is shared with all devices. Thus, it is skipped. Note that earlier host memory is always mapped to device 0 and HIP always performs peer checking for all kinds of hipMemcpy. Since the peer checking process requires context locking, hipMemcpy from/to host memory always grabs device 0's context lock. Therefore, if there is another thread holding the context lock of device 0 (e.g., hipDeviceSynchronize on device 0), hipMemcpy will have to wait for the lock until it can actually perform memcpy. This can significantly deteriorate execution performance. Signed-off-by: Sarunya Pumma <sarunya.pumma@amd.com>	2018-07-28 23:15:16 -07:00
Rahul Garg	f554e48db3	Revert "Use memcpy kernel for all pinned memory cases in hipMemcpy2DAsync"	2018-07-02 14:32:11 +05:30
Rahul Garg	007e2a4b5f	TEMP- fix memcpy2dAsync for trsm issue	2018-06-15 16:08:29 +05:30
Rahul Garg	2ae3be9773	Fix stream resolution in memcpy2dasync	2018-06-14 11:58:56 +05:30
Rahul Garg	68554e155b	Fix retrieved locked ptr offset	2018-06-13 23:10:05 +05:30
Maneesh Gupta	53037472ff	Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath Fix hipMemcpy3D for fast path	2018-06-06 14:44:02 +05:30

1 2 3 4 5

210 Commits