Graf commitů

116 Commity

Autor SHA1 Zpráva Datum
Alex Voicu 2cacda91bb Correctly deal with functions from shared objects, wherein the program visible VA == so_base_va + st_value(function_symbol). Remove quaint usage of pfe for hipMemset (which is actually fill_n). 2017-11-01 22:33:13 +00:00
Alex Voicu c2482d1255 This switches HIP from its currently convoluted macro + pfe based dispatch mechanism to a more natural one partially based on the existing module API. The basic idea is that HCC will always correctly emit __global__ functions: as empty-bodied stubs, on host, and as kernels, on device. It then becomes trivial to obtain the mangled name on host, at dispatch, from the function's address, and then to use the mangled name to retrieve the kernel. This should address all problems stemming from serialisation, dubious mismatches due to the manufactured functor, macro-isms et al. It also immediately enables support for generalised globals as a consequence of that being available in the module API. Finally, it will make debug much easier, since the actual names of the __global__ functions will automatically be used in traces etc. One detail is that due to how dispatch works now (hipLaunchKernel and hipLaunchKernelGGL are themselves variadic function templates which deduce the function type of the callee), in certain cases it may be necesssary to insert explicit casts to ensure that the variadic argument list selects a viable overload - this can be observed in some unit tests. Eventually we may be able to remove this limitation, but for now it does not appear terribly onerous. The code is not extremely HIPpie, nor is it fully optimised, but rather is intended as a starting point for the HIP team to make its own. 2017-11-01 15:09:59 +00:00
Ben Sander 3ed5f42dad Merge pull request #179 from gargrahul/fix_hipmallocarray
Fixed hipMallocArray for 1D cases
2017-09-14 12:59:33 -05:00
Ben Sander cea80cd8b3 Add HIP_INIT_ALLOC to init allocated memory. 2017-09-13 23:31:48 +00:00
Rahul Garg 2a915b2790 Fixed hipMallocArray for 1D cases 2017-09-12 21:52:11 +05:30
Rahul Garg 38c9febe21 Null check on input pointer arguments 2017-08-25 08:46:34 +05:30
Maneesh Gupta e40047f2bf [texture] guard new HCC APIs under workweek
Change-Id: I4f60a64fb0b0496ca1eb01ffe6ddda121c25d976
2017-08-15 15:51:38 +05:30
Weixing Zhang 4264a4ce56 [HIP Texture] The GPU virtual address for texture memory needs to be
aligned.

In hcc_am, a bigger buffer will be allocated for alignment purpose
and _unalignedDevicePointer is added in struct AmPointerInfo for
original allocated address.
2017-08-08 11:18:00 -04:00
Rahul Garg 0053e4db03 fix hipMemcpy2DAsync 2017-07-29 06:50:56 +05:30
Maneesh Gupta 526fcb1223 Merge pull request #116 from bensander/enable_coh_host
Make host memory allocations coherent by default.
2017-07-27 10:45:32 +05:30
Ben Sander 8b8e97ff28 Make host memory allocations coherent by default.
Associated change is to optimize event recording so it uses
agent-scope releaes (since it was only using system-scope release
to support non-coherent host mem).

Flags and environment variables exist to obtain previous behavior
if desired.  Options are documented in new performance guide.
2017-07-26 19:20:34 -05:00
Weixing Zhang e671cec367 HIP Texture Support 2017-07-17 15:16:12 -04:00
Sun, Peng 1df08626c8 Add support of HIP_HIDDEN_FREE_MEM, to deduct the returned available
memory from hipMemGetInfo API, measured in MB.

Change-Id: I7a8260c12e032e04e26611db4c38c893a29f2653
2017-06-26 15:29:38 -05:00
Rahul Garg ba51d7f676 Validity check of input arguments in Ipc Mem APIs
Change-Id: Ia48e949d19f354f10c7e44cc2457fd4154bf6d76
2017-06-14 15:18:57 +05:30
Ben Sander 99e9c7cca5 Use amHostCoherentFlag. Requires new HCC version. 2017-06-07 09:06:40 -05:00
Ben Sander 75f691ec2f Add hipHostMallocCoherent, hipHostMallocNonCoherent
Provide per-allocation control over coherent/non-coherent mem.
These overrid the default HIP_COHERENT_HOST_ALLOC setting.
2017-05-24 00:48:10 -05:00
Ben Sander d0ef9d8462 Remove HIP_MAX_QUEUES (replaced with HCC_MAX_QUEUES) 2017-05-23 23:48:01 -05:00
Ben Sander 7cfe07cff4 Fix trace category for hipHostMalloc 2017-05-23 23:15:45 -05:00
Ben Sander 46030bb2d2 Return precise address for hipHostGetDevicePointer. 2017-05-17 07:36:06 -05:00
Ben Sander 27877f8854 Add initial HIP_SYNC_NULL_STREAM=0 mode.
This eliminates host-synchronization for null stream.  Instead, the
null-stream uses GPU-side events to wait for other streams.
Default is OFF pending additional testing.

Add enhanced null-stream test.

Also refine HIP_TRACE_API.
2017-05-16 19:04:25 -05:00
Ben Sander 0edab14139 Add HIP_TRACE_API=4. Only display memory allocation/free apis. 2017-05-16 19:04:25 -05:00
Ben Sander c3ccaa01e5 hipHostMalloc allocation are mapped to all devices by default.
Support hipHostMallocPortable flag.
Default flags are hipHostMallocPortable | hipHostMallocMapped.

Also:
-refactor tests to move addCount and addCountReverse into HipTest
namespace.
-test multi-GPU host memory.
2017-05-10 17:34:36 -05:00
Ben Sander 687809104b Fix some typos, add additional guidance for -BSymbolic 2017-05-05 17:29:04 -05:00
Rahul Garg ec27c695c4 Added support for hipMemcpy2DAsync in HIP/HCC
Change-Id: Ia4a8306f2dc1e33a81a7195ec29aef652fcccc4b
2017-05-03 22:29:12 +05:30
Ben Sander 59df70662a Fix hipMalloc to return error code if allocation fails. 2017-04-24 22:30:54 -05:00
Aditya Atluri 59ab3659ee fixed build issues with hipPointerGetAttributes
Change-Id: I3f5fbc05bdaef720884ba949075928752a070377
2017-04-24 15:31:07 -05:00
Aditya Atluri 72bcfa438b changed arguments for hipPointerGetAttributes
Change-Id: Ia7a7c4722c1f7d0a23f0e5cc3dd6dea6c01c1fd8
2017-04-24 15:24:16 -05:00
Rahul Garg fc61b793fe mgpu IPC support fix
Change-Id: I12e4b2fd189c3658efd3b07defa18ece3853b0eb
2017-04-04 15:51:10 +05:30
Aditya Atluri b9091ba818 added new api hipHccModuleLaunchKernel
1. hipHccModuleLaunchKernel is same as hipModuleLaunchKernel with OpenCL workitem model
2. Added copy right
3. Fixed header naming

Change-Id: I6a7c35a3566e2f8d3f5056613e34193775d4b236
2017-03-31 12:11:34 -05:00
Rahul Garg 1d18006ab4 Fix for MemcpyFromSymbol on HIP/NVCC path
Change-Id: Ice38307f72870ae468cbf0861e104f0fa46dfd56
2017-03-27 00:35:10 +05:30
Rahul Garg f649c225a7 Fix for hipMemcpyFromSymbolAsync
Change-Id: I449c669c8f0ef041deaf0a1bc812a71b2f0cc5a6
2017-03-24 10:30:33 +05:30
Rahul Garg 4395582810 Fix for hipMemcpyFromSymbol (sync)
Change-Id: I66afec5443ce904a63ced1fafece5144ca59393e
2017-03-21 23:48:04 +05:30
Rahul Garg 1aba3c4375 Added hipMemsetD8
Change-Id: I6a230a036c9c46c72a77d5f93c16ce8a00c3f837
2017-03-14 22:11:34 +05:30
Ben Sander 524e007db5 Refactor registered memory calls. 2017-03-11 09:18:27 -06:00
Ben Sander 0a554f4dc1 Update hipHostRegister debug and pointerTracker debug and notes 2017-03-11 09:18:27 -06:00
Ben Sander 94c85fd4fc Fix copying of registered memory.
Set device properly so copying can recover context.
Enhance test to catch this case.
2017-03-11 09:18:27 -06:00
Rahul Garg 32d8a58f18 IPC supported using ROCR APIs
Change-Id: I0a353b1240098f4b20fa266a871f5f5826290af9
2017-03-10 23:45:28 +05:30
Aditya Atluri 1546732604 Added new API, hipMemPtrGetInfo
1. This API returns memory allocation size of pointer
2. Added test to check its functionality

Change-Id: I87976d817b5a6ca5530336c09e7cb0420601cb2c
2017-03-07 13:46:29 -06:00
Aditya Atluri 2e245ae58c Added initial support for hipMemcpyFromSymbol. But not working!
Change-Id: I48d8c7de4ec9f85c6c942be995fb488a3931f5d7
2017-02-23 11:29:06 -06:00
Aditya Atluri 639fd4dd5e added runtime api hipMemcpyFromSymbolAsync
Change-Id: Ibaf925faf0ba464dd0ed6c5ea74c224c2ce38889
2017-02-22 19:16:35 -06:00
Aditya Atluri 2790e9a448 fixed symbol memcpy issue
Change-Id: I89d7401be51d194bcbf771020ba66e3d3b6a18f8
2017-02-01 17:54:59 -06:00
Ben Sander 96eac67929 Add debug tips to docs 2017-01-23 22:34:41 -06:00
Ben Sander 927ac3d81c Add HIP_SYNC_HOST_ALLOC, HipReadEnv 2017-01-19 23:55:24 -06:00
Aditya Atluri 73fcce26f9 changed copyright year from 2016 to 2017 in src directory
Change-Id: Idb97db509b2b4b1656b2df7a14a62ade38c9d574
2017-01-11 18:05:41 -06:00
Ben Sander a3e0012567 Add HIP_MAX_QUEUES feature.
Includes some tricky manipulation of the locks for contexts and streams.
issue is that stealing a stream requires we lock the context to
walk the streams to find a victim.  To avoid deadlock, we can't
have a stream locked when we lock the context.  This implementation
releases the stream lock, then acquires the context and selects the
victim.
A more stable implemenation might be to copy the stream list
from a context so that a lock is not required to walk all streams.
Smart shared_ptr could be used to prevent the streams from being
deallocated during the walk.
2017-01-09 21:02:56 -06:00
Ben Sander 93fbc9cf7b First pass at virtualized queue support.
Also updated stream debug messages to consistently use trace_helper.
2017-01-09 21:02:53 -06:00
Rahul Garg 5fb09879c7 Added state for hipDevice.
Change-Id: Idbc3c04cd054a01b634856a1e0a23ff172e991aa
2017-01-09 23:54:01 +05:30
Ben Sander c325c988b1 Support size_t in memset kernel.
Add disable for HSA_AMD_AGENT_INFO_MAX_WAVES_PER_CU
Remove one copy of completion_future in memset.
2016-12-22 12:25:09 -06:00
Ben Sander b30e4b4781 Add USE_IPC to disable use of IPC APIs. Set to 0. 2016-12-13 15:07:04 -06:00
pensun a53d35fd6c HIP IPC implementation on ROCr IPC APIs
Change-Id: I1ca9d520f5d0b1b56694211471b81eb7c6c23d16
2016-12-07 15:38:36 -06:00