Граф коммитов

139 Коммитов

Автор SHA1 Сообщение Дата
Rahul Garg 294bf50f68 Fix hipMemset stream resolution 2018-04-11 19:01:53 +05:30
Rahul Garg 412a35be20 hipMemset refactoring 2018-04-11 15:58:48 +05:30
Maneesh Gupta 03eca1c57e hipMemcpyAsync returns success when trying to copy 0 bytes
Change-Id: I4c0ee7ccc7563e2df657b50356cdd7fec9a1ef15
2018-04-09 12:39:44 +05:30
Maneesh Gupta 1ba06f63c4 Apply .clangformat to all repo source files
Change-Id: I7e79c6058f0303f9a98911e3b7dd2e8596079344
2018-03-12 11:29:03 +05:30
Alex Voicu dc7560ef22 Change directory name to match HIP lowercase style. 2018-02-22 13:15:10 +00:00
Maneesh Gupta 4b8ae78891 Merge pull request #321 from gargrahul/hipMemcpyArray_Functions
Added support for hipMemcpy Array functions-
2018-02-12 10:36:38 +05:30
Rahul Garg 24ab820a11 Fixed host allocated globals address lookup for host usage
Fixed texture driver APIs failure
2018-01-30 18:06:31 +05:30
Rahul Garg 487a430b5a Added support for -
- hipMemcpyFromArray
- hipMemcpyAtoH
- hipMemcpyHtoA
2018-01-16 11:44:19 +05:30
Rahul Garg 115c7f2b79 Added support for
- 3D texture driver APIs
- hipMalloc3D
- hipMemcpy3D for destination other than array
2017-12-05 14:11:13 +05:30
Ben Sander 9bba97fdcc Fix some cppcheck style issues. 2017-12-01 20:45:34 +00:00
Ben Sander 4313686d6e Fix warning from default cppchek. 2017-12-01 20:45:33 +00:00
Alex Voicu 7c0b9a005b Fix legacy mode detection of the address of an agent allocated variable. In this mode, there exist two executables per each code object, one created by HCC and one created by HIP. Since we dispatch through HCC in legacy mode, we should obtain the address for an agent allocated variable from the latter's executable. Also add two omitted validity checks, whose absence could lead to segfaults when the current process had no .kernel section and / or when an invalid or empty blob was extracted from the latter. 2017-11-30 03:29:04 +00:00
Alex Voicu 32e11e7dc6 Revert "Revert adoption of CUDA indexing in general - this can only work with later versions of the compiler, just like module based dispatch, and thus must be guarded against usage in earlier (e.g. 1.6) versions."
This reverts commit d2fd1f5
2017-11-29 21:49:10 +00:00
Alex Voicu fbaf729f88 Revert "Revert adoption of CUDA indexing in general - this can only work with later versions of the compiler, just like module based dispatch, and thus must be guarded against usage in earlier (e.g. 1.6) versions."
This reverts commit d2fd1f5
2017-11-29 21:36:29 +00:00
Alex Voicu 6e4ca3fbb4 Change memset kernel to use memcpy instead of placement new. Simplify indexers. 2017-11-28 19:45:47 +00:00
Alex Voicu dc67ca3feb Merge remote-tracking branch 'origin/master' into feature_use_module_based_dispatch_instead_of_pfe
# Conflicts:
#	src/hip_module.cpp
2017-11-28 17:29:11 +00:00
Rahul Garg 56862b1c35 Fixed review comments 2017-11-21 21:19:06 +05:30
Rahul Garg 9866fa250d Changed function hipMemcpy_2D to hipMemcpyParam2D 2017-11-21 12:36:24 +05:30
Alex Voicu 9d088d2283 Refactor the __device__ versions of memset and memcpy to be less awkward i.e. not return nullptr as opposed to the destination pointer (it can only be assumed it was done for maximum confusion) and actually unroll as they claim to. Change all of the {to, from}Symbol functions to use hipModuleGetGlobal, as opposed to hc::accelerator::get_symbol_address which is no longer valid with module based dispatch. 2017-11-21 02:40:34 +00:00
Rahul Garg ef09c4918d Texture driver APIs support 2017-11-09 22:10:55 +05:30
Alex Voicu d8e323d4b5 Clean up trailing whitespace so as to reduce noise in #246. 2017-11-08 00:08:55 +00:00
Alex Voicu bb1176001f Merge remote-tracking branch 'origin/master' into feature_use_module_based_dispatch_instead_of_pfe
# Conflicts:
#	src/hip_module.cpp
2017-11-03 10:53:39 +00:00
Alex Voicu 2cacda91bb Correctly deal with functions from shared objects, wherein the program visible VA == so_base_va + st_value(function_symbol). Remove quaint usage of pfe for hipMemset (which is actually fill_n). 2017-11-01 22:33:13 +00:00
Alex Voicu c2482d1255 This switches HIP from its currently convoluted macro + pfe based dispatch mechanism to a more natural one partially based on the existing module API. The basic idea is that HCC will always correctly emit __global__ functions: as empty-bodied stubs, on host, and as kernels, on device. It then becomes trivial to obtain the mangled name on host, at dispatch, from the function's address, and then to use the mangled name to retrieve the kernel. This should address all problems stemming from serialisation, dubious mismatches due to the manufactured functor, macro-isms et al. It also immediately enables support for generalised globals as a consequence of that being available in the module API. Finally, it will make debug much easier, since the actual names of the __global__ functions will automatically be used in traces etc. One detail is that due to how dispatch works now (hipLaunchKernel and hipLaunchKernelGGL are themselves variadic function templates which deduce the function type of the callee), in certain cases it may be necesssary to insert explicit casts to ensure that the variadic argument list selects a viable overload - this can be observed in some unit tests. Eventually we may be able to remove this limitation, but for now it does not appear terribly onerous. The code is not extremely HIPpie, nor is it fully optimised, but rather is intended as a starting point for the HIP team to make its own. 2017-11-01 15:09:59 +00:00
Ben Sander 7d30f32332 Fix bug with peer-to-peer combined with context API
- Store context inside the tracker rather than using int deviceID that
  was always mapped to primary context
- IsPeerWatcher now based on device IDs rather than specific peers.
2017-10-26 19:44:22 +00:00
Ben Sander 3ed5f42dad Merge pull request #179 from gargrahul/fix_hipmallocarray
Fixed hipMallocArray for 1D cases
2017-09-14 12:59:33 -05:00
Ben Sander cea80cd8b3 Add HIP_INIT_ALLOC to init allocated memory. 2017-09-13 23:31:48 +00:00
Rahul Garg 2a915b2790 Fixed hipMallocArray for 1D cases 2017-09-12 21:52:11 +05:30
Rahul Garg 38c9febe21 Null check on input pointer arguments 2017-08-25 08:46:34 +05:30
Maneesh Gupta e40047f2bf [texture] guard new HCC APIs under workweek
Change-Id: I4f60a64fb0b0496ca1eb01ffe6ddda121c25d976
2017-08-15 15:51:38 +05:30
Weixing Zhang 4264a4ce56 [HIP Texture] The GPU virtual address for texture memory needs to be
aligned.

In hcc_am, a bigger buffer will be allocated for alignment purpose
and _unalignedDevicePointer is added in struct AmPointerInfo for
original allocated address.
2017-08-08 11:18:00 -04:00
Rahul Garg 0053e4db03 fix hipMemcpy2DAsync 2017-07-29 06:50:56 +05:30
Maneesh Gupta 526fcb1223 Merge pull request #116 from bensander/enable_coh_host
Make host memory allocations coherent by default.
2017-07-27 10:45:32 +05:30
Ben Sander 8b8e97ff28 Make host memory allocations coherent by default.
Associated change is to optimize event recording so it uses
agent-scope releaes (since it was only using system-scope release
to support non-coherent host mem).

Flags and environment variables exist to obtain previous behavior
if desired.  Options are documented in new performance guide.
2017-07-26 19:20:34 -05:00
Weixing Zhang e671cec367 HIP Texture Support 2017-07-17 15:16:12 -04:00
Sun, Peng 1df08626c8 Add support of HIP_HIDDEN_FREE_MEM, to deduct the returned available
memory from hipMemGetInfo API, measured in MB.

Change-Id: I7a8260c12e032e04e26611db4c38c893a29f2653
2017-06-26 15:29:38 -05:00
Rahul Garg ba51d7f676 Validity check of input arguments in Ipc Mem APIs
Change-Id: Ia48e949d19f354f10c7e44cc2457fd4154bf6d76
2017-06-14 15:18:57 +05:30
Ben Sander 99e9c7cca5 Use amHostCoherentFlag. Requires new HCC version. 2017-06-07 09:06:40 -05:00
Ben Sander 75f691ec2f Add hipHostMallocCoherent, hipHostMallocNonCoherent
Provide per-allocation control over coherent/non-coherent mem.
These overrid the default HIP_COHERENT_HOST_ALLOC setting.
2017-05-24 00:48:10 -05:00
Ben Sander d0ef9d8462 Remove HIP_MAX_QUEUES (replaced with HCC_MAX_QUEUES) 2017-05-23 23:48:01 -05:00
Ben Sander 7cfe07cff4 Fix trace category for hipHostMalloc 2017-05-23 23:15:45 -05:00
Ben Sander 46030bb2d2 Return precise address for hipHostGetDevicePointer. 2017-05-17 07:36:06 -05:00
Ben Sander 27877f8854 Add initial HIP_SYNC_NULL_STREAM=0 mode.
This eliminates host-synchronization for null stream.  Instead, the
null-stream uses GPU-side events to wait for other streams.
Default is OFF pending additional testing.

Add enhanced null-stream test.

Also refine HIP_TRACE_API.
2017-05-16 19:04:25 -05:00
Ben Sander 0edab14139 Add HIP_TRACE_API=4. Only display memory allocation/free apis. 2017-05-16 19:04:25 -05:00
Ben Sander c3ccaa01e5 hipHostMalloc allocation are mapped to all devices by default.
Support hipHostMallocPortable flag.
Default flags are hipHostMallocPortable | hipHostMallocMapped.

Also:
-refactor tests to move addCount and addCountReverse into HipTest
namespace.
-test multi-GPU host memory.
2017-05-10 17:34:36 -05:00
Ben Sander 687809104b Fix some typos, add additional guidance for -BSymbolic 2017-05-05 17:29:04 -05:00
Rahul Garg ec27c695c4 Added support for hipMemcpy2DAsync in HIP/HCC
Change-Id: Ia4a8306f2dc1e33a81a7195ec29aef652fcccc4b
2017-05-03 22:29:12 +05:30
Ben Sander 59df70662a Fix hipMalloc to return error code if allocation fails. 2017-04-24 22:30:54 -05:00
Aditya Atluri 59ab3659ee fixed build issues with hipPointerGetAttributes
Change-Id: I3f5fbc05bdaef720884ba949075928752a070377
2017-04-24 15:31:07 -05:00
Aditya Atluri 72bcfa438b changed arguments for hipPointerGetAttributes
Change-Id: Ia7a7c4722c1f7d0a23f0e5cc3dd6dea6c01c1fd8
2017-04-24 15:24:16 -05:00