Graphe des révisions

157 Révisions

Auteur SHA1 Message Date
Rahul Garg 27e0566c3a Use 64x4 grid dims 2018-05-24 23:51:52 +05:30
Rahul Garg aed2653857 Clean up and fix remaining bytes copy 2018-05-24 23:30:27 +05:30
Rahul Garg 53a7c61e9b Fix memcpy2d kernel dims 2018-05-24 17:00:12 +05:30
Rahul Garg 96b4618d26 Correct remaining bytes in copy 2d kernel 2018-05-24 08:27:24 +05:30
Rahul Garg 2c3d1498d4 Optimize memcpy2D kernel use 2018-05-23 14:43:47 +05:30
Maneesh Gupta 7042fe6067 Merge pull request #464 from gargrahul/fix_memcpy2d_pinned_mem_case
Fixed memcpy2D for pinned memory case using 2D kernel
2018-05-22 10:42:28 +05:30
Rahul Garg 40fb44dbe6 Fixed memcpy2D for pinned memory case using 2D kernel 2018-05-21 22:14:45 +05:30
Maneesh Gupta 66d05e6fc3 hipMemcpy returns success if sizeBytes is 0.
Fixes SWDEV-153754 & SWDEV-154178.
2018-05-21 15:38:44 +05:30
Rahul Garg 4f5bdb071c Fix for memcpy2DAsync for pinned host memory case 2018-05-18 21:09:50 +05:30
Maneesh Gupta 1c93e11cdf Merge pull request #433 from gargrahul/add_hipmemset3d
Added hipMemset3D
2018-05-18 14:54:15 +05:30
Rahul Garg 8413fb51e1 Fixed hipMemcpy2D to handle 1D memcpy case 2018-05-16 11:07:10 +05:30
Rahul Garg 78568435da Added hipMemset3D 2018-05-07 10:24:30 +05:30
Lakhan Singh 12d8a47c0c Null checks added for hipmallocpitch and hipmemcpy apis 2018-05-03 09:27:50 +05:30
Rahul Garg 1d76c48e3d Fix texture 3D for HIP/NVCC 2018-05-02 11:56:37 +05:30
Lakhan Singh 74faa61d52 SWDEV-141024 2018-04-20 17:40:00 +05:30
Rahul Garg fcc0866681 Added hipMemset2DAsync support 2018-04-17 18:27:27 +05:30
Rahul Garg 2ee80397f2 Correct missed ihipMemsetCopyDataType change 2018-04-12 10:27:19 +05:30
Rahul Garg 2658963f5b Changed ihipMemsetCopyDataType to ihipMemsetDataType 2018-04-12 09:29:22 +05:30
Rahul Garg 88073a17c1 Fix hipMemset stream resolution 2018-04-11 19:01:53 +05:30
Rahul Garg 40846f6f8e hipMemset refactoring 2018-04-11 15:58:48 +05:30
Maneesh Gupta 445de66560 hipMemcpyAsync returns success when trying to copy 0 bytes
Change-Id: I4c0ee7ccc7563e2df657b50356cdd7fec9a1ef15
2018-04-09 12:39:44 +05:30
Maneesh Gupta 9e47fccc89 Apply .clangformat to all repo source files
Change-Id: I7e79c6058f0303f9a98911e3b7dd2e8596079344
2018-03-12 11:29:03 +05:30
Alex Voicu 696f0595a3 Change directory name to match HIP lowercase style. 2018-02-22 13:15:10 +00:00
Maneesh Gupta 647d1ba310 Merge pull request #321 from gargrahul/hipMemcpyArray_Functions
Added support for hipMemcpy Array functions-
2018-02-12 10:36:38 +05:30
Rahul Garg b8c23f979b Fixed host allocated globals address lookup for host usage
Fixed texture driver APIs failure
2018-01-30 18:06:31 +05:30
Rahul Garg ca5bcb5af4 Added support for -
- hipMemcpyFromArray
- hipMemcpyAtoH
- hipMemcpyHtoA
2018-01-16 11:44:19 +05:30
Rahul Garg 105df94cd0 Added support for
- 3D texture driver APIs
- hipMalloc3D
- hipMemcpy3D for destination other than array
2017-12-05 14:11:13 +05:30
Ben Sander 5933ca7300 Fix some cppcheck style issues. 2017-12-01 20:45:34 +00:00
Ben Sander 743c54a012 Fix warning from default cppchek. 2017-12-01 20:45:33 +00:00
Alex Voicu 33bb425013 Fix legacy mode detection of the address of an agent allocated variable. In this mode, there exist two executables per each code object, one created by HCC and one created by HIP. Since we dispatch through HCC in legacy mode, we should obtain the address for an agent allocated variable from the latter's executable. Also add two omitted validity checks, whose absence could lead to segfaults when the current process had no .kernel section and / or when an invalid or empty blob was extracted from the latter. 2017-11-30 03:29:04 +00:00
Alex Voicu 4966518846 Revert "Revert adoption of CUDA indexing in general - this can only work with later versions of the compiler, just like module based dispatch, and thus must be guarded against usage in earlier (e.g. 1.6) versions."
This reverts commit 4792475
2017-11-29 21:49:10 +00:00
Alex Voicu 2557000b56 Revert "Revert adoption of CUDA indexing in general - this can only work with later versions of the compiler, just like module based dispatch, and thus must be guarded against usage in earlier (e.g. 1.6) versions."
This reverts commit d2fd1f5
2017-11-29 21:36:29 +00:00
Alex Voicu d5c8de3f41 Change memset kernel to use memcpy instead of placement new. Simplify indexers. 2017-11-28 19:45:47 +00:00
Alex Voicu d37a5a6008 Merge remote-tracking branch 'origin/master' into feature_use_module_based_dispatch_instead_of_pfe
# Conflicts:
#	src/hip_module.cpp
2017-11-28 17:29:11 +00:00
Rahul Garg 38029f2849 Fixed review comments 2017-11-21 21:19:06 +05:30
Rahul Garg 24307fe5c4 Changed function hipMemcpy_2D to hipMemcpyParam2D 2017-11-21 12:36:24 +05:30
Alex Voicu f8c1c1b38e Refactor the __device__ versions of memset and memcpy to be less awkward i.e. not return nullptr as opposed to the destination pointer (it can only be assumed it was done for maximum confusion) and actually unroll as they claim to. Change all of the {to, from}Symbol functions to use hipModuleGetGlobal, as opposed to hc::accelerator::get_symbol_address which is no longer valid with module based dispatch. 2017-11-21 02:40:34 +00:00
Rahul Garg 1851c153f6 Texture driver APIs support 2017-11-09 22:10:55 +05:30
Alex Voicu 0ad1308efd Clean up trailing whitespace so as to reduce noise in #246. 2017-11-08 00:08:55 +00:00
Alex Voicu 17753cbd92 Merge remote-tracking branch 'origin/master' into feature_use_module_based_dispatch_instead_of_pfe
# Conflicts:
#	src/hip_module.cpp
2017-11-03 10:53:39 +00:00
Alex Voicu 4693c5e56c Correctly deal with functions from shared objects, wherein the program visible VA == so_base_va + st_value(function_symbol). Remove quaint usage of pfe for hipMemset (which is actually fill_n). 2017-11-01 22:33:13 +00:00
Alex Voicu 28f87f7d2e This switches HIP from its currently convoluted macro + pfe based dispatch mechanism to a more natural one partially based on the existing module API. The basic idea is that HCC will always correctly emit __global__ functions: as empty-bodied stubs, on host, and as kernels, on device. It then becomes trivial to obtain the mangled name on host, at dispatch, from the function's address, and then to use the mangled name to retrieve the kernel. This should address all problems stemming from serialisation, dubious mismatches due to the manufactured functor, macro-isms et al. It also immediately enables support for generalised globals as a consequence of that being available in the module API. Finally, it will make debug much easier, since the actual names of the __global__ functions will automatically be used in traces etc. One detail is that due to how dispatch works now (hipLaunchKernel and hipLaunchKernelGGL are themselves variadic function templates which deduce the function type of the callee), in certain cases it may be necesssary to insert explicit casts to ensure that the variadic argument list selects a viable overload - this can be observed in some unit tests. Eventually we may be able to remove this limitation, but for now it does not appear terribly onerous. The code is not extremely HIPpie, nor is it fully optimised, but rather is intended as a starting point for the HIP team to make its own. 2017-11-01 15:09:59 +00:00
Ben Sander a417241507 Fix bug with peer-to-peer combined with context API
- Store context inside the tracker rather than using int deviceID that
  was always mapped to primary context
- IsPeerWatcher now based on device IDs rather than specific peers.
2017-10-26 19:44:22 +00:00
Ben Sander b54ab82694 Merge pull request #179 from gargrahul/fix_hipmallocarray
Fixed hipMallocArray for 1D cases
2017-09-14 12:59:33 -05:00
Ben Sander fff42fd591 Add HIP_INIT_ALLOC to init allocated memory. 2017-09-13 23:31:48 +00:00
Rahul Garg baef2321c8 Fixed hipMallocArray for 1D cases 2017-09-12 21:52:11 +05:30
Rahul Garg 765d7f3c7b Null check on input pointer arguments 2017-08-25 08:46:34 +05:30
Maneesh Gupta 172a568aa6 [texture] guard new HCC APIs under workweek
Change-Id: I4f60a64fb0b0496ca1eb01ffe6ddda121c25d976
2017-08-15 15:51:38 +05:30
Weixing Zhang e4de2d1138 [HIP Texture] The GPU virtual address for texture memory needs to be
aligned.

In hcc_am, a bigger buffer will be allocated for alignment purpose
and _unalignedDevicePointer is added in struct AmPointerInfo for
original allocated address.
2017-08-08 11:18:00 -04:00
Rahul Garg 320ae86d44 fix hipMemcpy2DAsync 2017-07-29 06:50:56 +05:30