Commit Graph

254 Commits

Author SHA1 Message Date
Aditya Atluri 8b918b065a Added NVCC support and name changes
- Added NVCC support for module APIs
- Changed hipFunction and hipModule data types to hipFunction_t and hipModule_t
- Created new intenal ihipModuleGetFunction as it is used twice
- Changed test to match with the new data types

Change-Id: I300a1c7fd40ed7065b1b8b9de97e3a06b96ed729
2016-08-26 10:32:01 -05:00
Rahul Garg 1211cc931c Added logic to update primary ctx when ctx stack is empty, updated hipCtxDestroy and ctxGetCurrent functions
Change-Id: Ia0a8943c121bc1279788a1cfa9be59af614b04a6
2016-08-26 19:03:23 +05:30
Rahul Garg ae77d4b6d7 Resolved errors due to hipCtxXXX APIs
Change-Id: Iffac0095c4352864eca622ea318d2291571b5153
2016-08-26 15:32:49 +05:30
Rahul Garg 5108140087 NVCC path support for hipCtxXXX APIs
Change-Id: Ic7dbfbdaee9d00c0de1363c50758e5e29a96a8b2
2016-08-26 14:10:36 +05:30
Rahul Garg 524eb687d3 Addition of hipCtxEnablePeerAccess and hipCtxDisablePeerAccess functions
Change-Id: I381c8cbbde17eae7d9bb5d4cb1596cebf4bda039
2016-08-26 13:51:33 +05:30
Aditya Atluri 842553a6e1 Changed how hipEvent_t is typedefed internall
- Mapped hipEvent_t directly to ihipEvent_t* instead of a handle

Change-Id: I5a8bcca0ef962932e0738c03eb1fc914d23022ae
2016-08-25 14:34:41 -05:00
Aditya Atluri 79e88a6af6 Added hipModuleGetGlobal and hipModuleLoadData
Change-Id: Iaec873f7d86b72911b6ad32e067a4dfe3d552fe6
2016-08-25 14:16:53 -05:00
Aditya Atluri cb996c7b7a changed internal structure of hipFunction and hipModule
Change-Id: Ifa343782e29d7e056efc47e56253311013005093
2016-08-24 09:47:11 -05:00
Aditya Atluri 2287af23a1 Module test correction and hipModuleUnload API
- Corrected the hipModule.cpp test to minimal code
- Added hipModuleUnload API
- Added hipModuleUnload API test

Change-Id: I9c40337043d7972a570b795e1bfc104bd2c4d8aa
2016-08-23 14:19:15 -05:00
Aditya Atluri 8f0f97f8f9 Added stream synchronisation for hipLaunchModuleKernel
- The module kernel launch is now in sync with commands in its stream
- Moved launch kernel inside ihipStream

Change-Id: Ic00cfcf4882bf81b6203c36881a52575ea68b529
2016-08-22 14:17:55 -05:00
Aditya Atluri 0806958a72 Added nvcc path for hipComplex APIs
- Changed from inline to static inline for hipComplex AMD APIs
- Added NVCC path for hipComplex APIs mapped to cuComplex APIs

Change-Id: I809cf3a11b5b1c8bbc7a57c5fbcc3dc6745ccb95
2016-08-22 10:29:46 -05:00
Rahul Garg a498753041 Added support for hipCtxSynchronize and hipCtxGetFlags,modified hipDeviceSynchronize
Change-Id: If7bac667a262fa8c0cb3dc93e97f2534855acd07
2016-08-22 16:15:27 +05:30
Aditya Atluri 24f6251b99 Added more complex apis and copyright
- New header which redirects to CUDA/HIP path added for hipComplex.h
- Added more complex device api including fma
- Added copyright to new files

Change-Id: Iff0dece4c438e97d0ae33efa4312975d465a6464
2016-08-19 23:02:04 -05:00
Aditya Atluri 78b15bf062 Added support for complex device functions
- Added complex number arithmetic operation for float and double datatypes
- TODO: make them host functions and support half
- Added new function which is not in CUDA, hipCsqabs which is square of absolute value

Change-Id: Ib96e194ad45dc64fcba29eb19ad0376542e0591d
2016-08-19 21:48:23 -05:00
Aditya Atluri 921736782e Added support for executable and symbols for data structures
- symbol handle is added to hipFunction
- executable handle is added to hipModule
- This way, the APIs doesn't need to track the values

Change-Id: I7cf05329cf79fe946319d7746bd9f5503268fda4
2016-08-19 08:49:34 -05:00
Aditya Atluri e51ce8fc09 Added hipLaunchModuleKernel and new error codes
- hipLaunchModuleKernel maps to cuLaunchKernel
- Whole lot of new error codes added for the use of driver api
 - KernelParams arguments is not yet supported
 - hipLaunchModuleKernel is a synchronous api (will change eventually)
 - All the commands in a stream will wait on host when hipLaunchModuleKernel is called on it

Change-Id: Ib4a4fae1db06fbb3a81d5a5575b026aa821264ed
2016-08-18 11:26:55 -05:00
Rahul Garg 0962ca43d9 Added further hipCtxXXX Apis
Change-Id: I286d962a06cee656c1c652b3f6b45078587fbc41
2016-08-17 16:28:22 +05:30
pensun 2d228d8e0c add occupancy support for NV path; fix hipPeekAtLastError on HCC path
Change-Id: I26b0e1875c19d7c636ffcc18f1738926572ded81
2016-08-16 16:25:03 -05:00
Aditya Atluri 3d27bbd3db Added kernel compilation driver apis
1. Added 2 new driver apis, hipModuleLoad, hipModuleGetFunction

Change-Id: If464a7fad178121e3da791c7ac9e17ebc01a9cd0
Issues: When a sample written with them shows Aborted (core dumped) when exiting
2016-08-16 14:36:25 -05:00
Evgeny Mankov 7015675983 #define HIP_DYNAMIC_SHARED_ATTRIBUTE is added 2016-08-16 17:58:57 +03:00
Rahul Garg eec9edef80 Implementation of hipCtxGetDevice
Change-Id: I067572e486323c3aad6f744a2c0c4997c8696af6
2016-08-13 01:17:46 +05:30
Rahul Garg 62d390da58 First implementation of hipCtxXXX functions
Change-Id: I4609cbe6bd90a1fff8655bff4fdd773864397aba
2016-08-13 00:09:08 +05:30
Jeffrey Poznanovic 48491a3978 Adding hipblas include files
Change-Id: I73064d410acd8f655dc62eaeb6f4bdefc5381e35
2016-08-12 11:59:25 +05:30
Ben Sander 89164259ab Context update.
- Remove tls_deviceID.
- Add first passing test.

Change-Id: If3e2f254abf589028cfe4f9e6369745f04160de0
2016-08-10 08:59:47 -05:00
Rahul Garg 2ac93c340d Changed StagingBuffer class to UnpinnedCopyEngine
Change-Id: I1e212bfc8030dcf225ecf78fd7b23fda9b1de92f
2016-08-09 21:29:42 +05:30
Rahul Garg 023b1ecf33 Moved sync copy decision logic to staging buffer class
Change-Id: I5c398772375fcc1f174a7597eea1215ce7bf80b4
2016-08-09 09:28:18 +05:30
Ben Sander 8f402132ba Add initial context implementation.
APIs: hipInit, hipCtxCreate.
Track TLS default ctx.  Set deviceID now changes the ctx.
Add first context test.

Change-Id: If1cb9989b5a04a36147e25e84904336c7b6f3d88
2016-08-08 17:49:02 -05:00
Ben Sander ed0a2c02fe Code cleanup, use camelCase where appropriate.
Change-Id: I5a7ec50df8bbb3e7a3b313c0b12e2dd55ae4a09c
2016-08-08 14:54:38 -05:00
Ben Sander 2a798152d4 Move copy kernel templates into hip_memory.cpp
Change-Id: I862529f3fa8232372c6bacaa5d36f035bbdd32a1
2016-08-08 12:07:12 -05:00
Ben Sander cfdacab32f Split ihipCtx_t into ihipCtx_t and ihipDevice_t .
Major change to existing code base.
    Ctx holds streams, enables peers, and flags.
    Device holds accelerator, hsa-agent, device props.

Add hipCtx_t.

Add peer APIs that accept hipCtx_t (in addition to deviceId)

Compiles and passes directed tests.

Change-Id: Iddab1eb9edbf90caad2ef5959c6b811d658197f1
2016-08-08 11:55:57 -05:00
Ben Sander 2dc3d3238b Change Device->Ctx
Change ihipDevice_t -> ihipCtx_t (new)
Change ihipGetTlsDefaultDevice->ihipGetTlsDefaultCtx
Some other changes from device->ctx where appropriate.

Change-Id: I5c4ae93b2fd42c6303aa23d748eb166b7431925d
2016-08-07 21:47:12 -05:00
Ben Sander e7d7c5cbe8 Remove ihipStream_r::_device_index
Replace with direct pointer to device.  Cleaner, and prep
for transition to contexts.

Change-Id: I0e550f34412923d46c541c0a14bb7d29c3fd4b11
2016-08-07 20:47:06 -05:00
Rahul Garg fcb2fcce1e Region based apis to pool based api changes
Change-Id: If53019eebafe051ab4e811863995f78315297080
2016-08-05 15:05:57 +05:30
Ben Sander 02dd7a7399 Cleanup sync code.
Remove dead depFutures, enqueueBarrier call.
Rename some parms to reflect usage.
Add comments to better explain tricky parts of sync code.

Change-Id: I763296421d9c2b3b58fc8cef5f010b12ab49553c
2016-07-27 18:31:11 -05:00
Aditya Atluri 1859c6e515 Signal Fix: Added signal limit to allocSignal
1. Did not change the logic in allocSignal
2. Added guard to wait on signal limit

Change-Id: I78f29097e6a584b3c3d78319dac19869067bd1fe
2016-07-27 13:48:49 -05:00
Aditya Atluri 0a31b47e2e Signal Fix: Moved kernel count to critical stream
1. Added environment variable HIP_NUM_KERNELS_INFLIGHT
2. Moved kernelcount variable inside stream critical section

Change-Id: I51d24d0a2a109467209170de117a6d02ba4e308e
2016-07-26 17:09:27 -05:00
Aditya Atluri 53d7629a85 Signal Fix: Changed global signal count to per stream signal count
1. The number of kernels that can use signals are increased to 128
2. The kernel count is now specific to the stream

Change-Id: Ie6d1aa3f437aad8f08c3333fe48bd3f46e551e60
2016-07-26 14:03:51 -05:00
Aditya Atluri 4bdf26a82e Added re-fix for memcpy kernel sync
1. The patch uses HIP signal pools to sync between copy and kernel commands
2. The hsa_signal_create is removed
3. Left the redundant enqueueBarrier method just in case

Change-Id: I3dff3e8ee57fff3cd49bec802ff735ed128e5ca1
2016-07-26 09:22:59 -05:00
Rahul Garg 42a3ed544c D2H and H2D unpinned memory transfer support
Change-Id: If6d6c970f435e5d917d5cc6cddc2ee2918cd1c37

Conflicts:
	src/hip_hcc.cpp
2016-07-25 14:36:07 +05:30
Aditya Atluri c756bb3398 Partial fix async after kernel launch signal issue
Change-Id: Ib48d6564379160035bded9493b93663fba361710
2016-07-23 14:54:20 -05:00
pensun f31668fee4 Add empty stubs for threadfence family routines, changes include:
- stubs and documentation in include/hcc_details/hip_runtime.h
    - stubs with "no-op" in src/hip_memory.cpp
    - document update in hip_kernel_language.md, add suggestions to
    disable L1 and L2 caches when using the threadfence routines.

Change-Id: Ic0753170f802003055bca9d7476d7f48817b98b7
2016-07-22 10:40:58 -05:00
Maneesh Gupta 71d51170ef Replace calls to ihipInit with use of HIP_INIT_API macro
Change-Id: Iabf7df79f0238a8ddffea4607fe945df36642850
2016-07-22 15:46:55 +05:30
Maneesh Gupta b23fad53cc Fix using ATP markers
Change-Id: If2d04f80b580237426c569737551e2001a8cd35a
2016-07-21 16:02:51 +05:30
Maneesh Gupta 7022986ab2 Merge branch 'hiparray' into amd-develop
Change-Id: I63ca7b1db7b593ac5cfb3fd7cd5d08d6e4075a4c
2016-07-21 12:29:56 +05:30
Maneesh Gupta d7b040bdba Merge branch 'amd-master' into amd-develop 2016-07-05 21:40:22 +05:30
Aditya Atluri 36b81c1be6 added more nvcc event functions
Change-Id: I79ee20ef444d4c1ab6ada3c0d56730ce754ab6b6
2016-06-30 21:03:19 -05:00
Maneesh Gupta 3f204b8580 Merge branch 'amd-develop' into amd-master
Change-Id: I04f85b207e15e66c1a546675dc0937726ee08362
2016-06-30 18:36:07 +05:30
Aditya Atluri 5633cc34cc moved half support to a source file
Change-Id: I7c09b41877e22c1b743dea25a585e5307427dafd
2016-06-30 18:23:29 +05:30
Aditya Atluri 83210c8ac3 added fp16 software support
Change-Id: Ic0fdd9f8248a66911169fc00d3af71f50b36e233
2016-06-30 18:23:29 +05:30
7SK 8264d5d6bd NVCC_COMPAT
add support for both cuda compatible implementation and hcc(faster)
implementation with test

Change-Id: I79a22344f458391d7dffac5f147619a542e97e4e
2016-06-28 09:36:06 +05:30