Graphe des révisions

6428 Révisions

Auteur SHA1 Message Date
Jeffrey Poznanovic 48491a3978 Adding hipblas include files
Change-Id: I73064d410acd8f655dc62eaeb6f4bdefc5381e35
2016-08-12 11:59:25 +05:30
Aditya Atluri 3be747c41e Added fix for using HCC_VERSION
1. The variable is brought outside the conditional so that its scope is increased

Change-Id: I2d2689553e67930050fe5b3648739f0f72c3bbc8
2016-08-11 16:13:44 -05:00
Aditya Atluri ef68f2f293 Change hipcc to take HCC_HOME from hipconfig by default
1. Current implementation checks both env var and value in hipconfig and reports error
2. New implementation gives value in hipconfig with highest priority
3. If hipconfig is not present, fall back to env variables.

To Devs: No need to switch between environment variables for different HCC + different HIP.

Change-Id: I6cdf37e1429d7f07be3a68c7e5ba1533d832962b
2016-08-11 15:31:24 -05:00
Evgeny Mankov 2aacb02358 clang-hipify: Add support for nested macro expansion and translation.
Fixes bug “HIPIFY: nested macro is not hipified”
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/33

Example:
#include "cuda_runtime.h"
#define MY_MACRO(func, flags) (func, flags)
...
cudaEvent_t *event = NULL;
MY_MACRO(cudaEventCreateWithFlags(event, cudaEventDisableTiming), NULL);

where cudaEventDisableTiming is a defined numeric literal and thus a nested MACRO:
#define cudaEventDisableTiming 0x02 /**< Event will not record timing data */

After hipifying now:
MY_MACRO(hipEventCreateWithFlags(event, cudaEventDisableTiming), NULL);

Should be:
MY_MACRO(hipEventCreateWithFlags(event, hipEventDisableTiming), NULL);
2016-08-11 22:29:55 +03:00
pensun 4553e4e7f7 add note in hip_faq regarding workaround that add keyword of static for all forceinline functions
Change-Id: Ia13ba59b1e54df8ead5a96a952084144431ec72a
2016-08-10 11:31:13 -05:00
Ben Sander 89164259ab Context update.
- Remove tls_deviceID.
- Add first passing test.

Change-Id: If3e2f254abf589028cfe4f9e6369745f04160de0
2016-08-10 08:59:47 -05:00
Ben Sander 1786b120ed Document workaround for parenthesis+macro+hipLaunchKernel
Change-Id: Ie04c99db92d6499ddde93028a96f9d8f72d3f992
2016-08-10 08:59:47 -05:00
Maneesh Gupta 2e9adefd71 Allow cmake to be run multiple times in directed tests
Change-Id: I9d68fdefd9f72895ad4bdb310fcf3c6e52dbbf02
2016-08-10 09:49:07 +05:30
Maneesh Gupta 76eeaf4fbb Fix build issues due to refactoring changes
Change-Id: I0a709ff4864244ba1b49e1a25327e3901ed6e17f
2016-08-09 22:49:32 +05:30
Rahul Garg 2ac93c340d Changed StagingBuffer class to UnpinnedCopyEngine
Change-Id: I1e212bfc8030dcf225ecf78fd7b23fda9b1de92f
2016-08-09 21:29:42 +05:30
Rahul Garg 023b1ecf33 Moved sync copy decision logic to staging buffer class
Change-Id: I5c398772375fcc1f174a7597eea1215ce7bf80b4
2016-08-09 09:28:18 +05:30
Ben Sander 8f402132ba Add initial context implementation.
APIs: hipInit, hipCtxCreate.
Track TLS default ctx.  Set deviceID now changes the ctx.
Add first context test.

Change-Id: If1cb9989b5a04a36147e25e84904336c7b6f3d88
2016-08-08 17:49:02 -05:00
Ben Sander ed0a2c02fe Code cleanup, use camelCase where appropriate.
Change-Id: I5a7ec50df8bbb3e7a3b313c0b12e2dd55ae4a09c
2016-08-08 14:54:38 -05:00
Ben Sander b1d8f9d00d Coding guidelines update
Change-Id: Ib8d8da4c3897d157aeb26eb2e99718d66fd260b1
2016-08-08 13:12:22 -05:00
Ben Sander 2a798152d4 Move copy kernel templates into hip_memory.cpp
Change-Id: I862529f3fa8232372c6bacaa5d36f035bbdd32a1
2016-08-08 12:07:12 -05:00
Ben Sander cfdacab32f Split ihipCtx_t into ihipCtx_t and ihipDevice_t .
Major change to existing code base.
    Ctx holds streams, enables peers, and flags.
    Device holds accelerator, hsa-agent, device props.

Add hipCtx_t.

Add peer APIs that accept hipCtx_t (in addition to deviceId)

Compiles and passes directed tests.

Change-Id: Iddab1eb9edbf90caad2ef5959c6b811d658197f1
2016-08-08 11:55:57 -05:00
Ben Sander 6aeb2dc8d6 Add initial/partial coding guidelines
Change-Id: Ifd8cb3ad74b15d3ab2f38c3daa038a2808af6fa9
2016-08-08 11:55:41 -05:00
Ben Sander 2dc3d3238b Change Device->Ctx
Change ihipDevice_t -> ihipCtx_t (new)
Change ihipGetTlsDefaultDevice->ihipGetTlsDefaultCtx
Some other changes from device->ctx where appropriate.

Change-Id: I5c4ae93b2fd42c6303aa23d748eb166b7431925d
2016-08-07 21:47:12 -05:00
Ben Sander e7d7c5cbe8 Remove ihipStream_r::_device_index
Replace with direct pointer to device.  Cleaner, and prep
for transition to contexts.

Change-Id: I0e550f34412923d46c541c0a14bb7d29c3fd4b11
2016-08-07 20:47:06 -05:00
Evgeny Mankov 114d5bfddf clang-hipify: Transformation of declarations with external linkage and shared attribute for IncompleteArrayType (aka C array[]) only.
Example:
extern __shared__ uint sRadix1[]; =>  HIP_DYNAMIC_SHARED(unsigned int, sRadix1);
2016-08-05 21:35:58 +03:00
Rahul Garg fcb2fcce1e Region based apis to pool based api changes
Change-Id: If53019eebafe051ab4e811863995f78315297080
2016-08-05 15:05:57 +05:30
Maneesh Gupta b151f6f997 Merge branch 'amd-develop' into amd-master
Change-Id: I817585dfe3077250acbfd56cbb0a7a4a120361e3
2016-08-03 14:16:22 +05:30
Maneesh Gupta bc394505cc .hipconfig now lives in lib
Also moved reading .hipconfig to begining of hipcc

Change-Id: Ibb9d32bef64a79f189aa037c36814759c8cc8052
2016-08-03 12:01:48 +05:30
scchan 2f6ac8c36e compile all hip objects with -fPIC so that they work in a shared library
Change-Id: I1f8355d4a81dbd5f408062c317c16a5efc668876
2016-08-03 11:48:46 +05:30
Maneesh Gupta 703a287bb0 hipcc reads .hipconfig
hipcc dies if HCC_VERSION is not same as the one used to build HIP

Change-Id: I180c0108812fe5cb6c5304477557c524a4a61f75
2016-08-03 11:32:08 +05:30
Maneesh Gupta c38d0466b8 Only create .hipconfig on install
Change-Id: I9c20440023401f5794b33a0165e0918372783a68
2016-08-03 11:25:59 +05:30
Maneesh Gupta de39d01922 Merge branch 'release_0.92.00' into amd-develop 2016-08-03 09:10:44 +05:30
Maneesh Gupta e818f2378d Fix hipcc to work correctly when HIP_ATP_MARKER is not defined
Change-Id: I7cc525daccf896704e6ccf6d04ed395fda622031
2016-08-03 09:08:40 +05:30
Maneesh Gupta b2035c98ac hip_hcc rpm package does not depend on rocm-profiler
Change-Id: I291c046371e704ce395781d5de3f6430c0dcba7b
2016-08-03 09:07:58 +05:30
Aditya Atluri 9c45d9eaed Added configuration file to be used by hipcc
1. CMake will create .hip-config file in bin directory
Future Work: Need to make changes to hipcc to read the file

Change-Id: Ia7dc48d43787921d5af4ab07d7a5befbcf904465
2016-08-02 15:45:46 -05:00
Ben Sander f43d02027e Remove faulty assert for kernelCnt==0
Change-Id: I8a925c95f48e857c0a31f44561499e90dc6df552
2016-08-01 13:38:47 -05:00
Maneesh Gupta 03df23df2b Merge branch 'release_0.92.00' into amd-develop
Conflicts:
	RELEASE.md
	docs/markdown/hip_faq.md

Change-Id: Ifae1b64b6255a7872dfdc885bb8fb52f622464b7
2016-08-01 10:47:25 +05:30
Aditya Atluri 9062ebcf3a Signal Fix: The signals in a stream are re-used
1. Before, the signal pool is increased depending on the usage
2. After, a static number of signals are allocated to the pool
Only these are used by hip in a stream
3. If the signals required are more than the pool size, the
stream has to wait to make sure all the signals are available
4. Once they are available, the stream can use them
5. Removed HIP_NUM_SIGNALS_PER_STREAM because of redundancy with HIP_STREAM_SIGNALS
6. Increased signal count from 2 to 32.
Future Work: Dynamically increase the pool size depending on the number of
streams allocated by the application. And, null stream should have more signals

Change-Id: I6be36e084f26bb04766fabf776c7210aee0f9e91
2016-07-28 23:01:35 -05:00
Ben Sander 666c227c7d Remove dead enqueueBarrier function.
Change-Id: Ib18fe6bd96ce24dbeb342961ddb5721f7d03f2b2
2016-07-28 22:48:22 -05:00
Ben Sander 02dd7a7399 Cleanup sync code.
Remove dead depFutures, enqueueBarrier call.
Rename some parms to reflect usage.
Add comments to better explain tricky parts of sync code.

Change-Id: I763296421d9c2b3b58fc8cef5f010b12ab49553c
2016-07-27 18:31:11 -05:00
Ben Sander 4cf0f86e99 Fix API string message for hipDeviceGetAttribute
Change-Id: I30f54627630c8ee835506be8c9921742bb68a43a
2016-07-27 16:18:14 -05:00
Aditya Atluri 1859c6e515 Signal Fix: Added signal limit to allocSignal
1. Did not change the logic in allocSignal
2. Added guard to wait on signal limit

Change-Id: I78f29097e6a584b3c3d78319dac19869067bd1fe
2016-07-27 13:48:49 -05:00
Maneesh Gupta 059a8d51ba Update release notes for 0.92.00 release
Change-Id: I9ca588cd0d5d752dc6521e76ba943500eb55525f
2016-07-27 20:30:04 +05:30
Aditya Atluri 0a31b47e2e Signal Fix: Moved kernel count to critical stream
1. Added environment variable HIP_NUM_KERNELS_INFLIGHT
2. Moved kernelcount variable inside stream critical section

Change-Id: I51d24d0a2a109467209170de117a6d02ba4e308e
2016-07-26 17:09:27 -05:00
Aditya Atluri 53d7629a85 Signal Fix: Changed global signal count to per stream signal count
1. The number of kernels that can use signals are increased to 128
2. The kernel count is now specific to the stream

Change-Id: Ie6d1aa3f437aad8f08c3333fe48bd3f46e551e60
2016-07-26 14:03:51 -05:00
Aditya Atluri fa7933eb91 removed redundant signal destroy
Change-Id: Icf0cd76b2620d34c87cfb6c7a83049087c0a0bc4
2016-07-26 13:35:35 -05:00
Ben Sander 99bc0f03e1 Doc update for FAQ and future RELEASE notes
Change-Id: I7e7c32d9a19fdaea0a0e41f1d4fa4652e53640f2
2016-07-26 12:14:15 -05:00
Ben Sander 6ea9d2eace Doc update for README.md - add more intro text, example
Change-Id: I99b8eaacd6460dfdbdbc8ddba3fe589647d877e7
2016-07-26 12:14:02 -05:00
Ben Sander 0a74e00ac8 Make HCC ignore register keyword
- (previously would emit a warning)
- Also tweak documentation.

Change-Id: I0f4f00f82f8cc53d420112570f2d7675535e6aea
2016-07-26 12:13:48 -05:00
Aditya Atluri 4bdf26a82e Added re-fix for memcpy kernel sync
1. The patch uses HIP signal pools to sync between copy and kernel commands
2. The hsa_signal_create is removed
3. Left the redundant enqueueBarrier method just in case

Change-Id: I3dff3e8ee57fff3cd49bec802ff735ed128e5ca1
2016-07-26 09:22:59 -05:00
Maneesh Gupta 75f804fbf7 Merge branch 'amd-develop' into amd-master
Change-Id: Ic6421e5f4cb170f196495e14395c4fc62e4b0899
2016-07-26 11:42:56 +05:30
Maneesh Gupta 90a3e4e1da Documentation updates
Change-Id: Ia624d86915c4c96da0ac0242f767135f30ff73c6
2016-07-25 14:53:15 +05:30
Rahul Garg 42a3ed544c D2H and H2D unpinned memory transfer support
Change-Id: If6d6c970f435e5d917d5cc6cddc2ee2918cd1c37

Conflicts:
	src/hip_hcc.cpp
2016-07-25 14:36:07 +05:30
Aditya Atluri c756bb3398 Partial fix async after kernel launch signal issue
Change-Id: Ib48d6564379160035bded9493b93663fba361710
2016-07-23 14:54:20 -05:00
pensun f31668fee4 Add empty stubs for threadfence family routines, changes include:
- stubs and documentation in include/hcc_details/hip_runtime.h
    - stubs with "no-op" in src/hip_memory.cpp
    - document update in hip_kernel_language.md, add suggestions to
    disable L1 and L2 caches when using the threadfence routines.

Change-Id: Ic0753170f802003055bca9d7476d7f48817b98b7
2016-07-22 10:40:58 -05:00