Ben Sander
f43d02027e
Remove faulty assert for kernelCnt==0
...
Change-Id: I8a925c95f48e857c0a31f44561499e90dc6df552
2016-08-01 13:38:47 -05:00
Maneesh Gupta
03df23df2b
Merge branch 'release_0.92.00' into amd-develop
...
Conflicts:
RELEASE.md
docs/markdown/hip_faq.md
Change-Id: Ifae1b64b6255a7872dfdc885bb8fb52f622464b7
2016-08-01 10:47:25 +05:30
Aditya Atluri
9062ebcf3a
Signal Fix: The signals in a stream are re-used
...
1. Before, the signal pool is increased depending on the usage
2. After, a static number of signals are allocated to the pool
Only these are used by hip in a stream
3. If the signals required are more than the pool size, the
stream has to wait to make sure all the signals are available
4. Once they are available, the stream can use them
5. Removed HIP_NUM_SIGNALS_PER_STREAM because of redundancy with HIP_STREAM_SIGNALS
6. Increased signal count from 2 to 32.
Future Work: Dynamically increase the pool size depending on the number of
streams allocated by the application. And, null stream should have more signals
Change-Id: I6be36e084f26bb04766fabf776c7210aee0f9e91
2016-07-28 23:01:35 -05:00
Ben Sander
666c227c7d
Remove dead enqueueBarrier function.
...
Change-Id: Ib18fe6bd96ce24dbeb342961ddb5721f7d03f2b2
2016-07-28 22:48:22 -05:00
Ben Sander
02dd7a7399
Cleanup sync code.
...
Remove dead depFutures, enqueueBarrier call.
Rename some parms to reflect usage.
Add comments to better explain tricky parts of sync code.
Change-Id: I763296421d9c2b3b58fc8cef5f010b12ab49553c
2016-07-27 18:31:11 -05:00
Ben Sander
4cf0f86e99
Fix API string message for hipDeviceGetAttribute
...
Change-Id: I30f54627630c8ee835506be8c9921742bb68a43a
2016-07-27 16:18:14 -05:00
Aditya Atluri
1859c6e515
Signal Fix: Added signal limit to allocSignal
...
1. Did not change the logic in allocSignal
2. Added guard to wait on signal limit
Change-Id: I78f29097e6a584b3c3d78319dac19869067bd1fe
2016-07-27 13:48:49 -05:00
Maneesh Gupta
059a8d51ba
Update release notes for 0.92.00 release
...
Change-Id: I9ca588cd0d5d752dc6521e76ba943500eb55525f
2016-07-27 20:30:04 +05:30
Aditya Atluri
0a31b47e2e
Signal Fix: Moved kernel count to critical stream
...
1. Added environment variable HIP_NUM_KERNELS_INFLIGHT
2. Moved kernelcount variable inside stream critical section
Change-Id: I51d24d0a2a109467209170de117a6d02ba4e308e
2016-07-26 17:09:27 -05:00
Aditya Atluri
53d7629a85
Signal Fix: Changed global signal count to per stream signal count
...
1. The number of kernels that can use signals are increased to 128
2. The kernel count is now specific to the stream
Change-Id: Ie6d1aa3f437aad8f08c3333fe48bd3f46e551e60
2016-07-26 14:03:51 -05:00
Aditya Atluri
fa7933eb91
removed redundant signal destroy
...
Change-Id: Icf0cd76b2620d34c87cfb6c7a83049087c0a0bc4
2016-07-26 13:35:35 -05:00
Ben Sander
99bc0f03e1
Doc update for FAQ and future RELEASE notes
...
Change-Id: I7e7c32d9a19fdaea0a0e41f1d4fa4652e53640f2
2016-07-26 12:14:15 -05:00
Ben Sander
6ea9d2eace
Doc update for README.md - add more intro text, example
...
Change-Id: I99b8eaacd6460dfdbdbc8ddba3fe589647d877e7
2016-07-26 12:14:02 -05:00
Ben Sander
0a74e00ac8
Make HCC ignore register keyword
...
- (previously would emit a warning)
- Also tweak documentation.
Change-Id: I0f4f00f82f8cc53d420112570f2d7675535e6aea
2016-07-26 12:13:48 -05:00
Aditya Atluri
4bdf26a82e
Added re-fix for memcpy kernel sync
...
1. The patch uses HIP signal pools to sync between copy and kernel commands
2. The hsa_signal_create is removed
3. Left the redundant enqueueBarrier method just in case
Change-Id: I3dff3e8ee57fff3cd49bec802ff735ed128e5ca1
2016-07-26 09:22:59 -05:00
Maneesh Gupta
90a3e4e1da
Documentation updates
...
Change-Id: Ia624d86915c4c96da0ac0242f767135f30ff73c6
2016-07-25 14:53:15 +05:30
Rahul Garg
42a3ed544c
D2H and H2D unpinned memory transfer support
...
Change-Id: If6d6c970f435e5d917d5cc6cddc2ee2918cd1c37
Conflicts:
src/hip_hcc.cpp
2016-07-25 14:36:07 +05:30
Aditya Atluri
c756bb3398
Partial fix async after kernel launch signal issue
...
Change-Id: Ib48d6564379160035bded9493b93663fba361710
2016-07-23 14:54:20 -05:00
pensun
f31668fee4
Add empty stubs for threadfence family routines, changes include:
...
- stubs and documentation in include/hcc_details/hip_runtime.h
- stubs with "no-op" in src/hip_memory.cpp
- document update in hip_kernel_language.md, add suggestions to
disable L1 and L2 caches when using the threadfence routines.
Change-Id: Ic0753170f802003055bca9d7476d7f48817b98b7
2016-07-22 10:40:58 -05:00
Maneesh Gupta
71d51170ef
Replace calls to ihipInit with use of HIP_INIT_API macro
...
Change-Id: Iabf7df79f0238a8ddffea4607fe945df36642850
2016-07-22 15:46:55 +05:30
Maneesh Gupta
1bf661223e
hip_hcc package now depends on rocm-profiler
...
Change-Id: I80d1c6048cc18c47c2024efb90368b17139a09ad
2016-07-22 15:45:32 +05:30
Maneesh Gupta
b23fad53cc
Fix using ATP markers
...
Change-Id: If2d04f80b580237426c569737551e2001a8cd35a
2016-07-21 16:02:51 +05:30
Maneesh Gupta
7022986ab2
Merge branch 'hiparray' into amd-develop
...
Change-Id: I63ca7b1db7b593ac5cfb3fd7cd5d08d6e4075a4c
2016-07-21 12:29:56 +05:30
Maneesh Gupta
b25443b96a
Bump HIP version to 0.92.00
...
Change-Id: I4d653213dcf3ddf7d09d36433afe31e21d17a7ee
2016-07-21 09:52:08 +05:30
Aditya Atluri
272820a620
added kernel launch stress test
...
Change-Id: Ib50d47e55079839ba31aabb524b70b704d7a46fe
2016-07-19 13:57:56 -05:00
Aditya Atluri
c8542007b5
added fix for signal overflow in kernels
...
Change-Id: Ie0b1f97f69b7d7b34e445f6f120472819be03a0e
2016-07-19 13:51:44 -05:00
Maneesh Gupta
ab0bea5b15
Add markdown documentation to hip_docs package
...
Change-Id: I4075e6baaf287356ac8b485cccc231bfb729f078
2016-07-19 14:30:03 +05:30
Maneesh Gupta
779502e4a2
Set cmake policy CMP0037 to old behavior for newer cmake versions
...
Change-Id: Ib2a7da53a238a489e73d6c006c50f12f07f866a1
2016-07-19 06:52:38 +05:30
Ben Sander
c3b4c6b23f
Update documentation.
...
- Add more detailed feature comparison HIP vs specific CUDA SDK revs.
- Remove reference to old codenames.
- Fix TOC in some md files.
Change-Id: If0f6a05dc61237c4710c4d4bf2d933d556dcaeae
2016-07-18 13:56:22 -05:00
Maneesh Gupta
307ea434cc
README.md: Fix link to clang-hipify documentation
...
Change-Id: I4c668c0e3c5ee2b16dfc213b48ccd1dbaac7b2a2
2016-07-15 14:18:59 +05:30
Maneesh Gupta
e15ff6a28e
Handle linker warning correctly
...
Change-Id: Ifca5886e31b4551603944c2b90a8e532442e889f
2016-07-15 10:40:56 +05:30
Maneesh Gupta
1be8097c69
Add specifying ROCm target support to HIP
...
Change-Id: I42930d8ccf0dafe408ce5bebe2d95f99e544ee8d
2016-07-15 10:08:27 +05:30
Maneesh Gupta
19c0bb0250
Merge branch 'release_0.90.00' into amd-develop
...
Conflicts:
bin/hipcc
Change-Id: Iced6290fe205551f5d30f656088530b0ea813193
2016-07-15 10:04:03 +05:30
Maneesh Gupta
fecea08416
Whitespace cleanup in hipcc
...
Change-Id: I36d32808e9e393df44ffd80b49d09ccf68b5751e
2016-07-13 11:26:03 +05:30
Maneesh Gupta
e88ff3eae8
Cleanup hip_hcc packaging script
...
Change-Id: I41168b3135b36263369b97974a2d2646516e8401
2016-07-12 13:08:21 +05:30
Maneesh Gupta
9e5f5c79c8
Fix whitespace in cmakefile
...
Change-Id: Ide671e19ea15d57cc8619ec57ad10bca615b3858
2016-07-12 13:00:42 +05:30
Maneesh Gupta
723e0def03
cmake queries hip version from hipconfig
...
Change-Id: I8c65ca508ddc288111eaa1d818213c72ef27ac1b
2016-07-12 12:40:35 +05:30
Maneesh Gupta
3cbbcc30f9
Move hip version information to hipconfig
...
Change-Id: I2892a9eb9356ac5310b89a433d858c1c3eb986ee
2016-07-11 16:38:41 +05:30
Evgeny Mankov
19a6f1486a
clang-hipify: Populate replacement rules and fix typos.
...
+ fix typo in hipStreamWaitEvent.
+ hipHostAlloc -> hipHostMalloc.
+ Memory Types, Pointer Attributes are added.
+ Event, Host Malloc and Register, Device and Stream Flags are added.
+ exclude cudacommon.h.prehip from includes' replacement rules.
2016-07-07 18:01:41 +03:00
Maneesh Gupta
e5ae34bb03
Merge branch 'amd-develop' into amd-master
...
Change-Id: I8743f4653c63b6b19d9f055b2c2df76e1ad185cb
2016-07-05 21:42:43 +05:30
Maneesh Gupta
d7b040bdba
Merge branch 'amd-master' into amd-develop
2016-07-05 21:40:22 +05:30
Maneesh Gupta
b5ed40d591
Merge branch 'hipblas' into amd-master
...
Change-Id: I73bcf3476174b1d01ef7109fe026323fba577ad8
2016-07-05 10:05:43 +05:30
Maneesh Gupta
054aef9006
Merge remote-tracking branch 'hipblas' into hipblas
2016-07-05 10:04:10 +05:30
Maneesh Gupta
1c05dc932b
Merge commit 'hipblas~5' into hipblas
2016-07-05 10:03:51 +05:30
Maneesh Gupta
8f27e78dd7
don't bundle source in rocm packages
...
Change-Id: I36bd93e2bff9db1310724dd79c6c736ee17aa806
2016-07-05 08:18:30 +05:30
Evgeny Mankov
fd1e556cf2
clang-hipify: add Replacement Excludes
...
Excludes are not replaced, for instance, CHECK_CUDA_ERROR and CUDA_SAFE_CALL.
Add check for excludes in MacroExpands and CallExpr routines.
2016-07-01 19:58:14 +03:00
Aditya Atluri
36b81c1be6
added more nvcc event functions
...
Change-Id: I79ee20ef444d4c1ab6ada3c0d56730ce754ab6b6
2016-06-30 21:03:19 -05:00
Maneesh Gupta
3f204b8580
Merge branch 'amd-develop' into amd-master
...
Change-Id: I04f85b207e15e66c1a546675dc0937726ee08362
2016-06-30 18:36:07 +05:30
Fan Cao
6a2bbbcb75
Replace GPU agent with CPU agent properly for memory async copy API
...
ihipStream_t::copySync use GPU agent in memory async copy API, even
if the src/dst memory does not belong to GPU, which cause the hsa
runtime to choose a slower copy engine.
SWDEV-95191
Change-Id: If3cab3d493c0c96ed63721cdcf28247a1193887c
2016-06-30 18:23:29 +05:30
Aditya Atluri
5633cc34cc
moved half support to a source file
...
Change-Id: I7c09b41877e22c1b743dea25a585e5307427dafd
2016-06-30 18:23:29 +05:30