Ben Sander
daae691cdb
Enable HCC_OPT_FLUSH=1.
...
Requires appropriate HCC with this support :
commit 38e392b517a46a09a3b1c8f388e6a0db3741c510
[ROCm/clr commit: c2baa4f6e6 ]
2017-06-07 00:15:05 -05:00
Sun, Peng
450a26e5d4
Improve HIP kernel names, attributes and codegen, contributed by Alex Voicu
...
Change-Id: I2cafbdc5a98e26c7f4fad84739c915e7dc09993c
[ROCm/clr commit: 3b6a863eef ]
2017-06-05 11:39:00 -05:00
Ben Sander
d1fe7f1683
Enable HIP_SYNC_NULL_STREAM=0 optimization.
...
[ROCm/clr commit: 344b6cb0c0 ]
2017-06-05 08:50:41 -05:00
Ben Sander
7237ed04f3
Fix HIP_SYNC_NULL_STREAM=0 mode.
...
- Fix null-stream sync
- hipStreamDestroy of null stream returns hipErrorInvalidResourceHandle
- Update documentation.
- Add tests for null stream sync, hipEventElapsedTime.
- Rename internal enum hipEventStatusRecorded to hipEventStatusComplete
- refactor hipStreamWaitEvent to streamline control-flow
[ROCm/clr commit: 823281dcba ]
2017-06-05 08:50:22 -05:00
Ben Sander
be21cd1a91
Update tests.
...
Fix some NVCC issues.
Add hipStreamSync2, record_event tests.
[ROCm/clr commit: 863b7c3f56 ]
2017-06-04 20:18:37 -05:00
Ben Sander
6aaeed821d
Update tests, add p2p coherency test.
...
[ROCm/clr commit: 15f54fb943 ]
2017-06-03 17:11:34 -05:00
Aditya Atluri
97fa7aeef6
added half data type and vector destructors
...
1. Added half data types to hip_fp16.h
2. Added destructor to vector data types
Change-Id: Id5ae76a663bb90a4bde2839ec79c58fbaee5072f
[ROCm/clr commit: fdcc223842 ]
2017-06-02 11:19:33 -05:00
emankov
6235e4bc7f
[HIPIFY] annotation
...
[ROCm/clr commit: c5f9758f4b ]
2017-06-02 16:33:48 +03:00
emankov
ef444588e1
[HIPIFY] rename legacy hipify perl script and its usage to hipify-perl
...
[ROCm/clr commit: e7779650e9 ]
2017-06-02 16:30:43 +03:00
Evgeny Mankov
b30b1acc5c
[HIPIFY] All CUDA 8.0.44 API functions update
...
(for both Driver and Runtime APIs)
1) P2P
cuDeviceGetP2PAttribute cudaDeviceGetP2PAttribute
2) Memory Mngmnt
cuMemPrefetchAsync cudaMemPrefetchAsync
cuMemAdvise cudaMemAdvise
cuMemRangeGetAttribute cudaMemRangeGetAttribute
cuMemRangeGetAttributes cudaMemRangeGetAttributes
3) Streams (Driver API only, no analogues in Runtime API)
cuStreamWaitValue32
cuStreamWaitValue32
cuStreamWriteValue32
4) Texture Reference Mngmnt (Driver API only, no analogues in Runtime API)
cuTexRefSetBorderColor
cuTexRefGetBorderColor
[ROCm/clr commit: ee85243bcd ]
2017-06-01 21:08:33 +03:00
Siu Chi Chan
8514cf513a
fix atomicCAS:remove load for the return value after CAS
...
[ROCm/clr commit: 969931b1ce ]
2017-05-31 15:20:19 -04:00
Evgeny Mankov
afbf55a9dc
[HIP] [HIPIFY] CUDA Driver API 8.0.44 JIT options support.
...
[ROCm/clr commit: 463c026976 ]
2017-05-31 18:55:29 +03:00
Maneesh Gupta
0e4e17db27
Fix hipMemoryAllocate test for single GPU
...
Change-Id: If121c18ab490ba125dc689ffc08a8839fd280c38
[ROCm/clr commit: 06ee0d3704 ]
2017-05-31 10:16:57 +05:30
Maneesh Gupta
2985fa3814
Disable rcbrtf, scalblnf, scalbnf in single precision device test
...
Change-Id: I8a250a64a0cb05132d022a11d9766ced9cdf11a7
[ROCm/clr commit: 2145e94049 ]
2017-05-31 10:16:19 +05:30
Maneesh Gupta
13896d6fb9
Disable rcbrt, scalbln and scalbn double precision device test
...
Change-Id: I46bd895701c46d3592b553090cafba99e41a2e2d
[ROCm/clr commit: da19087ae2 ]
2017-05-31 10:15:41 +05:30
Sandeep Kumar
ac8089e773
Add readme for inline asm and unroll cookbook samples
...
Change-Id: I71b7a5652c3dad181c5df60ab0dd1b81d79f1bfb
[ROCm/clr commit: f6b98854ba ]
2017-05-31 09:25:50 +05:30
Sandeep Kumar
c3167f463d
Add inline asm hip directed tests for v_add and v_mac
...
Change-Id: Ie5ace2e42d5da89b16e040537df2bb13d3883c6d
[ROCm/clr commit: c964a5f208 ]
2017-05-31 09:25:40 +05:30
Sandeep Kumar
be31ebb8a7
Add unroll and inline asm cookbook samples
...
Change-Id: Ie5a0fbb01b7fca82959090d89299533d49e092f1
[ROCm/clr commit: 5696eaf842 ]
2017-05-31 09:25:35 +05:30
Sandeep Kumar
b22fdeb171
Print msg for single gpu
...
Change-Id: I2d23c73542add8973990ba96592016726994422e
[ROCm/clr commit: e104c2e3bf ]
2017-05-31 09:25:17 +05:30
Ben Sander
81354999e8
Set event->_stream on hipHccModuleLaunchKernel path if start/stop used
...
Ensure _stream is always non-null in recorded events.
Fixes isDefaultStream fault.
[ROCm/clr commit: 6cc5dc0326 ]
2017-05-30 21:55:46 -05:00
Evgeny Mankov
54b3c90964
[HIPIFY] Add the rest CUDA Runtime API 8.0.44 Data structures.
...
+ sync with corresponding CUDA Driver API Data structures.
P.S.
There is no any new changes in CUDA Runtime API 8.0.61 Data structures since 8.0.44.
[ROCm/clr commit: 997ed19bb8 ]
2017-05-30 19:45:59 +03:00
Evgeny Mankov
306dca2c78
[HIPIFY] Add the rest CUDA Driver API 8.0.44 Data structures.
...
+ Memory advise values
+ Memory Range Attributes
+ P2P Attributes
P.S.
There is no any new changes in CUDA Driver API 8.0.61 Data structures since 8.0.44.
[ROCm/clr commit: a020eb76dd ]
2017-05-30 18:29:14 +03:00
Evgeny Mankov
ffb0d43b07
[HIPIFY] Add more CUDA Driver API 8.0.44 Data structures.
...
[ROCm/clr commit: ef86f943ac ]
2017-05-30 17:58:13 +03:00
Maneesh Gupta
06cdafe311
Disable normcdfinvf on __host__
...
Change-Id: If7bfc9826a09eb9b7675ea2a417b9418759b7912
[ROCm/clr commit: 445012d451 ]
2017-05-30 15:45:22 +05:30
Ben Sander
0ca3262f0a
Add event controls for release fences.
...
Env var : HIP_EVENT_SYS_RELEASE
Event allocation flags : hipEventReleaseToDevice, hipEventReleaseToSystem
(remove hipEventDisableSystemRelease)
Update test for new functionality.
[ROCm/clr commit: 942ec0eff8 ]
2017-05-27 16:02:34 -05:00
Ben Sander
d6e8f5bbdc
Cleanup hipEvent. (Intermediate checkpoint)
...
Support hipEventDisableSystemRelease flag.
Update test.
Remove stray printf
[ROCm/clr commit: c8178c6838 ]
2017-05-27 16:02:34 -05:00
Ben Sander
9442c6dd2d
Updates so hip compiles on CUDA.
...
[ROCm/clr commit: 8dc968f036 ]
2017-05-27 15:55:07 -05:00
Ben Sander
d9587ae2f0
Add isDefaultStream() accessor.
...
Fix code that checked for stream==nullptr after stream had been
resolved to a "true stream".
[ROCm/clr commit: b2b620c12b ]
2017-05-26 13:46:48 -05:00
Siu Chi Chan
6c3a05ac5b
fix hip_fast_dsqrt* to call a double fp sqrt function
...
[ROCm/clr commit: a3595d2e8c ]
2017-05-25 23:15:30 -04:00
Evgeny Mankov
ed54e3d0ee
[FIX] [HIPIFY] Add matchers for function return types.
...
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/73
Examples (https://github.com/thrust/thrust/blob/master/thrust/system/cuda/detail/trivial_copy.inl ):
template<typename System1,
typename System2>
cudaStream_t cuda_memcpy_stream(const thrust::cpp::execution_policy<System1> &,
const thrust::cuda::execution_policy<System2> &exec)
template<typename System1,
typename System2>
cudaMemcpyKind cuda_memcpy_kind(const thrust::cuda::execution_policy<System1> &,
const thrust::cpp::execution_policy<System2> &)
[ROCm/clr commit: a19ecab3f2 ]
2017-05-24 18:25:40 +03:00
Ben Sander
b4363ffcba
Remove HIP_NUM_KERNELS_INFLIGHT. (redundant with HCC controls)
...
[ROCm/clr commit: 35212632e7 ]
2017-05-24 01:03:28 -05:00
Ben Sander
d302498787
Add hipHostMallocCoherent, hipHostMallocNonCoherent
...
Provide per-allocation control over coherent/non-coherent mem.
These overrid the default HIP_COHERENT_HOST_ALLOC setting.
[ROCm/clr commit: dda70ae514 ]
2017-05-24 00:48:10 -05:00
Ben Sander
ae983e1b09
Remove HIP_MAX_QUEUES (replaced with HCC_MAX_QUEUES)
...
[ROCm/clr commit: d43d57d39c ]
2017-05-23 23:48:01 -05:00
Ben Sander
59e07db865
Expand test to cover copy followed by event sync
...
[ROCm/clr commit: 92bd54d7b3 ]
2017-05-23 23:15:45 -05:00
Ben Sander
2e8625a208
Use accelerator_scope for create_marker and create_blocking_marker.
...
As optimization when system-scope is not needed.
[ROCm/clr commit: 2d5b3359c6 ]
2017-05-23 23:15:45 -05:00
Ben Sander
0cde8e5db4
Fix trace category for hipHostMalloc
...
[ROCm/clr commit: ca07615c37 ]
2017-05-23 23:15:45 -05:00
Evgeny Mankov
9e7a50b1e0
[FIX] [HIPIFY] Matcher for new operator is missing.
...
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/80
Example from CUDA 8.0.44 sample (CUDASamples\0_Simple\matrixMulDrv\matrixMulDrv.cpp):
CUjit_option *jitOptions = new CUjit_option[jitNumOptions];
where CUjit_option is enum, should be:
hipJitOption *jitOptions = new hipJitOption[jitNumOptions];
[ROCm/clr commit: 21d74f09b9 ]
2017-05-23 19:45:38 +03:00
Aditya Atluri
e7813b1933
fixed erfinv build error as it is implemented in hcc
...
Change-Id: I27a512147c53f658a63fdf3e90f5e9cfac09ada8
[ROCm/clr commit: 490355203b ]
2017-05-23 09:32:19 -05:00
pensun
f848e9f117
fix GGL helper header file, reorder for C++17
...
Change-Id: I3d9ddfe670bf7e3e8e7bd85e52cc61f48c19c213
[ROCm/clr commit: 2523f8a492 ]
2017-05-22 08:52:43 -05:00
Evgeny Mankov
aebc80c8e0
[HIPIFY] [FIX] [HIPIFY] Matcher for pointer to enum var declaration is missing.
...
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/79
Example from CUDA 8.0.44 sample (CUDASamples\0_Simple\matrixMulDrv\matrixMulDrv.cpp):
CUjit_option *jitOptions = new CUjit_option[jitNumOptions];
where CUjit_option is enum, should be:
hipJitOption *jitOptions = new hipJitOption[jitNumOptions];
[TODO]
1. new CUjit_option -> new hipJitOption.
Matcher for new operator is missing:
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/79
2. Merge matchers cudaEnumDecl and cudaEnumVarPtr.
[ROCm/clr commit: 51b7420359 ]
2017-05-19 17:39:09 +03:00
Evgeny Mankov
052f630bd3
[HIP] [HIPIFY] [FIX] cuModuleLoadDataEx -> hipModuleLoadDataEx
...
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/81
1. Do not use JIT options on HCC path, call hipModuleLoadData instead.
2. NVCC path is unchanged, to call cuModuleLoadDataEx with all options.
3. Get rid of manual hipification, based on #ifdef #else for NVCC/HIP.
4. Update documentation accordingly.
[ROCm/clr commit: ae9f14ef9c ]
2017-05-19 17:22:14 +03:00
Aditya Atluri
905ab9a689
fixed vector type issues by reverting to old code, changed __half2 to map to vector types in llvm
...
Change-Id: I7317408c25e8c1a0c02a346042c9137e160c8bbd
[ROCm/clr commit: 0433a2e608 ]
2017-05-18 10:51:07 -05:00
Ben Sander
8c50285d30
Return precise address for hipHostGetDevicePointer.
...
[ROCm/clr commit: ee37a31799 ]
2017-05-17 07:36:06 -05:00
Aditya Atluri
0318e91450
changed vector types to make sure it generate proper llvm vector types
...
Change-Id: I6c4616dae137dc4eac35e5827dc5b7f3251e0247
[ROCm/clr commit: 93fa174900 ]
2017-05-16 21:35:40 -05:00
Ben Sander
33cfd1a35e
Make hipMultiThreadStreams1 test a little harsher.
...
Fail faster if synchronization rules are violated.
Run vectorAddRevers to read last elements of array first - if the
vector add kernel starts before preceding copy finishes we
will read stale data and flag the error.
Increase default array sizes, so synchronization errors more easily
exposed.
[ROCm/clr commit: 848d77ebb5 ]
2017-05-16 19:04:25 -05:00
Ben Sander
c098221a46
Fix HIP_TRACE_API so kernel launch only printed when requested.
...
[ROCm/clr commit: fad9104b5f ]
2017-05-16 19:04:25 -05:00
Ben Sander
4ac6ac9d1d
Add initial HIP_SYNC_NULL_STREAM=0 mode.
...
This eliminates host-synchronization for null stream. Instead, the
null-stream uses GPU-side events to wait for other streams.
Default is OFF pending additional testing.
Add enhanced null-stream test.
Also refine HIP_TRACE_API.
[ROCm/clr commit: 8bc6ee5932 ]
2017-05-16 19:04:25 -05:00
Ben Sander
db102ab82f
Update tests README
...
[ROCm/clr commit: 5d2072aba1 ]
2017-05-16 19:04:25 -05:00
Ben Sander
b8b6cfe02e
Doc update - split hip_debugging.md into separate file.
...
[ROCm/clr commit: a5a12942b2 ]
2017-05-16 19:04:25 -05:00
Ben Sander
a8d917c092
split debugging into separate .md file
...
[ROCm/clr commit: 61c8633113 ]
2017-05-16 19:04:25 -05:00