Ben Sander
65f91dbb0d
Enable HIP_SYNC_NULL_STREAM=0 optimization.
...
[ROCm/hip commit: decf3eee18 ]
2017-06-05 08:50:41 -05:00
Ben Sander
445042f916
Fix HIP_SYNC_NULL_STREAM=0 mode.
...
- Fix null-stream sync
- hipStreamDestroy of null stream returns hipErrorInvalidResourceHandle
- Update documentation.
- Add tests for null stream sync, hipEventElapsedTime.
- Rename internal enum hipEventStatusRecorded to hipEventStatusComplete
- refactor hipStreamWaitEvent to streamline control-flow
[ROCm/hip commit: 39c18e5e5f ]
2017-06-05 08:50:22 -05:00
Ben Sander
f3950e0748
Update tests.
...
Fix some NVCC issues.
Add hipStreamSync2, record_event tests.
[ROCm/hip commit: 8ce6d17983 ]
2017-06-04 20:18:37 -05:00
Ben Sander
a1bdd5f585
Update tests, add p2p coherency test.
...
[ROCm/hip commit: d5c1616324 ]
2017-06-03 17:11:34 -05:00
Aditya Atluri
47ec040140
added half data type and vector destructors
...
1. Added half data types to hip_fp16.h
2. Added destructor to vector data types
Change-Id: Id5ae76a663bb90a4bde2839ec79c58fbaee5072f
[ROCm/hip commit: cb7c4c423c ]
2017-06-02 11:19:33 -05:00
emankov
737e0cc93e
[HIPIFY] annotation
...
[ROCm/hip commit: ccc4cd1a3e ]
2017-06-02 16:33:48 +03:00
emankov
ca707fe693
[HIPIFY] rename legacy hipify perl script and its usage to hipify-perl
...
[ROCm/hip commit: 4d035caedf ]
2017-06-02 16:30:43 +03:00
Evgeny Mankov
0c12f09ba5
[HIPIFY] All CUDA 8.0.44 API functions update
...
(for both Driver and Runtime APIs)
1) P2P
cuDeviceGetP2PAttribute cudaDeviceGetP2PAttribute
2) Memory Mngmnt
cuMemPrefetchAsync cudaMemPrefetchAsync
cuMemAdvise cudaMemAdvise
cuMemRangeGetAttribute cudaMemRangeGetAttribute
cuMemRangeGetAttributes cudaMemRangeGetAttributes
3) Streams (Driver API only, no analogues in Runtime API)
cuStreamWaitValue32
cuStreamWaitValue32
cuStreamWriteValue32
4) Texture Reference Mngmnt (Driver API only, no analogues in Runtime API)
cuTexRefSetBorderColor
cuTexRefGetBorderColor
[ROCm/hip commit: 4a5484c616 ]
2017-06-01 21:08:33 +03:00
Siu Chi Chan
cc54bc4d85
fix atomicCAS:remove load for the return value after CAS
...
[ROCm/hip commit: 741eb844fe ]
2017-05-31 15:20:19 -04:00
Evgeny Mankov
7a3befc555
[HIP] [HIPIFY] CUDA Driver API 8.0.44 JIT options support.
...
[ROCm/hip commit: 6e99e388ea ]
2017-05-31 18:55:29 +03:00
Maneesh Gupta
404c3f6751
Fix hipMemoryAllocate test for single GPU
...
Change-Id: If121c18ab490ba125dc689ffc08a8839fd280c38
[ROCm/hip commit: 502a74fcd6 ]
2017-05-31 10:16:57 +05:30
Maneesh Gupta
be2c61b72a
Disable rcbrtf, scalblnf, scalbnf in single precision device test
...
Change-Id: I8a250a64a0cb05132d022a11d9766ced9cdf11a7
[ROCm/hip commit: 5cdd1b2bf5 ]
2017-05-31 10:16:19 +05:30
Maneesh Gupta
ed41fcd51a
Disable rcbrt, scalbln and scalbn double precision device test
...
Change-Id: I46bd895701c46d3592b553090cafba99e41a2e2d
[ROCm/hip commit: 4919863d3e ]
2017-05-31 10:15:41 +05:30
Sandeep Kumar
ee62cf4869
Add readme for inline asm and unroll cookbook samples
...
Change-Id: I71b7a5652c3dad181c5df60ab0dd1b81d79f1bfb
[ROCm/hip commit: 0154c97ddd ]
2017-05-31 09:25:50 +05:30
Sandeep Kumar
96a7f1853e
Add inline asm hip directed tests for v_add and v_mac
...
Change-Id: Ie5ace2e42d5da89b16e040537df2bb13d3883c6d
[ROCm/hip commit: 830f2b100d ]
2017-05-31 09:25:40 +05:30
Sandeep Kumar
1859451523
Add unroll and inline asm cookbook samples
...
Change-Id: Ie5a0fbb01b7fca82959090d89299533d49e092f1
[ROCm/hip commit: e4f0b28bb9 ]
2017-05-31 09:25:35 +05:30
Sandeep Kumar
a28399f456
Print msg for single gpu
...
Change-Id: I2d23c73542add8973990ba96592016726994422e
[ROCm/hip commit: dfcba01db6 ]
2017-05-31 09:25:17 +05:30
Ben Sander
59833bea80
Set event->_stream on hipHccModuleLaunchKernel path if start/stop used
...
Ensure _stream is always non-null in recorded events.
Fixes isDefaultStream fault.
[ROCm/hip commit: cb60763737 ]
2017-05-30 21:55:46 -05:00
Evgeny Mankov
f99ab93b61
[HIPIFY] Add the rest CUDA Runtime API 8.0.44 Data structures.
...
+ sync with corresponding CUDA Driver API Data structures.
P.S.
There is no any new changes in CUDA Runtime API 8.0.61 Data structures since 8.0.44.
[ROCm/hip commit: 063539308e ]
2017-05-30 19:45:59 +03:00
Evgeny Mankov
8a05cf882f
[HIPIFY] Add the rest CUDA Driver API 8.0.44 Data structures.
...
+ Memory advise values
+ Memory Range Attributes
+ P2P Attributes
P.S.
There is no any new changes in CUDA Driver API 8.0.61 Data structures since 8.0.44.
[ROCm/hip commit: 1cc5f42e34 ]
2017-05-30 18:29:14 +03:00
Evgeny Mankov
8574437d36
[HIPIFY] Add more CUDA Driver API 8.0.44 Data structures.
...
[ROCm/hip commit: 3e99bc23e7 ]
2017-05-30 17:58:13 +03:00
Maneesh Gupta
56bf96c6df
Disable normcdfinvf on __host__
...
Change-Id: If7bfc9826a09eb9b7675ea2a417b9418759b7912
[ROCm/hip commit: 4ff01c971f ]
2017-05-30 15:45:22 +05:30
Ben Sander
b3e01ed74f
Add event controls for release fences.
...
Env var : HIP_EVENT_SYS_RELEASE
Event allocation flags : hipEventReleaseToDevice, hipEventReleaseToSystem
(remove hipEventDisableSystemRelease)
Update test for new functionality.
[ROCm/hip commit: ec7102f461 ]
2017-05-27 16:02:34 -05:00
Ben Sander
e24627aa80
Cleanup hipEvent. (Intermediate checkpoint)
...
Support hipEventDisableSystemRelease flag.
Update test.
Remove stray printf
[ROCm/hip commit: 620eb30691 ]
2017-05-27 16:02:34 -05:00
Ben Sander
c88190d698
Updates so hip compiles on CUDA.
...
[ROCm/hip commit: be8d0ba644 ]
2017-05-27 15:55:07 -05:00
Ben Sander
715aeef97c
Add isDefaultStream() accessor.
...
Fix code that checked for stream==nullptr after stream had been
resolved to a "true stream".
[ROCm/hip commit: b251d72917 ]
2017-05-26 13:46:48 -05:00
Siu Chi Chan
ef1a8c2788
fix hip_fast_dsqrt* to call a double fp sqrt function
...
[ROCm/hip commit: 1dce01f9bb ]
2017-05-25 23:15:30 -04:00
Evgeny Mankov
bc9b970f82
[FIX] [HIPIFY] Add matchers for function return types.
...
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/73
Examples (https://github.com/thrust/thrust/blob/master/thrust/system/cuda/detail/trivial_copy.inl ):
template<typename System1,
typename System2>
cudaStream_t cuda_memcpy_stream(const thrust::cpp::execution_policy<System1> &,
const thrust::cuda::execution_policy<System2> &exec)
template<typename System1,
typename System2>
cudaMemcpyKind cuda_memcpy_kind(const thrust::cuda::execution_policy<System1> &,
const thrust::cpp::execution_policy<System2> &)
[ROCm/hip commit: c863215611 ]
2017-05-24 18:25:40 +03:00
Ben Sander
b7b7c7b7ac
Remove HIP_NUM_KERNELS_INFLIGHT. (redundant with HCC controls)
...
[ROCm/hip commit: 578d430bb3 ]
2017-05-24 01:03:28 -05:00
Ben Sander
5b9ce41633
Add hipHostMallocCoherent, hipHostMallocNonCoherent
...
Provide per-allocation control over coherent/non-coherent mem.
These overrid the default HIP_COHERENT_HOST_ALLOC setting.
[ROCm/hip commit: 75f691ec2f ]
2017-05-24 00:48:10 -05:00
Ben Sander
07865c3a02
Remove HIP_MAX_QUEUES (replaced with HCC_MAX_QUEUES)
...
[ROCm/hip commit: d0ef9d8462 ]
2017-05-23 23:48:01 -05:00
Ben Sander
247c34195f
Expand test to cover copy followed by event sync
...
[ROCm/hip commit: 236ce70e94 ]
2017-05-23 23:15:45 -05:00
Ben Sander
97bdbe6590
Use accelerator_scope for create_marker and create_blocking_marker.
...
As optimization when system-scope is not needed.
[ROCm/hip commit: 2b253a48b6 ]
2017-05-23 23:15:45 -05:00
Ben Sander
1e0046f737
Fix trace category for hipHostMalloc
...
[ROCm/hip commit: 7cfe07cff4 ]
2017-05-23 23:15:45 -05:00
Evgeny Mankov
0a43773e0d
[FIX] [HIPIFY] Matcher for new operator is missing.
...
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/80
Example from CUDA 8.0.44 sample (CUDASamples\0_Simple\matrixMulDrv\matrixMulDrv.cpp):
CUjit_option *jitOptions = new CUjit_option[jitNumOptions];
where CUjit_option is enum, should be:
hipJitOption *jitOptions = new hipJitOption[jitNumOptions];
[ROCm/hip commit: 3d973dc4da ]
2017-05-23 19:45:38 +03:00
Aditya Atluri
6e117234bf
fixed erfinv build error as it is implemented in hcc
...
Change-Id: I27a512147c53f658a63fdf3e90f5e9cfac09ada8
[ROCm/hip commit: 0559fc69e9 ]
2017-05-23 09:32:19 -05:00
pensun
b19e862dc4
fix GGL helper header file, reorder for C++17
...
Change-Id: I3d9ddfe670bf7e3e8e7bd85e52cc61f48c19c213
[ROCm/hip commit: 6d0f58b939 ]
2017-05-22 08:52:43 -05:00
Evgeny Mankov
cb7e073c42
[HIPIFY] [FIX] [HIPIFY] Matcher for pointer to enum var declaration is missing.
...
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/79
Example from CUDA 8.0.44 sample (CUDASamples\0_Simple\matrixMulDrv\matrixMulDrv.cpp):
CUjit_option *jitOptions = new CUjit_option[jitNumOptions];
where CUjit_option is enum, should be:
hipJitOption *jitOptions = new hipJitOption[jitNumOptions];
[TODO]
1. new CUjit_option -> new hipJitOption.
Matcher for new operator is missing:
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/79
2. Merge matchers cudaEnumDecl and cudaEnumVarPtr.
[ROCm/hip commit: b5a1d47e68 ]
2017-05-19 17:39:09 +03:00
Evgeny Mankov
042de3e175
[HIP] [HIPIFY] [FIX] cuModuleLoadDataEx -> hipModuleLoadDataEx
...
https://github.com/GPUOpen-ProfessionalCompute-Tools/HIP/issues/81
1. Do not use JIT options on HCC path, call hipModuleLoadData instead.
2. NVCC path is unchanged, to call cuModuleLoadDataEx with all options.
3. Get rid of manual hipification, based on #ifdef #else for NVCC/HIP.
4. Update documentation accordingly.
[ROCm/hip commit: 270f643c9c ]
2017-05-19 17:22:14 +03:00
Aditya Atluri
721cb0f7db
fixed vector type issues by reverting to old code, changed __half2 to map to vector types in llvm
...
Change-Id: I7317408c25e8c1a0c02a346042c9137e160c8bbd
[ROCm/hip commit: bdc08fcf10 ]
2017-05-18 10:51:07 -05:00
Ben Sander
324dfb870f
Return precise address for hipHostGetDevicePointer.
...
[ROCm/hip commit: 46030bb2d2 ]
2017-05-17 07:36:06 -05:00
Aditya Atluri
4dbebe0409
changed vector types to make sure it generate proper llvm vector types
...
Change-Id: I6c4616dae137dc4eac35e5827dc5b7f3251e0247
[ROCm/hip commit: 9dceccf136 ]
2017-05-16 21:35:40 -05:00
Ben Sander
bd7a374f20
Make hipMultiThreadStreams1 test a little harsher.
...
Fail faster if synchronization rules are violated.
Run vectorAddRevers to read last elements of array first - if the
vector add kernel starts before preceding copy finishes we
will read stale data and flag the error.
Increase default array sizes, so synchronization errors more easily
exposed.
[ROCm/hip commit: 2e1fec47ab ]
2017-05-16 19:04:25 -05:00
Ben Sander
f3dcff0906
Fix HIP_TRACE_API so kernel launch only printed when requested.
...
[ROCm/hip commit: 427f8472aa ]
2017-05-16 19:04:25 -05:00
Ben Sander
a55ce5bee4
Add initial HIP_SYNC_NULL_STREAM=0 mode.
...
This eliminates host-synchronization for null stream. Instead, the
null-stream uses GPU-side events to wait for other streams.
Default is OFF pending additional testing.
Add enhanced null-stream test.
Also refine HIP_TRACE_API.
[ROCm/hip commit: 27877f8854 ]
2017-05-16 19:04:25 -05:00
Ben Sander
cfe81dfbf4
Update tests README
...
[ROCm/hip commit: c67b828a5a ]
2017-05-16 19:04:25 -05:00
Ben Sander
3107e70ea2
Doc update - split hip_debugging.md into separate file.
...
[ROCm/hip commit: 704ba30b32 ]
2017-05-16 19:04:25 -05:00
Ben Sander
7d07d804d8
split debugging into separate .md file
...
[ROCm/hip commit: db097ab392 ]
2017-05-16 19:04:25 -05:00
Ben Sander
2cf05ad54b
Add HIP_TRACE_API=4. Only display memory allocation/free apis.
...
[ROCm/hip commit: 0edab14139 ]
2017-05-16 19:04:25 -05:00
emankov
dce8786e26
[HIPIFY] *.inl extension support for batch processing
...
[ROCm/hip commit: 30000ef130 ]
2017-05-16 19:52:39 +03:00