Alex Voicu
d035cb9000
Merge branch 'master' of https://github.com/ROCm-Developer-Tools/HIP into feature_get_alignment_and_size_from_metadata
2018-10-30 23:34:46 +00:00
Alex Voicu
185fa122ed
Merge branch 'master' of https://github.com/ROCm-Developer-Tools/HIP into feature_get_alignment_and_size_from_metadata
2018-10-28 17:02:10 +00:00
Alex Voicu
fe1e963299
Rely on code object metadat for kernarg arguments alignof and sizeof.
2018-10-28 17:01:00 +00:00
Alex Voicu
e4181b85be
hipLaunchKernel, hipLaunchParm are deprecated, and shall be removed.
2018-10-25 13:32:17 +01:00
Maneesh Gupta
f6f160fa6b
Merge pull request #723 from mangupta/fix_double_shfl
...
Fix logic for double variants of __shfl*
2018-10-25 06:01:38 +05:30
Maneesh Gupta
19404e603d
Fix logic for double variants of __shfl*
...
Change-Id: I604f00b54cf4bd9c5f26ca6fa680fca5e9629417
2018-10-24 12:39:09 +05:30
Maneesh Gupta
0703a2d0f0
Make HIP functional again with HCC from ROCm 1.9.x
...
Change-Id: I214acdfd0b79dcf783993e44fe31baee64fd4dc3
2018-10-24 10:41:56 +05:30
Maneesh Gupta
4a00b244a3
Merge pull request #705 from ROCm-Developer-Tools/feature_minimal_changes_for_hc_next
...
Feature minimal changes for hc next
2018-10-19 06:58:31 +05:30
Alex Voicu
5ccaf2fa7d
Dumb workaround is still needed, so add it back.
2018-10-18 15:33:46 +01:00
Alex Voicu
fe959f7bd7
Re-sync with upstream.
2018-10-18 12:27:03 +01:00
Maneesh Gupta
1a5025c57e
Merge pull request #688 from aaronenyeshi/fix-sinf-cosf-ocml
...
Use sinf and cosf from ocml device libs
2018-10-18 16:39:20 +05:30
Maneesh Gupta
d133493669
Merge pull request #692 from whchung/hip-reinit-take2
...
HIP program state re-initialization logic (take 2)
2018-10-18 12:06:41 +05:30
Maneesh Gupta
c24b06fa0a
Merge pull request #703 from mangupta/stream_create_with_priority
...
Implementation for stream priority
2018-10-17 10:53:43 +05:30
Maneesh Gupta
dbe4431d98
Merge pull request #702 from aaronenyeshi/fix-missing-irif-lib
...
Replace IRIF fences with atomic_work_item_fence
2018-10-17 10:53:27 +05:30
Maneesh Gupta
64d1cf86b7
Add missing hipHostRegister flags on nvcc path
...
Change-Id: I69f09204d9c544935104d4168ab8d3626666a623
2018-10-15 15:30:24 +05:30
Alex Voicu
5312336ce2
Minimal should mean minimal.
2018-10-11 00:21:41 +01:00
Alex Voicu
3e4dbd32a1
Address Aaron's comments
2018-10-11 00:03:01 +01:00
Alex Voicu
4bc40551b5
Merge branch 'master' of https://github.com/ROCm-Developer-Tools/HIP into feature_minimal_changes_for_hc_next
2018-10-10 11:44:09 +01:00
Alex Voicu
ca375cb8c5
Re-sync with upstream.
2018-10-10 11:43:49 +01:00
Maneesh Gupta
da64156fb2
Implementation for stream priority
...
- Requires ROCm 1.9.x or higher
- Requires HCC with PR#886 merged
Change-Id: Id7c95ea091ee610e80c9ad815f1cb989cba570ca
2018-10-05 16:27:46 +05:30
Aaron Enye Shi
0787f74ac2
Replace IRIF fences with atomic_work_item_fence
2018-10-04 21:47:28 +00:00
Aaron Enye Shi
5dd35576f6
Fix hip_vector_types.h for long long vectors
...
There was a missing long in the declaration for [u]longlongN types.
2018-10-03 13:57:52 -04:00
Wen-Heng (Jack) Chung
dab1a0f9db
HIP program state re-initialization logic
...
This commit is to support kernels dynamically loaded thru means such as
dlopen() after HIP runtime initializes.
2018-09-26 19:48:47 +00:00
Aaron Enye Shi
77c07d4118
Use sinf and cosf from ocml device libs
...
Using llvm_amdgcn builtin fails to produce accurate values, we should move to using the ocml device library versions.
2018-09-25 19:31:39 +00:00
Maneesh Gupta
3d67c9f952
Merge pull request #614 from ROCm-Developer-Tools/fma
...
Add overloading resolution functions for fma
2018-09-20 13:38:03 +05:30
Yaxun Sam Liu
a5c961e26c
Silent warnings about duplicate static keyword
...
static is already in __DEVICE__, so should be removed.
2018-09-19 10:39:45 -04:00
Yaxun Sam Liu
bd622a4b4a
Add fma function with float and _Float16 arguments
2018-09-19 09:59:33 -04:00
Yaxun Sam Liu
cf184460e9
Fix build failure of hipTestHalf and hipTestIncludeMath for hip-clang
2018-09-18 21:00:15 -04:00
Maneesh Gupta
9ee70fca8a
Merge pull request #672 from iotamudelta/fp16_fix
...
Only LLVM6 and higher contain the necessary intrinsics.
2018-09-18 08:43:33 +05:30
Maneesh Gupta
32787fa1fc
Merge pull request #674 from mangupta/fix_dtests_on_nvcc
...
[dtests] Fix hipTestClock, hipTestNew, hipTestGlobalVariable, hipSimpleAtomicsTest & hipTestIncludeMath tests on nvcc path
2018-09-18 07:50:52 +05:30
Maneesh Gupta
5cf281071d
Merge pull request #677 from yxsamliu/fix-launch-decay
...
Fix hipLaunchKernelGGL for hip-clang
2018-09-18 07:50:37 +05:30
Yaxun Sam Liu
cdfd82f1de
Disable device code for gcc in hip_memory.h
...
These device code should only be seen by HCC or hip-clang. They causd build failure
for HIP-VDI runtime and should be disabled for gcc.
2018-09-17 16:50:42 -04:00
Yaxun Sam Liu
fc228c7ea6
Fix hipLaunchKernelGGL for hip-clang
...
Do not decay function pointer type of the kernel argument passed to hipLaunchKernelGGL
and hipLaunchKernel, otherwise some type information is lost which may cause
type inference failure for the template.
This issue caused compilation error of FeatureLPPooling in Caffe2/PyTorch and this patch
fixes that.
2018-09-17 11:20:41 -04:00
Maneesh Gupta
cef5261fa9
Add mappings for __clock* in nvcc_detail/hip_runtime.h
...
Change-Id: Ibcecf52f3e69298268d921efc036090544fa0ed0
2018-09-17 15:23:30 +05:30
Alex Voicu
c6720e882b
Align with HC Next.
2018-09-17 11:50:29 +03:00
Maneesh Gupta
66f863d1f3
Merge branch 'master' into support-malloc
2018-09-17 10:17:25 +05:30
Maneesh Gupta
cb348421d7
Merge pull request #650 from ROCm-Developer-Tools/hip-clang-new
...
Support placement new in hip-clang
2018-09-15 11:21:01 +05:30
Maneesh Gupta
8fe4e22b19
Merge pull request #665 from aaronenyeshi/fix-min-funcs
...
Use templates for min to prevent ambiguity
2018-09-14 13:21:38 +05:30
Aaron Enye Shi
6b811ca6d1
Fix Tensorflow ambiguous min issue
2018-09-13 23:16:20 +00:00
Johannes M Dieterich
cf12a9c049
Only LLVM6 and higher contain the necessary intrinsics.
2018-09-13 13:55:43 -05:00
Maneesh Gupta
aed5ad31ba
Merge pull request #669 from ROCm-Developer-Tools/feature_automatic_cast
...
Remove potential for mismatch between runtime passed actuals and defined formals
2018-09-13 07:54:22 +05:30
Maneesh Gupta
411e53a665
Merge pull request #661 from yxsamliu/add-empty-printf
...
Add empty printf for hip-clang
2018-09-13 07:54:03 +05:30
Aaron Enye Shi
894cbdd749
Avoid AMP-retrict call to CPU-restrict
2018-09-12 14:54:31 +00:00
Alex Voicu
cdfea3ef7b
Remove potential for mismatch between runtime passed actuals and defined formals.
2018-09-12 10:30:48 +01:00
Maneesh Gupta
8249cf037b
Merge pull request #664 from lcskrishna/master
...
added __host__ to float2half and half2float functions.
2018-09-12 14:50:01 +05:30
Maneesh Gupta
133d665a88
Merge pull request #663 from yxsamliu/fix-launch
...
Use template for hipLaunchKernelGGL for hip-clang
2018-09-12 14:49:38 +05:30
carlushuang
d577f27d1a
fix __longlong_as_double() problem, return the double value
...
previous version return a long long valus *as* double, hence we may get the wrong result.
this also affect atomicAdd(double * ...), which use long long pointer to mimic double pointer.
Signed-off-by: carlushuang <carlus.huang@amd.com >
2018-09-12 13:25:00 +08:00
Aaron Enye Shi
ffd89dde9c
Avoid host min func conflict with gcc min
2018-09-11 18:48:31 +00:00
Aaron Enye Shi
0121ec13aa
Use templates for min to prevent ambiguity
2018-09-11 18:21:54 +00:00
Yaxun Sam Liu
9e9a93e10a
Use template for hipLaunchKernelGGL for hip-clang
2018-09-07 16:20:00 -04:00