Grafik Komit

864 Melakukan

Penulis SHA1 Pesan Tanggal
Alex Voicu d035cb9000 Merge branch 'master' of https://github.com/ROCm-Developer-Tools/HIP into feature_get_alignment_and_size_from_metadata 2018-10-30 23:34:46 +00:00
Alex Voicu 185fa122ed Merge branch 'master' of https://github.com/ROCm-Developer-Tools/HIP into feature_get_alignment_and_size_from_metadata 2018-10-28 17:02:10 +00:00
Alex Voicu fe1e963299 Rely on code object metadat for kernarg arguments alignof and sizeof. 2018-10-28 17:01:00 +00:00
Alex Voicu e4181b85be hipLaunchKernel, hipLaunchParm are deprecated, and shall be removed. 2018-10-25 13:32:17 +01:00
Maneesh Gupta f6f160fa6b Merge pull request #723 from mangupta/fix_double_shfl
Fix logic for double variants of __shfl*
2018-10-25 06:01:38 +05:30
Maneesh Gupta 19404e603d Fix logic for double variants of __shfl*
Change-Id: I604f00b54cf4bd9c5f26ca6fa680fca5e9629417
2018-10-24 12:39:09 +05:30
Maneesh Gupta 0703a2d0f0 Make HIP functional again with HCC from ROCm 1.9.x
Change-Id: I214acdfd0b79dcf783993e44fe31baee64fd4dc3
2018-10-24 10:41:56 +05:30
Maneesh Gupta 4a00b244a3 Merge pull request #705 from ROCm-Developer-Tools/feature_minimal_changes_for_hc_next
Feature minimal changes for hc next
2018-10-19 06:58:31 +05:30
Alex Voicu 5ccaf2fa7d Dumb workaround is still needed, so add it back. 2018-10-18 15:33:46 +01:00
Alex Voicu fe959f7bd7 Re-sync with upstream. 2018-10-18 12:27:03 +01:00
Maneesh Gupta 1a5025c57e Merge pull request #688 from aaronenyeshi/fix-sinf-cosf-ocml
Use sinf and cosf from ocml device libs
2018-10-18 16:39:20 +05:30
Maneesh Gupta d133493669 Merge pull request #692 from whchung/hip-reinit-take2
HIP program state re-initialization logic (take 2)
2018-10-18 12:06:41 +05:30
Maneesh Gupta c24b06fa0a Merge pull request #703 from mangupta/stream_create_with_priority
Implementation for stream priority
2018-10-17 10:53:43 +05:30
Maneesh Gupta dbe4431d98 Merge pull request #702 from aaronenyeshi/fix-missing-irif-lib
Replace IRIF fences with atomic_work_item_fence
2018-10-17 10:53:27 +05:30
Maneesh Gupta 64d1cf86b7 Add missing hipHostRegister flags on nvcc path
Change-Id: I69f09204d9c544935104d4168ab8d3626666a623
2018-10-15 15:30:24 +05:30
Alex Voicu 5312336ce2 Minimal should mean minimal. 2018-10-11 00:21:41 +01:00
Alex Voicu 3e4dbd32a1 Address Aaron's comments 2018-10-11 00:03:01 +01:00
Alex Voicu 4bc40551b5 Merge branch 'master' of https://github.com/ROCm-Developer-Tools/HIP into feature_minimal_changes_for_hc_next 2018-10-10 11:44:09 +01:00
Alex Voicu ca375cb8c5 Re-sync with upstream. 2018-10-10 11:43:49 +01:00
Maneesh Gupta da64156fb2 Implementation for stream priority
- Requires ROCm 1.9.x or higher
- Requires HCC with PR#886 merged

Change-Id: Id7c95ea091ee610e80c9ad815f1cb989cba570ca
2018-10-05 16:27:46 +05:30
Aaron Enye Shi 0787f74ac2 Replace IRIF fences with atomic_work_item_fence 2018-10-04 21:47:28 +00:00
Aaron Enye Shi 5dd35576f6 Fix hip_vector_types.h for long long vectors
There was a missing long in the declaration for [u]longlongN types.
2018-10-03 13:57:52 -04:00
Wen-Heng (Jack) Chung dab1a0f9db HIP program state re-initialization logic
This commit is to support kernels dynamically loaded thru means such as
dlopen() after HIP runtime initializes.
2018-09-26 19:48:47 +00:00
Aaron Enye Shi 77c07d4118 Use sinf and cosf from ocml device libs
Using llvm_amdgcn builtin fails to produce accurate values, we should move to using the ocml device library versions.
2018-09-25 19:31:39 +00:00
Maneesh Gupta 3d67c9f952 Merge pull request #614 from ROCm-Developer-Tools/fma
Add overloading resolution functions for fma
2018-09-20 13:38:03 +05:30
Yaxun Sam Liu a5c961e26c Silent warnings about duplicate static keyword
static is already in __DEVICE__, so should be removed.
2018-09-19 10:39:45 -04:00
Yaxun Sam Liu bd622a4b4a Add fma function with float and _Float16 arguments 2018-09-19 09:59:33 -04:00
Yaxun Sam Liu cf184460e9 Fix build failure of hipTestHalf and hipTestIncludeMath for hip-clang 2018-09-18 21:00:15 -04:00
Maneesh Gupta 9ee70fca8a Merge pull request #672 from iotamudelta/fp16_fix
Only LLVM6 and higher contain the necessary intrinsics.
2018-09-18 08:43:33 +05:30
Maneesh Gupta 32787fa1fc Merge pull request #674 from mangupta/fix_dtests_on_nvcc
[dtests] Fix hipTestClock, hipTestNew, hipTestGlobalVariable, hipSimpleAtomicsTest & hipTestIncludeMath tests on nvcc path
2018-09-18 07:50:52 +05:30
Maneesh Gupta 5cf281071d Merge pull request #677 from yxsamliu/fix-launch-decay
Fix hipLaunchKernelGGL for hip-clang
2018-09-18 07:50:37 +05:30
Yaxun Sam Liu cdfd82f1de Disable device code for gcc in hip_memory.h
These device code should only be seen by HCC or hip-clang. They causd build failure
for HIP-VDI runtime and should be disabled for gcc.
2018-09-17 16:50:42 -04:00
Yaxun Sam Liu fc228c7ea6 Fix hipLaunchKernelGGL for hip-clang
Do not decay function pointer type of the kernel argument passed to hipLaunchKernelGGL
and hipLaunchKernel, otherwise some type information is lost which may cause
type inference failure for the template.

This issue caused compilation error of FeatureLPPooling in Caffe2/PyTorch and this patch
fixes that.
2018-09-17 11:20:41 -04:00
Maneesh Gupta cef5261fa9 Add mappings for __clock* in nvcc_detail/hip_runtime.h
Change-Id: Ibcecf52f3e69298268d921efc036090544fa0ed0
2018-09-17 15:23:30 +05:30
Alex Voicu c6720e882b Align with HC Next. 2018-09-17 11:50:29 +03:00
Maneesh Gupta 66f863d1f3 Merge branch 'master' into support-malloc 2018-09-17 10:17:25 +05:30
Maneesh Gupta cb348421d7 Merge pull request #650 from ROCm-Developer-Tools/hip-clang-new
Support placement new in hip-clang
2018-09-15 11:21:01 +05:30
Maneesh Gupta 8fe4e22b19 Merge pull request #665 from aaronenyeshi/fix-min-funcs
Use templates for min to prevent ambiguity
2018-09-14 13:21:38 +05:30
Aaron Enye Shi 6b811ca6d1 Fix Tensorflow ambiguous min issue 2018-09-13 23:16:20 +00:00
Johannes M Dieterich cf12a9c049 Only LLVM6 and higher contain the necessary intrinsics. 2018-09-13 13:55:43 -05:00
Maneesh Gupta aed5ad31ba Merge pull request #669 from ROCm-Developer-Tools/feature_automatic_cast
Remove potential for mismatch between runtime passed actuals and defined formals
2018-09-13 07:54:22 +05:30
Maneesh Gupta 411e53a665 Merge pull request #661 from yxsamliu/add-empty-printf
Add empty printf for hip-clang
2018-09-13 07:54:03 +05:30
Aaron Enye Shi 894cbdd749 Avoid AMP-retrict call to CPU-restrict 2018-09-12 14:54:31 +00:00
Alex Voicu cdfea3ef7b Remove potential for mismatch between runtime passed actuals and defined formals. 2018-09-12 10:30:48 +01:00
Maneesh Gupta 8249cf037b Merge pull request #664 from lcskrishna/master
added __host__ to float2half and half2float functions.
2018-09-12 14:50:01 +05:30
Maneesh Gupta 133d665a88 Merge pull request #663 from yxsamliu/fix-launch
Use template for hipLaunchKernelGGL for hip-clang
2018-09-12 14:49:38 +05:30
carlushuang d577f27d1a fix __longlong_as_double() problem, return the double value
previous version return a long long valus *as* double, hence we may get the wrong result.
this also affect atomicAdd(double * ...), which use long long pointer to mimic double pointer.

Signed-off-by: carlushuang <carlus.huang@amd.com>
2018-09-12 13:25:00 +08:00
Aaron Enye Shi ffd89dde9c Avoid host min func conflict with gcc min 2018-09-11 18:48:31 +00:00
Aaron Enye Shi 0121ec13aa Use templates for min to prevent ambiguity 2018-09-11 18:21:54 +00:00
Yaxun Sam Liu 9e9a93e10a Use template for hipLaunchKernelGGL for hip-clang 2018-09-07 16:20:00 -04:00