Michael LIAO
f644d4daaa
[test] Add device variant of std::declval.
...
- Current clang disallows any invocation of wrong-side functions even
under context with type-inspection only. Work around that by adding a
variant of `std::decl` with `__device__` attribute.
[ROCm/clr commit: 32f69c8bc4 ]
2019-05-03 15:58:31 -04:00
Maneesh Gupta
13b13c3493
Merge pull request #1062 from mhbliao/hliao/master/icmp
...
[hip] Re-implement ballot using AMDGCN builtins
[ROCm/clr commit: 2eafa5dcf9 ]
2019-05-03 17:48:19 +05:30
Maneesh Gupta
c73c864fdf
Merge pull request #1058 from mhbliao/hliao/master/devfunc
...
[Device Function] Fix implementation
[ROCm/clr commit: ad070d4da5 ]
2019-05-03 17:47:51 +05:30
emankov
c27f0b952e
[HIPIFY][tests] Add cuSPARSE CSR-BCSR-SPMV-conversions example
...
[ROCm/clr commit: d5c3e5ea71 ]
2019-04-30 17:37:34 +03:00
Michael LIAO
eb43303d0b
[Device Function] Fix implementation of __bitinsert_u64
...
- It's a common mistake by assuming 1 << shamt would be promoted to
64-bit, if shamt is a 64-bit integer. That's not the case. Replace
that left shift to a 64-bit one to ensure it won't fall into undefined
behavior.
- Fix the host-side implementation as well for device function testing.
[ROCm/clr commit: 2380eb8ecc ]
2019-04-30 08:59:13 -04:00
Michael LIAO
e8de293fc5
[devfunc] Re-implement ballot using AMDGCN builtins
...
- As the signature of `amdgcn.icmp` is changed for next-gen chip, using
clang builtins is portable way to hide that details.
[ROCm/clr commit: a7a4d80f54 ]
2019-04-29 17:21:25 -04:00
Evgeny Mankov
7efe95bc42
[HIPIFY][doc] Update Readme.md: latest cuDNN 7.5.1.10 is supported
...
+ tested with CUDA 9.0, 9,2, 10.0 and 10.1
[ROCm/clr commit: 116daa3a1f ]
2019-04-29 15:41:08 +03:00
Aaron Enye Shi
f8d108a815
Revert "Use COMgr to read Kernel Args Metadata ( #1006 )"
...
This reverts commit 10048a5631 .
[ROCm/clr commit: 235c6877c8 ]
2019-04-26 16:04:56 -04:00
Aaron Enye Shi
2105ed24fc
Revert "Add COMGR relative path for build machines"
...
This reverts commit 9cd137f2e4 .
[ROCm/clr commit: acfa46bbbc ]
2019-04-26 16:04:56 -04:00
Aaron Enye Shi
9fbc8c0b58
Revert "Add dependency on amd_comgr in hip-config-*.cmake.in"
...
This reverts commit a1aa1f6f10 .
[ROCm/clr commit: 2378c7a20b ]
2019-04-26 16:04:56 -04:00
Maneesh Gupta
a1aa1f6f10
Add dependency on amd_comgr in hip-config-*.cmake.in
...
Change-Id: Iac1d851a8cfb99224e9c5926780273d9b9b08426
[ROCm/clr commit: b8fe5ba572 ]
2019-04-25 15:26:33 -04:00
Evgeny Mankov
729de93385
[HIPIFY][perl][fix][258] Memory fence device functions are supported now
...
[ROCm/clr commit: f0c2fdc6d7 ]
2019-04-25 13:27:30 +03:00
Evgeny Mankov
3c3255fbe5
[HIPIFY][DNN] cudnnSetFilter4dDescriptor support
...
[ROCm/clr commit: 72a809caf6 ]
2019-04-25 12:18:51 +03:00
Evgeny Mankov
6c114ca626
[HIPIFY][fix][ #204 ] Suppress warning message: #pragma once in main file
...
[ROCm/clr commit: df9418c3cd ]
2019-04-24 20:35:52 +03:00
Evgeny Mankov
ac39a25328
[HIPIFY][doc] Update README.md
...
+ A few words about clang patches to work with CUDA 9.2 - 10.0 on Windows;
+ Fix cuDNN versions with correct values.
[ROCm/clr commit: 1049031d98 ]
2019-04-24 17:40:35 +03:00
Maneesh Gupta
dad6abcd7a
Merge pull request #1043 from mhbliao/hliao/master/fp16
...
[hip] Fix including of hip_fp16.h
[ROCm/clr commit: 7f81c72f1c ]
2019-04-24 16:50:46 +05:30
Maneesh Gupta
f8f49d57dc
Merge pull request #1042 from mhbliao/hliao/master/ldg
...
[hip] Fix use of `__HIP_CLANG_ONLY__` in `hip_ldg.h`.
[ROCm/clr commit: 63ab2ea945 ]
2019-04-24 16:50:37 +05:30
Maneesh Gupta
b41d81d74e
Merge pull request #1040 from eshcherb/roctracer-hip-frontend-190422
...
hip_prof_api.h include under __cplusplus
[ROCm/clr commit: 54cdeabe6e ]
2019-04-24 16:50:27 +05:30
Maneesh Gupta
a709855e9d
Merge pull request #1039 from gargrahul/fix_ptrgetattr_nvcc
...
Fix hipPointerGetAttributes for NVCC
[ROCm/clr commit: 7edb43bc83 ]
2019-04-24 16:50:18 +05:30
Rahul Garg
d69edbbb7f
Add hipMallocManaged default functional support ( #1036 )
...
* Add hipMallocManaged default functional support
* Fix build error
* Add dtest
[ROCm/clr commit: 94769fc8dd ]
2019-04-24 16:50:03 +05:30
Maneesh Gupta
9835229370
Merge pull request #1034 from kpyzhov/master
...
Minor fixes for 64-bit device functions.
[ROCm/clr commit: 61861faddc ]
2019-04-24 16:49:36 +05:30
Maneesh Gupta
a0fdc93902
Merge pull request #1031 from yxsamliu/fix-init
...
Fix missing arg in HIP_INIT_API
[ROCm/clr commit: 3ac21336eb ]
2019-04-24 16:49:23 +05:30
Maneesh Gupta
c2011c2a2d
Merge pull request #1028 from gargrahul/fix_d2d_async_test
...
[dtest] Fix D2DAsync test
[ROCm/clr commit: 3ba7afcfc1 ]
2019-04-24 16:49:13 +05:30
Aaron Enye Shi
9cd137f2e4
Add COMGR relative path for build machines
...
[ROCm/clr commit: 6b3095f7cb ]
2019-04-23 17:16:26 -04:00
Evgeny Mankov
bc3583c9bd
[HIPIFY][doc] Provide patches for clang's bug 38811
...
+ Update Readme.md accordingly
[ROCm/clr commit: 87fa81f7be ]
2019-04-23 21:13:00 +03:00
Evgeny Mankov
176db946b2
[HIPIFY][hipify-perl] Formatting
...
[ROCm/clr commit: 65dd1d4c7d ]
2019-04-23 17:55:47 +03:00
Michael LIAO
59e6127969
[hip] Fix including of hip_fp16.h
...
- Separate the definition of `__HCC_OR_HIP_CLANG__`, `__HCC_ONLY__`, and
`__HIP_CLANG_ONLY__` into hip_common.h so that it could be included in
hip_fp16.h, which may be included separately in app.
[ROCm/clr commit: d086dbd0e5 ]
2019-04-23 09:16:00 -04:00
Michael LIAO
619050ae96
[hip] Fix use of __HIP_CLANG_ONLY__ in hip_ldg.h.
...
- Check its value instead of whether it's defined or not.
[ROCm/clr commit: ca6a5c07eb ]
2019-04-22 23:22:32 -04:00
Evgeny
79df39e3c3
hip_prof_api.h include under __cplusplus
...
[ROCm/clr commit: 165c42483b ]
2019-04-22 21:14:18 -05:00
Rahul Garg
0198199780
Fix hipPointerGetAttributes for NVCC
...
[ROCm/clr commit: c0e0f0b7fd ]
2019-04-23 03:22:25 +05:30
Konstantin Pyzhov
a525cc8f47
Fix for __popcll() device function implementation.
...
[ROCm/clr commit: f6fbf8751d ]
2019-04-19 08:53:22 -04:00
Yaxun (Sam) Liu
cb81018121
Fix missing arg in HIP_INIT_API
...
[ROCm/clr commit: 710e633bdd ]
2019-04-18 16:18:31 -04:00
Konstantin Pyzhov
d1bbf23181
Fix for __ffsll() device functions.
...
[ROCm/clr commit: 5664ed3206 ]
2019-04-18 13:07:24 -04:00
David Salinas
5624c3837d
Revert "append the ELF flags for sram-ecc and xnack to the target triple per code object"
...
This reverts commit d1f4e7ea54 .
[ROCm/clr commit: 1237a0b691 ]
2019-04-18 11:49:40 -04:00
Rahul Garg
c6dd7c9678
Fix D2DAsync test
...
[ROCm/clr commit: f1dc017167 ]
2019-04-18 07:35:06 +05:30
Evgeny Mankov
3586f0be35
[HIPIFY][SPARSE] cuSPARSE 10.1 support
...
[ROCm/clr commit: 95aca4f9a9 ]
2019-04-16 14:59:44 +03:00
Evgeny Mankov
e08460eb09
[HIPIFY][BLAS] cuBLAS 10.1 support
...
[ROCm/clr commit: bbcacd0146 ]
2019-04-16 12:52:58 +03:00
Evgeny Mankov
999640effd
Merge pull request #1023 from emankov/master
...
[HIPIFY][cuDNN] Add partial cudnnRNNBiasMode_t support
[ROCm/clr commit: f5f1636181 ]
2019-04-16 11:03:22 +03:00
Evgeny Mankov
26129109df
[HIPIFY][cuDNN] Add partial cudnnRNNBiasMode_t support
...
[ROCm/clr commit: 5fa84735a6 ]
2019-04-16 11:01:01 +03:00
Maneesh Gupta
7c43b9ee4b
Merge pull request #995 from david-salinas/add_sram-ecc_and_xnack_flags_to_triple
...
Append the ELF flags for sram-ecc and xnack to the target triple per code object
[ROCm/clr commit: 715a500b97 ]
2019-04-16 09:10:04 +05:30
Maneesh Gupta
dac817873f
Merge pull request #1019 from scchan/lazy_binding
...
minor workaround for lazy binding
[ROCm/clr commit: 22660bed74 ]
2019-04-16 08:36:10 +05:30
Jeff Daily
cf4e198a91
In hipFree, synchronize owner of memory ( #1018 )
...
* In hipFree, if memory is associated with a device, synchronize that device's streams.
This changes the behavior from synchronizing the currently set TLS device.
* All devices sync in hipFree for _appId=-1 case.
* Revert "All devices sync in hipFree for _appId=-1 case."
This reverts commit 1efb34d6a8426661e45bc5f763422a1147aeac10.
* add HIP_SYNC_FREE env var
[ROCm/clr commit: cf8fb43e6b ]
2019-04-16 08:35:55 +05:30
Mr-LiuSw
e909811963
add little changes in hip_runtime_api.h to work with c language ( #1017 )
...
* Update hip_runtime_api.h
when i try to use mpicc or gcc to compile a c language code which call some hip runtime api , error occured as
> /path/to/hcc_detail/hip_runtime_api.h:2268:33: error: unknown type name ‘hipFuncAttributes’;
> hipFuncGetAttributes(hipFuncAttributes* attr, const void* func);
add ' struct ' for the first parameter of hipFuncGetAttributes will get ride of this problem.
[ROCm/clr commit: 64bdf82265 ]
2019-04-16 08:35:36 +05:30
Aaron Enye Shi
10048a5631
Use COMgr to read Kernel Args Metadata ( #1006 )
...
* Add CMAKE dep to amd_comgr
* Use COMGR for read_kernarg_metadata in COV2
* Do not assume kernargs exist
* Add proper metadata destroy cleanup
* Use a process function for easier destroy
* Remove old read_kernarg_metadata
* Clean up HCC, prints, names
* Use COMGR in CMAKE by default
* Move metadata lookup for keyword values into helper
* Remove C string usage for lookup_keyword_value
* Guard COMGR for non-NVCC path
* Add hip_hcc dependency on comgr package
* Add lifetime to metadata nodes
* Find COMGR config file for amd_comgr target
* Move set_active data earlier
[ROCm/clr commit: 2c80975e9c ]
2019-04-16 08:34:39 +05:30
Evgeny Mankov
27275c9e8e
[HIPIFY] cuDNN 7.5.0.56 support
...
[ROCm/clr commit: 64f0f29111 ]
2019-04-15 15:46:46 +03:00
Maneesh Gupta
c82695a311
[ci] Enable tests on ROCm 2.3
...
Change-Id: Id344ef600b0868f36f2e7ac08d5664234d88835b
[ROCm/clr commit: 72e17e3c92 ]
2019-04-15 12:38:01 +05:30
Yaxun (Sam) Liu
7d58d7b02a
hip-clang: Add __align__
...
CUDA has __align__. Define eqivalent for hip-clang.
[ROCm/clr commit: e200ece4da ]
2019-04-10 14:17:18 -04:00
Evgeny Mankov
ca6021640b
[HIPIFY] CUDA 10.1 Runtime API support
...
[ROCm/clr commit: b11bf48270 ]
2019-04-10 18:41:36 +03:00
Evgeny Mankov
61a6949cf0
[HIPIFY] CUDA 10.1 Driver API support
...
[ROCm/clr commit: 9a660c0d48 ]
2019-04-10 15:03:34 +03:00
Maneesh Gupta
f8194fd6f1
Merge pull request #1013 from yxsamliu/config
...
Fix hip-config.cmake for hip-clang
[ROCm/clr commit: 75691ff3e4 ]
2019-04-10 07:53:22 +00:00