Maneesh Gupta
35ebb36cbd
Merge pull request #951 from tycho/nvcc-hipDeviceSetCacheConfig
...
nvcc_detail/hip_runtime_api.h: add missing hipDeviceSetCacheConfig API
[ROCm/clr commit: 44309ee879 ]
2019-03-08 09:22:25 +05:30
Maneesh Gupta
ab318d047f
Merge pull request #953 from yxsamliu/vdi
...
Fix HIP/VDI build failure due to dlopen change
[ROCm/clr commit: 8bc97d6859 ]
2019-03-08 06:02:47 +05:30
Yaxun Sam Liu
d1721cc124
Fix HIP/VDI build failure due to dlopen change
...
[ROCm/clr commit: 2dc87b6019 ]
2019-03-07 14:45:45 -05:00
Evgeny Mankov
f26d036c39
Merge pull request #950 from emankov/master
...
[HIPIFY][tests] Update lit testing infrastructure
[ROCm/clr commit: 6b664dcf77 ]
2019-03-07 15:19:15 +03:00
Steven Noonan
9f642e40ad
nvcc_detail/hip_runtime_api.h: add missing hipDeviceSetCacheConfig API
...
Signed-off-by: Steven Noonan <steven@uplinklabs.net >
[ROCm/clr commit: 27d6755552 ]
2019-03-06 11:21:36 -08:00
Evgeny Mankov
36f569cba6
[HIPIFY][tests] Update lit testing infrastructure
...
+ Set -D__LP64__ in case of 64-bit hipify-clang binary
[partial workaround for clang's bug https://bugs.llvm.org/show_bug.cgi?id=38811 ]
C:/GIT/LLVM/trunk/llvm-64-release-vs2017/dist/lib/clang/9.0.0\include\__clang_cuda_device_functions.h(1609,45): error GEF7559A7: no matching function for call to 'roundf'
__DEVICE__ long lroundf(float __a) { return roundf(__a); }
#if defined(__LP64__)
__DEVICE__ long lround(double __a) { return llround(__a); }
__DEVICE__ long lroundf(float __a) { return llroundf(__a); } // ok: llroundf should be used when 64-bit
#else
__DEVICE__ long lround(double __a) { return round(__a); }
__DEVICE__ long lroundf(float __a) { return roundf(__a); } // error
#endif
+ Print more system info while testing in the following form:
========================================
CUDA 9.0 - will be used for testing
LLVM 9.0.0svn - will be used for testing
AMD64 - Platform architecture
Windows 10 - Platform OS
64 - hipify-clang binary bitness
32 - python 3.7.2 binary bitness
========================================
[ROCm/clr commit: f138f89bc8 ]
2019-03-06 19:26:05 +03:00
eshcherb
e4b8389619
roctracer-proto header find_path ( #884 )
...
[ROCm/clr commit: 7c3499198c ]
2019-03-06 17:36:34 +05:30
Maneesh Gupta
2a11a5529e
Merge pull request #949 from gargrahul/single_stream_concurrent_kernels
...
Add extension for kernel concurrency on same stream
[ROCm/clr commit: 3955f2c131 ]
2019-03-06 17:34:54 +05:30
Maneesh Gupta
b1fdc46049
Merge pull request #932 from ROCm-Developer-Tools/feature_maybe_dlopen_test
...
Add one test case for complex dynamic loading behavior
[ROCm/clr commit: b9809cb2b6 ]
2019-03-06 17:32:23 +05:30
Alex Voicu
45f4ac5023
dlopen() fixes ( #929 )
...
* Initial attempt to switch over to internally linked state.
* Add missing CMake update.
* hipLaunchKernelGGLImpl must be inline as well. Ensure internal linkage.
* Ensure global retrieval uses internally linked state.
* Hide HC in the implementation. Minimise ADL woes.
* Strange software exists, and must be catered to.
* Use a less spammy mechanism for ensuring internal linkage / non-export.
* Remove leftover internal detail.
[ROCm/clr commit: ed48847237 ]
2019-03-06 17:31:44 +05:30
Rahul Garg
f364b32e29
Add extension for kernel concurrency on same stream
...
[ROCm/clr commit: 263e82a67a ]
2019-03-06 12:55:39 +05:30
Maneesh Gupta
9d4d74e5dc
Merge pull request #936 from mangupta/swdev-174923
...
[hipconfig] Update HIP_PLATFORM detection logic
[ROCm/clr commit: 8099d81788 ]
2019-03-06 06:08:11 +05:30
Evgeny Mankov
dd5466b6ac
Merge pull request #948 from emankov/master
...
[HIPIFY] Change CUDA Driver's functions' cuMemsetD32(Async) mapping
[ROCm/clr commit: a1a205c849 ]
2019-03-05 18:18:39 +03:00
Evgeny Mankov
ed9ea6cde4
[HIPIFY] Change CUDA Driver's functions' cuMemsetD32(Async) mapping
...
cuMemsetD32(Async) -> hipMemsetD32(Async) (was hipMemset(Async))
based on:
[#933 ] https://github.com/ROCm-Developer-Tools/HIP/pull/933
[ROCm/clr commit: dfc631fb44 ]
2019-03-05 18:13:18 +03:00
Maneesh Gupta
fd6647184b
Merge pull request #933 from ROCm-Developer-Tools/fix_hipmemset
...
Add HIP memset APIs to cope with non-zero initial values of integer types
[ROCm/clr commit: b525d21011 ]
2019-03-05 14:31:38 +05:30
Maneesh Gupta
5c939d7f27
Update hipMemset.cpp
...
Address build issues on nvcc path.
[ROCm/clr commit: 8af4e2b5e4 ]
2019-03-05 12:11:11 +05:30
Maneesh Gupta
34554f5730
Update hip_runtime_api.h
...
Use hipCUResultTohipError instead of hipCUDAErrorTohipError in hipMemsetD32 & hipMemsetD32Async.
[ROCm/clr commit: 38b7a43b43 ]
2019-03-05 12:10:01 +05:30
Wen-Heng (Jack) Chung
e34b0ccd48
Address code review comments to use hipDeviceptr_t
...
[ROCm/clr commit: 8b7baa0bd9 ]
2019-03-05 05:51:05 +00:00
Wen-Heng (Jack) Chung
a4b654a5af
Add implementation for NVCC path
...
[ROCm/clr commit: b46e684d2e ]
2019-03-04 20:11:12 -08:00
Wen-Heng (Jack) Chung
21cf5b0ae4
Add direct test for hipMemsetD32 and hipMemsetD32Async
...
[ROCm/clr commit: 365d08535b ]
2019-03-04 17:20:32 +00:00
Wen-Heng (Jack) Chung
2706bf46f2
Add hipMemsetD32 and hipMemsetD32Async
...
Add 2 extra memset functions which fills memory with integer-typed data
Also change the parameters of ihipMemset to better explain the semantic
[ROCm/clr commit: 392271f4db ]
2019-03-04 17:00:33 +00:00
Maneesh Gupta
1269a45f8c
Merge pull request #939 from gargrahul/update_hipmemset_test
...
[dtest] Update hipMemset test
[ROCm/clr commit: 24570ab72a ]
2019-03-03 20:29:55 +05:30
Rahul Garg
b3ba23ba81
Fix review comments
...
[ROCm/clr commit: 5900416629 ]
2019-03-02 23:38:37 +05:30
Maneesh Gupta
0900e28473
Merge pull request #945 from wkwchau/hipMemset3D_fix
...
Fix hipMemset3D test
[ROCm/clr commit: 8bd168febf ]
2019-03-01 21:18:12 +05:30
Maneesh Gupta
b451ed974b
Merge pull request #942 from yxsamliu/v3
...
revert hipcc changes about code object v3
[ROCm/clr commit: d5e4c68f30 ]
2019-03-01 21:17:10 +05:30
Wilkin Chau
19b361281a
Fix hipMemset3D test
...
Calculate the allocated size based on the width, height and depth.
[ROCm/clr commit: 99540373cf ]
2019-02-28 22:42:46 +00:00
Rahul Garg
e6f5850ebb
Fix hipMemset test for HIP/NVCC
...
[ROCm/clr commit: 41afe4d947 ]
2019-03-01 03:46:57 +05:30
Yaxun Sam Liu
169bfb6b75
Revert "hipcc should consume -mcode-object-v3 flag"
...
This reverts commit ac4b2b03ac .
[ROCm/clr commit: b40d9c7849 ]
2019-02-28 11:21:47 -05:00
Yaxun Sam Liu
89721c8ce8
Revert "Change code-object flag to only HIP-Clang"
...
This reverts commit 1e483af21b .
[ROCm/clr commit: 9002c7d09d ]
2019-02-28 11:20:04 -05:00
Yaxun Sam Liu
768d00f5e7
Revert "Consume the code obj args to prevent duplicates"
...
This reverts commit 0e1fc751ea .
[ROCm/clr commit: 510590ac1d ]
2019-02-28 11:19:35 -05:00
Maneesh Gupta
b2a859754d
Merge pull request #938 from gargrahul/fix_hipBusBW_p2p_bidir
...
Fix hipBusBW sample for P2P bidirectional test
[ROCm/clr commit: 4eff6bd09a ]
2019-02-28 07:14:38 +05:30
Maneesh Gupta
7627e3dfe9
Merge pull request #937 from yxsamliu/nan2
...
Fix nan for windows
[ROCm/clr commit: c1ff2c95a4 ]
2019-02-28 07:14:27 +05:30
Maneesh Gupta
ca193fe866
Merge pull request #935 from gargrahul/fix_hipbusbw_beatsoverflow
...
Fix hipBusBW overflow with setting beats/iterations
[ROCm/clr commit: c6b050b7f2 ]
2019-02-28 07:14:16 +05:30
Maneesh Gupta
192131a9de
Merge pull request #934 from gargrahul/fix_forceinline_non_hcc
...
Fix forceinline for non HCC compilation
[ROCm/clr commit: 267b0b3a30 ]
2019-02-28 07:14:05 +05:30
Rahul Garg
1a97c0fafc
Update hipMemset test
...
[ROCm/clr commit: 0156388a6b ]
2019-02-28 06:54:49 +05:30
Rahul Garg
1985d70605
Fix hipBusBW sample for P2P bidirectional test
...
[ROCm/clr commit: 828e62fe4f ]
2019-02-28 00:56:07 +05:30
Yaxun Sam Liu
0b8586ab26
Fix nan for windows
...
[ROCm/clr commit: 0ebe23512f ]
2019-02-27 12:33:26 -05:00
Maneesh Gupta
ec494ba628
[hipcofig] Update HIP_PLATFORM detection logic
...
HIP_PLATFORM detection logic relied on finding a working KFD. If it was
found, the platform was set as hcc else as nvcc.
However this logic is flawed since it is possible for the development
system to only have the user mode bits to build HIP application code.
Hence the better logic is to rely on finding a suitable compiler.
The new logic is as follows:
- look for a working HCC. If found, platform is set as hcc.
- else look for a working NVCC. If found, platform is set as nvcc.
- else the platform defaults to hcc for now.
Change-Id: Ifcc42c29a19f722153d5c23c55f1a8765dceaf6b
[ROCm/clr commit: 03adc12474 ]
2019-02-27 14:10:21 +05:30
Rahul Garg
a835072903
Fix hipBusBW overflow with setting beats/iterations
...
[ROCm/clr commit: 4fef69afdc ]
2019-02-27 00:18:52 +05:30
Rahul Garg
1c95abbb18
Fix forceinline for non HCC compilation
...
[ROCm/clr commit: 832142234b ]
2019-02-26 07:50:09 +05:30
Wen-Heng (Jack) Chung
3faa98b08d
Add one test case for complex dynamic loading behavior
...
Existing HIT syntax doesn't seem to support the expected build and run steps
for this test.
[ROCm/clr commit: 8c5a92a789 ]
2019-02-25 17:03:31 +00:00
Evgeny Mankov
089cb31bff
Merge pull request #930 from emankov/master
...
[HIPIFY][doc] Update README.md
[ROCm/clr commit: a4e3fa4f1b ]
2019-02-25 18:29:26 +03:00
Evgeny Mankov
123d3e3dd8
[HIPIFY][doc] Update README.md
...
+ Populate Dependencies section with upcoming LLVM versions
+ Add clang bugs for not working configs LLVM+CUDA
+ Update Testing section
[ROCm/clr commit: 52fdd6e3cc ]
2019-02-25 18:26:25 +03:00
Evgeny Mankov
063175cfa4
Merge pull request #928 from emankov/master
...
[HIPIFY][tests] caffe2 test fix
[ROCm/clr commit: 2738db936a ]
2019-02-25 17:39:06 +03:00
Evgeny Mankov
7d96cc8b63
[HIPIFY][tests] caffe2 test fix
...
[ROCm/clr commit: 9c000f4b57 ]
2019-02-25 17:12:32 +03:00
Evgeny Mankov
a163e01be3
Merge pull request #927 from emankov/master
...
[HIPIFY][Caffe2] Initial Caffe2 support
[ROCm/clr commit: b24cb99a12 ]
2019-02-25 16:41:17 +03:00
Evgeny Mankov
e42eda2744
[HIPIFY][Caffe2] Initial Caffe2 support
...
[ROCm/clr commit: 59533a7309 ]
2019-02-23 20:46:22 +03:00
Maneesh Gupta
e6a5d8c9ea
Merge pull request #925 from yxsamliu/h2f
...
Add __gnu_h2f_ieee and __gnu_f2h_ieee
[ROCm/clr commit: f4e97f26bb ]
2019-02-22 13:38:15 +05:30
Maneesh Gupta
e24b0f1924
Merge pull request #923 from aaronenyeshi/fix-co-v3-arg
...
Consume the code obj args to prevent duplicates
[ROCm/clr commit: 33d14dd7cb ]
2019-02-22 13:38:08 +05:30
Yaxun Sam Liu
ed6efc2c0b
Add __gnu_h2f_ieee and __gnu_f2h_ieee
...
The implementation is copied from HCC runtime.
For hcc it has no effect since apps can find them in either hcc runtime or HIP
runtime.
hip-clang needs it in HIP/HCC runtime so that HIP/HCC and HIP/VDI runtime are
swappable.
[ROCm/clr commit: 972ca06c4c ]
2019-02-21 12:48:28 -05:00