Evgeny Mankov
aa05b3d84e
Merge pull request #262 from ChrisKitching/frontendaction
...
[HIPIFY] Mostly fix preprocessor-or-template induced issues
2017-11-27 17:30:11 +03:00
Alex Voicu
907d41df77
Re-sync with upstream.
2017-11-24 13:04:12 +00:00
Jenkins
37cdfb41dc
Merge 'master' into 'amd-master'
...
Change-Id: I9b15fd04a369c32b6c9bef1c0d67a90b28dfe69b
2017-11-24 04:10:50 -06:00
Rahul Garg
04bc5a1d1f
Porting guides update for texture APIs usage
2017-11-24 12:00:55 +05:30
Ben Sander
814548407d
Merge pull request #273 from mangupta/swdev-129574
...
Fix float2int rounding functions
2017-11-23 12:04:36 -06:00
Jenkins
0b12b71ab2
Merge 'master' into 'amd-master'
...
Change-Id: I74751b8fe8b223a9bdcb18c6e340c5ff8c771a64
2017-11-23 04:10:36 -06:00
Maneesh Gupta
4c96882366
Fix float2int rounding functions
...
Change-Id: I67943859a6344c5eec0eaa23418c9b802ef72468
2017-11-23 09:57:24 +05:30
Alex Voicu
08f252e4bf
Remove leftover comment.
2017-11-22 19:37:03 +00:00
Evgeny Mankov
4e92a034d0
Merge pull request #263 from ChrisKitching/headers
...
[HIPIFY] Add hipify mappings for all CUDA headers that have HIP equivalents
2017-11-22 21:24:21 +03:00
Alex Voicu
4131b47134
Modify the set component of the memcpy test (unclear why there is a memset component to begin with).
2017-11-21 17:52:01 +00:00
Rahul Garg
56862b1c35
Fixed review comments
2017-11-21 21:19:06 +05:30
Alex Voicu
5e16ee0d1f
This corrects how addresses are formed for symbols which reside in shared objects. For this case, the .value component of an ELF symbol holds the offset from the base VA where the shared object was loaded. Thus, to correctly obtain the VA of the object refered by the symbol, we must add the offset to the VA where the shared object is loaded. We were already doing this correctly for symbols denoting functions, but we were incorrect for those denoting objects.
2017-11-21 13:15:13 +00:00
Jenkins
619fce9daf
Merge 'master' into 'amd-master'
...
Change-Id: Ie37ee84b9c4f2808bd0dc2986e20feeb4449cea3
2017-11-21 04:10:37 -06:00
Rahul Garg
9866fa250d
Changed function hipMemcpy_2D to hipMemcpyParam2D
2017-11-21 12:36:24 +05:30
Alex Voicu
9d088d2283
Refactor the __device__ versions of memset and memcpy to be less awkward i.e. not return nullptr as opposed to the destination pointer (it can only be assumed it was done for maximum confusion) and actually unroll as they claim to. Change all of the {to, from}Symbol functions to use hipModuleGetGlobal, as opposed to hc::accelerator::get_symbol_address which is no longer valid with module based dispatch.
2017-11-21 02:40:34 +00:00
Alex Voicu
1824fb7698
Clean-up some remaining noise in program_state.cpp.
2017-11-20 22:41:46 +00:00
Alex Voicu
7d5a45ac1a
Correct ill-formed merge in earlier commit and adjust for differences with the new CUDA natural indexing mechanism.
2017-11-20 16:33:52 +00:00
Alex Voicu
c5f2b22d0d
Re-sync with upstream.
2017-11-20 15:34:50 +00:00
Ben Sander
e8ede28ec4
Merge pull request #264 from pzins/missing_end_marker
...
Fix missing MARKER_END
2017-11-20 06:08:01 -06:00
Rahul Garg
f97c5f9a64
-Moved coGlobals in hipModule class (takes care of multi module case)
...
-Used mutex scope for updating coGlobals
2017-11-20 16:23:18 +05:30
Jenkins
53b48673f3
Merge 'master' into 'amd-master'
...
Change-Id: I4cbe49d40a046c80e4968aac8fa7fd760e2d9709
2017-11-20 04:10:56 -06:00
Maneesh Gupta
db378fbc9e
Merge pull request #266 from gargrahul/fix_half2_gfx900
...
Fixed half2 issue on gfx900
2017-11-20 07:28:41 +05:30
Maneesh Gupta
1174534e85
Merge pull request #265 from phani544/nvccTests
...
[nvccTests]Enabled inline_asm_vadd on nvcc
2017-11-20 07:28:29 +05:30
Ben Sander
59956a57ca
Fix test on cuda
2017-11-19 15:31:02 -06:00
Ben Sander
5a7a28ad29
Merge branch 'feature_natural_indexing' of https://github.com/AlexVlx/HIP
2017-11-19 15:25:17 -06:00
Ben Sander
e0c3f684ae
Temporarily disable P2P on nvidia (fails on dual GPU)
2017-11-19 15:21:37 -06:00
Rahul Garg
c7d60a7a75
Update hipModuleGetTexRef API
2017-11-19 22:10:46 +05:30
Alex Voicu
cffd0e14eb
This implements the trivial change needed to move back from the hip{Something}_{x, y, z} macros to the natural CUDA syntax of Something.{x, y, z}. This is contained in lines 384-404 in hip_runtime.h. All of the other changes have to do with changing unit tests to use this syntax. The macros are retained for backwards compatibility.
2017-11-19 01:54:12 +00:00
Alex Voicu
6fa7adf077
This actually (tries) to do the right thing all the way, by using memcpy for bitcasting, and not rely on undefined behaviour of a different flavour as a substitute for the original undefined behaviour. Note that the compiler will (should) optimise down to the same emitted code, since this is a pattern it understands.
2017-11-18 01:16:31 +00:00
Alex Voicu
153878e368
This fixes some outright quaint choices made when implementing HIP's bitwise conversion functions, by using simple reinterpret_casts, as is idiomatic. These functions are supposed to be re-entrant, correct and efficient. Sadly, they were neither: they hid a massive race condition against a value stored in global memory, which means that they were also unreasonably slow if they ever managed to be correct, and relied on union based type punning which is in a grey area of the standard. It is difficult to ascertain what may have been the reason for coming up with this quirky solution.
2017-11-17 16:00:28 +00:00
Alex Voicu
f93859cdc2
Merge remote-tracking branch 'origin/master' into feature_use_module_based_dispatch_instead_of_pfe
2017-11-16 23:20:15 +00:00
Rahul Garg
9af0f9cbc1
Fixed test case for GFX900
2017-11-16 09:34:52 +05:30
Rahul Garg
fef496d4f1
Fixed half2 issue on gfx900
2017-11-15 18:52:59 +05:30
Rahul Garg
ae1eb7a03a
Removed redundant desc variable
2017-11-15 18:28:27 +05:30
Rahul Garg
4b19c2aa0c
-Fixed texture driver API sample
...
-Added hipTexRefSetAddress and hipTexRefSetAddress2D APIs
2017-11-15 18:23:28 +05:30
Phaneendr-kumar Lanka
18f6e31d1d
[nvccTests]Enabled inline_asm_vadd on nvcc
2017-11-14 16:37:59 +05:30
Rahul Garg
63680edd30
Texture code reorganized
2017-11-14 11:09:35 +05:30
Pierre
6baaed8e48
Fix missing MARKER_END
...
Logging status of hipCtxSynchronize was missing
Test if hip profiling is active for MARKER_END in ihipPostLaunchKernel
Add MARKER_END after the completion of a kernel launched through
the "grid launch"
2017-11-13 16:13:19 -05:00
Chris Kitching
ab3debb2f9
Add an explicit check for proper rewriting of CUDA includes
2017-11-13 21:02:42 +00:00
Chris Kitching
2344ca89f3
Add a preprocessor conditional to one of the tests
...
Hurrah, we can cope with ifdefs now (except for kernel launches)
2017-11-13 20:58:55 +00:00
Chris Kitching
6b767a59ba
Use proper clang diagnostics for printing warnings
...
Much pretty. Very wow
This gives users all the usual power when it comes to manipulating
clang diagnostics. People can pass -Werror can have hipify fail if
it doesn't completely translate a file, for example. Much nicer
than reinventing the wheel.
2017-11-13 20:58:55 +00:00
Chris Kitching
24cdc5e1d3
Use a custom FrontendAction to simplify identifier translation
...
Most of what hipify does is really just replacing CUDA idenitifers
with HIP ones. CUDA function calls, preprocessor macro calls,
enum references, types, etc.
This is problematic: calls/types/enum-refs require name resolution
for the AST matcher to work. This fails in the presence of code
deleted by the preprocessor, and in two-pass template compilation.
Instead, we can simply hook the lexer and have it rewrite the
identifiers for us.
This approach means identifier transformations will work correctly
regardless of where they appear (and we get to delete lots of code)
- Fixes #260
- Helps a bit with #207 - it will still fail to translate kernel
calls in preprocessor-ignored code, but everything except kerel
launches should translate correctly now, even in
preprocessor-deleted code.
2017-11-13 20:58:54 +00:00
Chris Kitching
23b5d26582
Add hipify mappings for all CUDA headers that have HIP equivalents
...
I'm particularly running into issues with `device_types.h` in real
CUDA code...
2017-11-13 17:20:07 +00:00
Chris Kitching
9165df3848
Add a test that exposes #260
2017-11-13 16:18:15 +00:00
Chris Kitching
4ab091ce1e
Add a couple of missing CHECK directives to concurrentKernels.cu
2017-11-13 16:17:19 +00:00
Jenkins
65cea04ba2
Merge 'master' into 'amd-master'
...
Change-Id: I2d2b621a937a8a645072fe76c371ccb36f059a6e
2017-11-13 04:11:00 -06:00
Maneesh Gupta
85975e719d
Merge pull request #261 from gargrahul/fix_module_api_sample
...
Fix module_api sample
2017-11-13 11:55:54 +05:30
Rahul Garg
83adf6525e
Fix module_api sample
2017-11-13 08:56:39 +05:30
Jenkins
e2d81a6038
Merge 'master' into 'amd-master'
...
Change-Id: If635fe33b97998b22c4a00c0e9a5e041ef332d82
2017-11-10 04:48:30 -06:00
Alex Voicu
819e72fba6
Add omitted changes in CMakeLists.txt.
2017-11-10 01:20:50 +00:00