Wykres commitów

2003 Commity

Autor SHA1 Wiadomość Data
Chris Kitching 51df93cf25 Remove CUDA_EXCLUDES
An artefact from a now-defunct hack to avoid corrupting programs
2017-10-27 20:12:32 +01:00
Chris Kitching 417fc92482 Make unsupported actually be a bool... 2017-10-27 20:12:31 +01:00
Evgeny Mankov ab6abe150d Merge pull request #234 from ChrisKitching/warningSpam
[HIPIFY] Do not process __fetch_builtin_* in cudaCall()
2017-10-27 21:30:42 +03:00
Evgeny Mankov d1600b4a85 Merge pull request #235 from ChrisKitching/preprocessorEnhancements
[HIPIFY] Handle unconditional preprocessor directives far better
2017-10-27 21:21:20 +03:00
Ben Sander 8a64feef61 Merge pull request #198 from AlexVlx/feature_support_globals_for_module_api
Feature support globals for module api
2017-10-27 01:53:34 +02:00
Ben Sander 9713dbb6f6 Merge pull request #218 from ChrisKitching/nodiscard
Add [[nodiscard]] attribute to hipError_t in C++17 mode
2017-10-26 22:48:54 +02:00
Ben Sander f76bcf4045 Merge pull request #223 from bensander/2x_bidir
Use 2X for bidir memory bandwidth calc
2017-10-26 21:49:06 +02:00
Chris Kitching 6491c2c3eb Greatly enhance handling of macros in kernel launches
All but the most contrived use of macros is now properly handled -
have a look at the new testcases this commit adds. You can have
macros in kernel calls, macros spanning chunks of your arguments,
the call, call parameters, or callee can all be macros or
partially macros.
2017-10-26 17:28:46 +01:00
Chris Kitching a35d30e0b7 Simplify how kernel launch expressions get translated
It seems like there was a lot of machinery here that is no longer
needed now we have hipLaunchKernelGGL (which doesn't require us
to insert an extra argument into kernel functions). We no longer
need to waste cycles scanning the AST for callees.

We can literally just do "Take the callee expression, and dump
it into the first argument of hipLaunchKernelGGL()".
2017-10-26 17:28:30 +01:00
Chris Kitching 1ef68090ae Deduplicate preprocessor code
There's three functions here that all do the same thing...

There was also logic that looks for numeric literals and works
backwards to find the macro name from which they are expanded.
I previously introduced code that rewrites macro references at
expand-time in the `MacroExpands` callback, so that code is no
longer doing anything useful.
2017-10-26 17:28:30 +01:00
Chris Kitching 30e7e7d919 Rewrite _all_ CUDA macro identifiers in the preprocessor
Calls to macros that were themselves CUDA API calls were often
being missed - this applies the identifier transform to macro
names at the callsites, too.
2017-10-26 17:27:56 +01:00
Chris Kitching f7e65c5334 Don't special-case source locations for calls in macros
The source location for a call that's inside a macro body will,
by default, point into the macro definition itself. The original
logic was causing macro invocations to be overwritten, as I
explain here:
https://github.com/ROCm-Developer-Tools/HIP/issues/207#issuecomment-337521851

The existing PPCallbacks code is correctly rewriting macro
definitions, so the practical effect of this change is that AST
rewrites on code that's expanded from macros are no-ops.

It might be a performance optimisation to put a short-circiut at
the top of the AST callbacks to abort when faced with code that
was expanded from macros.

It might yet prove wise to do absolutely everything at lex-time...
2017-10-26 17:26:37 +01:00
Chris Kitching c1f4612176 Prefer early-return to deep nesting
A chain of 7 closing braces is never a great sign :D

In the process it became apparant that the unsupported flag
was being silently ignored, causing users to be left with cuda
API calls in their programs with no warning given. This has been
rectified for consistency.
2017-10-26 17:26:37 +01:00
Chris Kitching d5fd5e55c5 Do not process __fetch_builtin_* in cudaCall()
Fixes #205
2017-10-26 17:23:55 +01:00
Evgeny Mankov fbdedd9196 Merge pull request #213 from ChrisKitching/simplify
[HIPIFY] Simplify (and accelerate) hipification of CUDA type identifiers
2017-10-26 19:17:01 +03:00
Ben Sander cf9eb832a2 Merge pull request #232 from kknox/update-cuda-version
Update container to newer cuda driver and sdk 9.0
2017-10-25 23:45:24 +02:00
Kent Knox f81d6d5bb3 Update container to newer cuda driver and sdk 9.0 2017-10-25 16:14:32 -05:00
Chris Kitching 8d91579dcf Don't use now-defunct cmake variable in lit test config 2017-10-24 20:52:51 +01:00
Chris Kitching b24b33ee2e Refactor cudaCall to prefer early return to deep nesting
Sorry for the invasive refactor, but this was making reasoning
about this function more difficult.
2017-10-24 20:38:49 +01:00
Chris Kitching 9da456b315 Split the giant lookup table into 3 smaller ones
Instead of having a single, enormous LUT for all CUDA names, let's
have separate ones for different types of entity. We often know
that we're looking at a typename, or a function name, or a macro
name - so we can be more efficient (and resilient to name
collisions) by having smaller lookup tables for each of those
classes of entity).

Here we start that off by having three LUTs:
- Header names
- Type names
- Everything else

Future work could usefully split "everything else" into:
- enum values
- macro names
- function names
- everything else

It's worth noting that the "needs new matcher" todos I delete here
were actually resolved with the previous commit. It no longer
naively searches for things that start with "cu*" - it will find
exactly those things that are present in our lookup tables.
2017-10-24 20:38:49 +01:00
Chris Kitching 93c9b3ca34 One matcher for type expressions to rule them all
Previously, there were different AST matchers for each
language construct that contains a type reference, and custom
logic to perform the transformation within each of those
structures.

Since the transformation in all such cases was only replacing
CUDA types with hip ones, we can instead use an AST matcher
that finds and updates the type references directly.
This simplifies the program considerably, and it won't fail
when it finds a language feature (or complicated type expression)
that nobody wrote custom logic for yet.
2017-10-24 20:38:49 +01:00
Chris Kitching 3b2d8029fb Make control flow less insane
`while(false)` is certainly a bold choice.
2017-10-24 20:38:49 +01:00
Chris Kitching 9a4778f435 Use a cmake glob for collecting hipify sources
Should make breaking this monstrosity into multiple files a bit
easier...
2017-10-24 20:38:48 +01:00
Chris Kitching 803d3ffd9c Move giant lookup table into another translation unit
Also, rewrote it as a constant variable instead of a function
that imperatively fills a map. It's shorter, faster the compile,
and (depending on how badly the compiler screws it up) maybe
faster to run.

And, of course, it starts breaking up that giant .cpp file.
2017-10-24 20:38:25 +01:00
Evgeny Mankov 4e78738267 Merge pull request #230 from emankov/master
[HIPIFY][fix] cmake: NO_DEFAULT_PATH is strongly needed in find_package for LLVM
2017-10-24 20:44:56 +03:00
Evgeny Mankov cbb772c3ab Merge pull request #224 from ChrisKitching/tests
[HIPIFY] Make the automated tests more useful
2017-10-24 20:12:36 +03:00
Evgeny Mankov 712a1da073 Merge branch 'master' into tests 2017-10-24 20:03:51 +03:00
Evgeny Mankov 7b307b21d1 Merge pull request #227 from ChrisKitching/clang-silly
Tweak some version numbers in clang version compatibility checks
2017-10-24 17:16:07 +03:00
Evgeny Mankov 4a0228cedb [HIPIFY][fix] cmake: NO_DEFAULT_PATH is strongly needed in find_package for LLVM
Otherwise LLVM will be searched in system folders.
2017-10-24 16:35:10 +03:00
Rahul Garg 10b1b58505 Example showing globals use with module APIs 2017-10-24 18:12:25 +05:30
Maneesh Gupta e324e77184 Merge pull request #225 from emankov/master
[HIPIFY] cmake: fix standalone build
2017-10-24 17:06:51 +05:30
Evgeny Mankov 7c51376bbf [HIPIFY][fix] cmake: do not build hipify-clang if not asked
+ warn "hipify-clang will not be built" if HIPIFY_CLANG_LLVM_DIR is not specified.
+ fix typo in previous commit .
2017-10-24 14:16:05 +03:00
emankov 559d9a68aa [HIPIFY] cmake: simplify build 2017-10-24 10:51:11 +03:00
Chris Kitching aa288fcf1f Tweak some version numbers in clang version compatibility checks
Apparently a couple of those APIs changed in clang 5, not 4.

Drat.
2017-10-24 01:45:23 +01:00
Evgeny Mankov cdd849541f [HIPIFY] cmake: fix standalone build 2017-10-23 21:16:13 +03:00
Rahul Garg 4090c82936 Use 2X for bidir p2p memory bandwidth calc 2017-10-23 21:57:20 +05:30
Chris Kitching 2c65d0da37 Add concurentKernels.cu to the testsuite 2017-10-23 13:39:37 +01:00
Chris Kitching 64d5f07050 Add the CUDA samples include dir to the path for tests
Means we get to easily steal CUDA examples for tests
2017-10-23 13:39:37 +01:00
Chris Kitching 2437b31939 Add cudaRegister.cu lit test 2017-10-23 13:39:37 +01:00
Chris Kitching ead79e5bf4 Add square.cu to lit testsuite 2017-10-23 13:39:37 +01:00
Chris Kitching c99dcbba8d Introduce a test runner script to simplify invocation
... And to use a standard, highly amusing trick for making
coloured output work.
2017-10-23 13:39:37 +01:00
Chris Kitching 5912f465bd Adapt lit test for the hipLaunchKernelGGL changes from before... 2017-10-23 13:39:37 +01:00
Chris Kitching 74fd64d5c1 Migrate lit test to using FileCheck, so failures are readable
It seems the test is already broken, but look how awesome the
error message is now:

/home/chris/HIP/tests/hipify-clang/axpy.cu:31:12: error: expected string not found in input
 // CHECK: hipLaunchKernel(HIP_KERNEL_NAME(axpy), dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y);
           ^
<stdin>:31:2: note: scanning from here
 //
 ^
<stdin>:33:2: note: possible intended match here
 hipLaunchKernelGGL(axpy, dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y);
 ^
2017-10-23 13:39:37 +01:00
Chris Kitching 3868036ea7 Look for FileCheck for running lit tests, too
Use of grep in `lit` RUN lines is deprecated:
https://llvm.org/docs/TestingGuide.html#writing-new-regression-tests

Using grep leads to really unhelpful failure output (it literally
just says "the test failed"). FileCheck is much more helpful, and
distributed with LLVM on most distros anyway, so this extra
dependency shouldn't prove problematic.
2017-10-23 13:39:36 +01:00
Chris Kitching 9747578d09 Propagate the CUDA toolkit directory into the lit tests
Allows the tests to actually run... :D
2017-10-23 13:39:36 +01:00
Chris Kitching e9d259699c Remove dependency on nonexistent lit target
This just... isn't a target. Maybe you were trying to do a file
dependency? But that's handled implicitly by the find_program
call anyway, so all this is doing is generating a cmake warning
2017-10-23 13:39:36 +01:00
Maneesh Gupta 281c543bc3 Make elfio headers private
Change-Id: I3ba174bb46e84a75380207d93a0da6fe3703689e
2017-10-23 10:24:36 +05:30
Ben Sander 21689e8710 Use 2X for bidir memory bandwidth calc 2017-10-21 07:47:32 -05:00
Evgeny Mankov 9dec691dbf Merge pull request #219 from ChrisKitching/newClang
[HIPIFY] Migrate hipify-clang to a newer version of clang
2017-10-21 11:59:46 +03:00
Ben Sander 326651875d Merge pull request #214 from scchan/reinit_printf_buffer
hipDeviceReset(): make sure to reinitialize the printf buffer in hcc RT
2017-10-20 11:04:37 -05:00