The original implementation had the statistics system woken very
tightly into things like PPCallbacks, with counters duplicated
in two places, and all the output code duplicated. This made it
very difficult to alter the structure of the program without
breaking the statistics system.
Since the planned approach for solving the remaining preprocessor
bugs needs the introduction of a custom FrontendAction, and such
a restructure was incompatible with the way the statistics system
was set up, this rewrite was required.
'tis rather simpler now, mind you :D
This commit also fixes an issue where some stats were counted
twice, and allows `-print-stats` to operate independently of
`-stat-output`, allowing you to print stats to a file without
printing them to a terminal (or vice-versa).
[ROCm/clr commit: dd5a60054a]
operator[] default-constructs the map value if no value exists
for that key. Default-construction of int yields a zero. So all
the manual faffing around is just unnecessary.
[ROCm/clr commit: cecc0782ef]
All but the most contrived use of macros is now properly handled -
have a look at the new testcases this commit adds. You can have
macros in kernel calls, macros spanning chunks of your arguments,
the call, call parameters, or callee can all be macros or
partially macros.
[ROCm/clr commit: 6491c2c3eb]
It seems like there was a lot of machinery here that is no longer
needed now we have hipLaunchKernelGGL (which doesn't require us
to insert an extra argument into kernel functions). We no longer
need to waste cycles scanning the AST for callees.
We can literally just do "Take the callee expression, and dump
it into the first argument of hipLaunchKernelGGL()".
[ROCm/clr commit: a35d30e0b7]
There's three functions here that all do the same thing...
There was also logic that looks for numeric literals and works
backwards to find the macro name from which they are expanded.
I previously introduced code that rewrites macro references at
expand-time in the `MacroExpands` callback, so that code is no
longer doing anything useful.
[ROCm/clr commit: 1ef68090ae]
Calls to macros that were themselves CUDA API calls were often
being missed - this applies the identifier transform to macro
names at the callsites, too.
[ROCm/clr commit: 30e7e7d919]
The source location for a call that's inside a macro body will,
by default, point into the macro definition itself. The original
logic was causing macro invocations to be overwritten, as I
explain here:
https://github.com/ROCm-Developer-Tools/HIP/issues/207#issuecomment-337521851
The existing PPCallbacks code is correctly rewriting macro
definitions, so the practical effect of this change is that AST
rewrites on code that's expanded from macros are no-ops.
It might be a performance optimisation to put a short-circiut at
the top of the AST callbacks to abort when faced with code that
was expanded from macros.
It might yet prove wise to do absolutely everything at lex-time...
[ROCm/clr commit: f7e65c5334]
A chain of 7 closing braces is never a great sign :D
In the process it became apparant that the unsupported flag
was being silently ignored, causing users to be left with cuda
API calls in their programs with no warning given. This has been
rectified for consistency.
[ROCm/clr commit: c1f4612176]
Instead of having a single, enormous LUT for all CUDA names, let's
have separate ones for different types of entity. We often know
that we're looking at a typename, or a function name, or a macro
name - so we can be more efficient (and resilient to name
collisions) by having smaller lookup tables for each of those
classes of entity).
Here we start that off by having three LUTs:
- Header names
- Type names
- Everything else
Future work could usefully split "everything else" into:
- enum values
- macro names
- function names
- everything else
It's worth noting that the "needs new matcher" todos I delete here
were actually resolved with the previous commit. It no longer
naively searches for things that start with "cu*" - it will find
exactly those things that are present in our lookup tables.
[ROCm/clr commit: 9da456b315]
Previously, there were different AST matchers for each
language construct that contains a type reference, and custom
logic to perform the transformation within each of those
structures.
Since the transformation in all such cases was only replacing
CUDA types with hip ones, we can instead use an AST matcher
that finds and updates the type references directly.
This simplifies the program considerably, and it won't fail
when it finds a language feature (or complicated type expression)
that nobody wrote custom logic for yet.
[ROCm/clr commit: 93c9b3ca34]
Also, rewrote it as a constant variable instead of a function
that imperatively fills a map. It's shorter, faster the compile,
and (depending on how badly the compiler screws it up) maybe
faster to run.
And, of course, it starts breaking up that giant .cpp file.
[ROCm/clr commit: 803d3ffd9c]
It seems the test is already broken, but look how awesome the
error message is now:
/home/chris/HIP/tests/hipify-clang/axpy.cu:31:12: error: expected string not found in input
// CHECK: hipLaunchKernel(HIP_KERNEL_NAME(axpy), dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y);
^
<stdin>:31:2: note: scanning from here
//
^
<stdin>:33:2: note: possible intended match here
hipLaunchKernelGGL(axpy, dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y);
^
[ROCm/clr commit: 74fd64d5c1]
Use of grep in `lit` RUN lines is deprecated:
https://llvm.org/docs/TestingGuide.html#writing-new-regression-tests
Using grep leads to really unhelpful failure output (it literally
just says "the test failed"). FileCheck is much more helpful, and
distributed with LLVM on most distros anyway, so this extra
dependency shouldn't prove problematic.
[ROCm/clr commit: 3868036ea7]