Instead of having a single, enormous LUT for all CUDA names, let's
have separate ones for different types of entity. We often know
that we're looking at a typename, or a function name, or a macro
name - so we can be more efficient (and resilient to name
collisions) by having smaller lookup tables for each of those
classes of entity).
Here we start that off by having three LUTs:
- Header names
- Type names
- Everything else
Future work could usefully split "everything else" into:
- enum values
- macro names
- function names
- everything else
It's worth noting that the "needs new matcher" todos I delete here
were actually resolved with the previous commit. It no longer
naively searches for things that start with "cu*" - it will find
exactly those things that are present in our lookup tables.
Previously, there were different AST matchers for each
language construct that contains a type reference, and custom
logic to perform the transformation within each of those
structures.
Since the transformation in all such cases was only replacing
CUDA types with hip ones, we can instead use an AST matcher
that finds and updates the type references directly.
This simplifies the program considerably, and it won't fail
when it finds a language feature (or complicated type expression)
that nobody wrote custom logic for yet.
Also, rewrote it as a constant variable instead of a function
that imperatively fills a map. It's shorter, faster the compile,
and (depending on how badly the compiler screws it up) maybe
faster to run.
And, of course, it starts breaking up that giant .cpp file.
It seems the test is already broken, but look how awesome the
error message is now:
/home/chris/HIP/tests/hipify-clang/axpy.cu:31:12: error: expected string not found in input
// CHECK: hipLaunchKernel(HIP_KERNEL_NAME(axpy), dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y);
^
<stdin>:31:2: note: scanning from here
//
^
<stdin>:33:2: note: possible intended match here
hipLaunchKernelGGL(axpy, dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y);
^
Use of grep in `lit` RUN lines is deprecated:
https://llvm.org/docs/TestingGuide.html#writing-new-regression-tests
Using grep leads to really unhelpful failure output (it literally
just says "the test failed"). FileCheck is much more helpful, and
distributed with LLVM on most distros anyway, so this extra
dependency shouldn't prove problematic.
This just... isn't a target. Maybe you were trying to do a file
dependency? But that's handled implicitly by the find_program
call anyway, so all this is doing is generating a cmake warning
This bug was present all along, but something changed in the order
of de-initialisation performed by llvm that makes it actually
crash now.
The constructor of HipifyPPCallbacks gives:
```
std::unique_ptr<HipifyPPCallbacks>(this)
```
to the LLVM Preprocessor instance. The Preprocessor instance
subsequently frees the HipifyPPCallbacks, which is then freed
again when we leave the stack frame at line 4340.
So: let's leak the HipifyPPCallbacks onto the heap, and leave
the LLVM Preprocessor object responsible for tidying it up.
The Preprocessor smart pointer is held by the CompilerInstance,
and therefore its reference count cannot reach zero until the
CompilerInstance itself is destroyed.
If the CompilerInstance is destroyed, you have more to worry about
than just the preprocessor being deallocated!
Newer versions of the LLVM/Clang API migrated to using
std::shared_ptr, so there is no `Retain()` function (by that
name, anyway). Eliminating this redundant use is a neat and
backward-compatible way to become compatible with newer versions
of the LLVM/Clang API.
Newer versions of llvm/clang mean there is both an
llvm::StringLiteral and a clang::StringLiteral. Since we're
dumping both namespaces wholesale into the global namespace with
`using` declarations, this creates a name collision, which must be
resolved.
This change is backwards-compatible, and fixes a problem you
encounter when using newer versions of the llvm/clang API.
What we actually want to do here is use the StringRef version in
versions newer than 3.8, and the void one in 3.8 and older.
Checking "major-version >= 3 && minor-version >= 9" does not do
what we want. Consider what this will do for version 4.0, for
which minor-version is zero...