Graf Tiomantas

1927 Tiomáintí

Údar SHA1 Teachtaireacht Dáta
Chris Kitching 4e6ca773fa Decouple the statistics system from the code translation
The original implementation had the statistics system woken very
tightly into things like PPCallbacks, with counters duplicated
in two places, and all the output code duplicated. This made it
very difficult to alter the structure of the program without
breaking the statistics system.

Since the planned approach for solving the remaining preprocessor
bugs needs the introduction of a custom FrontendAction, and such
a restructure was incompatible with the way the statistics system
was set up, this rewrite was required.

'tis rather simpler now, mind you :D

This commit also fixes an issue where some stats were counted
twice, and allows `-print-stats` to operate independently of
`-stat-output`, allowing you to print stats to a file without
printing them to a terminal (or vice-versa).


[ROCm/clr commit: dd5a60054a]
2017-10-27 20:12:33 +01:00
Chris Kitching c6bc4f5249 Copy-paste less in the statistics printing code
[ROCm/clr commit: 5365a8638f]
2017-10-27 20:12:33 +01:00
Chris Kitching e9b8afaaeb Inline updateCountersExt
[ROCm/clr commit: 69d2555f17]
2017-10-27 20:12:32 +01:00
Chris Kitching 0b45e8d905 Update counter maps sanely
operator[] default-constructs the map value if no value exists
for that key. Default-construction of int yields a zero. So all
the manual faffing around is just unnecessary.


[ROCm/clr commit: cecc0782ef]
2017-10-27 20:12:32 +01:00
Chris Kitching b89133c2d4 Prefer references to pointers in updateCountersExt()
[ROCm/clr commit: 828552decb]
2017-10-27 20:12:32 +01:00
Chris Kitching f659ad0eeb Move string utility functions into their own translation unit
[ROCm/clr commit: ab824ebd47]
2017-10-27 20:12:32 +01:00
Chris Kitching dcd770f25a Extract LLVM compatibility code into its own translation unit
[ROCm/clr commit: 9478be8198]
2017-10-27 20:12:32 +01:00
Chris Kitching 43b9ce8b56 Remove unused field
[ROCm/clr commit: 9eeac22359]
2017-10-27 20:12:32 +01:00
Chris Kitching 3d0f71e803 Remove CUDA_EXCLUDES
An artefact from a now-defunct hack to avoid corrupting programs


[ROCm/clr commit: 51df93cf25]
2017-10-27 20:12:32 +01:00
Chris Kitching cf14905f11 Make unsupported actually be a bool...
[ROCm/clr commit: 417fc92482]
2017-10-27 20:12:31 +01:00
Evgeny Mankov bfa8ad1880 Merge pull request #234 from ChrisKitching/warningSpam
[HIPIFY] Do not process __fetch_builtin_* in cudaCall()

[ROCm/clr commit: ab6abe150d]
2017-10-27 21:30:42 +03:00
Evgeny Mankov 29c12e7927 Merge pull request #235 from ChrisKitching/preprocessorEnhancements
[HIPIFY] Handle unconditional preprocessor directives far better

[ROCm/clr commit: d1600b4a85]
2017-10-27 21:21:20 +03:00
Ben Sander b0aa15ee5f Merge pull request #198 from AlexVlx/feature_support_globals_for_module_api
Feature support globals for module api

[ROCm/clr commit: 8a64feef61]
2017-10-27 01:53:34 +02:00
Ben Sander 1058f074f3 Merge pull request #218 from ChrisKitching/nodiscard
Add [[nodiscard]] attribute to hipError_t in C++17 mode

[ROCm/clr commit: 9713dbb6f6]
2017-10-26 22:48:54 +02:00
Ben Sander 7ce81b1573 Merge pull request #223 from bensander/2x_bidir
Use 2X for bidir memory bandwidth calc

[ROCm/clr commit: f76bcf4045]
2017-10-26 21:49:06 +02:00
Chris Kitching 7360326705 Greatly enhance handling of macros in kernel launches
All but the most contrived use of macros is now properly handled -
have a look at the new testcases this commit adds. You can have
macros in kernel calls, macros spanning chunks of your arguments,
the call, call parameters, or callee can all be macros or
partially macros.


[ROCm/clr commit: 6491c2c3eb]
2017-10-26 17:28:46 +01:00
Chris Kitching 58428d739c Simplify how kernel launch expressions get translated
It seems like there was a lot of machinery here that is no longer
needed now we have hipLaunchKernelGGL (which doesn't require us
to insert an extra argument into kernel functions). We no longer
need to waste cycles scanning the AST for callees.

We can literally just do "Take the callee expression, and dump
it into the first argument of hipLaunchKernelGGL()".


[ROCm/clr commit: a35d30e0b7]
2017-10-26 17:28:30 +01:00
Chris Kitching 74af29d66a Deduplicate preprocessor code
There's three functions here that all do the same thing...

There was also logic that looks for numeric literals and works
backwards to find the macro name from which they are expanded.
I previously introduced code that rewrites macro references at
expand-time in the `MacroExpands` callback, so that code is no
longer doing anything useful.


[ROCm/clr commit: 1ef68090ae]
2017-10-26 17:28:30 +01:00
Chris Kitching b25b17b6b3 Rewrite _all_ CUDA macro identifiers in the preprocessor
Calls to macros that were themselves CUDA API calls were often
being missed - this applies the identifier transform to macro
names at the callsites, too.


[ROCm/clr commit: 30e7e7d919]
2017-10-26 17:27:56 +01:00
Chris Kitching cf50b4f97a Don't special-case source locations for calls in macros
The source location for a call that's inside a macro body will,
by default, point into the macro definition itself. The original
logic was causing macro invocations to be overwritten, as I
explain here:
https://github.com/ROCm-Developer-Tools/HIP/issues/207#issuecomment-337521851

The existing PPCallbacks code is correctly rewriting macro
definitions, so the practical effect of this change is that AST
rewrites on code that's expanded from macros are no-ops.

It might be a performance optimisation to put a short-circiut at
the top of the AST callbacks to abort when faced with code that
was expanded from macros.

It might yet prove wise to do absolutely everything at lex-time...


[ROCm/clr commit: f7e65c5334]
2017-10-26 17:26:37 +01:00
Chris Kitching af81909cab Prefer early-return to deep nesting
A chain of 7 closing braces is never a great sign :D

In the process it became apparant that the unsupported flag
was being silently ignored, causing users to be left with cuda
API calls in their programs with no warning given. This has been
rectified for consistency.


[ROCm/clr commit: c1f4612176]
2017-10-26 17:26:37 +01:00
Chris Kitching 07fb675fe7 Do not process __fetch_builtin_* in cudaCall()
Fixes #205


[ROCm/clr commit: d5fd5e55c5]
2017-10-26 17:23:55 +01:00
Evgeny Mankov 2371dda65e Merge pull request #213 from ChrisKitching/simplify
[HIPIFY] Simplify (and accelerate) hipification of CUDA type identifiers

[ROCm/clr commit: fbdedd9196]
2017-10-26 19:17:01 +03:00
Kent Knox f32e20abc7 Update container to newer cuda driver and sdk 9.0
[ROCm/clr commit: f81d6d5bb3]
2017-10-25 16:14:32 -05:00
Chris Kitching 31920efae1 Don't use now-defunct cmake variable in lit test config
[ROCm/clr commit: 8d91579dcf]
2017-10-24 20:52:51 +01:00
Chris Kitching 5ee04d16c5 Refactor cudaCall to prefer early return to deep nesting
Sorry for the invasive refactor, but this was making reasoning
about this function more difficult.


[ROCm/clr commit: b24b33ee2e]
2017-10-24 20:38:49 +01:00
Chris Kitching 4d23fb75e9 Split the giant lookup table into 3 smaller ones
Instead of having a single, enormous LUT for all CUDA names, let's
have separate ones for different types of entity. We often know
that we're looking at a typename, or a function name, or a macro
name - so we can be more efficient (and resilient to name
collisions) by having smaller lookup tables for each of those
classes of entity).

Here we start that off by having three LUTs:
- Header names
- Type names
- Everything else

Future work could usefully split "everything else" into:
- enum values
- macro names
- function names
- everything else

It's worth noting that the "needs new matcher" todos I delete here
were actually resolved with the previous commit. It no longer
naively searches for things that start with "cu*" - it will find
exactly those things that are present in our lookup tables.


[ROCm/clr commit: 9da456b315]
2017-10-24 20:38:49 +01:00
Chris Kitching a00a5e3101 One matcher for type expressions to rule them all
Previously, there were different AST matchers for each
language construct that contains a type reference, and custom
logic to perform the transformation within each of those
structures.

Since the transformation in all such cases was only replacing
CUDA types with hip ones, we can instead use an AST matcher
that finds and updates the type references directly.
This simplifies the program considerably, and it won't fail
when it finds a language feature (or complicated type expression)
that nobody wrote custom logic for yet.


[ROCm/clr commit: 93c9b3ca34]
2017-10-24 20:38:49 +01:00
Chris Kitching 1a53e259b3 Make control flow less insane
`while(false)` is certainly a bold choice.


[ROCm/clr commit: 3b2d8029fb]
2017-10-24 20:38:49 +01:00
Chris Kitching 02d3a34917 Use a cmake glob for collecting hipify sources
Should make breaking this monstrosity into multiple files a bit
easier...


[ROCm/clr commit: 9a4778f435]
2017-10-24 20:38:48 +01:00
Chris Kitching bbac7c8b82 Move giant lookup table into another translation unit
Also, rewrote it as a constant variable instead of a function
that imperatively fills a map. It's shorter, faster the compile,
and (depending on how badly the compiler screws it up) maybe
faster to run.

And, of course, it starts breaking up that giant .cpp file.


[ROCm/clr commit: 803d3ffd9c]
2017-10-24 20:38:25 +01:00
Evgeny Mankov 5a2bdb98f0 Merge pull request #230 from emankov/master
[HIPIFY][fix] cmake: NO_DEFAULT_PATH is strongly needed in find_package for LLVM

[ROCm/clr commit: 4e78738267]
2017-10-24 20:44:56 +03:00
Evgeny Mankov a12402fc9a Merge branch 'master' into tests
[ROCm/clr commit: 712a1da073]
2017-10-24 20:03:51 +03:00
Evgeny Mankov a6f0e70cea Merge pull request #227 from ChrisKitching/clang-silly
Tweak some version numbers in clang version compatibility checks

[ROCm/clr commit: 7b307b21d1]
2017-10-24 17:16:07 +03:00
Evgeny Mankov 59c0a69751 [HIPIFY][fix] cmake: NO_DEFAULT_PATH is strongly needed in find_package for LLVM
Otherwise LLVM will be searched in system folders.


[ROCm/clr commit: 4a0228cedb]
2017-10-24 16:35:10 +03:00
Rahul Garg 28c72504b6 Example showing globals use with module APIs
[ROCm/clr commit: 10b1b58505]
2017-10-24 18:12:25 +05:30
Evgeny Mankov 10cd9f9eb9 [HIPIFY][fix] cmake: do not build hipify-clang if not asked
+ warn "hipify-clang will not be built" if HIPIFY_CLANG_LLVM_DIR is not specified.
+ fix typo in previous commit .


[ROCm/clr commit: 7c51376bbf]
2017-10-24 14:16:05 +03:00
emankov d9a550d09c [HIPIFY] cmake: simplify build
[ROCm/clr commit: 559d9a68aa]
2017-10-24 10:51:11 +03:00
Chris Kitching 482541bb8c Tweak some version numbers in clang version compatibility checks
Apparently a couple of those APIs changed in clang 5, not 4.

Drat.


[ROCm/clr commit: aa288fcf1f]
2017-10-24 01:45:23 +01:00
Evgeny Mankov 27df5ecc82 [HIPIFY] cmake: fix standalone build
[ROCm/clr commit: cdd849541f]
2017-10-23 21:16:13 +03:00
Rahul Garg dc6c43772c Use 2X for bidir p2p memory bandwidth calc
[ROCm/clr commit: 4090c82936]
2017-10-23 21:57:20 +05:30
Chris Kitching 26be50b082 Add concurentKernels.cu to the testsuite
[ROCm/clr commit: 2c65d0da37]
2017-10-23 13:39:37 +01:00
Chris Kitching 8fffd03350 Add the CUDA samples include dir to the path for tests
Means we get to easily steal CUDA examples for tests


[ROCm/clr commit: 64d5f07050]
2017-10-23 13:39:37 +01:00
Chris Kitching 7547f796d0 Add cudaRegister.cu lit test
[ROCm/clr commit: 2437b31939]
2017-10-23 13:39:37 +01:00
Chris Kitching 8fd3b3b1cd Add square.cu to lit testsuite
[ROCm/clr commit: ead79e5bf4]
2017-10-23 13:39:37 +01:00
Chris Kitching 8b311424a6 Introduce a test runner script to simplify invocation
... And to use a standard, highly amusing trick for making
coloured output work.


[ROCm/clr commit: c99dcbba8d]
2017-10-23 13:39:37 +01:00
Chris Kitching 4e80782cda Adapt lit test for the hipLaunchKernelGGL changes from before...
[ROCm/clr commit: 5912f465bd]
2017-10-23 13:39:37 +01:00
Chris Kitching 0f6c153774 Migrate lit test to using FileCheck, so failures are readable
It seems the test is already broken, but look how awesome the
error message is now:

/home/chris/HIP/tests/hipify-clang/axpy.cu:31:12: error: expected string not found in input
 // CHECK: hipLaunchKernel(HIP_KERNEL_NAME(axpy), dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y);
           ^
<stdin>:31:2: note: scanning from here
 //
 ^
<stdin>:33:2: note: possible intended match here
 hipLaunchKernelGGL(axpy, dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y);
 ^


[ROCm/clr commit: 74fd64d5c1]
2017-10-23 13:39:37 +01:00
Chris Kitching d92c43bd21 Look for FileCheck for running lit tests, too
Use of grep in `lit` RUN lines is deprecated:
https://llvm.org/docs/TestingGuide.html#writing-new-regression-tests

Using grep leads to really unhelpful failure output (it literally
just says "the test failed"). FileCheck is much more helpful, and
distributed with LLVM on most distros anyway, so this extra
dependency shouldn't prove problematic.


[ROCm/clr commit: 3868036ea7]
2017-10-23 13:39:36 +01:00
Chris Kitching baabd2755e Propagate the CUDA toolkit directory into the lit tests
Allows the tests to actually run... :D


[ROCm/clr commit: 9747578d09]
2017-10-23 13:39:36 +01:00