rocm-systems

Údar	SHA1	Teachtaireacht	Dáta
Chris Kitching	4e6ca773fa	Decouple the statistics system from the code translation The original implementation had the statistics system woken very tightly into things like PPCallbacks, with counters duplicated in two places, and all the output code duplicated. This made it very difficult to alter the structure of the program without breaking the statistics system. Since the planned approach for solving the remaining preprocessor bugs needs the introduction of a custom FrontendAction, and such a restructure was incompatible with the way the statistics system was set up, this rewrite was required. 'tis rather simpler now, mind you :D This commit also fixes an issue where some stats were counted twice, and allows `-print-stats` to operate independently of `-stat-output`, allowing you to print stats to a file without printing them to a terminal (or vice-versa). [ROCm/clr commit: `dd5a60054a`]	2017-10-27 20:12:33 +01:00
Chris Kitching	c6bc4f5249	Copy-paste less in the statistics printing code [ROCm/clr commit: `5365a8638f`]	2017-10-27 20:12:33 +01:00
Chris Kitching	e9b8afaaeb	Inline updateCountersExt [ROCm/clr commit: `69d2555f17`]	2017-10-27 20:12:32 +01:00
Chris Kitching	0b45e8d905	Update counter maps sanely operator[] default-constructs the map value if no value exists for that key. Default-construction of int yields a zero. So all the manual faffing around is just unnecessary. [ROCm/clr commit: `cecc0782ef`]	2017-10-27 20:12:32 +01:00
Chris Kitching	b89133c2d4	Prefer references to pointers in updateCountersExt() [ROCm/clr commit: `828552decb`]	2017-10-27 20:12:32 +01:00
Chris Kitching	f659ad0eeb	Move string utility functions into their own translation unit [ROCm/clr commit: `ab824ebd47`]	2017-10-27 20:12:32 +01:00
Chris Kitching	dcd770f25a	Extract LLVM compatibility code into its own translation unit [ROCm/clr commit: `9478be8198`]	2017-10-27 20:12:32 +01:00
Chris Kitching	43b9ce8b56	Remove unused field [ROCm/clr commit: `9eeac22359`]	2017-10-27 20:12:32 +01:00
Chris Kitching	3d0f71e803	Remove CUDA_EXCLUDES An artefact from a now-defunct hack to avoid corrupting programs [ROCm/clr commit: `51df93cf25`]	2017-10-27 20:12:32 +01:00
Chris Kitching	cf14905f11	Make `unsupported` actually be a bool... [ROCm/clr commit: `417fc92482`]	2017-10-27 20:12:31 +01:00
Evgeny Mankov	bfa8ad1880	Merge pull request #234 from ChrisKitching/warningSpam [HIPIFY] Do not process __fetch_builtin_* in cudaCall() [ROCm/clr commit: `ab6abe150d`]	2017-10-27 21:30:42 +03:00
Evgeny Mankov	29c12e7927	Merge pull request #235 from ChrisKitching/preprocessorEnhancements [HIPIFY] Handle unconditional preprocessor directives far better [ROCm/clr commit: `d1600b4a85`]	2017-10-27 21:21:20 +03:00
Ben Sander	b0aa15ee5f	Merge pull request #198 from AlexVlx/feature_support_globals_for_module_api Feature support globals for module api [ROCm/clr commit: `8a64feef61`]	2017-10-27 01:53:34 +02:00
Ben Sander	1058f074f3	Merge pull request #218 from ChrisKitching/nodiscard Add [[nodiscard]] attribute to hipError_t in C++17 mode [ROCm/clr commit: `9713dbb6f6`]	2017-10-26 22:48:54 +02:00
Ben Sander	7ce81b1573	Merge pull request #223 from bensander/2x_bidir Use 2X for bidir memory bandwidth calc [ROCm/clr commit: `f76bcf4045`]	2017-10-26 21:49:06 +02:00
Chris Kitching	7360326705	Greatly enhance handling of macros in kernel launches All but the most contrived use of macros is now properly handled - have a look at the new testcases this commit adds. You can have macros in kernel calls, macros spanning chunks of your arguments, the call, call parameters, or callee can all be macros or partially macros. [ROCm/clr commit: `6491c2c3eb`]	2017-10-26 17:28:46 +01:00
Chris Kitching	58428d739c	Simplify how kernel launch expressions get translated It seems like there was a lot of machinery here that is no longer needed now we have hipLaunchKernelGGL (which doesn't require us to insert an extra argument into kernel functions). We no longer need to waste cycles scanning the AST for callees. We can literally just do "Take the callee expression, and dump it into the first argument of hipLaunchKernelGGL()". [ROCm/clr commit: `a35d30e0b7`]	2017-10-26 17:28:30 +01:00
Chris Kitching	74af29d66a	Deduplicate preprocessor code There's three functions here that all do the same thing... There was also logic that looks for numeric literals and works backwards to find the macro name from which they are expanded. I previously introduced code that rewrites macro references at expand-time in the `MacroExpands` callback, so that code is no longer doing anything useful. [ROCm/clr commit: `1ef68090ae`]	2017-10-26 17:28:30 +01:00
Chris Kitching	b25b17b6b3	Rewrite _all_ CUDA macro identifiers in the preprocessor Calls to macros that were themselves CUDA API calls were often being missed - this applies the identifier transform to macro names at the callsites, too. [ROCm/clr commit: `30e7e7d919`]	2017-10-26 17:27:56 +01:00
Chris Kitching	cf50b4f97a	Don't special-case source locations for calls in macros The source location for a call that's inside a macro body will, by default, point into the macro definition itself. The original logic was causing macro invocations to be overwritten, as I explain here: https://github.com/ROCm-Developer-Tools/HIP/issues/207#issuecomment-337521851 The existing PPCallbacks code is correctly rewriting macro definitions, so the practical effect of this change is that AST rewrites on code that's expanded from macros are no-ops. It might be a performance optimisation to put a short-circiut at the top of the AST callbacks to abort when faced with code that was expanded from macros. It might yet prove wise to do absolutely everything at lex-time... [ROCm/clr commit: `f7e65c5334`]	2017-10-26 17:26:37 +01:00
Chris Kitching	af81909cab	Prefer early-return to deep nesting A chain of 7 closing braces is never a great sign :D In the process it became apparant that the unsupported flag was being silently ignored, causing users to be left with cuda API calls in their programs with no warning given. This has been rectified for consistency. [ROCm/clr commit: `c1f4612176`]	2017-10-26 17:26:37 +01:00
Chris Kitching	07fb675fe7	Do not process __fetch_builtin_* in cudaCall() Fixes #205 [ROCm/clr commit: `d5fd5e55c5`]	2017-10-26 17:23:55 +01:00
Evgeny Mankov	2371dda65e	Merge pull request #213 from ChrisKitching/simplify [HIPIFY] Simplify (and accelerate) hipification of CUDA type identifiers [ROCm/clr commit: `fbdedd9196`]	2017-10-26 19:17:01 +03:00
Kent Knox	f32e20abc7	Update container to newer cuda driver and sdk 9.0 [ROCm/clr commit: `f81d6d5bb3`]	2017-10-25 16:14:32 -05:00
Chris Kitching	31920efae1	Don't use now-defunct cmake variable in lit test config [ROCm/clr commit: `8d91579dcf`]	2017-10-24 20:52:51 +01:00
Chris Kitching	5ee04d16c5	Refactor cudaCall to prefer early return to deep nesting Sorry for the invasive refactor, but this was making reasoning about this function more difficult. [ROCm/clr commit: `b24b33ee2e`]	2017-10-24 20:38:49 +01:00
Chris Kitching	4d23fb75e9	Split the giant lookup table into 3 smaller ones Instead of having a single, enormous LUT for all CUDA names, let's have separate ones for different types of entity. We often know that we're looking at a typename, or a function name, or a macro name - so we can be more efficient (and resilient to name collisions) by having smaller lookup tables for each of those classes of entity). Here we start that off by having three LUTs: - Header names - Type names - Everything else Future work could usefully split "everything else" into: - enum values - macro names - function names - everything else It's worth noting that the "needs new matcher" todos I delete here were actually resolved with the previous commit. It no longer naively searches for things that start with "cu*" - it will find exactly those things that are present in our lookup tables. [ROCm/clr commit: `9da456b315`]	2017-10-24 20:38:49 +01:00
Chris Kitching	a00a5e3101	One matcher for type expressions to rule them all Previously, there were different AST matchers for each language construct that contains a type reference, and custom logic to perform the transformation within each of those structures. Since the transformation in all such cases was only replacing CUDA types with hip ones, we can instead use an AST matcher that finds and updates the type references directly. This simplifies the program considerably, and it won't fail when it finds a language feature (or complicated type expression) that nobody wrote custom logic for yet. [ROCm/clr commit: `93c9b3ca34`]	2017-10-24 20:38:49 +01:00
Chris Kitching	1a53e259b3	Make control flow less insane `while(false)` is certainly a bold choice. [ROCm/clr commit: `3b2d8029fb`]	2017-10-24 20:38:49 +01:00
Chris Kitching	02d3a34917	Use a cmake glob for collecting hipify sources Should make breaking this monstrosity into multiple files a bit easier... [ROCm/clr commit: `9a4778f435`]	2017-10-24 20:38:48 +01:00
Chris Kitching	bbac7c8b82	Move giant lookup table into another translation unit Also, rewrote it as a constant variable instead of a function that imperatively fills a map. It's shorter, faster the compile, and (depending on how badly the compiler screws it up) maybe faster to run. And, of course, it starts breaking up that giant .cpp file. [ROCm/clr commit: `803d3ffd9c`]	2017-10-24 20:38:25 +01:00
Evgeny Mankov	5a2bdb98f0	Merge pull request #230 from emankov/master [HIPIFY][fix] cmake: NO_DEFAULT_PATH is strongly needed in find_package for LLVM [ROCm/clr commit: `4e78738267`]	2017-10-24 20:44:56 +03:00
Evgeny Mankov	a12402fc9a	Merge branch 'master' into tests [ROCm/clr commit: `712a1da073`]	2017-10-24 20:03:51 +03:00
Evgeny Mankov	a6f0e70cea	Merge pull request #227 from ChrisKitching/clang-silly Tweak some version numbers in clang version compatibility checks [ROCm/clr commit: `7b307b21d1`]	2017-10-24 17:16:07 +03:00
Evgeny Mankov	59c0a69751	[HIPIFY][fix] cmake: NO_DEFAULT_PATH is strongly needed in find_package for LLVM Otherwise LLVM will be searched in system folders. [ROCm/clr commit: `4a0228cedb`]	2017-10-24 16:35:10 +03:00
Rahul Garg	28c72504b6	Example showing globals use with module APIs [ROCm/clr commit: `10b1b58505`]	2017-10-24 18:12:25 +05:30
Evgeny Mankov	10cd9f9eb9	[HIPIFY][fix] cmake: do not build hipify-clang if not asked + warn "hipify-clang will not be built" if HIPIFY_CLANG_LLVM_DIR is not specified. + fix typo in previous commit . [ROCm/clr commit: `7c51376bbf`]	2017-10-24 14:16:05 +03:00
emankov	d9a550d09c	[HIPIFY] cmake: simplify build [ROCm/clr commit: `559d9a68aa`]	2017-10-24 10:51:11 +03:00
Chris Kitching	482541bb8c	Tweak some version numbers in clang version compatibility checks Apparently a couple of those APIs changed in clang 5, not 4. Drat. [ROCm/clr commit: `aa288fcf1f`]	2017-10-24 01:45:23 +01:00
Evgeny Mankov	27df5ecc82	[HIPIFY] cmake: fix standalone build [ROCm/clr commit: `cdd849541f`]	2017-10-23 21:16:13 +03:00
Rahul Garg	dc6c43772c	Use 2X for bidir p2p memory bandwidth calc [ROCm/clr commit: `4090c82936`]	2017-10-23 21:57:20 +05:30
Chris Kitching	26be50b082	Add concurentKernels.cu to the testsuite [ROCm/clr commit: `2c65d0da37`]	2017-10-23 13:39:37 +01:00
Chris Kitching	8fffd03350	Add the CUDA samples include dir to the path for tests Means we get to easily steal CUDA examples for tests [ROCm/clr commit: `64d5f07050`]	2017-10-23 13:39:37 +01:00
Chris Kitching	7547f796d0	Add cudaRegister.cu lit test [ROCm/clr commit: `2437b31939`]	2017-10-23 13:39:37 +01:00
Chris Kitching	8fd3b3b1cd	Add square.cu to lit testsuite [ROCm/clr commit: `ead79e5bf4`]	2017-10-23 13:39:37 +01:00
Chris Kitching	8b311424a6	Introduce a test runner script to simplify invocation ... And to use a standard, highly amusing trick for making coloured output work. [ROCm/clr commit: `c99dcbba8d`]	2017-10-23 13:39:37 +01:00
Chris Kitching	4e80782cda	Adapt `lit` test for the hipLaunchKernelGGL changes from before... [ROCm/clr commit: `5912f465bd`]	2017-10-23 13:39:37 +01:00
Chris Kitching	0f6c153774	Migrate lit test to using FileCheck, so failures are readable It seems the test is already broken, but look how awesome the error message is now: /home/chris/HIP/tests/hipify-clang/axpy.cu:31:12: error: expected string not found in input // CHECK: hipLaunchKernel(HIP_KERNEL_NAME(axpy), dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y); ^ <stdin>:31:2: note: scanning from here // ^ <stdin>:33:2: note: possible intended match here hipLaunchKernelGGL(axpy, dim3(1), dim3(kDataLen), 0, 0, a, device_x, device_y); ^ [ROCm/clr commit: `74fd64d5c1`]	2017-10-23 13:39:37 +01:00
Chris Kitching	d92c43bd21	Look for FileCheck for running lit tests, too Use of grep in `lit` RUN lines is deprecated: https://llvm.org/docs/TestingGuide.html#writing-new-regression-tests Using grep leads to really unhelpful failure output (it literally just says "the test failed"). FileCheck is much more helpful, and distributed with LLVM on most distros anyway, so this extra dependency shouldn't prove problematic. [ROCm/clr commit: `3868036ea7`]	2017-10-23 13:39:36 +01:00
Chris Kitching	baabd2755e	Propagate the CUDA toolkit directory into the lit tests Allows the tests to actually run... :D [ROCm/clr commit: `9747578d09`]	2017-10-23 13:39:36 +01:00

1 2 3 4 5 ...

1927 Tiomáintí