Граф коммитов

1960 Коммитов

Автор SHA1 Сообщение Дата
Alex Voicu dab971370e Correctly deal with functions from shared objects, wherein the program visible VA == so_base_va + st_value(function_symbol). Remove quaint usage of pfe for hipMemset (which is actually fill_n).
[ROCm/hip commit: 2cacda91bb]
2017-11-01 22:33:13 +00:00
Alex Voicu 70a41e7dac This switches HIP from its currently convoluted macro + pfe based dispatch mechanism to a more natural one partially based on the existing module API. The basic idea is that HCC will always correctly emit __global__ functions: as empty-bodied stubs, on host, and as kernels, on device. It then becomes trivial to obtain the mangled name on host, at dispatch, from the function's address, and then to use the mangled name to retrieve the kernel. This should address all problems stemming from serialisation, dubious mismatches due to the manufactured functor, macro-isms et al. It also immediately enables support for generalised globals as a consequence of that being available in the module API. Finally, it will make debug much easier, since the actual names of the __global__ functions will automatically be used in traces etc. One detail is that due to how dispatch works now (hipLaunchKernel and hipLaunchKernelGGL are themselves variadic function templates which deduce the function type of the callee), in certain cases it may be necesssary to insert explicit casts to ensure that the variadic argument list selects a viable overload - this can be observed in some unit tests. Eventually we may be able to remove this limitation, but for now it does not appear terribly onerous. The code is not extremely HIPpie, nor is it fully optimised, but rather is intended as a starting point for the HIP team to make its own.
[ROCm/hip commit: c2482d1255]
2017-11-01 15:09:59 +00:00
Maneesh Gupta e6e90e9cfa Merge pull request #197 from bensander/update_coherency_tests
Update coherency tests

[ROCm/hip commit: f27c2c1715]
2017-10-31 17:26:50 +05:30
Maneesh Gupta 07b52ec043 Merge pull request #241 from ROCm-Developer-Tools/multi_host
Inital codes to remove x86_64 dependency in HIP source build

[ROCm/hip commit: 4d85b6ab29]
2017-10-31 16:35:12 +05:30
Ben Sander e88ef63bc8 Add ns-level timer for HIP API routines
Refactor some miuses of ihipLogStatus, these should only be in top-level
HIP APIs and should be paired with HIP_API_INIT calls.


[ROCm/hip commit: 7e908bdec8]
2017-10-30 20:20:51 +00:00
Wen-Heng (Jack) Chung afb75b0b2c Inital codes to remove x86_64 dependency in HIP source build
[ROCm/hip commit: 92fb244841]
2017-10-30 15:19:23 -05:00
Ben Sander 51ee7807db Merge pull request #222 from bensander/fix_device_prop
Fix device prop

[ROCm/hip commit: 2e8ec71e40]
2017-10-30 17:58:48 +01:00
Ben Sander 8242c45eda Merge pull request #226 from scchan/add_printf3
add printf to HIP device functions

[ROCm/hip commit: f8843ae415]
2017-10-30 17:08:18 +01:00
Evgeny Mankov 27c5e94c81 [HIPIFY] fix typo - missing )
[ROCm/hip commit: 44c74b6511]
2017-10-27 23:31:43 +03:00
Evgeny Mankov b9a52020b2 Merge pull request #238 from ChrisKitching/statistics
[HIPIFY] Decouple the statistics system from the code rewriter

[ROCm/hip commit: b28a69785b]
2017-10-27 23:17:20 +03:00
Evgeny Mankov 193235bd08 Merge pull request #236 from ChrisKitching/friendlyCmake
[HIPIFY] Make the cmake build system more friendly

[ROCm/hip commit: c9b7c43e1c]
2017-10-27 22:35:13 +03:00
Chris Kitching 54f786583b Remove commented else-block
A warning statement for _string literals_ seems a bit unhelpful.
There's no value in this being here.


[ROCm/hip commit: 20871a3a07]
2017-10-27 20:12:33 +01:00
Chris Kitching 05699e3779 Decouple the statistics system from the code translation
The original implementation had the statistics system woken very
tightly into things like PPCallbacks, with counters duplicated
in two places, and all the output code duplicated. This made it
very difficult to alter the structure of the program without
breaking the statistics system.

Since the planned approach for solving the remaining preprocessor
bugs needs the introduction of a custom FrontendAction, and such
a restructure was incompatible with the way the statistics system
was set up, this rewrite was required.

'tis rather simpler now, mind you :D

This commit also fixes an issue where some stats were counted
twice, and allows `-print-stats` to operate independently of
`-stat-output`, allowing you to print stats to a file without
printing them to a terminal (or vice-versa).


[ROCm/hip commit: b303ffe53e]
2017-10-27 20:12:33 +01:00
Chris Kitching 1454bf9651 Copy-paste less in the statistics printing code
[ROCm/hip commit: 5699c18adc]
2017-10-27 20:12:33 +01:00
Chris Kitching e4569bc84e Inline updateCountersExt
[ROCm/hip commit: 50448aec3b]
2017-10-27 20:12:32 +01:00
Chris Kitching a3c1d30745 Update counter maps sanely
operator[] default-constructs the map value if no value exists
for that key. Default-construction of int yields a zero. So all
the manual faffing around is just unnecessary.


[ROCm/hip commit: d8beee8918]
2017-10-27 20:12:32 +01:00
Chris Kitching 51df5b20c9 Prefer references to pointers in updateCountersExt()
[ROCm/hip commit: 00bb447e55]
2017-10-27 20:12:32 +01:00
Chris Kitching 25517e41fd Move string utility functions into their own translation unit
[ROCm/hip commit: ee8e11a720]
2017-10-27 20:12:32 +01:00
Chris Kitching 85fd2e6f51 Extract LLVM compatibility code into its own translation unit
[ROCm/hip commit: 1bd837b4b1]
2017-10-27 20:12:32 +01:00
Chris Kitching 389fa2e68e Remove unused field
[ROCm/hip commit: 0c09bdf523]
2017-10-27 20:12:32 +01:00
Chris Kitching 48e7403762 Remove CUDA_EXCLUDES
An artefact from a now-defunct hack to avoid corrupting programs


[ROCm/hip commit: c6707ef33c]
2017-10-27 20:12:32 +01:00
Chris Kitching 199d75adc0 Make unsupported actually be a bool...
[ROCm/hip commit: 2f376c9b25]
2017-10-27 20:12:31 +01:00
Chris Kitching 2285482075 Describe the LLVM we found
[ROCm/hip commit: 69e67fe25a]
2017-10-27 19:39:41 +01:00
Chris Kitching 7a0eedabbe Update hipify-clang readme for simplified build process
[ROCm/hip commit: 82d05ee6f4]
2017-10-27 19:39:41 +01:00
Chris Kitching e3338fe5a4 hipify does not add the hipLaunchParm option any more
This was removed a while ago - seems like it uses a different
variant of the launch kernel function now, so this is redundant.


[ROCm/hip commit: b412802c66]
2017-10-27 19:39:41 +01:00
Chris Kitching fd656a0fbb Use cmake's builtin mechanism for handling library locations
See [the documentation](https://cmake.org/cmake/help/v3.0/command/find_package.html)
for exactly how the search procedure works. If you want to use an
LLVM from a specific location, use CMAKE_PREFIX_PATH as normal.

No longer do we have a nonstandard HIPIFY_CLANG_LLVM_DIR variable
for people to learn about.


[ROCm/hip commit: 8fefc6a2b7]
2017-10-27 19:39:40 +01:00
Chris Kitching 1461c69757 Move the "LLVM found" print adjacent to the find_package call
Very surprising that LLVM's finder module doesn't print this
itself like _literally every other finder module_. Blarg.


[ROCm/hip commit: 92c90a7068]
2017-10-27 19:39:40 +01:00
Chris Kitching cc1bc495a0 We no longer rely on HIPIFY_CLANG_LLVM_DIR to disable hipify-clang
Since there's now an option for toggling hipify-clang, omitting the
path is no longer something we need to check for. We'll still
abort if LLVM isn't found, due to `REQUIRED`.


[ROCm/hip commit: c60c8d417e]
2017-10-27 19:39:40 +01:00
Chris Kitching d92f3c8d76 Don't attempt to find test dependencies if tests are disabled
And while we're at it, introduce a handy program-finder macro


[ROCm/hip commit: 921ff4c8a3]
2017-10-27 19:39:40 +01:00
Chris Kitching b61c4a241f Use add_dependencies to avoid duplication of pkg_hip_base
[ROCm/hip commit: 56b4222043]
2017-10-27 19:39:40 +01:00
Chris Kitching 7947e406c7 Make BUILD_HIPIFY_CLANG a cmake option
Instead of deciding whether to build hipify-clang based on
the presence of an LLVM path on the command line, have an
explicit option.

Do we want this default-on or default-off? I've defaulted it to
on for now, but maybe we want the opposite?


[ROCm/hip commit: a4ecd4eb31]
2017-10-27 19:39:39 +01:00
Evgeny Mankov a9c3228cb1 Merge pull request #234 from ChrisKitching/warningSpam
[HIPIFY] Do not process __fetch_builtin_* in cudaCall()

[ROCm/hip commit: 9151a355c6]
2017-10-27 21:30:42 +03:00
Evgeny Mankov aead033e41 Merge pull request #235 from ChrisKitching/preprocessorEnhancements
[HIPIFY] Handle unconditional preprocessor directives far better

[ROCm/hip commit: a865ebfe10]
2017-10-27 21:21:20 +03:00
Siu Chi Chan 6cc7f10e84 Merge remote-tracking branch 'origin/master' into HEAD
[ROCm/hip commit: a9789ddcda]
2017-10-27 01:18:28 -04:00
Ben Sander ad9a636b90 Merge pull request #198 from AlexVlx/feature_support_globals_for_module_api
Feature support globals for module api

[ROCm/hip commit: f288f24e95]
2017-10-27 01:53:34 +02:00
Ben Sander 37bd264cd4 Merge pull request #218 from ChrisKitching/nodiscard
Add [[nodiscard]] attribute to hipError_t in C++17 mode

[ROCm/hip commit: e97f675397]
2017-10-26 22:48:54 +02:00
Ben Sander c8fc8122b9 Merge pull request #223 from bensander/2x_bidir
Use 2X for bidir memory bandwidth calc

[ROCm/hip commit: 772fe865fc]
2017-10-26 21:49:06 +02:00
Chris Kitching c15f6bdf5c Greatly enhance handling of macros in kernel launches
All but the most contrived use of macros is now properly handled -
have a look at the new testcases this commit adds. You can have
macros in kernel calls, macros spanning chunks of your arguments,
the call, call parameters, or callee can all be macros or
partially macros.


[ROCm/hip commit: 094b2b9b05]
2017-10-26 17:28:46 +01:00
Chris Kitching d0acfd5bde Simplify how kernel launch expressions get translated
It seems like there was a lot of machinery here that is no longer
needed now we have hipLaunchKernelGGL (which doesn't require us
to insert an extra argument into kernel functions). We no longer
need to waste cycles scanning the AST for callees.

We can literally just do "Take the callee expression, and dump
it into the first argument of hipLaunchKernelGGL()".


[ROCm/hip commit: eff86d975b]
2017-10-26 17:28:30 +01:00
Chris Kitching 59713c2459 Deduplicate preprocessor code
There's three functions here that all do the same thing...

There was also logic that looks for numeric literals and works
backwards to find the macro name from which they are expanded.
I previously introduced code that rewrites macro references at
expand-time in the `MacroExpands` callback, so that code is no
longer doing anything useful.


[ROCm/hip commit: fd911e1839]
2017-10-26 17:28:30 +01:00
Chris Kitching ca913bb196 Rewrite _all_ CUDA macro identifiers in the preprocessor
Calls to macros that were themselves CUDA API calls were often
being missed - this applies the identifier transform to macro
names at the callsites, too.


[ROCm/hip commit: d1e26b2e7e]
2017-10-26 17:27:56 +01:00
Chris Kitching cbf786a8fd Don't special-case source locations for calls in macros
The source location for a call that's inside a macro body will,
by default, point into the macro definition itself. The original
logic was causing macro invocations to be overwritten, as I
explain here:
https://github.com/ROCm-Developer-Tools/HIP/issues/207#issuecomment-337521851

The existing PPCallbacks code is correctly rewriting macro
definitions, so the practical effect of this change is that AST
rewrites on code that's expanded from macros are no-ops.

It might be a performance optimisation to put a short-circiut at
the top of the AST callbacks to abort when faced with code that
was expanded from macros.

It might yet prove wise to do absolutely everything at lex-time...


[ROCm/hip commit: 4a794ed8c0]
2017-10-26 17:26:37 +01:00
Chris Kitching 32cbe68a93 Prefer early-return to deep nesting
A chain of 7 closing braces is never a great sign :D

In the process it became apparant that the unsupported flag
was being silently ignored, causing users to be left with cuda
API calls in their programs with no warning given. This has been
rectified for consistency.


[ROCm/hip commit: 35a892bc77]
2017-10-26 17:26:37 +01:00
Chris Kitching 78826f5512 Do not process __fetch_builtin_* in cudaCall()
Fixes #205


[ROCm/hip commit: 2a5acac80e]
2017-10-26 17:23:55 +01:00
Evgeny Mankov 99d43e1e6a Merge pull request #213 from ChrisKitching/simplify
[HIPIFY] Simplify (and accelerate) hipification of CUDA type identifiers

[ROCm/hip commit: 3bacb69e20]
2017-10-26 19:17:01 +03:00
Kent Knox f6b4db7dd5 Update container to newer cuda driver and sdk 9.0
[ROCm/hip commit: 160d92af20]
2017-10-25 16:14:32 -05:00
Ben Sander 68e571aa82 Clean up test to address review feedback.
[ROCm/hip commit: ca1230300a]
2017-10-25 16:08:16 -05:00
Siu Chi Chan c595b8f660 add HC_FEATURE_PRINTF around the printf buffer definition
[ROCm/hip commit: d91a4f5bd6]
2017-10-25 12:00:02 -04:00
Chris Kitching 641f7ed8f1 Don't use now-defunct cmake variable in lit test config
[ROCm/hip commit: 59071b895e]
2017-10-24 20:52:51 +01:00
Chris Kitching 5092f3f184 Refactor cudaCall to prefer early return to deep nesting
Sorry for the invasive refactor, but this was making reasoning
about this function more difficult.


[ROCm/hip commit: 778d6827f9]
2017-10-24 20:38:49 +01:00