+ Fix the error in cmake: clang 8.0.0 supports CUDA 10.0, not 10.1; only trunk clang 9.0.0 supports CUDA 10.1.
+ Update Readme:
1. clang's bug 36384 is gone, so latest stable release for Linux is clang 8.0.0, which supports CUDA up to 10.0;
2. update dependencies, testing and building notes.
Due to a HCC bug in rocm-head, the test fails to compile. Disabling the test until the issue is resolved.
Change-Id: Ib4e7baf0b9c2cb5f5dbe38e6dd2bab894d28886a
__hipRegisterFunction is called during by .init functions during program initialization.
It calls hipModuleGetFunction to locate kernel symbol in code objects. hipModuleGetFunction
assumes current device when locating kernel symbols. This works for HCC but not for hip-clang,
since hip-clang needs to locate kernel symbols for different devices without switching
between devices.
This patch introduces a new hsa agent parameter to ihipModuleGetFunction, which allows
__hipRegisterFunction to choose the correct hsa agent when locating kernel symbols. By
default it uses this_agent(), therefore this patch has no impact on HCC.
* Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime
In HIP PR #929, quite a few HIP public APIs were made as inline functions with
hidden visibility. It was necessary to support applications with shared
libraries with GPU kernels launched via hipLaunchKernelGGL(), after HIP runtime
is initialized.
In empirical tests, the implementation has been proved to be a bit too
excessive, especially for hipModuleGetGlobal(). The function is used by another
type of client applications which relies on the existence of this function
within HIP runtime so global symbols from HSA code objects loaded dynamically
at runtime can be retrieved programmtically.
This commit moves hipModuleGetGlobal() back to src/hip_module.cpp, and makes it
visible and not inline, to fulfill requirements for applications
aforementioned. It does not change the behavior of applications depending on
hipLaunchKernelGGL().
* Add HIP_INIT_API into the implementation of hipModuleGetGlobal
Address review comments.
* Fix failing HIP unit tests
Disambiguate calling many varibles "agent".
More detail in exception message.
Create and discard map placeholders; no need to call std::vector::clear() on map value.
For code objects with global symbols of length 0, ROCR runtime would
ignore them even though they exist in the symbol table. Therefore the
result from read_agent_globals() can't be trusted entirely.
As a workaround to tame applications which depend on the existence of
global symbols with length 0, always return hipSuccess here.
This behavior shall be reverted once ROCR runtime has been fixed to
address SWDEV-173477
* adding prof primitives generator
* minor change, renaming
* minor cosmetic changes, comments correcting and dead code removing
* minor changes and renaming
* minor chane, fixing comments
* reimplement HIP_INIT as a function, expose it as hip_impl::hip_init()
so that it could be called from hipLaunchKernelGGL and other inlined
HIP functions
* Don't call hip_init from ihipPreLaunchKernel