* adding prof primitives generator
* minor change, renaming
* minor cosmetic changes, comments correcting and dead code removing
* minor changes and renaming
* minor chane, fixing comments
* reimplement HIP_INIT as a function, expose it as hip_impl::hip_init()
so that it could be called from hipLaunchKernelGGL and other inlined
HIP functions
* Don't call hip_init from ihipPreLaunchKernel
Making more clear what this list details. In particular, this list is intended to indicate what items for each CUDA release is supported, and which are not.
Issue: Header uses std::vector<Agent_global> agent_globals which is created by hip_module.cpp
- Move iterator fails to copy Agent_global from library source into header version
- Due to different versions of std::string name in struct Agent_global
Fix: Change Agent_global to use char* name instead of std::string name
Issue: mismatch undefined symbols in different user env
- Binary expects modified return value std::string&
- Fails to match libhip_hcc.so: return value is std::string& but doesn't match modified C++ env
Fix: Change return value to char*, create new key std::string in header from char*
+ Set -D__LP64__ in case of 64-bit hipify-clang binary
[partial workaround for clang's bug https://bugs.llvm.org/show_bug.cgi?id=38811]
C:/GIT/LLVM/trunk/llvm-64-release-vs2017/dist/lib/clang/9.0.0\include\__clang_cuda_device_functions.h(1609,45): error GEF7559A7: no matching function for call to 'roundf'
__DEVICE__ long lroundf(float __a) { return roundf(__a); }
#if defined(__LP64__)
__DEVICE__ long lround(double __a) { return llround(__a); }
__DEVICE__ long lroundf(float __a) { return llroundf(__a); } // ok: llroundf should be used when 64-bit
#else
__DEVICE__ long lround(double __a) { return round(__a); }
__DEVICE__ long lroundf(float __a) { return roundf(__a); } // error
#endif
+ Print more system info while testing in the following form:
========================================
CUDA 9.0 - will be used for testing
LLVM 9.0.0svn - will be used for testing
AMD64 - Platform architecture
Windows 10 - Platform OS
64 - hipify-clang binary bitness
32 - python 3.7.2 binary bitness
========================================
* Initial attempt to switch over to internally linked state.
* Add missing CMake update.
* hipLaunchKernelGGLImpl must be inline as well. Ensure internal linkage.
* Ensure global retrieval uses internally linked state.
* Hide HC in the implementation. Minimise ADL woes.
* Strange software exists, and must be catered to.
* Use a less spammy mechanism for ensuring internal linkage / non-export.
* Remove leftover internal detail.