+ Makes hip_Memcpy2D struct compatible with CUDA_MEMCPY2D struct
+ Add hipMemcpyParam2D support in nvcc fallback path
+ Update hipify-clang, tests and docs accordingly
* Add ersatz for NVRTC.
* Fix extraneous paren and use correct namespace.
* Use lowerCamelCase (yuck, yuck) consistently.
* Link against FS when building hiprtc lib.
* Correctly mark Manipulators. Fix dual compile.
* Add unit tests. Extend HIT to accept linker options.
* Make sure the HIPRTC library is installed.
* Better logging. Try to auto-detect the target.
* Stop specifying the target explicitly.
* Add missing flavour of `hipModuleLaunchKernel`.
* Program was already destroyed.
* Don't use `--genco`. Fix mangled name trimming.
* Fix HIPRTC breakage due to upstream noise.
* [dtests] Replace RUN -> TEST in hiprtc tests
Change-Id: Ie499e92dfe4e5c94634b1c2b76cf52d241bcfea3
* [hit] Set HIP_PATH to HIP_ROOT_DIR for all tests
Change-Id: Ib0ad1f99bc71c03e363e055dd508a7a4a210680a
* Revert "Revert "Use COMgr to read Kernel Args Metadata (#1006)""
This reverts commit a3d118eaa8.
* Revert "Use COMgr to read Kernel Args Metadata (#1006)"
This reverts commit 8a548bf40b.
* Revert "improve program state commentary"
This reverts commit 7aada87cbd.
* Revert "load program state once per agent"
This reverts commit c9117de8eb.
* start moving function_names() into the hip shared lib
* start moving code_object_blobs to a new "state" object
* Consolidate various program state related static objects into a
single program_state object
* minor clean up
* move more stuffs from functional_grid_launch into program_state
* debug make_kernarg
* moving lookup for kernargs size_align into program_state
* clean up old code for kernarg size and alignment
* update hip_module to use newer api in program_state
* Create public member functions for program_state
* move most program state functions into shared library
* Pass the data buffer size to load_executable
Otherwise, it can't figure what the data size is
just from the char* (since the data is not really a string)
* turning free functions in program state into members of program_state_impl
* change the free function globals() into a member of program_state_impl
* replace the static mutex used for populating globals
* moving associate_code_object_symbols_with_host_allocation into
program_state_impl
* move load_code_object_and_freeze_executable into program_state_impl
* moving executables and functions_names into program_state_impl
* moving kernels() into program_state_impl
* moving functions() into program_state_impl
* move get_kernargs into program_state_impl
* moving kernel_descriptor into program_state_impl
* moving kernargs_size_align calculation into program_state_impl
* Changing the handle to program_state_impl to a pointer
* moving program_state_impl into a separate inline source file
* fixing/cleaning up some header file includes
* moving member function for kernargs_size_align into program_state.cpp
* moving Kernel_descriptor into program_state.inl
* add a new class to manage agent globals
* moving all agent globals processing functions into agent_globals_impl
* load program state once per agent
re-merging PR991 against other program state changes
* fix per-agent program state member initialization
* cache executables based on elf name, isa, and agent.
This avoids program state reloading executables after a shared library is dlopened.
re-merging PR1057 against other program state changes
* protect executables cache by a global mutex
* return ref to executables cache
* adapt PR#981 Make hipModuleGetGlobal be in HIP runtime
- It's a common mistake by assuming 1 << shamt would be promoted to
64-bit, if shamt is a 64-bit integer. That's not the case. Replace
that left shift to a 64-bit one to ensure it won't fall into undefined
behavior.
- Fix the host-side implementation as well for device function testing.
- Separate the definition of `__HCC_OR_HIP_CLANG__`, `__HCC_ONLY__`, and
`__HIP_CLANG_ONLY__` into hip_common.h so that it could be included in
hip_fp16.h, which may be included separately in app.
* Update hip_runtime_api.h
when i try to use mpicc or gcc to compile a c language code which call some hip runtime api , error occured as
> /path/to/hcc_detail/hip_runtime_api.h:2268:33: error: unknown type name ‘hipFuncAttributes’;
> hipFuncGetAttributes(hipFuncAttributes* attr, const void* func);
add ' struct ' for the first parameter of hipFuncGetAttributes will get ride of this problem.
* Add CMAKE dep to amd_comgr
* Use COMGR for read_kernarg_metadata in COV2
* Do not assume kernargs exist
* Add proper metadata destroy cleanup
* Use a process function for easier destroy
* Remove old read_kernarg_metadata
* Clean up HCC, prints, names
* Use COMGR in CMAKE by default
* Move metadata lookup for keyword values into helper
* Remove C string usage for lookup_keyword_value
* Guard COMGR for non-NVCC path
* Add hip_hcc dependency on comgr package
* Add lifetime to metadata nodes
* Find COMGR config file for amd_comgr target
* Move set_active data earlier
* Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime
In HIP PR #929, quite a few HIP public APIs were made as inline functions with
hidden visibility. It was necessary to support applications with shared
libraries with GPU kernels launched via hipLaunchKernelGGL(), after HIP runtime
is initialized.
In empirical tests, the implementation has been proved to be a bit too
excessive, especially for hipModuleGetGlobal(). The function is used by another
type of client applications which relies on the existence of this function
within HIP runtime so global symbols from HSA code objects loaded dynamically
at runtime can be retrieved programmtically.
This commit moves hipModuleGetGlobal() back to src/hip_module.cpp, and makes it
visible and not inline, to fulfill requirements for applications
aforementioned. It does not change the behavior of applications depending on
hipLaunchKernelGGL().
* Add HIP_INIT_API into the implementation of hipModuleGetGlobal
Address review comments.
* Fix failing HIP unit tests
Disambiguate calling many varibles "agent".
More detail in exception message.
Create and discard map placeholders; no need to call std::vector::clear() on map value.
For code objects with global symbols of length 0, ROCR runtime would
ignore them even though they exist in the symbol table. Therefore the
result from read_agent_globals() can't be trusted entirely.
As a workaround to tame applications which depend on the existence of
global symbols with length 0, always return hipSuccess here.
This behavior shall be reverted once ROCR runtime has been fixed to
address SWDEV-173477