Wykres commitów

927 Commity

Autor SHA1 Wiadomość Data
Maneesh Gupta 693bd556d4 Merge pull request #1083 from gargrahul/fix_hip_impl_visible_agents
Maintain HIP_VISIBLE_DEVICES for kernel launch
2019-05-13 14:20:18 +05:30
Rahul Garg aeeab1b23f Add fine grained host memory lock support (#1095)
* Add fine grained host memory lock support

* Fix default flag check
2019-05-13 11:48:26 +05:30
Siu Chi Chan f5eb91d53d migrate program_state logic from header into shared library (phase I) (#1077)
* Revert "Revert "Use COMgr to read Kernel Args Metadata (#1006)""

This reverts commit a3d118eaa8.

* Revert "Use COMgr to read Kernel Args Metadata (#1006)"

This reverts commit 8a548bf40b.

* Revert "improve program state commentary"

This reverts commit 7aada87cbd.

* Revert "load program state once per agent"

This reverts commit c9117de8eb.

* start moving function_names() into the hip shared lib

* start moving code_object_blobs to a new "state" object

* Consolidate various program state related static objects into a
single program_state object

* minor clean up

* move more stuffs from functional_grid_launch into program_state

* debug make_kernarg

* moving lookup for kernargs size_align into program_state

* clean up old code for kernarg size and alignment

* update hip_module to use newer api in program_state

* Create public member functions for program_state

* move most program state functions into shared library

* Pass the data buffer size to load_executable
Otherwise, it can't figure what the data size is
just from the char* (since the data is not really a string)

* turning free functions in program state into members of program_state_impl

* change the free function globals() into a member of program_state_impl

* replace the static mutex used for populating globals

* moving associate_code_object_symbols_with_host_allocation into
program_state_impl

* move load_code_object_and_freeze_executable into program_state_impl

* moving executables and functions_names into program_state_impl

* moving kernels() into program_state_impl

* moving functions() into program_state_impl

* move get_kernargs into program_state_impl

* moving kernel_descriptor into program_state_impl

* moving kernargs_size_align calculation into program_state_impl

* Changing the handle to program_state_impl to a pointer

* moving program_state_impl into a separate inline source file

* fixing/cleaning up some header file includes

* moving member function for kernargs_size_align into program_state.cpp

* moving Kernel_descriptor into program_state.inl

* add a new class to manage agent globals

* moving all agent globals processing functions into agent_globals_impl

* load program state once per agent

re-merging PR991 against other program state changes

* fix per-agent program state member initialization

* cache executables based on elf name, isa, and agent.

This avoids program state reloading executables after a shared library is dlopened.

re-merging PR1057 against other program state changes

* protect executables cache by a global mutex

* return ref to executables cache

* adapt PR#981 Make hipModuleGetGlobal be in HIP runtime
2019-05-12 19:24:03 +05:30
Maneesh Gupta d5abe65668 Merge pull request #1081 from mangupta/swdev-181624
Implement hipExtGetLinkTypeAndHopCount for ROCm devices
2019-05-07 16:15:41 +05:30
wkwchau 29b3b46b42 Return hipErrorInsufficientDriver status when CPU device not found (#1064)
* Return hipErrorInsufficientDriver status when CPU device not found - no exception thrown

* Return hipErrorInsufficientDriver status when CPU device not found
2019-05-07 15:58:25 +05:30
Rahul Garg 620a07102d Maintain HIP_VISIBLE_DEVICES for kernel launch 2019-05-07 05:09:02 +05:30
Maneesh Gupta 035ef04e1f Implement hipExtGetLinkTypeAndHopCount for ROCm devices
Change-Id: Ie5bb4f640ac6d189c7fceeab22627a7494fd10bd
2019-05-06 15:54:31 +05:30
Sameer Sahasrabuddhe abb9375707 minor cleanup: eliminate repetition 2019-04-25 20:41:16 +05:30
Rahul Garg 2bc2c46d4d Add hipMallocManaged default functional support (#1036)
* Add hipMallocManaged default functional support

* Fix build error

* Add dtest
2019-04-24 16:50:03 +05:30
Yaxun (Sam) Liu bb5c620b13 Fix missing arg in HIP_INIT_API 2019-04-18 16:18:31 -04:00
Maneesh Gupta d789aef46e Merge pull request #1019 from scchan/lazy_binding
minor workaround for lazy binding
2019-04-16 08:36:10 +05:30
Jeff Daily 2b3037a6ea In hipFree, synchronize owner of memory (#1018)
* In hipFree, if memory is associated with a device, synchronize that device's streams.

This changes the behavior from synchronizing the currently set TLS device.

* All devices sync in hipFree for _appId=-1 case.

* Revert "All devices sync in hipFree for _appId=-1 case."

This reverts commit 1efb34d6a8426661e45bc5f763422a1147aeac10.

* add HIP_SYNC_FREE env var
2019-04-16 08:35:55 +05:30
Yaxun (Sam) Liu 271fdc4e4d Fix regression on multi-gpu due to PR#997 2019-04-05 22:54:41 -04:00
Siu Chi Chan e126f7254c minor workaround for lazy binding 2019-04-02 17:28:06 -04:00
Yaxun Sam Liu 98b9e92908 hip-clang: fix kernel not found on multi-gpu
__hipRegisterFunction is called during by .init functions during program initialization.
It calls hipModuleGetFunction to locate kernel symbol in code objects. hipModuleGetFunction
assumes current device when locating kernel symbols. This works for HCC but not for hip-clang,
since hip-clang needs to locate kernel symbols for different devices without switching
between devices.

This patch introduces a new hsa agent parameter to ihipModuleGetFunction, which allows
__hipRegisterFunction to choose the correct hsa agent when locating kernel symbols. By
default it uses this_agent(), therefore this patch has no impact on HCC.
2019-03-31 10:08:20 -04:00
Wen-Heng (Jack) Chung 4b7177ac42 Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime (#981)
* Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime

In HIP PR #929, quite a few HIP public APIs were made as inline functions with
hidden visibility. It was necessary to support applications with shared
libraries with GPU kernels launched via hipLaunchKernelGGL(), after HIP runtime
is initialized.

In empirical tests, the implementation has been proved to be a bit too
excessive, especially for hipModuleGetGlobal(). The function is used by another
type of client applications which relies on the existence of this function
within HIP runtime so global symbols from HSA code objects loaded dynamically
at runtime can be retrieved programmtically.

This commit moves hipModuleGetGlobal() back to src/hip_module.cpp, and makes it
visible and not inline, to fulfill requirements for applications
aforementioned. It does not change the behavior of applications depending on
hipLaunchKernelGGL().

* Add HIP_INIT_API into the implementation of hipModuleGetGlobal

Address review comments.

* Fix failing HIP unit tests
2019-03-29 03:45:04 +00:00
Maneesh Gupta d0e5fbeb72 Merge pull request #992 from gargrahul/handle_d2d_memcpy2d
Handle D2D in memcpy2D
2019-03-28 04:41:36 +00:00
Rahul Garg 0c55db8552 Handle D2D in memcpy2D 2019-03-28 02:21:45 +05:30
Jeff Daily c9117de8eb load program state once per agent 2019-03-27 18:19:10 +00:00
Rahul Garg f0af073793 Let hipHostMalloc always share/map pinned host ptr 2019-03-26 10:19:13 +05:30
Rahul Garg 5e917d70f3 Avoid double mapping of devices to hostMalloc buffer 2019-03-25 23:07:05 +05:30
Maneesh Gupta 30b5c02ec4 Merge pull request #970 from mangupta/swdev-172995
hipExtMallocWithFlags implementation
2019-03-25 07:46:53 +00:00
Maneesh Gupta eb03d50de9 Merge pull request #962 from gargrahul/add_2d_copy_fallback
Add 2D fallback to use copy kernel
2019-03-25 07:46:43 +00:00
Rahul Garg 9bbfbceb64 2D Fallback needs hcc workweek 19101 or higher 2019-03-25 12:07:28 +05:30
Maneesh Gupta cab119c8b2 hipExtMallocWithFlags needs hcc workweek 19115 or higher 2019-03-25 11:41:20 +05:30
Siu Chi Chan 24d08beef8 reimplement HIP_INIT as hip_impl::hip_init(), add hip_init() to some of the inlined API (#966)
* reimplement HIP_INIT as a function, expose it as hip_impl::hip_init()
so that it could be called from hipLaunchKernelGGL and other inlined
HIP functions

* Don't call hip_init from ihipPreLaunchKernel
2019-03-20 05:11:15 +00:00
Maneesh Gupta 73ec5d54b5 hipExtMallocWithFlags implementation
Change-Id: Iee9e119796472200b2933d5e23be60813f33bc75
2019-03-19 11:59:22 +05:30
Evgeny 31475c5ac8 tracing callback layer update 2019-03-14 22:43:52 -05:00
Rahul Garg 918d7e3a40 Add 2D fallback to use copy kernel 2019-03-14 13:03:06 +05:30
Siu Chi Chan 824ee1aa72 move triple_to_hsa_isa into the header 2019-03-11 19:51:44 +00:00
Siu Chi Chan bf1d48bf78 Fix memory leak introduced by previous change to Agent_global.
Make Agent_global manage the lifetime of the name string
2019-03-11 19:51:32 +00:00
Aaron Enye Shi 00d24d254d Fix Agent_global variables failing hipTestDeviceSymbol
Issue: Header uses std::vector<Agent_global> agent_globals which is created by hip_module.cpp
  - Move iterator fails to copy Agent_global from library source into header version
  - Due to different versions of std::string name in struct Agent_global
Fix: Change Agent_global to use char* name instead of std::string name
2019-03-11 19:51:25 +00:00
Aaron Enye Shi 23e9968752 Fix hash_for undefined reference in hipTestConstant test
Issue: mismatch undefined symbols in different user env
  - Binary expects modified return value std::string&
  - Fails to match libhip_hcc.so: return value is std::string& but doesn't match modified C++ env
Fix: Change return value to char*, create new key std::string in header from char*
2019-03-11 19:51:18 +00:00
Maneesh Gupta 352b17346c Merge pull request #949 from gargrahul/single_stream_concurrent_kernels
Add extension for kernel concurrency on same stream
2019-03-06 17:34:54 +05:30
Alex Voicu ea0fcf3e61 dlopen() fixes (#929)
* Initial attempt to switch over to internally linked state.

* Add missing CMake update.

* hipLaunchKernelGGLImpl must be inline as well. Ensure internal linkage.

* Ensure global retrieval uses internally linked state.

* Hide HC in the implementation. Minimise ADL woes.

* Strange software exists, and must be catered to.

* Use a less spammy mechanism for ensuring internal linkage / non-export.

* Remove leftover internal detail.
2019-03-06 17:31:44 +05:30
Rahul Garg 59081c69fc Add extension for kernel concurrency on same stream 2019-03-06 12:55:39 +05:30
Wen-Heng (Jack) Chung 5cbd28f29b Address code review comments to use hipDeviceptr_t 2019-03-05 05:51:05 +00:00
Wen-Heng (Jack) Chung 7ebbbd3525 Add hipMemsetD32 and hipMemsetD32Async
Add 2 extra memset functions which fills memory with integer-typed data

Also change the parameters of ihipMemset to better explain the semantic
2019-03-04 17:00:33 +00:00
Wilkin Chau 8d92d1ebd7 Fix hipMemset3D test
Calculate the allocated size based on the width, height and depth.
2019-02-28 22:42:46 +00:00
Yaxun Sam Liu c5e813f64c Add __gnu_h2f_ieee and __gnu_f2h_ieee
The implementation is copied from HCC runtime.

For hcc it has no effect since apps can find them in either hcc runtime or HIP
runtime.

hip-clang needs it in HIP/HCC runtime so that HIP/HCC and HIP/VDI runtime are
swappable.
2019-02-21 12:48:28 -05:00
lmoriche 005e09f5b9 Merge pull request #899 from lmoriche/hip_clang
Add code-object-v3 support for the HIP-Clang path.
2019-02-06 19:50:51 -08:00
Laurent Morichetti 6ab8711f02 Add code-object-v3 support for the HIP-Clang path. 2019-02-05 14:43:26 -08:00
Rahul Garg 3120db15e4 Revert "Fixed issue of GPU device losing access to host pinned memory" 2019-02-04 22:45:13 -08:00
Rahul Garg bab3a94b33 Fixed issue of GPU device losing access to host pinned memory 2019-01-22 06:25:43 +05:30
Evgeny 0164464bcc fixing HSA_INIT_API cid args 2019-01-16 23:45:44 -06:00
Alex Voicu 0b57f50855 Ensure that static structures are populated 2019-01-09 17:21:53 +00:00
Wen-Heng (Jack) Chung b4d658a48f Introduce hash key to HIP module implementation
A hash calculated via FNV-1a algorithm is introduced in ihipModule_t, the
internal of hipModule_t. The hash is used by HIP module APIs such as

- read_agent_global_from_module

to determine whether the agent-scope globals for a module have been iterated.

This commit fixes one issue that applications which load / unload modules
frequently would occasionally fail. After deep investigation of the issue it
turns out the old implementation in read_agent_global_from_module uses
hipModule_t as the key, which is not robust enough, as hipModule_t instances
are allocated dynamically so there are cases that one memory address may be
used by multiple hipModule_t instances. The real solution is to introduce a
uniquely identifiable hash for the code object associated with the HIP module.
And that's the rationale behind this commit.
2019-01-08 17:33:40 +00:00
Alex Voicu ec14daa7ce Hook into the creaky lazy-reinit machinery. Try to minimise race-risk. 2019-01-01 11:01:16 +05:30
Alex Voicu 340674ceb6 More blobs, more problems. 2019-01-01 11:01:07 +05:30
Alex Voicu 25c7e5d609 Start re-working 731 for 2.0. 2019-01-01 11:00:57 +05:30