Граф коммитов

3131 Коммитов

Автор SHA1 Сообщение Дата
Evgeny Mankov 9eb10c5b4d [HIPIFY][doc] Fix typos and minor text inaccuracies in HIP FAQ
[ROCm/clr commit: 4990a1003b]
2019-04-05 19:19:38 +03:00
Maneesh Gupta bedff2406a Merge pull request #998 from yxsamliu/doc
hip-clang: update installation guide.

[ROCm/clr commit: 2d9e5615b7]
2019-04-02 05:08:18 +00:00
Maneesh Gupta 9d2c0c8e5d Merge pull request #997 from yxsamliu/mgpu
hip-clang: fix kernel not found on multi-gpu

[ROCm/clr commit: a6e7dfefe3]
2019-04-02 05:07:31 +00:00
Evgeny Mankov 17ac0d3e19 [HIPIFY][tests] Fix typo in test for CUDA 10.x
[ROCm/clr commit: 5b59f87305]
2019-04-01 19:52:08 +03:00
Evgeny Mankov 236624a229 [HIPIFY][cmake] Update CMakeLists and Readme cause CUDA 10.1 and clang 8.0.0 are released
[ROCm/clr commit: 799a6f5512]
2019-04-01 19:44:52 +03:00
Yaxun Sam Liu 12ac74bad1 hip-clang: fix kernel not found on multi-gpu
__hipRegisterFunction is called during by .init functions during program initialization.
It calls hipModuleGetFunction to locate kernel symbol in code objects. hipModuleGetFunction
assumes current device when locating kernel symbols. This works for HCC but not for hip-clang,
since hip-clang needs to locate kernel symbols for different devices without switching
between devices.

This patch introduces a new hsa agent parameter to ihipModuleGetFunction, which allows
__hipRegisterFunction to choose the correct hsa agent when locating kernel symbols. By
default it uses this_agent(), therefore this patch has no impact on HCC.


[ROCm/clr commit: 8f5c812a68]
2019-03-31 10:08:20 -04:00
Yaxun (Sam) Liu 54c2b79351 Update INSTALL.md
[ROCm/clr commit: 76a9fdd924]
2019-03-30 08:29:08 -04:00
Yaxun Sam Liu 79ac0097dc hip-clang: update installation guide.
[ROCm/clr commit: 598604aa7f]
2019-03-30 08:24:49 -04:00
Wen-Heng (Jack) Chung cfe930f9d6 Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime (#981)
* Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime

In HIP PR #929, quite a few HIP public APIs were made as inline functions with
hidden visibility. It was necessary to support applications with shared
libraries with GPU kernels launched via hipLaunchKernelGGL(), after HIP runtime
is initialized.

In empirical tests, the implementation has been proved to be a bit too
excessive, especially for hipModuleGetGlobal(). The function is used by another
type of client applications which relies on the existence of this function
within HIP runtime so global symbols from HSA code objects loaded dynamically
at runtime can be retrieved programmtically.

This commit moves hipModuleGetGlobal() back to src/hip_module.cpp, and makes it
visible and not inline, to fulfill requirements for applications
aforementioned. It does not change the behavior of applications depending on
hipLaunchKernelGGL().

* Add HIP_INIT_API into the implementation of hipModuleGetGlobal

Address review comments.

* Fix failing HIP unit tests


[ROCm/clr commit: 04915cea2f]
2019-03-29 03:45:04 +00:00
Maneesh Gupta d99bc4c540 Merge pull request #992 from gargrahul/handle_d2d_memcpy2d
Handle D2D in memcpy2D

[ROCm/clr commit: f9f4cee347]
2019-03-28 04:41:36 +00:00
Jeff Daily 5233d41c6c improve program state commentary
Disambiguate calling many varibles "agent".
More detail in exception message.
Create and discard map placeholders; no need to call std::vector::clear() on map value.


[ROCm/clr commit: f5e4fff6cc]
2019-03-27 21:40:27 +00:00
Rahul Garg 73bb9a74bb Handle D2D in memcpy2D
[ROCm/clr commit: 50d623981e]
2019-03-28 02:21:45 +05:30
Jeff Daily 9cee2c5311 load program state once per agent
[ROCm/clr commit: 2845b4c4b8]
2019-03-27 18:19:10 +00:00
Maneesh Gupta a0b29c8ed0 Merge pull request #987 from gargrahul/fix_hostmalloc_double_device_map
Avoid double mapping of devices to hostMalloc buffer

[ROCm/clr commit: 93906a072c]
2019-03-27 05:23:47 +00:00
Maneesh Gupta 6fb7f626ba Merge pull request #990 from mhbliao/hliao/master/sw
SWDEV-184380 Fix hcc compilation

[ROCm/clr commit: 178e3ecdca]
2019-03-27 05:23:26 +00:00
Michael LIAO c5717a37d7 SWDEV-184380 Fix hcc compilation
- `hcc` has no builtin. Need to invoke LLVM intrinsic directly.


[ROCm/clr commit: d355122bf9]
2019-03-26 15:20:17 -04:00
Rahul Garg 0d47ae4203 Let hipHostMalloc always share/map pinned host ptr
[ROCm/clr commit: 9b38380c03]
2019-03-26 10:19:13 +05:30
Rahul Garg 21d7bbab11 Avoid double mapping of devices to hostMalloc buffer
[ROCm/clr commit: ad11972f47]
2019-03-25 23:07:05 +05:30
Michael LIAO 5482fa8102 [hip] Fix typo in macro hipLaunchKernel
[ROCm/clr commit: 13655df76e]
2019-03-25 12:06:46 -04:00
Maneesh Gupta 817e064745 Merge pull request #970 from mangupta/swdev-172995
hipExtMallocWithFlags implementation

[ROCm/clr commit: c20d233585]
2019-03-25 07:46:53 +00:00
Maneesh Gupta 9e2774e81e Merge pull request #962 from gargrahul/add_2d_copy_fallback
Add 2D fallback to use copy kernel

[ROCm/clr commit: 9de28dfa5a]
2019-03-25 07:46:43 +00:00
Rahul Garg 66ce9921d5 2D Fallback needs hcc workweek 19101 or higher
[ROCm/clr commit: bec3995700]
2019-03-25 12:07:28 +05:30
Maneesh Gupta 505fc1e98c hipExtMallocWithFlags needs hcc workweek 19115 or higher
[ROCm/clr commit: 45255ab492]
2019-03-25 11:41:20 +05:30
Maneesh Gupta 888b43cc6f Merge pull request #982 from ROCm-Developer-Tools/hack_swdev-173477
HACK for SWDEV-173477

[ROCm/clr commit: 158eac9374]
2019-03-22 09:14:38 +00:00
Wen-Heng (Jack) Chung 86379d694f HACK for SWDEV-173477
For code objects with global symbols of length 0, ROCR runtime would
ignore them even though they exist in the symbol table. Therefore the
result from read_agent_globals() can't be trusted entirely.

As a workaround to tame applications which depend on the existence of
global symbols with length 0, always return hipSuccess here.

This behavior shall be reverted once ROCR runtime has been fixed to
address SWDEV-173477


[ROCm/clr commit: cf7ad0f184]
2019-03-21 17:18:16 +00:00
Nico Trost 0b3f8dce2b fixed loss of accuracy in hipCfma()
[ROCm/clr commit: 725486fb11]
2019-03-21 10:30:10 +01:00
eshcherb 0cf8b184a5 adding hip_prof_gen verbose log (#977)
* adding hip_prof_gen verbose log

* adding stderr fatal error

* adding no error exit by default

* adding hip_prof_str regeneration dependencies

* adding more informative messages

* fixing error mesage


[ROCm/clr commit: 045c6afa2c]
2019-03-21 05:28:18 +00:00
Maneesh Gupta 19bba906a2 Merge pull request #972 from yxsamliu/global
Add declaration of symbol related API for VDI

[ROCm/clr commit: ce72890dcf]
2019-03-20 05:12:21 +00:00
Maneesh Gupta aac0de849c Merge pull request #973 from mhbliao/hliao/master/build
[Device Function] Fix typos.

[ROCm/clr commit: 54091b5273]
2019-03-20 05:12:14 +00:00
Maneesh Gupta 0bae7dac36 Merge pull request #974 from yxsamliu/name2
Change HIP dll name to amdhip64.dll on Windows

[ROCm/clr commit: 48d790e205]
2019-03-20 05:11:58 +00:00
eshcherb 05b9ae6a09 adding prof primitives generator (#967)
* adding prof primitives generator

* minor change, renaming

* minor cosmetic changes, comments correcting and dead code removing

* minor changes and renaming

* minor chane, fixing comments


[ROCm/clr commit: 1229750546]
2019-03-20 05:11:40 +00:00
Siu Chi Chan 597c06b6be reimplement HIP_INIT as hip_impl::hip_init(), add hip_init() to some of the inlined API (#966)
* reimplement HIP_INIT as a function, expose it as hip_impl::hip_init()
so that it could be called from hipLaunchKernelGGL and other inlined
HIP functions

* Don't call hip_init from ihipPreLaunchKernel


[ROCm/clr commit: fa9495841b]
2019-03-20 05:11:15 +00:00
Yaxun Sam Liu d52780d6f9 Change HIP dll name to amdhip64.dll on Windows
[ROCm/clr commit: 55f4c416a0]
2019-03-19 16:27:18 -04:00
Michael LIAO 43afb85ca4 [Device Function] Fix typos.
[ROCm/clr commit: f42e84cef7]
2019-03-19 15:32:19 -04:00
Yaxun Sam Liu 24bd42fb57 Add declaration of symbol related API for VDI
[ROCm/clr commit: fb3241a000]
2019-03-19 11:11:49 -04:00
Maneesh Gupta 5b1ce07700 Merge pull request #969 from nicholasmalaya/patch-1
Update hip_faq.md

[ROCm/clr commit: 6ced14e71c]
2019-03-19 18:42:05 +05:30
Maneesh Gupta 4366c618d5 Merge pull request #965 from mhbliao/hliao/master/immarg
[Device Function] Support immediate argument.

[ROCm/clr commit: e7453483e2]
2019-03-19 18:41:31 +05:30
Maneesh Gupta 64089c1d87 Merge pull request #954 from mhbliao/master
[hip] Re-implement hipLaunchKernelGGL as macros.

[ROCm/clr commit: 1500eec5f7]
2019-03-19 18:39:27 +05:30
Maneesh Gupta f1d064562d hipExtMallocWithFlags implementation
Change-Id: Iee9e119796472200b2933d5e23be60813f33bc75


[ROCm/clr commit: e44de376f7]
2019-03-19 11:59:22 +05:30
Nicholas Malaya bc0eab04fc Update hip_faq.md
Making more clear what this list details. In particular, this list is intended to indicate what items for each CUDA release is supported, and which are not.

[ROCm/clr commit: b1ec4e0b5f]
2019-03-18 14:51:18 -05:00
Michael LIAO 360e5b366d [Device Function] Support immediate argument.
- `immarg`, immediate argument, is enabled on all AMDGPU intrinsics.
  Revise device functions using these intrinsics with immediate
  arguments.


[ROCm/clr commit: b74b4500c4]
2019-03-15 12:38:04 -04:00
Evgeny a0c8ef2e96 tracing callback layer update
[ROCm/clr commit: 2aa88a4505]
2019-03-14 22:43:52 -05:00
Maneesh Gupta 694bbbc366 Merge pull request #963 from gargrahul/add_module_get_global_test
Test hipModuleGetGlobal

[ROCm/clr commit: e3726bbf90]
2019-03-15 06:17:50 +05:30
Maneesh Gupta fc835d6a43 Merge pull request #958 from aaronenyeshi/cxxabi-mismatch-workaround
CXX11 ABI Mismatch Workaround

[ROCm/clr commit: 23170f6af8]
2019-03-15 06:15:46 +05:30
Rahul Garg 19d2ff51c8 Test hipModuleGetGlobal
[ROCm/clr commit: 46346343af]
2019-03-15 04:08:03 +05:30
Rahul Garg da6653482d Add 2D fallback to use copy kernel
[ROCm/clr commit: af72cde0a1]
2019-03-14 13:03:06 +05:30
Siu Chi Chan b2a51c6cdb remove visibility hidden attribute
[ROCm/clr commit: 739d43c5d8]
2019-03-13 11:58:32 -04:00
Evgeny 0b4f2151a2 adding memset32d
[ROCm/clr commit: 0586924ae6]
2019-03-11 21:28:27 -05:00
Siu Chi Chan 2d6ebcaffb minor cleanup
[ROCm/clr commit: 5044c9ba49]
2019-03-11 19:51:57 +00:00
Siu Chi Chan 7955348b8e remove old style triple name
[ROCm/clr commit: f54da9358b]
2019-03-11 19:51:51 +00:00