Граф коммитов

993 Коммитов

Автор SHA1 Сообщение Дата
Maneesh Gupta 7264f6b64e Merge pull request #1082 from gargrahul/fix_hipmemcpy_symbol_nvcc
Fix symbol address issue on NVCC path
2019-05-07 16:17:01 +05:30
Maneesh Gupta d5abe65668 Merge pull request #1081 from mangupta/swdev-181624
Implement hipExtGetLinkTypeAndHopCount for ROCm devices
2019-05-07 16:15:41 +05:30
Maneesh Gupta d82d6b499e Merge pull request #1068 from mhbliao/hliao/master/dev_vec_func
[devfunc] Add necessary `__device__` and `__host__` attributes.
2019-05-07 16:01:48 +05:30
Rahul Garg e906bd69ce Fix symbol address issue on NVCC path 2019-05-07 03:59:43 +05:30
Maneesh Gupta 035ef04e1f Implement hipExtGetLinkTypeAndHopCount for ROCm devices
Change-Id: Ie5bb4f640ac6d189c7fceeab22627a7494fd10bd
2019-05-06 15:54:31 +05:30
Maneesh Gupta 117bdd8774 Merge pull request #1062 from mhbliao/hliao/master/icmp
[hip] Re-implement ballot using AMDGCN builtins
2019-05-03 17:48:19 +05:30
Michael LIAO 44692978fe [devfunc] Add necessary __device__ and __host__ attributes.
- Minor clean up to keep consistent function declaration.
2019-05-01 22:26:35 -04:00
Michael LIAO 9bd2d5746d [Device Function] Fix implementation of __bitinsert_u64
- It's a common mistake by assuming 1 << shamt would be promoted to
  64-bit, if shamt is a 64-bit integer. That's not the case. Replace
  that left shift to a 64-bit one to ensure it won't fall into undefined
  behavior.
- Fix the host-side implementation as well for device function testing.
2019-04-30 08:59:13 -04:00
Michael LIAO a64637da2c [devfunc] Re-implement ballot using AMDGCN builtins
- As the signature of `amdgcn.icmp` is changed for next-gen chip, using
  clang builtins is portable way to hide that details.
2019-04-29 17:21:25 -04:00
Aaron Enye Shi a3d118eaa8 Revert "Use COMgr to read Kernel Args Metadata (#1006)"
This reverts commit 8a548bf40b.
2019-04-26 16:04:56 -04:00
Maneesh Gupta ffe9f86fe8 Merge pull request #1043 from mhbliao/hliao/master/fp16
[hip] Fix including of hip_fp16.h
2019-04-24 16:50:46 +05:30
Maneesh Gupta de6c680767 Merge pull request #1042 from mhbliao/hliao/master/ldg
[hip] Fix use of `__HIP_CLANG_ONLY__` in `hip_ldg.h`.
2019-04-24 16:50:37 +05:30
Maneesh Gupta e489f7579a Merge pull request #1040 from eshcherb/roctracer-hip-frontend-190422
hip_prof_api.h include under __cplusplus
2019-04-24 16:50:27 +05:30
Maneesh Gupta 2975221560 Merge pull request #1039 from gargrahul/fix_ptrgetattr_nvcc
Fix hipPointerGetAttributes for NVCC
2019-04-24 16:50:18 +05:30
Rahul Garg 2bc2c46d4d Add hipMallocManaged default functional support (#1036)
* Add hipMallocManaged default functional support

* Fix build error

* Add dtest
2019-04-24 16:50:03 +05:30
Michael LIAO dc0d7bd5ce [hip] Fix including of hip_fp16.h
- Separate the definition of `__HCC_OR_HIP_CLANG__`, `__HCC_ONLY__`, and
  `__HIP_CLANG_ONLY__` into hip_common.h so that it could be included in
  hip_fp16.h, which may be included separately in app.
2019-04-23 09:16:00 -04:00
Michael LIAO 6fb07acc8c [hip] Fix use of __HIP_CLANG_ONLY__ in hip_ldg.h.
- Check its value instead of whether it's defined or not.
2019-04-22 23:22:32 -04:00
Evgeny af3f3ccb2b hip_prof_api.h include under __cplusplus 2019-04-22 21:14:18 -05:00
Rahul Garg 69a3d6b72a Fix hipPointerGetAttributes for NVCC 2019-04-23 03:22:25 +05:30
Konstantin Pyzhov beadaab661 Fix for __popcll() device function implementation. 2019-04-19 08:53:22 -04:00
Konstantin Pyzhov b7bd29924a Fix for __ffsll() device functions. 2019-04-18 13:07:24 -04:00
David Salinas 5843530a06 Revert "append the ELF flags for sram-ecc and xnack to the target triple per code object"
This reverts commit c61f265657.
2019-04-18 11:49:40 -04:00
Maneesh Gupta 8309632e2d Merge pull request #995 from david-salinas/add_sram-ecc_and_xnack_flags_to_triple
Append the ELF flags for sram-ecc and xnack to the target triple per code object
2019-04-16 09:10:04 +05:30
Maneesh Gupta d789aef46e Merge pull request #1019 from scchan/lazy_binding
minor workaround for lazy binding
2019-04-16 08:36:10 +05:30
Mr-LiuSw dc07b1b06c add little changes in hip_runtime_api.h to work with c language (#1017)
* Update hip_runtime_api.h

when i try to use mpicc or gcc to compile a c language code which call some hip runtime api , error occured as
> /path/to/hcc_detail/hip_runtime_api.h:2268:33: error: unknown type name ‘hipFuncAttributes’; 
> hipFuncGetAttributes(hipFuncAttributes* attr, const void* func);
 
add ' struct ' for the first parameter of hipFuncGetAttributes will get ride of this problem.
2019-04-16 08:35:36 +05:30
Aaron Enye Shi 8a548bf40b Use COMgr to read Kernel Args Metadata (#1006)
* Add CMAKE dep to amd_comgr

* Use COMGR for read_kernarg_metadata in COV2

* Do not assume kernargs exist

* Add proper metadata destroy cleanup

* Use a process function for easier destroy

* Remove old read_kernarg_metadata

* Clean up HCC, prints, names

* Use COMGR in CMAKE by default

* Move metadata lookup for keyword values into helper

* Remove C string usage for lookup_keyword_value

* Guard COMGR for non-NVCC path

* Add hip_hcc dependency on comgr package

* Add lifetime to metadata nodes

* Find COMGR config file for amd_comgr target

* Move set_active data earlier
2019-04-16 08:34:39 +05:30
Yaxun (Sam) Liu b9f8f977fb hip-clang: Add __align__
CUDA has __align__. Define eqivalent for hip-clang.
2019-04-10 14:17:18 -04:00
David Salinas c61f265657 append the ELF flags for sram-ecc and xnack to the target triple per code object 2019-04-05 13:17:11 -04:00
Siu Chi Chan e126f7254c minor workaround for lazy binding 2019-04-02 17:28:06 -04:00
Wen-Heng (Jack) Chung 4b7177ac42 Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime (#981)
* Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime

In HIP PR #929, quite a few HIP public APIs were made as inline functions with
hidden visibility. It was necessary to support applications with shared
libraries with GPU kernels launched via hipLaunchKernelGGL(), after HIP runtime
is initialized.

In empirical tests, the implementation has been proved to be a bit too
excessive, especially for hipModuleGetGlobal(). The function is used by another
type of client applications which relies on the existence of this function
within HIP runtime so global symbols from HSA code objects loaded dynamically
at runtime can be retrieved programmtically.

This commit moves hipModuleGetGlobal() back to src/hip_module.cpp, and makes it
visible and not inline, to fulfill requirements for applications
aforementioned. It does not change the behavior of applications depending on
hipLaunchKernelGGL().

* Add HIP_INIT_API into the implementation of hipModuleGetGlobal

Address review comments.

* Fix failing HIP unit tests
2019-03-29 03:45:04 +00:00
Jeff Daily 7aada87cbd improve program state commentary
Disambiguate calling many varibles "agent".
More detail in exception message.
Create and discard map placeholders; no need to call std::vector::clear() on map value.
2019-03-27 21:40:27 +00:00
Jeff Daily c9117de8eb load program state once per agent 2019-03-27 18:19:10 +00:00
Maneesh Gupta e66dfa6ea3 Merge pull request #990 from mhbliao/hliao/master/sw
SWDEV-184380 Fix hcc compilation
2019-03-27 05:23:26 +00:00
Michael LIAO 939d153c1a SWDEV-184380 Fix hcc compilation
- `hcc` has no builtin. Need to invoke LLVM intrinsic directly.
2019-03-26 15:20:17 -04:00
Michael LIAO b3497824ec [hip] Fix typo in macro hipLaunchKernel 2019-03-25 12:06:46 -04:00
Maneesh Gupta 30b5c02ec4 Merge pull request #970 from mangupta/swdev-172995
hipExtMallocWithFlags implementation
2019-03-25 07:46:53 +00:00
Maneesh Gupta a75f375788 Merge pull request #982 from ROCm-Developer-Tools/hack_swdev-173477
HACK for SWDEV-173477
2019-03-22 09:14:38 +00:00
Wen-Heng (Jack) Chung d941f19399 HACK for SWDEV-173477
For code objects with global symbols of length 0, ROCR runtime would
ignore them even though they exist in the symbol table. Therefore the
result from read_agent_globals() can't be trusted entirely.

As a workaround to tame applications which depend on the existence of
global symbols with length 0, always return hipSuccess here.

This behavior shall be reverted once ROCR runtime has been fixed to
address SWDEV-173477
2019-03-21 17:18:16 +00:00
Nico Trost 7bb5a14a4e fixed loss of accuracy in hipCfma() 2019-03-21 10:30:10 +01:00
Maneesh Gupta 6ade928535 Merge pull request #972 from yxsamliu/global
Add declaration of symbol related API for VDI
2019-03-20 05:12:21 +00:00
Maneesh Gupta 87551e29f6 Merge pull request #973 from mhbliao/hliao/master/build
[Device Function] Fix typos.
2019-03-20 05:12:14 +00:00
eshcherb 673cdeea55 adding prof primitives generator (#967)
* adding prof primitives generator

* minor change, renaming

* minor cosmetic changes, comments correcting and dead code removing

* minor changes and renaming

* minor chane, fixing comments
2019-03-20 05:11:40 +00:00
Siu Chi Chan 24d08beef8 reimplement HIP_INIT as hip_impl::hip_init(), add hip_init() to some of the inlined API (#966)
* reimplement HIP_INIT as a function, expose it as hip_impl::hip_init()
so that it could be called from hipLaunchKernelGGL and other inlined
HIP functions

* Don't call hip_init from ihipPreLaunchKernel
2019-03-20 05:11:15 +00:00
Michael LIAO e51fa174c8 [Device Function] Fix typos. 2019-03-19 15:32:19 -04:00
Yaxun Sam Liu e9bc4034d8 Add declaration of symbol related API for VDI 2019-03-19 11:11:49 -04:00
Maneesh Gupta 4094824b9e Merge pull request #965 from mhbliao/hliao/master/immarg
[Device Function] Support immediate argument.
2019-03-19 18:41:31 +05:30
Maneesh Gupta 91ebfadabf Merge pull request #954 from mhbliao/master
[hip] Re-implement hipLaunchKernelGGL as macros.
2019-03-19 18:39:27 +05:30
Maneesh Gupta 73ec5d54b5 hipExtMallocWithFlags implementation
Change-Id: Iee9e119796472200b2933d5e23be60813f33bc75
2019-03-19 11:59:22 +05:30
Michael LIAO 950b6efe72 [Device Function] Support immediate argument.
- `immarg`, immediate argument, is enabled on all AMDGPU intrinsics.
  Revise device functions using these intrinsics with immediate
  arguments.
2019-03-15 12:38:04 -04:00
Evgeny 31475c5ac8 tracing callback layer update 2019-03-14 22:43:52 -05:00