Commit Graph

1010 Commits

Author SHA1 Message Date
Maneesh Gupta ef87f7eaef Merge pull request #1152 from asalmanp/hip_as_b
Header change for new hip API hipExtLaunchMultiKernelMultiDevice
2019-06-04 13:21:13 +05:30
Maneesh Gupta 98aa6cf895 Merge pull request #1149 from zuhaib27/SWDEV-185448
Structured hipFloatComplex as typedef of float2, and hipDoubleComplex as typedef of double2.
2019-06-04 13:21:02 +05:30
Aryan Salmanpour d8e94fd5b5 Header change for new hip API hipExtLaunchMultiKernelMultiDevice 2019-05-30 18:04:05 -04:00
Siu Chi Chan b2ffd6afc2 fix compilation error when host compiler is clang (#1147)
* fix compilation error when host compiler is clang

* use a macro specifically for hcc && hip-clang
2019-05-29 12:34:48 +05:30
Zuhaib Khan 6aa704e7b9 Structured hipFloatComplex as typedef of float2, and hipDoubleComplex as typedef of double2. 2019-05-28 16:57:51 -04:00
Maneesh Gupta 4af81134ba Header changes for cooperative groups
Change-Id: I5f3acca94275d74adc97adcb168aed9f74951189
2019-05-28 16:58:55 +05:30
Maneesh Gupta f03a8cc1b0 Merge pull request #1128 from aaronenyeshi/fix-smid-func
Fix bug in __smid not setting correct size
2019-05-24 14:16:12 +05:30
Aaron Enye Shi 2b11a8bf0c Fix bug in __smid not setting correct size
The SZ field should minus by 1 since SIZE range is 1..32. Also add comments that results may vary.
2019-05-22 19:20:09 +00:00
Evgeny Mankov 49b9df7a9e [HIP] fix typo in #1127 2019-05-22 20:48:18 +03:00
Evgeny Mankov 6806ab6745 [HIP] fix nvcc path break in #1127 2019-05-22 20:04:45 +03:00
Evgeny Mankov 9cb3e9aa5e [HIP][HIPIFY] Make hipMemcpyParam2D coherent with cuMemcpy2D
+ Makes hip_Memcpy2D struct compatible with CUDA_MEMCPY2D struct
+ Add hipMemcpyParam2D support in nvcc fallback path
+ Update hipify-clang, tests and docs accordingly
2019-05-22 18:31:39 +03:00
Alex Voicu a538eb705a Add HIPRTC, glorious ersatz for NVRTC (#1097)
* Add ersatz for NVRTC.

* Fix extraneous paren and use correct namespace.

* Use lowerCamelCase (yuck, yuck) consistently.

* Link against FS when building hiprtc lib.

* Correctly mark Manipulators. Fix dual compile.

* Add unit tests. Extend HIT to accept linker options.

* Make sure the HIPRTC library is installed.

* Better logging. Try to auto-detect the target.

* Stop specifying the target explicitly.

* Add missing flavour of `hipModuleLaunchKernel`.

* Program was already destroyed.

* Don't use `--genco`. Fix mangled name trimming.

* Fix HIPRTC breakage due to upstream noise.

* [dtests] Replace RUN -> TEST in hiprtc tests

Change-Id: Ie499e92dfe4e5c94634b1c2b76cf52d241bcfea3

* [hit] Set HIP_PATH to HIP_ROOT_DIR for all tests

Change-Id: Ib0ad1f99bc71c03e363e055dd508a7a4a210680a
2019-05-16 18:28:54 +05:30
Wen-Heng (Jack) Chung a4db991cbf Revert "HACK for SWDEV-173477" (#1004)
* Revert "HACK for SWDEV-173477"

This reverts commit cf7ad0f184.
2019-05-13 14:42:05 +05:30
Rahul Garg e1f3dc0c80 Add fine grained host memory lock support (#1095)
* Add fine grained host memory lock support

* Fix default flag check
2019-05-13 11:48:26 +05:30
Siu Chi Chan 05a1b696da migrate program_state logic from header into shared library (phase I) (#1077)
* Revert "Revert "Use COMgr to read Kernel Args Metadata (#1006)""

This reverts commit 235c6877c8.

* Revert "Use COMgr to read Kernel Args Metadata (#1006)"

This reverts commit 2c80975e9c.

* Revert "improve program state commentary"

This reverts commit f5e4fff6cc.

* Revert "load program state once per agent"

This reverts commit 2845b4c4b8.

* start moving function_names() into the hip shared lib

* start moving code_object_blobs to a new "state" object

* Consolidate various program state related static objects into a
single program_state object

* minor clean up

* move more stuffs from functional_grid_launch into program_state

* debug make_kernarg

* moving lookup for kernargs size_align into program_state

* clean up old code for kernarg size and alignment

* update hip_module to use newer api in program_state

* Create public member functions for program_state

* move most program state functions into shared library

* Pass the data buffer size to load_executable
Otherwise, it can't figure what the data size is
just from the char* (since the data is not really a string)

* turning free functions in program state into members of program_state_impl

* change the free function globals() into a member of program_state_impl

* replace the static mutex used for populating globals

* moving associate_code_object_symbols_with_host_allocation into
program_state_impl

* move load_code_object_and_freeze_executable into program_state_impl

* moving executables and functions_names into program_state_impl

* moving kernels() into program_state_impl

* moving functions() into program_state_impl

* move get_kernargs into program_state_impl

* moving kernel_descriptor into program_state_impl

* moving kernargs_size_align calculation into program_state_impl

* Changing the handle to program_state_impl to a pointer

* moving program_state_impl into a separate inline source file

* fixing/cleaning up some header file includes

* moving member function for kernargs_size_align into program_state.cpp

* moving Kernel_descriptor into program_state.inl

* add a new class to manage agent globals

* moving all agent globals processing functions into agent_globals_impl

* load program state once per agent

re-merging PR991 against other program state changes

* fix per-agent program state member initialization

* cache executables based on elf name, isa, and agent.

This avoids program state reloading executables after a shared library is dlopened.

re-merging PR1057 against other program state changes

* protect executables cache by a global mutex

* return ref to executables cache

* adapt PR#981 Make hipModuleGetGlobal be in HIP runtime
2019-05-12 19:24:03 +05:30
Maneesh Gupta e78a09c041 Merge pull request #1084 from mhbliao/hliao/master/api_ext
[hip] Add API `hipExtModuleLaunchKernel` in HIP runtime
2019-05-09 18:26:31 +05:30
Maneesh Gupta c6cf2a9e26 Merge pull request #1082 from gargrahul/fix_hipmemcpy_symbol_nvcc
Fix symbol address issue on NVCC path
2019-05-07 16:17:01 +05:30
Maneesh Gupta c6c5e4cee8 Merge pull request #1081 from mangupta/swdev-181624
Implement hipExtGetLinkTypeAndHopCount for ROCm devices
2019-05-07 16:15:41 +05:30
Maneesh Gupta 11972049c6 Merge pull request #1068 from mhbliao/hliao/master/dev_vec_func
[devfunc] Add necessary `__device__` and `__host__` attributes.
2019-05-07 16:01:48 +05:30
Michael LIAO de768c22ae [hip] Add API hipExtModuleLaunchKernel in HIP runtime 2019-05-06 21:20:28 -04:00
Rahul Garg 6cbc70d238 Fix symbol address issue on NVCC path 2019-05-07 03:59:43 +05:30
Maneesh Gupta 2f43f110d9 Implement hipExtGetLinkTypeAndHopCount for ROCm devices
Change-Id: Ie5bb4f640ac6d189c7fceeab22627a7494fd10bd
2019-05-06 15:54:31 +05:30
Maneesh Gupta 2eafa5dcf9 Merge pull request #1062 from mhbliao/hliao/master/icmp
[hip] Re-implement ballot using AMDGCN builtins
2019-05-03 17:48:19 +05:30
Michael LIAO a9f90713f3 [devfunc] Add necessary __device__ and __host__ attributes.
- Minor clean up to keep consistent function declaration.
2019-05-01 22:26:35 -04:00
Michael LIAO 2380eb8ecc [Device Function] Fix implementation of __bitinsert_u64
- It's a common mistake by assuming 1 << shamt would be promoted to
  64-bit, if shamt is a 64-bit integer. That's not the case. Replace
  that left shift to a 64-bit one to ensure it won't fall into undefined
  behavior.
- Fix the host-side implementation as well for device function testing.
2019-04-30 08:59:13 -04:00
Michael LIAO a7a4d80f54 [devfunc] Re-implement ballot using AMDGCN builtins
- As the signature of `amdgcn.icmp` is changed for next-gen chip, using
  clang builtins is portable way to hide that details.
2019-04-29 17:21:25 -04:00
Aaron Enye Shi 235c6877c8 Revert "Use COMgr to read Kernel Args Metadata (#1006)"
This reverts commit 2c80975e9c.
2019-04-26 16:04:56 -04:00
Maneesh Gupta 7f81c72f1c Merge pull request #1043 from mhbliao/hliao/master/fp16
[hip] Fix including of hip_fp16.h
2019-04-24 16:50:46 +05:30
Maneesh Gupta 63ab2ea945 Merge pull request #1042 from mhbliao/hliao/master/ldg
[hip] Fix use of `__HIP_CLANG_ONLY__` in `hip_ldg.h`.
2019-04-24 16:50:37 +05:30
Maneesh Gupta 54cdeabe6e Merge pull request #1040 from eshcherb/roctracer-hip-frontend-190422
hip_prof_api.h include under __cplusplus
2019-04-24 16:50:27 +05:30
Maneesh Gupta 7edb43bc83 Merge pull request #1039 from gargrahul/fix_ptrgetattr_nvcc
Fix hipPointerGetAttributes for NVCC
2019-04-24 16:50:18 +05:30
Rahul Garg 94769fc8dd Add hipMallocManaged default functional support (#1036)
* Add hipMallocManaged default functional support

* Fix build error

* Add dtest
2019-04-24 16:50:03 +05:30
Michael LIAO d086dbd0e5 [hip] Fix including of hip_fp16.h
- Separate the definition of `__HCC_OR_HIP_CLANG__`, `__HCC_ONLY__`, and
  `__HIP_CLANG_ONLY__` into hip_common.h so that it could be included in
  hip_fp16.h, which may be included separately in app.
2019-04-23 09:16:00 -04:00
Michael LIAO ca6a5c07eb [hip] Fix use of __HIP_CLANG_ONLY__ in hip_ldg.h.
- Check its value instead of whether it's defined or not.
2019-04-22 23:22:32 -04:00
Evgeny 165c42483b hip_prof_api.h include under __cplusplus 2019-04-22 21:14:18 -05:00
Rahul Garg c0e0f0b7fd Fix hipPointerGetAttributes for NVCC 2019-04-23 03:22:25 +05:30
Konstantin Pyzhov f6fbf8751d Fix for __popcll() device function implementation. 2019-04-19 08:53:22 -04:00
Konstantin Pyzhov 5664ed3206 Fix for __ffsll() device functions. 2019-04-18 13:07:24 -04:00
David Salinas 1237a0b691 Revert "append the ELF flags for sram-ecc and xnack to the target triple per code object"
This reverts commit 4d0dc45078.
2019-04-18 11:49:40 -04:00
Maneesh Gupta 715a500b97 Merge pull request #995 from david-salinas/add_sram-ecc_and_xnack_flags_to_triple
Append the ELF flags for sram-ecc and xnack to the target triple per code object
2019-04-16 09:10:04 +05:30
Maneesh Gupta 22660bed74 Merge pull request #1019 from scchan/lazy_binding
minor workaround for lazy binding
2019-04-16 08:36:10 +05:30
Mr-LiuSw 64bdf82265 add little changes in hip_runtime_api.h to work with c language (#1017)
* Update hip_runtime_api.h

when i try to use mpicc or gcc to compile a c language code which call some hip runtime api , error occured as
> /path/to/hcc_detail/hip_runtime_api.h:2268:33: error: unknown type name ‘hipFuncAttributes’; 
> hipFuncGetAttributes(hipFuncAttributes* attr, const void* func);
 
add ' struct ' for the first parameter of hipFuncGetAttributes will get ride of this problem.
2019-04-16 08:35:36 +05:30
Aaron Enye Shi 2c80975e9c Use COMgr to read Kernel Args Metadata (#1006)
* Add CMAKE dep to amd_comgr

* Use COMGR for read_kernarg_metadata in COV2

* Do not assume kernargs exist

* Add proper metadata destroy cleanup

* Use a process function for easier destroy

* Remove old read_kernarg_metadata

* Clean up HCC, prints, names

* Use COMGR in CMAKE by default

* Move metadata lookup for keyword values into helper

* Remove C string usage for lookup_keyword_value

* Guard COMGR for non-NVCC path

* Add hip_hcc dependency on comgr package

* Add lifetime to metadata nodes

* Find COMGR config file for amd_comgr target

* Move set_active data earlier
2019-04-16 08:34:39 +05:30
Yaxun (Sam) Liu e200ece4da hip-clang: Add __align__
CUDA has __align__. Define eqivalent for hip-clang.
2019-04-10 14:17:18 -04:00
David Salinas 4d0dc45078 append the ELF flags for sram-ecc and xnack to the target triple per code object 2019-04-05 13:17:11 -04:00
Siu Chi Chan b5045af7e9 minor workaround for lazy binding 2019-04-02 17:28:06 -04:00
Wen-Heng (Jack) Chung 04915cea2f Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime (#981)
* Make hipModuleGetGlobal be in HIP runtime so it can be discovered at runtime

In HIP PR #929, quite a few HIP public APIs were made as inline functions with
hidden visibility. It was necessary to support applications with shared
libraries with GPU kernels launched via hipLaunchKernelGGL(), after HIP runtime
is initialized.

In empirical tests, the implementation has been proved to be a bit too
excessive, especially for hipModuleGetGlobal(). The function is used by another
type of client applications which relies on the existence of this function
within HIP runtime so global symbols from HSA code objects loaded dynamically
at runtime can be retrieved programmtically.

This commit moves hipModuleGetGlobal() back to src/hip_module.cpp, and makes it
visible and not inline, to fulfill requirements for applications
aforementioned. It does not change the behavior of applications depending on
hipLaunchKernelGGL().

* Add HIP_INIT_API into the implementation of hipModuleGetGlobal

Address review comments.

* Fix failing HIP unit tests
2019-03-29 03:45:04 +00:00
Jeff Daily f5e4fff6cc improve program state commentary
Disambiguate calling many varibles "agent".
More detail in exception message.
Create and discard map placeholders; no need to call std::vector::clear() on map value.
2019-03-27 21:40:27 +00:00
Jeff Daily 2845b4c4b8 load program state once per agent 2019-03-27 18:19:10 +00:00
Maneesh Gupta 178e3ecdca Merge pull request #990 from mhbliao/hliao/master/sw
SWDEV-184380 Fix hcc compilation
2019-03-27 05:23:26 +00:00