Graf Tiomantas

1027 Tiomáintí

Údar SHA1 Teachtaireacht Dáta
Alex Voicu e474469213 Put 3-wide vector types on a ketogenic diet. (#1180)
* Put 3-wide vector types on a ketogenic diet.

* Remove needless include.

* Do not be narrow-minded.

* Do not be narrow-minded.

* Put the C people on a diet too.


[ROCm/hip commit: 67abac1365]
2019-06-25 06:32:09 +05:30
Aryan Salmanpour 45fa752888 [hip] implement the hipExtLaunchMultiKernelMultiDevice API (#1165)
* [hip] implement the hipExtLaunchMultiKernelMultiDevice API

* add a guard to check the HCC version for acquire_locked_hsa_queue() API which was introdued in HCC for ROCm 2.5

* modified code based on the requested changes

* changes to lock all streams before launching kernels for each device and unlock them after the dispatches

* check each stream to be valid before starting to lock all the streams


[ROCm/hip commit: 96dc74897d]
2019-06-20 05:59:05 +05:30
wkwchau 40bd111519 Implement the hipOccupancyMaxPotentialBlockSize function (#1162)
* Implement the hipOccupancyMaxPotentialBlockSize function

* Replaced hipGetDeviceProperties() call by ihipGetDeviceProperties() in ihipOccupancyMaxPotentialBlockSize()

* Add test for hipOccupancyMaxPotentialBlockSize in Module API

* Added extern declaration for ihipGetDeviceProperties() to be accessed inside ihipOccupancyMaxPotentialBlockSize()

* fixed hipOccupancyMaxPotentialBlockSize test build issue

* Fix hipOccupancyMaxPotentialBlockSize dtest

* Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest

* Revert "Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest"

This reverts commit 0480ff56f1441fc515d2c26ce33783e303423938.

* Disable hipOccupancyMaxPotentialBlockSize dtest on NVCC

* move extern declaration of ihipGetDeviceProperties to hip_module.cpp

* Update the limiation of 32 wavefronts per CU and 800/512 SGPRs for VI/pre-VI chips to calculate the occupancy


[ROCm/hip commit: d492f1fd6b]
2019-06-20 05:58:29 +05:30
Evgeny 5a004eb676 prof layer includes refactoring
[ROCm/hip commit: 19ca7a2a08]
2019-06-11 20:13:29 -05:00
Michael LIAO 96cc54ab79 [hip] Replace implicit conversions with explicit ones.
[ROCm/hip commit: 52a8f2fed4]
2019-06-05 10:00:35 -04:00
Maneesh Gupta 58caf3c615 Merge pull request #1140 from scchan/program_state_stage_2-rebase-20190524
migrate more program_state logic from header into shared library (phase II)

[ROCm/hip commit: 7013f87885]
2019-06-05 16:09:01 +05:30
Maneesh Gupta 95368b8b28 Merge pull request #1159 from mhbliao/hliao/master/explicit
[hip] Make vector type's scalar conversion explicit.

[ROCm/hip commit: 0982bb52f2]
2019-06-05 12:13:56 +05:30
Michael LIAO e5d347a679 [hip] Make vector type's scalar conversion explicit.
[ROCm/hip commit: 9bb4ecfcfe]
2019-06-04 15:15:26 -04:00
Michael LIAO de162fa300 [hip] Minor fix to silence compilation warnings.
- Add parenthese to silence repeative compilation warnings across
  projects built against hip.


[ROCm/hip commit: 498fe92734]
2019-06-04 10:39:31 -04:00
Maneesh Gupta 1d4edf5379 Merge pull request #1154 from yxsamliu/tex
Add device_builtin_texture_type attribute to texture type for hip-clang

[ROCm/hip commit: 0564d8af57]
2019-06-04 13:21:29 +05:30
Maneesh Gupta 67145c20c2 Merge pull request #1152 from asalmanp/hip_as_b
Header change for new hip API hipExtLaunchMultiKernelMultiDevice

[ROCm/hip commit: 49e254f5e8]
2019-06-04 13:21:13 +05:30
Maneesh Gupta 0ba8db8f7e Merge pull request #1149 from zuhaib27/SWDEV-185448
Structured hipFloatComplex as typedef of float2, and hipDoubleComplex as typedef of double2.

[ROCm/hip commit: b0d90ac2ed]
2019-06-04 13:21:02 +05:30
Yaxun Sam Liu 474e9b8847 Add device_builtin_texture_type attribute to texture type for hip-clang
This is required to support texture type for hip-clang.


[ROCm/hip commit: 80bcf0785a]
2019-05-31 12:07:58 -04:00
Aryan Salmanpour 510af08a36 Header change for new hip API hipExtLaunchMultiKernelMultiDevice
[ROCm/hip commit: f34654f8aa]
2019-05-30 18:04:05 -04:00
Siu Chi Chan b2dff85550 fix compilation error when host compiler is clang (#1147)
* fix compilation error when host compiler is clang

* use a macro specifically for hcc && hip-clang


[ROCm/hip commit: 72e51f3ad0]
2019-05-29 12:34:48 +05:30
Zuhaib Khan 2446ed8f62 Structured hipFloatComplex as typedef of float2, and hipDoubleComplex as typedef of double2.
[ROCm/hip commit: cc0f3445bb]
2019-05-28 16:57:51 -04:00
Maneesh Gupta d5e847f2a3 Header changes for cooperative groups
Change-Id: I5f3acca94275d74adc97adcb168aed9f74951189


[ROCm/hip commit: 21a9e05867]
2019-05-28 16:58:55 +05:30
Siu Chi Chan e02fb5a835 remove code_object_bundle.hpp, clean up the old Agent_global
[ROCm/hip commit: 54f94ed02f]
2019-05-24 18:13:05 -04:00
Siu Chi Chan 6e2efe563a fix breakage due to compiling in C++17
[ROCm/hip commit: cec0926924]
2019-05-24 17:27:58 -04:00
Siu Chi Chan e92b422b80 replace std::vector for kernarg
[ROCm/hip commit: fc08f29735]
2019-05-24 17:27:43 -04:00
Siu Chi Chan 12d457cb4d move executable_cache into program_state.cpp
[ROCm/hip commit: 00824be34c]
2019-05-24 17:27:25 -04:00
Siu Chi Chan 44943f5cd9 remove executables() from program_state
[ROCm/hip commit: 80fec2b477]
2019-05-24 17:27:01 -04:00
Siu Chi Chan b9b076a958 moving agent_globals_impl into hip_module
[ROCm/hip commit: 4239cfcf02]
2019-05-24 16:43:38 -04:00
Maneesh Gupta 77010ad8e8 Merge pull request #1128 from aaronenyeshi/fix-smid-func
Fix bug in __smid not setting correct size

[ROCm/hip commit: 68ae4a0a3e]
2019-05-24 14:16:12 +05:30
Aaron Enye Shi b08cbff718 Fix bug in __smid not setting correct size
The SZ field should minus by 1 since SIZE range is 1..32. Also add comments that results may vary.


[ROCm/hip commit: ff74babedf]
2019-05-22 19:20:09 +00:00
Evgeny Mankov c24d90afb3 [HIP] fix typo in #1127
[ROCm/hip commit: eede980a39]
2019-05-22 20:48:18 +03:00
Evgeny Mankov 9ebc775254 [HIP] fix nvcc path break in #1127
[ROCm/hip commit: 47bbb9c605]
2019-05-22 20:04:45 +03:00
Evgeny Mankov cd309b6638 [HIP][HIPIFY] Make hipMemcpyParam2D coherent with cuMemcpy2D
+ Makes hip_Memcpy2D struct compatible with CUDA_MEMCPY2D struct
+ Add hipMemcpyParam2D support in nvcc fallback path
+ Update hipify-clang, tests and docs accordingly


[ROCm/hip commit: 8f059b0ee9]
2019-05-22 18:31:39 +03:00
Alex Voicu d5a3acfd69 Add HIPRTC, glorious ersatz for NVRTC (#1097)
* Add ersatz for NVRTC.

* Fix extraneous paren and use correct namespace.

* Use lowerCamelCase (yuck, yuck) consistently.

* Link against FS when building hiprtc lib.

* Correctly mark Manipulators. Fix dual compile.

* Add unit tests. Extend HIT to accept linker options.

* Make sure the HIPRTC library is installed.

* Better logging. Try to auto-detect the target.

* Stop specifying the target explicitly.

* Add missing flavour of `hipModuleLaunchKernel`.

* Program was already destroyed.

* Don't use `--genco`. Fix mangled name trimming.

* Fix HIPRTC breakage due to upstream noise.

* [dtests] Replace RUN -> TEST in hiprtc tests

Change-Id: Ie499e92dfe4e5c94634b1c2b76cf52d241bcfea3

* [hit] Set HIP_PATH to HIP_ROOT_DIR for all tests

Change-Id: Ib0ad1f99bc71c03e363e055dd508a7a4a210680a


[ROCm/hip commit: ccfb764a59]
2019-05-16 18:28:54 +05:30
Wen-Heng (Jack) Chung 3264991e58 Revert "HACK for SWDEV-173477" (#1004)
* Revert "HACK for SWDEV-173477"

This reverts commit 3eb009913c.

[ROCm/hip commit: 9b9257f9b0]
2019-05-13 14:42:05 +05:30
Rahul Garg c4567ad01a Add fine grained host memory lock support (#1095)
* Add fine grained host memory lock support

* Fix default flag check


[ROCm/hip commit: aeeab1b23f]
2019-05-13 11:48:26 +05:30
Siu Chi Chan d0252dfa79 migrate program_state logic from header into shared library (phase I) (#1077)
* Revert "Revert "Use COMgr to read Kernel Args Metadata (#1006)""

This reverts commit 62e96cb4cf.

* Revert "Use COMgr to read Kernel Args Metadata (#1006)"

This reverts commit 882006555b.

* Revert "improve program state commentary"

This reverts commit fb2beb0c88.

* Revert "load program state once per agent"

This reverts commit 21f5e142f5.

* start moving function_names() into the hip shared lib

* start moving code_object_blobs to a new "state" object

* Consolidate various program state related static objects into a
single program_state object

* minor clean up

* move more stuffs from functional_grid_launch into program_state

* debug make_kernarg

* moving lookup for kernargs size_align into program_state

* clean up old code for kernarg size and alignment

* update hip_module to use newer api in program_state

* Create public member functions for program_state

* move most program state functions into shared library

* Pass the data buffer size to load_executable
Otherwise, it can't figure what the data size is
just from the char* (since the data is not really a string)

* turning free functions in program state into members of program_state_impl

* change the free function globals() into a member of program_state_impl

* replace the static mutex used for populating globals

* moving associate_code_object_symbols_with_host_allocation into
program_state_impl

* move load_code_object_and_freeze_executable into program_state_impl

* moving executables and functions_names into program_state_impl

* moving kernels() into program_state_impl

* moving functions() into program_state_impl

* move get_kernargs into program_state_impl

* moving kernel_descriptor into program_state_impl

* moving kernargs_size_align calculation into program_state_impl

* Changing the handle to program_state_impl to a pointer

* moving program_state_impl into a separate inline source file

* fixing/cleaning up some header file includes

* moving member function for kernargs_size_align into program_state.cpp

* moving Kernel_descriptor into program_state.inl

* add a new class to manage agent globals

* moving all agent globals processing functions into agent_globals_impl

* load program state once per agent

re-merging PR991 against other program state changes

* fix per-agent program state member initialization

* cache executables based on elf name, isa, and agent.

This avoids program state reloading executables after a shared library is dlopened.

re-merging PR1057 against other program state changes

* protect executables cache by a global mutex

* return ref to executables cache

* adapt PR#981 Make hipModuleGetGlobal be in HIP runtime


[ROCm/hip commit: f5eb91d53d]
2019-05-12 19:24:03 +05:30
Maneesh Gupta 1f94348f2e Merge pull request #1084 from mhbliao/hliao/master/api_ext
[hip] Add API `hipExtModuleLaunchKernel` in HIP runtime

[ROCm/hip commit: 5b607e14a6]
2019-05-09 18:26:31 +05:30
Maneesh Gupta 049c0d84bb Merge pull request #1082 from gargrahul/fix_hipmemcpy_symbol_nvcc
Fix symbol address issue on NVCC path

[ROCm/hip commit: 7264f6b64e]
2019-05-07 16:17:01 +05:30
Maneesh Gupta 36d37296d5 Merge pull request #1081 from mangupta/swdev-181624
Implement hipExtGetLinkTypeAndHopCount for ROCm devices

[ROCm/hip commit: d5abe65668]
2019-05-07 16:15:41 +05:30
Maneesh Gupta 0527ac9038 Merge pull request #1068 from mhbliao/hliao/master/dev_vec_func
[devfunc] Add necessary `__device__` and `__host__` attributes.

[ROCm/hip commit: d82d6b499e]
2019-05-07 16:01:48 +05:30
Michael LIAO 0c41f2298d [hip] Add API hipExtModuleLaunchKernel in HIP runtime
[ROCm/hip commit: 5150f1297a]
2019-05-06 21:20:28 -04:00
Rahul Garg d700f67b79 Fix symbol address issue on NVCC path
[ROCm/hip commit: e906bd69ce]
2019-05-07 03:59:43 +05:30
Maneesh Gupta f7eaff474b Implement hipExtGetLinkTypeAndHopCount for ROCm devices
Change-Id: Ie5bb4f640ac6d189c7fceeab22627a7494fd10bd


[ROCm/hip commit: 035ef04e1f]
2019-05-06 15:54:31 +05:30
Maneesh Gupta f49a4c6d89 Merge pull request #1062 from mhbliao/hliao/master/icmp
[hip] Re-implement ballot using AMDGCN builtins

[ROCm/hip commit: 117bdd8774]
2019-05-03 17:48:19 +05:30
Michael LIAO 2c2b897902 [devfunc] Add necessary __device__ and __host__ attributes.
- Minor clean up to keep consistent function declaration.


[ROCm/hip commit: 44692978fe]
2019-05-01 22:26:35 -04:00
Michael LIAO e637e72364 [Device Function] Fix implementation of __bitinsert_u64
- It's a common mistake by assuming 1 << shamt would be promoted to
  64-bit, if shamt is a 64-bit integer. That's not the case. Replace
  that left shift to a 64-bit one to ensure it won't fall into undefined
  behavior.
- Fix the host-side implementation as well for device function testing.


[ROCm/hip commit: 9bd2d5746d]
2019-04-30 08:59:13 -04:00
Michael LIAO cc4de2bc28 [devfunc] Re-implement ballot using AMDGCN builtins
- As the signature of `amdgcn.icmp` is changed for next-gen chip, using
  clang builtins is portable way to hide that details.


[ROCm/hip commit: a64637da2c]
2019-04-29 17:21:25 -04:00
Aaron Enye Shi 62e96cb4cf Revert "Use COMgr to read Kernel Args Metadata (#1006)"
This reverts commit 882006555b.


[ROCm/hip commit: a3d118eaa8]
2019-04-26 16:04:56 -04:00
Maneesh Gupta c36aa3a056 Merge pull request #1043 from mhbliao/hliao/master/fp16
[hip] Fix including of hip_fp16.h

[ROCm/hip commit: ffe9f86fe8]
2019-04-24 16:50:46 +05:30
Maneesh Gupta 7938012378 Merge pull request #1042 from mhbliao/hliao/master/ldg
[hip] Fix use of `__HIP_CLANG_ONLY__` in `hip_ldg.h`.

[ROCm/hip commit: de6c680767]
2019-04-24 16:50:37 +05:30
Maneesh Gupta 7b5ed45968 Merge pull request #1040 from eshcherb/roctracer-hip-frontend-190422
hip_prof_api.h include under __cplusplus

[ROCm/hip commit: e489f7579a]
2019-04-24 16:50:27 +05:30
Maneesh Gupta 5235da375e Merge pull request #1039 from gargrahul/fix_ptrgetattr_nvcc
Fix hipPointerGetAttributes for NVCC

[ROCm/hip commit: 2975221560]
2019-04-24 16:50:18 +05:30
Rahul Garg c01236f679 Add hipMallocManaged default functional support (#1036)
* Add hipMallocManaged default functional support

* Fix build error

* Add dtest


[ROCm/hip commit: 2bc2c46d4d]
2019-04-24 16:50:03 +05:30
Michael LIAO 6eec9d4ac8 [hip] Fix including of hip_fp16.h
- Separate the definition of `__HCC_OR_HIP_CLANG__`, `__HCC_ONLY__`, and
  `__HIP_CLANG_ONLY__` into hip_common.h so that it could be included in
  hip_fp16.h, which may be included separately in app.


[ROCm/hip commit: dc0d7bd5ce]
2019-04-23 09:16:00 -04:00