Граф коммитов

1031 Коммитов

Автор SHA1 Сообщение Дата
searlmc1 f3bbad0716 Fix typo ( T in Vec3_cmp = needs to be int ). (#1203)
Typo introduced here:
commit 87eac86298
Author: Alex Voicu <alexandru.voicu@amd.com>
Date:   Mon Jun 24 20:02:09 2019 -0500

    Put 3-wide vector types on a ketogenic diet. (#1180)
2019-07-04 00:44:26 +00:00
kpyzhov 43ffe25979 Change the type of hipTextureObject_t to pointer to a struct. (#1199)
This is necessary to allow Runtime to perform required texture buffer handling.
2019-07-04 00:44:06 +00:00
Jannik a401997b8e Fix call to cuda library (#1197)
* Remove flags parameter from hipOccupancyMaxPotentialBlockSize

This commit makes the hipOccupancyMaxPotentialBlockSize method consistent with hcc path and the CUDA API.
2019-07-04 00:43:22 +00:00
Yaxun (Sam) Liu 3169f3a881 Add __hip_pinned_shadow__ for hip-clang and fix texture reference tests (#1200) 2019-07-03 08:51:59 +00:00
Alex Voicu 87eac86298 Put 3-wide vector types on a ketogenic diet. (#1180)
* Put 3-wide vector types on a ketogenic diet.

* Remove needless include.

* Do not be narrow-minded.

* Do not be narrow-minded.

* Put the C people on a diet too.
2019-06-25 06:32:09 +05:30
Aryan Salmanpour d6ad690cb6 [hip] implement the hipExtLaunchMultiKernelMultiDevice API (#1165)
* [hip] implement the hipExtLaunchMultiKernelMultiDevice API

* add a guard to check the HCC version for acquire_locked_hsa_queue() API which was introdued in HCC for ROCm 2.5

* modified code based on the requested changes

* changes to lock all streams before launching kernels for each device and unlock them after the dispatches

* check each stream to be valid before starting to lock all the streams
2019-06-20 05:59:05 +05:30
wkwchau 28c34ead70 Implement the hipOccupancyMaxPotentialBlockSize function (#1162)
* Implement the hipOccupancyMaxPotentialBlockSize function

* Replaced hipGetDeviceProperties() call by ihipGetDeviceProperties() in ihipOccupancyMaxPotentialBlockSize()

* Add test for hipOccupancyMaxPotentialBlockSize in Module API

* Added extern declaration for ihipGetDeviceProperties() to be accessed inside ihipOccupancyMaxPotentialBlockSize()

* fixed hipOccupancyMaxPotentialBlockSize test build issue

* Fix hipOccupancyMaxPotentialBlockSize dtest

* Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest

* Revert "Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest"

This reverts commit 0480ff56f1441fc515d2c26ce33783e303423938.

* Disable hipOccupancyMaxPotentialBlockSize dtest on NVCC

* move extern declaration of ihipGetDeviceProperties to hip_module.cpp

* Update the limiation of 32 wavefronts per CU and 800/512 SGPRs for VI/pre-VI chips to calculate the occupancy
2019-06-20 05:58:29 +05:30
Evgeny c6600ba26b prof layer includes refactoring 2019-06-11 20:13:29 -05:00
Michael LIAO 4d5cf71c48 [hip] Replace implicit conversions with explicit ones. 2019-06-05 10:00:35 -04:00
Maneesh Gupta 1d5d923d36 Merge pull request #1140 from scchan/program_state_stage_2-rebase-20190524
migrate more program_state logic from header into shared library (phase II)
2019-06-05 16:09:01 +05:30
Maneesh Gupta b2c6817df0 Merge pull request #1159 from mhbliao/hliao/master/explicit
[hip] Make vector type's scalar conversion explicit.
2019-06-05 12:13:56 +05:30
Michael LIAO 62d70cb2c8 [hip] Make vector type's scalar conversion explicit. 2019-06-04 15:15:26 -04:00
Michael LIAO 2655003fde [hip] Minor fix to silence compilation warnings.
- Add parenthese to silence repeative compilation warnings across
  projects built against hip.
2019-06-04 10:39:31 -04:00
Maneesh Gupta 8a859f8020 Merge pull request #1154 from yxsamliu/tex
Add device_builtin_texture_type attribute to texture type for hip-clang
2019-06-04 13:21:29 +05:30
Maneesh Gupta ef87f7eaef Merge pull request #1152 from asalmanp/hip_as_b
Header change for new hip API hipExtLaunchMultiKernelMultiDevice
2019-06-04 13:21:13 +05:30
Maneesh Gupta 98aa6cf895 Merge pull request #1149 from zuhaib27/SWDEV-185448
Structured hipFloatComplex as typedef of float2, and hipDoubleComplex as typedef of double2.
2019-06-04 13:21:02 +05:30
Yaxun Sam Liu 7c20081f8d Add device_builtin_texture_type attribute to texture type for hip-clang
This is required to support texture type for hip-clang.
2019-05-31 12:07:58 -04:00
Aryan Salmanpour d8e94fd5b5 Header change for new hip API hipExtLaunchMultiKernelMultiDevice 2019-05-30 18:04:05 -04:00
Siu Chi Chan b2ffd6afc2 fix compilation error when host compiler is clang (#1147)
* fix compilation error when host compiler is clang

* use a macro specifically for hcc && hip-clang
2019-05-29 12:34:48 +05:30
Zuhaib Khan 6aa704e7b9 Structured hipFloatComplex as typedef of float2, and hipDoubleComplex as typedef of double2. 2019-05-28 16:57:51 -04:00
Maneesh Gupta 4af81134ba Header changes for cooperative groups
Change-Id: I5f3acca94275d74adc97adcb168aed9f74951189
2019-05-28 16:58:55 +05:30
Siu Chi Chan d40a4627c3 remove code_object_bundle.hpp, clean up the old Agent_global 2019-05-24 18:13:05 -04:00
Siu Chi Chan 2bc9455928 fix breakage due to compiling in C++17 2019-05-24 17:27:58 -04:00
Siu Chi Chan 1fb9ab2d44 replace std::vector for kernarg 2019-05-24 17:27:43 -04:00
Siu Chi Chan 1a2d332e76 move executable_cache into program_state.cpp 2019-05-24 17:27:25 -04:00
Siu Chi Chan e2c0122892 remove executables() from program_state 2019-05-24 17:27:01 -04:00
Siu Chi Chan 6852be819f moving agent_globals_impl into hip_module 2019-05-24 16:43:38 -04:00
Maneesh Gupta f03a8cc1b0 Merge pull request #1128 from aaronenyeshi/fix-smid-func
Fix bug in __smid not setting correct size
2019-05-24 14:16:12 +05:30
Aaron Enye Shi 2b11a8bf0c Fix bug in __smid not setting correct size
The SZ field should minus by 1 since SIZE range is 1..32. Also add comments that results may vary.
2019-05-22 19:20:09 +00:00
Evgeny Mankov 49b9df7a9e [HIP] fix typo in #1127 2019-05-22 20:48:18 +03:00
Evgeny Mankov 6806ab6745 [HIP] fix nvcc path break in #1127 2019-05-22 20:04:45 +03:00
Evgeny Mankov 9cb3e9aa5e [HIP][HIPIFY] Make hipMemcpyParam2D coherent with cuMemcpy2D
+ Makes hip_Memcpy2D struct compatible with CUDA_MEMCPY2D struct
+ Add hipMemcpyParam2D support in nvcc fallback path
+ Update hipify-clang, tests and docs accordingly
2019-05-22 18:31:39 +03:00
Alex Voicu a538eb705a Add HIPRTC, glorious ersatz for NVRTC (#1097)
* Add ersatz for NVRTC.

* Fix extraneous paren and use correct namespace.

* Use lowerCamelCase (yuck, yuck) consistently.

* Link against FS when building hiprtc lib.

* Correctly mark Manipulators. Fix dual compile.

* Add unit tests. Extend HIT to accept linker options.

* Make sure the HIPRTC library is installed.

* Better logging. Try to auto-detect the target.

* Stop specifying the target explicitly.

* Add missing flavour of `hipModuleLaunchKernel`.

* Program was already destroyed.

* Don't use `--genco`. Fix mangled name trimming.

* Fix HIPRTC breakage due to upstream noise.

* [dtests] Replace RUN -> TEST in hiprtc tests

Change-Id: Ie499e92dfe4e5c94634b1c2b76cf52d241bcfea3

* [hit] Set HIP_PATH to HIP_ROOT_DIR for all tests

Change-Id: Ib0ad1f99bc71c03e363e055dd508a7a4a210680a
2019-05-16 18:28:54 +05:30
Wen-Heng (Jack) Chung a4db991cbf Revert "HACK for SWDEV-173477" (#1004)
* Revert "HACK for SWDEV-173477"

This reverts commit cf7ad0f184.
2019-05-13 14:42:05 +05:30
Rahul Garg e1f3dc0c80 Add fine grained host memory lock support (#1095)
* Add fine grained host memory lock support

* Fix default flag check
2019-05-13 11:48:26 +05:30
Siu Chi Chan 05a1b696da migrate program_state logic from header into shared library (phase I) (#1077)
* Revert "Revert "Use COMgr to read Kernel Args Metadata (#1006)""

This reverts commit 235c6877c8.

* Revert "Use COMgr to read Kernel Args Metadata (#1006)"

This reverts commit 2c80975e9c.

* Revert "improve program state commentary"

This reverts commit f5e4fff6cc.

* Revert "load program state once per agent"

This reverts commit 2845b4c4b8.

* start moving function_names() into the hip shared lib

* start moving code_object_blobs to a new "state" object

* Consolidate various program state related static objects into a
single program_state object

* minor clean up

* move more stuffs from functional_grid_launch into program_state

* debug make_kernarg

* moving lookup for kernargs size_align into program_state

* clean up old code for kernarg size and alignment

* update hip_module to use newer api in program_state

* Create public member functions for program_state

* move most program state functions into shared library

* Pass the data buffer size to load_executable
Otherwise, it can't figure what the data size is
just from the char* (since the data is not really a string)

* turning free functions in program state into members of program_state_impl

* change the free function globals() into a member of program_state_impl

* replace the static mutex used for populating globals

* moving associate_code_object_symbols_with_host_allocation into
program_state_impl

* move load_code_object_and_freeze_executable into program_state_impl

* moving executables and functions_names into program_state_impl

* moving kernels() into program_state_impl

* moving functions() into program_state_impl

* move get_kernargs into program_state_impl

* moving kernel_descriptor into program_state_impl

* moving kernargs_size_align calculation into program_state_impl

* Changing the handle to program_state_impl to a pointer

* moving program_state_impl into a separate inline source file

* fixing/cleaning up some header file includes

* moving member function for kernargs_size_align into program_state.cpp

* moving Kernel_descriptor into program_state.inl

* add a new class to manage agent globals

* moving all agent globals processing functions into agent_globals_impl

* load program state once per agent

re-merging PR991 against other program state changes

* fix per-agent program state member initialization

* cache executables based on elf name, isa, and agent.

This avoids program state reloading executables after a shared library is dlopened.

re-merging PR1057 against other program state changes

* protect executables cache by a global mutex

* return ref to executables cache

* adapt PR#981 Make hipModuleGetGlobal be in HIP runtime
2019-05-12 19:24:03 +05:30
Maneesh Gupta e78a09c041 Merge pull request #1084 from mhbliao/hliao/master/api_ext
[hip] Add API `hipExtModuleLaunchKernel` in HIP runtime
2019-05-09 18:26:31 +05:30
Maneesh Gupta c6cf2a9e26 Merge pull request #1082 from gargrahul/fix_hipmemcpy_symbol_nvcc
Fix symbol address issue on NVCC path
2019-05-07 16:17:01 +05:30
Maneesh Gupta c6c5e4cee8 Merge pull request #1081 from mangupta/swdev-181624
Implement hipExtGetLinkTypeAndHopCount for ROCm devices
2019-05-07 16:15:41 +05:30
Maneesh Gupta 11972049c6 Merge pull request #1068 from mhbliao/hliao/master/dev_vec_func
[devfunc] Add necessary `__device__` and `__host__` attributes.
2019-05-07 16:01:48 +05:30
Michael LIAO de768c22ae [hip] Add API hipExtModuleLaunchKernel in HIP runtime 2019-05-06 21:20:28 -04:00
Rahul Garg 6cbc70d238 Fix symbol address issue on NVCC path 2019-05-07 03:59:43 +05:30
Maneesh Gupta 2f43f110d9 Implement hipExtGetLinkTypeAndHopCount for ROCm devices
Change-Id: Ie5bb4f640ac6d189c7fceeab22627a7494fd10bd
2019-05-06 15:54:31 +05:30
Maneesh Gupta 2eafa5dcf9 Merge pull request #1062 from mhbliao/hliao/master/icmp
[hip] Re-implement ballot using AMDGCN builtins
2019-05-03 17:48:19 +05:30
Michael LIAO a9f90713f3 [devfunc] Add necessary __device__ and __host__ attributes.
- Minor clean up to keep consistent function declaration.
2019-05-01 22:26:35 -04:00
Michael LIAO 2380eb8ecc [Device Function] Fix implementation of __bitinsert_u64
- It's a common mistake by assuming 1 << shamt would be promoted to
  64-bit, if shamt is a 64-bit integer. That's not the case. Replace
  that left shift to a 64-bit one to ensure it won't fall into undefined
  behavior.
- Fix the host-side implementation as well for device function testing.
2019-04-30 08:59:13 -04:00
Michael LIAO a7a4d80f54 [devfunc] Re-implement ballot using AMDGCN builtins
- As the signature of `amdgcn.icmp` is changed for next-gen chip, using
  clang builtins is portable way to hide that details.
2019-04-29 17:21:25 -04:00
Aaron Enye Shi 235c6877c8 Revert "Use COMgr to read Kernel Args Metadata (#1006)"
This reverts commit 2c80975e9c.
2019-04-26 16:04:56 -04:00
Maneesh Gupta 7f81c72f1c Merge pull request #1043 from mhbliao/hliao/master/fp16
[hip] Fix including of hip_fp16.h
2019-04-24 16:50:46 +05:30
Maneesh Gupta 63ab2ea945 Merge pull request #1042 from mhbliao/hliao/master/ldg
[hip] Fix use of `__HIP_CLANG_ONLY__` in `hip_ldg.h`.
2019-04-24 16:50:37 +05:30