Grafico dei commit

175 Commit

Autore SHA1 Messaggio Data
Tao Sang b58355b065 SWDEV-294596 - Make hipModuleGetGlobal match cuda
Make hipModuleGetGlobal match cuModuleGetGlobal behavour.
That is, if one of the first two parameters is nullptr, ignore it.

Change-Id: I3fe6dbc35a7b14aa9119df297b7885df83d28048
2021-07-23 23:06:56 -04:00
agunashe d9d9e81acb SWDEV-293742 - Update copyrights end year for hipamd
Change-Id: I08f620f84563a9214b59f1b943ed091b67229eab
2021-07-09 12:08:39 -04:00
Rahul Garg 19c84bc604 ROCMOPS-1956 - Push restructured code to hipamd
hipamd will have AMD's ROCCLR based HIP backend implementation

Change-Id: Id7de9634519b4ce46fca71a1b61f3d5b1e3fc459
2021-06-07 21:42:44 +00:00
Tao Sang 1cba7ec965 Remove hip-hcc codes: Part one
Remove hip-hcc codes from hip code base
Simplify hip CMakeLists.txt to exclude hip-hcc
Simplify cmake cmd for hip-rocclr building
Some minor fixes

Change-Id: I1ae357ecfd638d6c25bca293c1724b026be21ecd
2020-12-09 15:49:47 -05:00
Todd tiantuo Li a243a69e98 SWDEV-240803 - add hipFuncSetSharedMemConfig
Change-Id: I160b04677b3e7b99b3981ae7ecc84a0e3811d5e8
2020-08-20 18:18:24 -04:00
Todd tiantuo Li fb43f21044 SWDEV-240803 - add hipFuncSetAttribute and hipFuncAttribute
Change-Id: I3f4d67b19d89fd348fa5b884af4a2542ee4aba60
2020-08-14 17:39:29 -04:00
Jatin 126573df4c Adding changes for hipExtLaunchKernel for rocCLR
Change-Id: Iba52bc3bde7c37f3fb375a55ba0947e87b3cdc9b
2020-06-02 14:16:41 -04:00
Maneesh Gupta f2e1118d7a Merge in the rocclr based hip runtime (#2032)
* Merge master-next changes in master (include vdi development in master branch)
2020-04-23 09:12:06 -07:00
Aryan Salmanpour 4d05b4dce7 [HIP] add support for NoPreSync/NoPostSync flags for Cooperative MultiDevice launch API (#1990) 2020-04-13 14:02:52 +05:30
Rahul Garg 69e09a0b1b Rename hipDrvOccupancy to hipModuleOccupancy and match CUDA syntax (#1943) 2020-04-07 14:02:52 +05:30
Rahul Garg f7751db2ee use hsa_executable_get_symbol_by_name in find_kernel_by_name (#1994) 2020-04-06 15:39:30 +05:30
Rahul Garg c09c4cd239 Bump version to 3.5 (#1993)
* Switch CI testing from rocm-3.1.x to rocm-3.3.x
* Update hcc workweek for cooperative view
* bump version to 3.5
2020-04-06 15:39:10 +05:30
Jatin Chaudhary eab81ca91b Removing header size from formula (#1988)
Fixed a bug in the elf file size computation.
2020-04-06 15:37:07 +05:30
Siu Chi Chan 6ab1e864b6 don't expose symbols from code_object_bundle (#1971)
Change-Id: I56479485aad42c3d517fe6d9055be1cd846eeb00
2020-03-27 14:09:07 +05:30
Sarbojit2019 4a68ab5a8c Fix few memory leaks in HIP (#1969) 2020-03-27 14:08:30 +05:30
Aryan Salmanpour 1a1cdee6ff [hip] fix a build error when building hip with latest hcc (#1977)
there is a build error when building HIP with latest HCC from GitHub after PR#1935 merged into HIP master branch. this PR changed blockDimX to blockDim and two lines missed this change where added in the current PR.
2020-03-26 17:10:42 +05:30
Joseph Greathouse 341ef7fdca Fix cooperative launch APIs to set hipGetLastError (#1935)
* Fix cooperative launch APIs to set hipGetLastError

Previously, the cooperative launch APIs did not properly log their
errors in the global hipGetLastError variable before returning back
to the user. As such, the APIs would leave hipSuccess in the
last error, which would break some use cases.

This fixes that problem by making a trampoline function that does
the HIP_INIT_API and ihipLogStatus.

* Add missing flag to the log of multi-GPU launch
2020-03-25 14:39:24 -07:00
Aryan Salmanpour 66735bff13 [HIP] use markers to sync cooperative and normal queues (#1948) 2020-03-18 11:20:43 +05:30
jglaser ea28d64297 Implement accurate max block size in hipFuncGetAttributes() (#1676)
This PR takes ensures that the maxThreadsPerBlock returned by hipFuncGetAttributes is both a multiple of the warp size and that the register usage of the maximum block does not exceed the number of available registers.

Fixes #1662
2020-03-18 11:20:06 +05:30
Joseph Greathouse 6ae1b1a321 Fix errors in occupancy calculation function (#1926)
Fix two errors in hipOccupancyMaxActiveBlocksPerMultiprocessor.
1) Fix a possible segfault if the user passed in a null pointer for
   the numBlocks value.
2) Handle the situation when the user is asking for a block size
   that is larger than what the target device can hold within a
   single block.
2020-03-17 14:00:38 +05:30
Aryan Salmanpour 70ef268add [HIP] return an error if blockDim exceeds maxThreadsPerBlock 2020-03-10 15:26:53 -04:00
Aryan Salmanpour 7009901fe0 [HIP] fix formatting/code clean up and fix a bug 2020-03-09 16:03:59 -04:00
Aryan Salmanpour 97b24eba45 [HIP] Refactor cooperative APIs 2020-03-06 18:30:12 -05:00
Rahul Garg 1c794045e0 Add hipDrvOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags] (#1854)
Equivalent to cuOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags].
2020-02-28 16:46:55 +05:30
Sarbojit2019 a03628335c [hip] Fix for bug introduced in #1770 when blockSize is non-power of 2 (#1864)
Fixes SWDEV-222161
2020-02-13 14:22:46 +05:30
Maneesh Gupta d032637934 Revert "Match Occupancy APIs syntax with CUDA (#1625)" (#1857)
Reverting this for now till we figure out how to avoid the build
breakage.

This reverts commit 262ad13dd5.
2020-02-10 10:45:28 +05:30
Siu Chi Chan 14e235378f Fix C-style hipLaunchKernel (#1835)
* Fix bug in LaunchKernel test
Instead of passing the address of the gpu buffer, pass the address
of the pointer that holds the address of the gpu buffer

* Fix hipLaunchKernel's kernarg buffer construction.
The hipLaunchKernel implementation should rely on ihipModuleLaunchKernel
to construct the kernarg buffer correctly based on kernel metadata.

* Fix a bug in get_functions where the Kernel_descriptor wasn't constructed with the correct kernarg layout information.

* Fix a bug in kernarg layout parsing dealing with kernel without any arg

* teach ihipModuleLaunchKernel to handle kernel without any arg

* Add a more interesting test
2020-02-04 19:37:16 +05:30
Sarbojit2019 6e62ea5ee3 Added overflow check in kernel launch (#1770) 2020-02-04 09:02:16 +05:30
satyanveshd 262ad13dd5 Match Occupancy APIs syntax with CUDA (#1625)
* Match Occupancy APIs syntax with CUDA and fix tests using these APIs
2020-01-29 13:05:53 -08:00
Siu Chi Chan 26b50e1e1b Detect when an explicit printf buffer flush is required (#1766)
* Detect when an explicit printf buffer flush is required
in a device/stream synchronization function.

* hip_module.cpp: add missing hc_am.hpp header
2020-01-07 09:06:38 -08:00
Aryan Salmanpour 857052be1e [hip] refactoring cooperative kernel launch APIs (#1737)
This PR is a follow-up on PR# #1698 and it makes two more APIs (hipLaunchCooperativeKernel/hipLaunchCooperativeKernelMultiDevice) inline so that they can work correctly with lazy binding.
2019-12-30 12:42:17 +05:30
Alex Voicu 150e690a3a Fix late-coming issues. (#1724)
Implementation for hipMemcpyWithStream.
2019-12-23 19:11:24 +05:30
Aryan Salmanpour 8eaea4d114 [hip] refactoring hipExtLaunchMultiKernelMultiDevice API (#1698)
[Background] it was found that if lazy linking used for a library that calls hipExtLaunchMultiKernelMultiDevice API then this API can get the wrong program_state object for looking up device kernels leading to a "No device code available" error in this API.

To fix this issue, the API was refactored to be inline and get and pass the correct program_state to an internal hip API to request a multi-device kernel launch.
2019-12-04 11:50:51 +05:30
Rahul Garg dfee3ae279 Rename hip/hip_hcc.h to hip/hip_ext.h (#1341)
* Rename hip/hip_hcc.h to hip/hip_ext.h

* Deprecate hip_hcc.h
2019-11-07 13:17:10 +05:30
Rahul Garg 27221bc823 Revert "Fix occupany APIs (#1560)"
This reverts commit 6c5fbf9b4a.
2019-10-29 11:41:08 -07:00
satyanveshd 6c5fbf9b4a Fix occupany APIs (#1560)
Addresses SWDEV-205006
2019-10-24 17:44:47 +05:30
searlmc1 15a699688e Improve performance of v2 arg handling (#1539)
* Improve performance of v2 arg handling

* Missing change to `std::string`
2019-10-24 17:44:05 +05:30
Aryan Salmanpour 93c688a0c9 [hip] add support for implicit kernel argument for multi-grid sync (#1456)
* [hip] add support for implicit kernel argument for multi-grid sync

* modified code for calculating the prev_sum

* change the impCoopArg type to size_t

* add memory clean up

* launch init_gws and main kernels into two separate loops
2019-10-24 17:43:30 +05:30
Nick Curtis d16963c9d5 Guard against division by zero for no VGPR usage (e.g., in an empty kernel) (#1528)
* guard against division by zero for no VGPR usage (e.g., in an empty kernel)

* fix bracket format

* clean up parenthesis
2019-10-16 10:49:56 +05:30
Siu Chi Chan d8e09c4b70 fix kernel descriptor bug with code object v3
Change-Id: I9306b2baf36d338e36c5ab1226f74373a61a5ae0
2019-10-03 10:56:35 -04:00
Jeff Daily 2a53299f07 hipModuleUnload should remove global variables from memtracker (#1464) 2019-09-30 10:41:20 +05:30
Aryan Salmanpour 6c7da60e28 [hip] add initial support for hipLaunchCooperativeKernelMultiDevice API (#1368)
* [hip] add initial support for hipLaunchCooperativeKernelMultiDevice API

* fix formatting
2019-09-16 08:31:17 +00:00
Sarbojit2019 1ae43cbeba [HIP] Reclaiming hipLaunchKernel API (#1353)
* [HIP] Reclaiming hipLaunchKernel API

* Reclaiming hipLaunchKernel : Incorporated review comments

* Incorporated review comments

* Removed hipLaunchKernel Macro from nvcc path
2019-08-29 01:02:41 +00:00
Aryan Salmanpour 32ce882d6e [hip] add initial implementation for hipLaunchCooperativeKernel API (#1339)
* [hip] add initial implementation for hipLaunchCooperativeKernel API

* [hip] use total number of work groups to initialize the GWS resource

* [hip] use only one argument for init_gws kernel

* [hip] use the device associated with the stream for checking the device properties
2019-08-23 09:19:35 +00:00
Rahul Garg 7f9de881cb Fix undefined identifier issue for hipExtModuleLaunchKernel 2019-08-14 16:46:32 -04:00
Rahul Garg 8b6317d041 Add support for hipFuncGetAttribute (#1279)
* Add support for hipFunGetAttribute

* Support NVCC path

* Test using sample module_api_global

* Try fixing CI build failure due to hip_prof_gen scan

* Fix for CI build issue

* Resolve conflict

* Rebase and resolve conflicts with master

* Fix build error

* Fix NVCC path build error
2019-08-08 08:27:41 +00:00
Jeff Daily f337ae1edb consolidate thread local storage (#915)
* all thread local access now through single struct

* clean up old commented-out code, more use of GET_TLS()

* fewer calls to GET_TLS by passing tls as a funtion argument

* revert unnecessary change to printf

* fix failing tests due to TLS change

* fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor
2019-08-05 09:51:02 +00:00
Rahul Garg 20e9aba94e Fix missing logstatus in hipFuncGetAttributes 2019-08-02 11:51:34 +05:30
wkwchau 7b9801fe9a Added support of hipOccupancyMaxActiveBlocksPerMultiprocessor & hipOc… (#1240)
* Added support of hipOccupancyMaxActiveBlocksPerMultiprocessor & hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags APIs

* Taking into account of SGPR usage to determine the max active blocks in hipOccupancyMaxActiveBlocksPerMultiprocessor()
2019-08-01 08:58:48 +00:00
Rahul Garg 8df47255c5 Add hip init in hipExtLaunchMultiKernelMultiDevice (#1263)
* Add hip init in hipExtLaunchMultiKernelMultiDevice

* Add more logstatus for multiple return paths

* Fix missing i in function name
2019-07-31 15:42:29 +00:00