Gráfico de commits

168 Commits

Autor SHA1 Mensaje Fecha
Maneesh Gupta 8e137a6ec1 Merge in the rocclr based hip runtime (#2032)
* Merge master-next changes in master (include vdi development in master branch)



[ROCm/hip commit: a0b5dfd625]
2020-04-23 09:12:06 -07:00
Aryan Salmanpour 462b9245ea [HIP] add support for NoPreSync/NoPostSync flags for Cooperative MultiDevice launch API (#1990)
[ROCm/hip commit: cf8589b8c8]
2020-04-13 14:02:52 +05:30
Rahul Garg 003d776e72 Rename hipDrvOccupancy to hipModuleOccupancy and match CUDA syntax (#1943)
[ROCm/hip commit: ba8a556ea9]
2020-04-07 14:02:52 +05:30
Rahul Garg dc71d4b121 use hsa_executable_get_symbol_by_name in find_kernel_by_name (#1994)
[ROCm/hip commit: a12cc8b031]
2020-04-06 15:39:30 +05:30
Rahul Garg 6fe5c52518 Bump version to 3.5 (#1993)
* Switch CI testing from rocm-3.1.x to rocm-3.3.x
* Update hcc workweek for cooperative view
* bump version to 3.5

[ROCm/hip commit: 59afcb1091]
2020-04-06 15:39:10 +05:30
Jatin Chaudhary 57bab41a1c Removing header size from formula (#1988)
Fixed a bug in the elf file size computation.

[ROCm/hip commit: 6358e40a76]
2020-04-06 15:37:07 +05:30
Siu Chi Chan e58a0d06f7 don't expose symbols from code_object_bundle (#1971)
Change-Id: I56479485aad42c3d517fe6d9055be1cd846eeb00

[ROCm/hip commit: 43abf84f54]
2020-03-27 14:09:07 +05:30
Sarbojit2019 1909e436cf Fix few memory leaks in HIP (#1969)
[ROCm/hip commit: f1b028b93e]
2020-03-27 14:08:30 +05:30
Aryan Salmanpour b5f7402c62 [hip] fix a build error when building hip with latest hcc (#1977)
there is a build error when building HIP with latest HCC from GitHub after PR#1935 merged into HIP master branch. this PR changed blockDimX to blockDim and two lines missed this change where added in the current PR.

[ROCm/hip commit: c8ca2355ae]
2020-03-26 17:10:42 +05:30
Joseph Greathouse cb69c6037c Fix cooperative launch APIs to set hipGetLastError (#1935)
* Fix cooperative launch APIs to set hipGetLastError

Previously, the cooperative launch APIs did not properly log their
errors in the global hipGetLastError variable before returning back
to the user. As such, the APIs would leave hipSuccess in the
last error, which would break some use cases.

This fixes that problem by making a trampoline function that does
the HIP_INIT_API and ihipLogStatus.

* Add missing flag to the log of multi-GPU launch

[ROCm/hip commit: f61b79d9a3]
2020-03-25 14:39:24 -07:00
Aryan Salmanpour 799e2380a7 [HIP] use markers to sync cooperative and normal queues (#1948)
[ROCm/hip commit: 4acb0ea038]
2020-03-18 11:20:43 +05:30
jglaser d783cc6650 Implement accurate max block size in hipFuncGetAttributes() (#1676)
This PR takes ensures that the maxThreadsPerBlock returned by hipFuncGetAttributes is both a multiple of the warp size and that the register usage of the maximum block does not exceed the number of available registers.

Fixes #1662

[ROCm/hip commit: b5e683a35d]
2020-03-18 11:20:06 +05:30
Joseph Greathouse 753763e163 Fix errors in occupancy calculation function (#1926)
Fix two errors in hipOccupancyMaxActiveBlocksPerMultiprocessor.
1) Fix a possible segfault if the user passed in a null pointer for
   the numBlocks value.
2) Handle the situation when the user is asking for a block size
   that is larger than what the target device can hold within a
   single block.

[ROCm/hip commit: bf04d7380a]
2020-03-17 14:00:38 +05:30
Aryan Salmanpour 7327f6a756 [HIP] return an error if blockDim exceeds maxThreadsPerBlock
[ROCm/hip commit: b663fccf0b]
2020-03-10 15:26:53 -04:00
Aryan Salmanpour c39d9f8f7b [HIP] fix formatting/code clean up and fix a bug
[ROCm/hip commit: 5494f5b247]
2020-03-09 16:03:59 -04:00
Aryan Salmanpour c25dd0ca3d [HIP] Refactor cooperative APIs
[ROCm/hip commit: 4844fbdf0a]
2020-03-06 18:30:12 -05:00
Rahul Garg 5229ffff99 Add hipDrvOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags] (#1854)
Equivalent to cuOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags].

[ROCm/hip commit: edc97f3073]
2020-02-28 16:46:55 +05:30
Sarbojit2019 624cba30a6 [hip] Fix for bug introduced in #1770 when blockSize is non-power of 2 (#1864)
Fixes SWDEV-222161

[ROCm/hip commit: 1109cbff83]
2020-02-13 14:22:46 +05:30
Maneesh Gupta 7753b3e827 Revert "Match Occupancy APIs syntax with CUDA (#1625)" (#1857)
Reverting this for now till we figure out how to avoid the build
breakage.

This reverts commit e38db9fb6f.

[ROCm/hip commit: f8e1c01900]
2020-02-10 10:45:28 +05:30
Siu Chi Chan ec45ec16e3 Fix C-style hipLaunchKernel (#1835)
* Fix bug in LaunchKernel test
Instead of passing the address of the gpu buffer, pass the address
of the pointer that holds the address of the gpu buffer

* Fix hipLaunchKernel's kernarg buffer construction.
The hipLaunchKernel implementation should rely on ihipModuleLaunchKernel
to construct the kernarg buffer correctly based on kernel metadata.

* Fix a bug in get_functions where the Kernel_descriptor wasn't constructed with the correct kernarg layout information.

* Fix a bug in kernarg layout parsing dealing with kernel without any arg

* teach ihipModuleLaunchKernel to handle kernel without any arg

* Add a more interesting test

[ROCm/hip commit: bff8e15e13]
2020-02-04 19:37:16 +05:30
Sarbojit2019 91d9cfd64d Added overflow check in kernel launch (#1770)
[ROCm/hip commit: 13316f724f]
2020-02-04 09:02:16 +05:30
satyanveshd e38db9fb6f Match Occupancy APIs syntax with CUDA (#1625)
* Match Occupancy APIs syntax with CUDA and fix tests using these APIs


[ROCm/hip commit: fa98798b63]
2020-01-29 13:05:53 -08:00
Siu Chi Chan fcf07e0b04 Detect when an explicit printf buffer flush is required (#1766)
* Detect when an explicit printf buffer flush is required
in a device/stream synchronization function.

* hip_module.cpp: add missing hc_am.hpp header


[ROCm/hip commit: f4555c835a]
2020-01-07 09:06:38 -08:00
Aryan Salmanpour ffea90f865 [hip] refactoring cooperative kernel launch APIs (#1737)
This PR is a follow-up on PR# #1698 and it makes two more APIs (hipLaunchCooperativeKernel/hipLaunchCooperativeKernelMultiDevice) inline so that they can work correctly with lazy binding.

[ROCm/hip commit: 6968aeb841]
2019-12-30 12:42:17 +05:30
Alex Voicu 1f5ecc0f6a Fix late-coming issues. (#1724)
Implementation for hipMemcpyWithStream.


[ROCm/hip commit: 75a11330aa]
2019-12-23 19:11:24 +05:30
Aryan Salmanpour abe7531676 [hip] refactoring hipExtLaunchMultiKernelMultiDevice API (#1698)
[Background] it was found that if lazy linking used for a library that calls hipExtLaunchMultiKernelMultiDevice API then this API can get the wrong program_state object for looking up device kernels leading to a "No device code available" error in this API.

To fix this issue, the API was refactored to be inline and get and pass the correct program_state to an internal hip API to request a multi-device kernel launch.

[ROCm/hip commit: 68cc787781]
2019-12-04 11:50:51 +05:30
Rahul Garg 6968362d99 Rename hip/hip_hcc.h to hip/hip_ext.h (#1341)
* Rename hip/hip_hcc.h to hip/hip_ext.h

* Deprecate hip_hcc.h


[ROCm/hip commit: 579a4f36fa]
2019-11-07 13:17:10 +05:30
Rahul Garg 70449cfa92 Revert "Fix occupany APIs (#1560)"
This reverts commit 4f23f9cb18.


[ROCm/hip commit: e4a1e44162]
2019-10-29 11:41:08 -07:00
satyanveshd 4f23f9cb18 Fix occupany APIs (#1560)
Addresses SWDEV-205006 

[ROCm/hip commit: af351d7e1b]
2019-10-24 17:44:47 +05:30
searlmc1 4d668d5a52 Improve performance of v2 arg handling (#1539)
* Improve performance of v2 arg handling

* Missing change to `std::string`


[ROCm/hip commit: c4a51f3679]
2019-10-24 17:44:05 +05:30
Aryan Salmanpour 9ab561dd66 [hip] add support for implicit kernel argument for multi-grid sync (#1456)
* [hip] add support for implicit kernel argument for multi-grid sync

* modified code for calculating the prev_sum

* change the impCoopArg type to size_t

* add memory clean up

* launch init_gws and main kernels into two separate loops


[ROCm/hip commit: 359dc79101]
2019-10-24 17:43:30 +05:30
Nick Curtis d2e9718d23 Guard against division by zero for no VGPR usage (e.g., in an empty kernel) (#1528)
* guard against division by zero for no VGPR usage (e.g., in an empty kernel)

* fix bracket format

* clean up parenthesis


[ROCm/hip commit: 73ca2b0083]
2019-10-16 10:49:56 +05:30
Siu Chi Chan 0f9074b568 fix kernel descriptor bug with code object v3
Change-Id: I9306b2baf36d338e36c5ab1226f74373a61a5ae0


[ROCm/hip commit: dcf70ff9a2]
2019-10-03 10:56:35 -04:00
Jeff Daily dcd73a1a87 hipModuleUnload should remove global variables from memtracker (#1464)
[ROCm/hip commit: 56f67e5e36]
2019-09-30 10:41:20 +05:30
Aryan Salmanpour 9e9a505b39 [hip] add initial support for hipLaunchCooperativeKernelMultiDevice API (#1368)
* [hip] add initial support for hipLaunchCooperativeKernelMultiDevice API

* fix formatting


[ROCm/hip commit: bac52d3729]
2019-09-16 08:31:17 +00:00
Sarbojit2019 74a3171c6b [HIP] Reclaiming hipLaunchKernel API (#1353)
* [HIP] Reclaiming hipLaunchKernel API

* Reclaiming hipLaunchKernel : Incorporated review comments

* Incorporated review comments

* Removed hipLaunchKernel Macro from nvcc path


[ROCm/hip commit: 5c4f78bac3]
2019-08-29 01:02:41 +00:00
Aryan Salmanpour 0fc745b3a6 [hip] add initial implementation for hipLaunchCooperativeKernel API (#1339)
* [hip] add initial implementation for hipLaunchCooperativeKernel API

* [hip] use total number of work groups to initialize the GWS resource

* [hip] use only one argument for init_gws kernel

* [hip] use the device associated with the stream for checking the device properties


[ROCm/hip commit: 5066700ace]
2019-08-23 09:19:35 +00:00
Rahul Garg 3c8f84a5c3 Fix undefined identifier issue for hipExtModuleLaunchKernel
[ROCm/hip commit: 3dd0e988b1]
2019-08-14 16:46:32 -04:00
Rahul Garg d429ba57e1 Add support for hipFuncGetAttribute (#1279)
* Add support for hipFunGetAttribute

* Support NVCC path

* Test using sample module_api_global

* Try fixing CI build failure due to hip_prof_gen scan

* Fix for CI build issue

* Resolve conflict

* Rebase and resolve conflicts with master

* Fix build error

* Fix NVCC path build error


[ROCm/hip commit: 6ce86f409d]
2019-08-08 08:27:41 +00:00
Jeff Daily 9b44993343 consolidate thread local storage (#915)
* all thread local access now through single struct

* clean up old commented-out code, more use of GET_TLS()

* fewer calls to GET_TLS by passing tls as a funtion argument

* revert unnecessary change to printf

* fix failing tests due to TLS change

* fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor


[ROCm/hip commit: 1eb3dbf065]
2019-08-05 09:51:02 +00:00
Rahul Garg 8b597565c4 Fix missing logstatus in hipFuncGetAttributes
[ROCm/hip commit: 474bf0effc]
2019-08-02 11:51:34 +05:30
wkwchau 7676b86f12 Added support of hipOccupancyMaxActiveBlocksPerMultiprocessor & hipOc… (#1240)
* Added support of hipOccupancyMaxActiveBlocksPerMultiprocessor & hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags APIs

* Taking into account of SGPR usage to determine the max active blocks in hipOccupancyMaxActiveBlocksPerMultiprocessor()


[ROCm/hip commit: 4b18b321f7]
2019-08-01 08:58:48 +00:00
Rahul Garg 617a6d43dc Add hip init in hipExtLaunchMultiKernelMultiDevice (#1263)
* Add hip init in hipExtLaunchMultiKernelMultiDevice

* Add more logstatus for multiple return paths

* Fix missing i in function name


[ROCm/hip commit: b9e6d72ee6]
2019-07-31 15:42:29 +00:00
Rahul Garg d7973153ca Add HIP init in hipFuncGetAttributes (#1262)
* Add HIP init in hipFuncGetAttributes

* [dtest]Remove explicit hip init call in hipFuncGetAttributes dtest


[ROCm/hip commit: 0517c30507]
2019-07-31 15:42:08 +00:00
cdevadas a02f3a3655 Increased the number of implicit-kernarg bytes to 56 (#1217)
[ROCm/hip commit: d5dba47804]
2019-07-19 04:45:34 +00:00
wkwchau e61b8cec28 Fixed bug of determine max block size in hipOccupancyMaxPotentialBlockSize (#1235)
[ROCm/hip commit: 38254caf7a]
2019-07-18 03:19:29 +00:00
wkwchau 3c963cc0e1 Fixed bug in hipOccupancyMaxPotentialBlockSize for the SGPRs limitation of gfx8 devices (#1176)
[ROCm/hip commit: 47f16264ed]
2019-06-26 15:18:00 +05:30
Aryan Salmanpour 45fa752888 [hip] implement the hipExtLaunchMultiKernelMultiDevice API (#1165)
* [hip] implement the hipExtLaunchMultiKernelMultiDevice API

* add a guard to check the HCC version for acquire_locked_hsa_queue() API which was introdued in HCC for ROCm 2.5

* modified code based on the requested changes

* changes to lock all streams before launching kernels for each device and unlock them after the dispatches

* check each stream to be valid before starting to lock all the streams


[ROCm/hip commit: 96dc74897d]
2019-06-20 05:59:05 +05:30
wkwchau 40bd111519 Implement the hipOccupancyMaxPotentialBlockSize function (#1162)
* Implement the hipOccupancyMaxPotentialBlockSize function

* Replaced hipGetDeviceProperties() call by ihipGetDeviceProperties() in ihipOccupancyMaxPotentialBlockSize()

* Add test for hipOccupancyMaxPotentialBlockSize in Module API

* Added extern declaration for ihipGetDeviceProperties() to be accessed inside ihipOccupancyMaxPotentialBlockSize()

* fixed hipOccupancyMaxPotentialBlockSize test build issue

* Fix hipOccupancyMaxPotentialBlockSize dtest

* Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest

* Revert "Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest"

This reverts commit 0480ff56f1441fc515d2c26ce33783e303423938.

* Disable hipOccupancyMaxPotentialBlockSize dtest on NVCC

* move extern declaration of ihipGetDeviceProperties to hip_module.cpp

* Update the limiation of 32 wavefronts per CU and 800/512 SGPRs for VI/pre-VI chips to calculate the occupancy


[ROCm/hip commit: d492f1fd6b]
2019-06-20 05:58:29 +05:30
Rahul Garg 884d0fef76 HACK for SWDEV-173477/SWDEV-190701
[ROCm/hip commit: bc528b1e8b]
2019-06-13 18:15:31 -07:00