Grafik Komit

159 Melakukan

Penulis SHA1 Pesan Tanggal
Tao Sang 8218ede5e8 Support hipFuncGetAttributes with hip-clang+Hcc RT
Fix issues of missing kernel function symbols and missing argument list via
using __hipRegister* functions.
Then the following tests can pass,
directed_tests/runtimeApi/module/hipFuncGetAttributes
directed_tests/runtimeApi/module/hipExtLaunchMultiKernelMultiDevice
directed_tests/gcc/LaunchKernel

Change-Id: I52135b61e8283eb4f9f10f77895151e4e55418d9


[ROCm/clr commit: 4a113bb669]
2020-03-26 23:28:35 -04:00
Tao Sang 7a171f28c7 Fix failure to get global variables
Implement _ihipGetGlobalVar() and ihipGetGlobalVar() to
get global variables.

Change-Id: I442ab6712e12306c3316f114f5dc42f6daefaad9


[ROCm/clr commit: 8e9e6a44a4]
2020-03-17 16:14:16 -04:00
Lad, Aditya eca6c4f001 Merge branch 'master' into amd-master-next
Conflicts:
	CMakeLists.txt
	tests/src/texture/simpleTexture2DLayered.cpp
	tests/src/texture/simpleTexture3D.cpp

Change-Id: I4aa4754d391b5f37ddf15fa0bcfc84d9da020119


[ROCm/clr commit: e7fdb3d796]
2020-03-06 14:10:44 -05:00
agodavar 1c828b7575 Fix hipExtLaunchMultiKernelMultiDevice compilation issue
Fix compilation error on hip-hcc+clang , hip-vdi+clang
Enabled hipExtLaunchMultiKernelMultiDevice test on hip-vdi path
hipExtLaunchMultiKernelMultiDevice common declaration for all paths

Change-Id: I76031840614fce8e12a8e845548fa43a389a741a


[ROCm/clr commit: 5a6c605730]
2020-03-04 15:38:14 -05:00
Rahul Garg ff421fb2d2 Add hipDrvOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags] (#1854)
Equivalent to cuOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags].

[ROCm/clr commit: 1c794045e0]
2020-02-28 16:46:55 +05:30
Tao Sang 21abc838ef Fix bug of hip/samples/2_Cookbook/7_streams
Initialize Kernel_descriptor with matched function name.

Change-Id: I26911d6bc9b2beae186a9e6f9441ce408521bce9


[ROCm/clr commit: 47d3276177]
2020-02-18 13:53:21 -05:00
Tao Sang 690db691bb Fix bug of hip/samples/0_Intro/hcc_dialects/vadd_hip
For hipLaunchKernelGGL(), hmod->kernargs is empty, thus we need
insert hmod->kernargs[name_str] which is empty.

Change-Id: I95f818d0525da84452e66c5778f0648a643843c7


[ROCm/clr commit: 05785772e3]
2020-02-16 17:58:04 -05:00
Tao Sang 53e62f704d Support defaultDriver test with Clang+Hcc Runtime
Fix the following issues:
1.Ignore hidden arguments of kernel functions.
2.Look up both origial function name and function name with .kd postfix
when argments are retrived from module.
3.Addition, fix compiling issue of LaunchKernel test app.

Change-Id: I9400943f2f02433cb4409b19c0cac3626c2bc454


[ROCm/clr commit: eb8c4d1ce2]
2020-02-15 19:29:13 -05:00
Sarbojit2019 194b05efe2 [hip] Fix for bug introduced in #1770 when blockSize is non-power of 2 (#1864)
Fixes SWDEV-222161

[ROCm/clr commit: a03628335c]
2020-02-13 14:22:46 +05:30
Maneesh Gupta a21b1f84cb Revert "Match Occupancy APIs syntax with CUDA (#1625)" (#1857)
Reverting this for now till we figure out how to avoid the build
breakage.

This reverts commit 2c6e880e05.

[ROCm/clr commit: d032637934]
2020-02-10 10:45:28 +05:30
Siu Chi Chan 30f09a1435 Fix C-style hipLaunchKernel (#1835)
* Fix bug in LaunchKernel test
Instead of passing the address of the gpu buffer, pass the address
of the pointer that holds the address of the gpu buffer

* Fix hipLaunchKernel's kernarg buffer construction.
The hipLaunchKernel implementation should rely on ihipModuleLaunchKernel
to construct the kernarg buffer correctly based on kernel metadata.

* Fix a bug in get_functions where the Kernel_descriptor wasn't constructed with the correct kernarg layout information.

* Fix a bug in kernarg layout parsing dealing with kernel without any arg

* teach ihipModuleLaunchKernel to handle kernel without any arg

* Add a more interesting test

[ROCm/clr commit: 14e235378f]
2020-02-04 19:37:16 +05:30
Sarbojit2019 dd5c4d41a5 Added overflow check in kernel launch (#1770)
[ROCm/clr commit: 6e62ea5ee3]
2020-02-04 09:02:16 +05:30
satyanveshd 2c6e880e05 Match Occupancy APIs syntax with CUDA (#1625)
* Match Occupancy APIs syntax with CUDA and fix tests using these APIs


[ROCm/clr commit: 262ad13dd5]
2020-01-29 13:05:53 -08:00
Siu Chi Chan 94ddf110dd Detect when an explicit printf buffer flush is required (#1766)
* Detect when an explicit printf buffer flush is required
in a device/stream synchronization function.

* hip_module.cpp: add missing hc_am.hpp header


[ROCm/clr commit: 26b50e1e1b]
2020-01-07 09:06:38 -08:00
Aryan Salmanpour ed2609b880 [hip] refactoring cooperative kernel launch APIs (#1737)
This PR is a follow-up on PR# #1698 and it makes two more APIs (hipLaunchCooperativeKernel/hipLaunchCooperativeKernelMultiDevice) inline so that they can work correctly with lazy binding.

[ROCm/clr commit: 857052be1e]
2019-12-30 12:42:17 +05:30
Alex Voicu 1f762e3970 Fix late-coming issues. (#1724)
Implementation for hipMemcpyWithStream.


[ROCm/clr commit: 150e690a3a]
2019-12-23 19:11:24 +05:30
Aryan Salmanpour 2471c1950c [hip] refactoring hipExtLaunchMultiKernelMultiDevice API (#1698)
[Background] it was found that if lazy linking used for a library that calls hipExtLaunchMultiKernelMultiDevice API then this API can get the wrong program_state object for looking up device kernels leading to a "No device code available" error in this API.

To fix this issue, the API was refactored to be inline and get and pass the correct program_state to an internal hip API to request a multi-device kernel launch.

[ROCm/clr commit: 8eaea4d114]
2019-12-04 11:50:51 +05:30
Rahul Garg cf648f7dbe Rename hip/hip_hcc.h to hip/hip_ext.h (#1341)
* Rename hip/hip_hcc.h to hip/hip_ext.h

* Deprecate hip_hcc.h


[ROCm/clr commit: dfee3ae279]
2019-11-07 13:17:10 +05:30
Rahul Garg 72c686ed67 Revert "Fix occupany APIs (#1560)"
This reverts commit ad1e409a24.


[ROCm/clr commit: 27221bc823]
2019-10-29 11:41:08 -07:00
satyanveshd ad1e409a24 Fix occupany APIs (#1560)
Addresses SWDEV-205006 

[ROCm/clr commit: 6c5fbf9b4a]
2019-10-24 17:44:47 +05:30
searlmc1 510be4b5dc Improve performance of v2 arg handling (#1539)
* Improve performance of v2 arg handling

* Missing change to `std::string`


[ROCm/clr commit: 15a699688e]
2019-10-24 17:44:05 +05:30
Aryan Salmanpour 9e0eaef846 [hip] add support for implicit kernel argument for multi-grid sync (#1456)
* [hip] add support for implicit kernel argument for multi-grid sync

* modified code for calculating the prev_sum

* change the impCoopArg type to size_t

* add memory clean up

* launch init_gws and main kernels into two separate loops


[ROCm/clr commit: 93c688a0c9]
2019-10-24 17:43:30 +05:30
Nick Curtis a7d6c03e17 Guard against division by zero for no VGPR usage (e.g., in an empty kernel) (#1528)
* guard against division by zero for no VGPR usage (e.g., in an empty kernel)

* fix bracket format

* clean up parenthesis


[ROCm/clr commit: d16963c9d5]
2019-10-16 10:49:56 +05:30
Siu Chi Chan 1a7fc4e5c1 fix kernel descriptor bug with code object v3
Change-Id: I9306b2baf36d338e36c5ab1226f74373a61a5ae0


[ROCm/clr commit: d8e09c4b70]
2019-10-03 10:56:35 -04:00
Jeff Daily 280212353c hipModuleUnload should remove global variables from memtracker (#1464)
[ROCm/clr commit: 2a53299f07]
2019-09-30 10:41:20 +05:30
Aryan Salmanpour f68652cc5f [hip] add initial support for hipLaunchCooperativeKernelMultiDevice API (#1368)
* [hip] add initial support for hipLaunchCooperativeKernelMultiDevice API

* fix formatting


[ROCm/clr commit: 6c7da60e28]
2019-09-16 08:31:17 +00:00
Sarbojit2019 4d23e35532 [HIP] Reclaiming hipLaunchKernel API (#1353)
* [HIP] Reclaiming hipLaunchKernel API

* Reclaiming hipLaunchKernel : Incorporated review comments

* Incorporated review comments

* Removed hipLaunchKernel Macro from nvcc path


[ROCm/clr commit: 1ae43cbeba]
2019-08-29 01:02:41 +00:00
Aryan Salmanpour e8f69a8c36 [hip] add initial implementation for hipLaunchCooperativeKernel API (#1339)
* [hip] add initial implementation for hipLaunchCooperativeKernel API

* [hip] use total number of work groups to initialize the GWS resource

* [hip] use only one argument for init_gws kernel

* [hip] use the device associated with the stream for checking the device properties


[ROCm/clr commit: 32ce882d6e]
2019-08-23 09:19:35 +00:00
Rahul Garg 1ca562e14c Fix undefined identifier issue for hipExtModuleLaunchKernel
[ROCm/clr commit: 7f9de881cb]
2019-08-14 16:46:32 -04:00
Rahul Garg 6641c0dc94 Add support for hipFuncGetAttribute (#1279)
* Add support for hipFunGetAttribute

* Support NVCC path

* Test using sample module_api_global

* Try fixing CI build failure due to hip_prof_gen scan

* Fix for CI build issue

* Resolve conflict

* Rebase and resolve conflicts with master

* Fix build error

* Fix NVCC path build error


[ROCm/clr commit: 8b6317d041]
2019-08-08 08:27:41 +00:00
Jeff Daily 68f674205e consolidate thread local storage (#915)
* all thread local access now through single struct

* clean up old commented-out code, more use of GET_TLS()

* fewer calls to GET_TLS by passing tls as a funtion argument

* revert unnecessary change to printf

* fix failing tests due to TLS change

* fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor


[ROCm/clr commit: f337ae1edb]
2019-08-05 09:51:02 +00:00
Rahul Garg b064d7cab2 Fix missing logstatus in hipFuncGetAttributes
[ROCm/clr commit: 20e9aba94e]
2019-08-02 11:51:34 +05:30
wkwchau a19b4fbd8b Added support of hipOccupancyMaxActiveBlocksPerMultiprocessor & hipOc… (#1240)
* Added support of hipOccupancyMaxActiveBlocksPerMultiprocessor & hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags APIs

* Taking into account of SGPR usage to determine the max active blocks in hipOccupancyMaxActiveBlocksPerMultiprocessor()


[ROCm/clr commit: 7b9801fe9a]
2019-08-01 08:58:48 +00:00
Rahul Garg f9eaac9561 Add hip init in hipExtLaunchMultiKernelMultiDevice (#1263)
* Add hip init in hipExtLaunchMultiKernelMultiDevice

* Add more logstatus for multiple return paths

* Fix missing i in function name


[ROCm/clr commit: 8df47255c5]
2019-07-31 15:42:29 +00:00
Rahul Garg 3b5dac1d9d Add HIP init in hipFuncGetAttributes (#1262)
* Add HIP init in hipFuncGetAttributes

* [dtest]Remove explicit hip init call in hipFuncGetAttributes dtest


[ROCm/clr commit: c610159b85]
2019-07-31 15:42:08 +00:00
cdevadas 35f0fb2916 Increased the number of implicit-kernarg bytes to 56 (#1217)
[ROCm/clr commit: fc0aca2a7d]
2019-07-19 04:45:34 +00:00
wkwchau d20537e595 Fixed bug of determine max block size in hipOccupancyMaxPotentialBlockSize (#1235)
[ROCm/clr commit: 6ec476e50a]
2019-07-18 03:19:29 +00:00
wkwchau 7662c1a650 Fixed bug in hipOccupancyMaxPotentialBlockSize for the SGPRs limitation of gfx8 devices (#1176)
[ROCm/clr commit: 3742f24477]
2019-06-26 15:18:00 +05:30
Aryan Salmanpour 362445220a [hip] implement the hipExtLaunchMultiKernelMultiDevice API (#1165)
* [hip] implement the hipExtLaunchMultiKernelMultiDevice API

* add a guard to check the HCC version for acquire_locked_hsa_queue() API which was introdued in HCC for ROCm 2.5

* modified code based on the requested changes

* changes to lock all streams before launching kernels for each device and unlock them after the dispatches

* check each stream to be valid before starting to lock all the streams


[ROCm/clr commit: d6ad690cb6]
2019-06-20 05:59:05 +05:30
wkwchau 81b5ea1c4a Implement the hipOccupancyMaxPotentialBlockSize function (#1162)
* Implement the hipOccupancyMaxPotentialBlockSize function

* Replaced hipGetDeviceProperties() call by ihipGetDeviceProperties() in ihipOccupancyMaxPotentialBlockSize()

* Add test for hipOccupancyMaxPotentialBlockSize in Module API

* Added extern declaration for ihipGetDeviceProperties() to be accessed inside ihipOccupancyMaxPotentialBlockSize()

* fixed hipOccupancyMaxPotentialBlockSize test build issue

* Fix hipOccupancyMaxPotentialBlockSize dtest

* Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest

* Revert "Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest"

This reverts commit 0480ff56f1441fc515d2c26ce33783e303423938.

* Disable hipOccupancyMaxPotentialBlockSize dtest on NVCC

* move extern declaration of ihipGetDeviceProperties to hip_module.cpp

* Update the limiation of 32 wavefronts per CU and 800/512 SGPRs for VI/pre-VI chips to calculate the occupancy


[ROCm/clr commit: 28c34ead70]
2019-06-20 05:58:29 +05:30
Rahul Garg effbc8b212 HACK for SWDEV-173477/SWDEV-190701
[ROCm/clr commit: 107734f7ad]
2019-06-13 18:15:31 -07:00
Maneesh Gupta b4fb2b0ab4 Merge pull request #1140 from scchan/program_state_stage_2-rebase-20190524
migrate more program_state logic from header into shared library (phase II)

[ROCm/clr commit: 1d5d923d36]
2019-06-05 16:09:01 +05:30
Maneesh Gupta 3d6944e0db Merge branch 'master' into implicit-kernarg
[ROCm/clr commit: d4fa74ff09]
2019-06-04 13:24:19 +05:30
Maneesh Gupta 1a9326b2dd Merge pull request #1155 from gargrahul/fix_kernel_lp_dim_trace
Fix wrong grid dim shown in trace

[ROCm/clr commit: 40a09318e4]
2019-06-04 13:21:39 +05:30
cdevadas 8de283ef77 Runtime changes to append implicit kernel arguments.
Appended 48 empty bytes to the kernarg area at runtime. The implicit arguments are enabled primarily for the hostcall services
and it is completely abstracted from the user code. Enabled it for both hip-clang and hip-hcc.


[ROCm/clr commit: 214ec53da3]
2019-06-04 10:45:49 +05:30
Rahul Garg a8de3fafba Fix wrong grid dim shown in trace
[ROCm/clr commit: 7a2e3b6a1c]
2019-05-31 22:30:24 +05:30
Siu Chi Chan 305eb4239e remove executables() from program_state
[ROCm/clr commit: e2c0122892]
2019-05-24 17:27:01 -04:00
Siu Chi Chan 0cae3e06c1 moving agent_globals_impl into hip_module
[ROCm/clr commit: 6852be819f]
2019-05-24 16:43:38 -04:00
Laurent Morichetti 4c402ccfaf Add support for code object v3
Use the code object manager library to parse the code object metadata. Both
code object v2 and v3 formats are now supported for HCC generated binaries.


[ROCm/clr commit: de89102528]
2019-05-23 18:03:32 -07:00
Alex Voicu a4a3132c64 Add HIPRTC, glorious ersatz for NVRTC (#1097)
* Add ersatz for NVRTC.

* Fix extraneous paren and use correct namespace.

* Use lowerCamelCase (yuck, yuck) consistently.

* Link against FS when building hiprtc lib.

* Correctly mark Manipulators. Fix dual compile.

* Add unit tests. Extend HIT to accept linker options.

* Make sure the HIPRTC library is installed.

* Better logging. Try to auto-detect the target.

* Stop specifying the target explicitly.

* Add missing flavour of `hipModuleLaunchKernel`.

* Program was already destroyed.

* Don't use `--genco`. Fix mangled name trimming.

* Fix HIPRTC breakage due to upstream noise.

* [dtests] Replace RUN -> TEST in hiprtc tests

Change-Id: Ie499e92dfe4e5c94634b1c2b76cf52d241bcfea3

* [hit] Set HIP_PATH to HIP_ROOT_DIR for all tests

Change-Id: Ib0ad1f99bc71c03e363e055dd508a7a4a210680a


[ROCm/clr commit: a538eb705a]
2019-05-16 18:28:54 +05:30