提交图

1272 次代码提交

作者 SHA1 备注 提交日期
Yaxun (Sam) Liu 13316e2919 Add pow(float/double/_Float16,int)
Change-Id: Ie65d15cd3df9853a3bbd613d8c7188ae39c327c7
2020-07-06 07:38:57 -04:00
Ronak Chauhan affe9ab9b5 Support passing macros to hipLaunchKernelGGL
This makes hipLaunchKernelGGL take a variable argument list, that will be
expanded before being fed to hipLaunchKernelGGLInternal.

This is different from 961717879d.

We try to accomodate the case when a kernel template has multiple
type parameters.

Change-Id: I87577d402c92b0f3b51e298f8293f4065e1f6de8
2020-06-30 10:44:55 -04:00
Daniil Fukalov 63e44d16a3 Add __attribute__((const)) to grid related functions declarations
This is cherrypick of Daniil Fukalov's PR https://github.com/ROCm-Developer-Tools/HIP/pull/2110
which has been committed to master branch.

Make declarations consistent with https://github.com/RadeonOpenCompute/ROCm-Device-Libs/blob/amd-stg-open/ockl/src/workitem.cl
Without the attribute these functions don't have "readnone" LLVM IR attribute. Without it some optimizations fails, e.g. Loop Invariant Code Motion doesn't hoist these calls out of a loop.

Change-Id: Idb599570d142152cc4f6a3c8986384ad7f0c4729
2020-06-29 13:33:18 -04:00
Ronak Nilesh Chauhan b7101af203 Revert "Support passing macros to hipLaunchKernelGGL"
This reverts commit 961717879d.

Reason for revert: This patch breaks ROCPrim tests

Change-Id: Ib2235f719861c9f4317c33e86b6c1f8bc669cfd4
2020-06-24 04:28:46 -04:00
Michael LIAO cea6b99a28 [hip] Disable assert workaround for HCC on HIP-Clang.
- HIP-Clang follows the standard assert definition by providing
  `__assert_fail`.  But, `assert` macro is added as an HCC-specific
  workaround due to the missing implementation. Only enable that on the
  HCC compilation to avoid unexpected behaviors on HIP-Clang
  compilation.

Change-Id: I1c9a707baff9b85c30faef58c52ebfe07e3fc3fc
2020-06-22 10:33:56 -04:00
Ronak Chauhan 961717879d Support passing macros to hipLaunchKernelGGL
This makes hipLaunchKernelGGL take a variable argument list, that will be
expanded before being fed to hipLaunchKernelGGLInternal.

Change-Id: Id76e2bf91acd5d68f56a24fc39f219f2eeb06d33
2020-06-22 04:35:29 -04:00
Tao Sang 63051ca2e1 Support numa policy set by user
Add hipHostMallocNumaUser flag to hipHostMalloc() in order to support
numa policy set by user.

Change-Id: I6d70ed539a5f97f27187f2242b68849c0e27e4d6
2020-06-19 21:23:58 -04:00
Yaxun (Sam) Liu b907505d55 Fix missing ldexp(float,int)
Change-Id: I2c1553407dfc26948d3ab7aa532eef42a0f6b204
2020-06-18 15:16:59 -04:00
Jason Tang 38cd2b96c7 Add asicRevision
Change-Id: I59f3ad20b9bdadf77bd1e0725f7a401d7ad423a3
2020-06-16 17:54:20 -04:00
German Andryeyev f4211c3905 Initial support for HIP managed memory
- Call the new ROCclr interfaces for HMM

Change-Id: I2cd1bf438f712a9e9e328340e7d0c025257ca6c1
2020-06-15 18:10:41 -04:00
Rahul Garg 00301b1665 Addback __mbcnt_lo and __mbcnt_hi
Change-Id: Ic3facba2e2245461515799f6a17842da0f5d9933
2020-06-11 21:21:36 -04:00
Dittakavi Satyanvesh 6ed1868203 SWDEV-236670 Address Eigen unit test failure by adding __host__ attribute to half2 functions
Change-Id: Ifdc852c30a1b3704871e0ee58cb7a55d3d37fc6e
2020-06-10 03:01:42 -04:00
Yaxun (Sam) Liu 087c579625 Fix include path and wrapper header
Currently std::complex and some other std functions require uses to
include hip_runtime.h before any other headers to work, which is not
reliable.

changes are made in clang to fix this issue:
https://reviews.llvm.org/D81176

which requires hipcc and HIP headers to make corresponding changes.

This patch will make sure the clang change will not break
HIP/ROCclr during this transition.

After the transition is done, we can remove explicitly setting
include path for HIP-Clang and HIP header in hipcc and hip config
cmake files and rely on clang driver to set it automatically.

Change-Id: I5d226861c2560ffa6c5ab17343a43cc378048061
2020-06-09 17:37:20 -04:00
Jason Tang 1c0d737e1f SWDEV-227909 - Add gcnArchName
Change-Id: Iea6d16b5d693dd0d900fa424d7a321c39315430e
2020-06-05 15:33:55 -04:00
Siu Chi Chan 784ca6f43c add constexpr constructor for vector types
Change-Id: I45bb0537d6a24ee50b548c2fd8b4f20518764813
2020-06-04 01:57:03 -04:00
Evgeny cad3f805c0 adding hipGetStreamDeviceId() profiling API
Change-Id: I5ccf88ddac123260d7c17defefcf20ff3b2504e2
2020-06-03 18:57:49 -04:00
Jatin 2d517fdcc6 Adding changes for hipExtLaunchKernel for rocCLR
Change-Id: Iba52bc3bde7c37f3fb375a55ba0947e87b3cdc9b
2020-06-02 14:16:41 -04:00
Evgeny ef7ff69ff0 adding hipKernelNameRefByPtr function
Change-Id: Iefc18967b10394b85a207ffdb5bbfe5e3601474d
2020-05-28 10:59:48 -04:00
Michael LIAO f6addba699 [hip] Those texture interfaces are C interfaces should be always exposed.
Change-Id: Ie34f1420839b17486346149b1672e70ec0088b54
2020-05-27 15:03:59 -04:00
Sarbojit Sarkar 83b11f9a61 [doc]shfl*sync update
1. Updated FAQ with shft*sync not supported hip_faq.md
2. Corrected some of input parameter description in hcc_details/hip_runtime_api.h
3. Redirect shfl*() to shfl_*_sync() for nvcc path where CUDA > 9.0

Change-Id: I3d8184db5fcc622852c9bad96b706348e8dfc16c
2020-05-27 02:17:40 -04:00
Mahesha Shivamallappa 01dae52d64 Add support for cooperative group type - thread_block
Change-Id: If3770b6d6718a638b70f527ae2533d9ef3267ff4
2020-05-22 23:08:42 -04:00
Aryan Salmanpour 7dd5b19290 Add support for hipExtStreamCreateWithCUMask API
Change-Id: I369d0eaca493821c4badc6b18ac02daa2fddc95f
2020-05-22 11:34:06 -04:00
Evgeny 5abb8e1a68 API tracing instrumentation
Change-Id: I257409b9fe299b009ded3e3a43287322d5f93a70
2020-05-14 11:03:09 -05:00
Matt Arsenault d2dd307c7d Remove some asm declarations for intrinsics
This technique should never be used, and only accessed through
__builtins.

There's currently no builtin for groupstaticsize. I left ds_swizzle
since for some reason it switches to the builtin based on __HCC__ or
not.

Change-Id: If1e1394221dba83ea4add6db5e94d6b715552044
2020-05-11 15:20:58 -04:00
Michael LIAO a2dbcc075c [hip] Fix -Wduplicate-decl-specifier warning. NFC.
Change-Id: Iae48bbb7805c39f1005c920df8e76504426f2d3b
2020-05-11 10:12:33 -04:00
Sarbojit Sarkar 3612851809 Enabling hipGetDeviceFlags required in [SWDEV-229170]
Change-Id: I998d37e5847f9651345554bada86df6fce86d1eb
2020-05-08 01:37:23 -04:00
Payam c5f76c3de3 name change vdi to rocclr
Change-Id: I06d198bbb4a499e153b290b73a92afed3553b252
2020-05-06 09:14:30 -04:00
Rahul Garg 60c34fbd4d Make HIP C compliant
Change-Id: Ic2fa650675e68200c841ce3db622da836b169f33
2020-05-05 12:49:40 -04:00
Vlad Sytchenko bfad8d2833 Fix even more typos from 5429b40afe
Change-Id: I4f44261547b321a214348943ff5117eb5bd55b06
2020-05-04 15:26:56 -04:00
Alex Xie d890d77da4 SWDEV-221166 - Detect support for large bar access through HIP runtime API
Change-Id: Iaa9756c1b5e40c1ab5afb38e44a6699fa5f6c13f
2020-05-01 20:39:52 -04:00
Michael LIAO 64507de694 Fix more typos from 5429b40afe.
Change-Id: I75ed28a5862daffc0778910d7ba3b97f51a87949
2020-05-01 12:19:30 -04:00
root 2689246de6 Merge master into amd-master-next
Change-Id: I3fc1dc0c860d627053537581e75561e8a7efe327
2020-04-26 22:19:37 +00:00
Yaxun (Sam) Liu 808dae6813 Enable template max and min for HIP-Clang (#2028)
It was for HCC only. HIP-Clang also needs it for __fp16 since AMDMIGraphX uses it.

Change-Id: Id49322b7b89ef799accdf6b47627a6fce51d1ab5
2020-04-24 12:30:28 -07:00
Yaxun (Sam) Liu 4143d81618 Enable template max and min for HIP-Clang
This change is required by AMDMIGraphX.

It was for HCC only. HIP-Clang also needs it for __fp16 since AMDMIGraphX uses it.

Change-Id: Id49322b7b89ef799accdf6b47627a6fce51d1ab5
2020-04-24 09:51:17 -04:00
Vlad Sytchenko 8d6347c6b8 Make sure to zero out all the unset texture fields
These might contain garbage causing the runtime to incorrectly parse the state of the texture references.

Change-Id: I93c726fa30b580b3e14c50ac939f3c71b0d1c8d9
2020-04-23 16:38:52 -04:00
Maneesh Gupta a0b5dfd625 Merge in the rocclr based hip runtime (#2032)
* Merge master-next changes in master (include vdi development in master branch)
2020-04-23 09:12:06 -07:00
Michael LIAO 218044577e [hip] Fix typos.
Change-Id: I9d85d0e70033d144dbd4d61cb434ffbe023af8c0
2020-04-22 16:44:54 -04:00
Michael LIAO 19f793f1cd [hip] Generate assertion message in assertion.
Change-Id: Ie66f6563e8728fd0e21cf22dcc6619e4a0e5c28d
2020-04-21 16:44:40 -04:00
Michael LIAO 16d9fe5e37 [vdi] Refactor texture/surface reference support.
Change-Id: I8014d82aae7139ef5f95e4b50c4fc6da200dbc9d
2020-04-21 11:56:48 -04:00
Aryan Salmanpour 386a0e0123 disable printf on hip-clang on Windows (#2021) 2020-04-17 10:33:24 +05:30
Jeff Daily ef596cd088 add IPC event support (#1996) 2020-04-17 10:31:22 +05:30
Yaxun (Sam) Liu 8d83e95457 Disable device side malloc (#2009)
* Disable device side malloc

Currently device side malloc is not working and takes excessive
device memory.

Disable it for now until a working malloc is implemented.

Change-Id: I1ad908c1c53a83752383b4be96688a848642c699
2020-04-14 16:07:14 +05:30
Yaxun (Sam) Liu 88304c15e6 Fix MIOpen build failure
This is charrypick of 9ead991784
and https://github.com/ROCm-Developer-Tools/HIP/pull/2009

Fix cmake config file

Removed cmake target files under packaging directory.

Merged cmake config .in files for HIP-Clang and HCC as one.

Use cmake generated target files in both install and packaging.

This makes cmake config file consistent for make install and
make package.

Let device side malloc/free return nullptr and trap

Change-Id: I448f3ea2d4934648089bad371debc203f895cba6
2020-04-13 23:01:31 -04:00
Vlad Sytchenko f311b0062f Fix Windows build
Change-Id: I8c46c8ee82a6e47483d4c0430b483eead3772e5b
2020-04-10 22:25:04 -04:00
Maneesh Gupta 2af31479e2 Merge branch 'amd-master' into amd-master-next
Change-Id: I3094c15008093f2072bcd38aca4ea90aeae2d97b
2020-04-09 06:31:00 -04:00
Michael LIAO a48b312aa9 [hip] Fix volatile-qualified member function declartion.
- It should be a volatile-qualified member function instead of returning
  volatile type.

Change-Id: Id7aaa1953d56151b59e469ef22b9f4280f63bebb
2020-04-07 12:49:26 -04:00
Rahul Garg ba8a556ea9 Rename hipDrvOccupancy to hipModuleOccupancy and match CUDA syntax (#1943) 2020-04-07 14:02:52 +05:30
German Andryeyev 5fe91ccb1b SWDEV-184710
Support hipLaunchCooperativeKernelMultiDevice()

- Add validation logic for MGPU launches to pass a cuda test

Change-Id: Iccca7fde43493fc3bc6685512d39202271ae3e92
2020-04-06 16:38:27 -04:00
lmoriche 9de5e90ab5 Don't duplicate embedded code objects (#1991)
If the code object is embedded in an already mapped file, and the
lifetime of the mapped file exceeds the lifetime of the executable,
we do not need to make a copy of the binary.

This allows the ROCR to present the code object URI as
file:///path/to/file#offset=X&size=Y.
2020-04-06 15:37:35 +05:30
ansurya 770e76e752 Initial support for bfloat16 (#1980) 2020-04-06 15:35:43 +05:30