Maneesh Gupta
cb642f14ab
Merge pull request #482 from ROCm-Developer-Tools/feature_clean_up_hip_math
...
Switch to using ROCDL directly, as opposed to via HC. Add missing bits.
2018-06-06 16:07:22 +05:30
Maneesh Gupta
53037472ff
Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
...
Fix hipMemcpy3D for fast path
2018-06-06 14:44:02 +05:30
Maneesh Gupta
28c5b15d88
Merge pull request #492 from gargrahul/fix_depth_3d_alloc
...
Fix depth value for 3D allocations
2018-06-06 14:41:23 +05:30
Maneesh Gupta
f7c49dde38
Merge pull request #491 from scchan/fix_wait
...
callback handling: don't need to wait for the thread to become ready
2018-06-06 14:38:25 +05:30
Rahul Garg
163d4a5b03
Fix hipMemcpy3D for fast path
2018-06-05 18:54:33 +05:30
Siu Chi Chan
0d719c514f
remove the _ready flag in ihipStreamCallback_t and the mutex that protects it.
2018-06-04 17:29:04 -04:00
Rahul Garg
fa6ce7a724
Fix depth value for 3D allocations
2018-06-04 18:00:22 +05:30
Siu Chi Chan
e21e6ed3a0
callback handler: don't need to wait for the thread to become ready
2018-06-02 17:55:37 -04:00
Alex Voicu
68a0dd826d
Remove vestigial implementations.
2018-06-02 11:37:08 +01:00
Rahul Garg
94f086e9cd
Add integrated device property
2018-06-02 13:11:16 +05:30
Alex Voicu
f2d7f112ab
Re-sync with upstream.
2018-06-01 15:49:05 +01:00
Maneesh Gupta
095d4dd91e
Merge pull request #484 from gargrahul/fix_malloc_hiphostreg
...
Fix memcpy2D for malloc+ hostRegister
2018-06-01 16:53:25 +05:30
Maneesh Gupta
8ecb3eeb55
Merge pull request #466 from ROCm-Developer-Tools/feature_use_Float16
...
Feature use _Float16 and match CUDA __half behaviour.
2018-06-01 13:50:12 +05:30
Alex Voicu
e03ca1a72e
Re-sync with upstream. Add integer abs.
2018-05-31 16:38:00 +01:00
Rahul Garg
a3609eaf61
Fix memcpy2D for malloc+ hostRegister
2018-05-31 13:14:27 +05:30
Alex Voicu
14e6a04387
Switch to using ROCDL directly, as opposed to via HC. Add missing bits.
2018-05-31 03:17:26 +01:00
Maneesh Gupta
3327a9a2de
Merge pull request #472 from Jorghi12/patch-3
...
Adding double/long int signatures for abs
2018-05-30 08:32:14 +05:30
Jorghi12
61ff40a1cf
Update math_functions.cpp
...
CUDA also has a function named labs.
2018-05-26 16:21:14 -04:00
Jorghi12
13f37d550f
Adding double/long int signatures for abs
...
Adding overloads for abs that are found in cuda's math_functions.
2018-05-26 00:40:14 -04:00
Rahul Garg
27e0566c3a
Use 64x4 grid dims
2018-05-24 23:51:52 +05:30
Rahul Garg
aed2653857
Clean up and fix remaining bytes copy
2018-05-24 23:30:27 +05:30
Alex Voicu
f2a86f3e1c
Remove vestigial inline LLVMIR.
2018-05-24 12:46:14 +01:00
Rahul Garg
53a7c61e9b
Fix memcpy2d kernel dims
2018-05-24 17:00:12 +05:30
Rahul Garg
96b4618d26
Correct remaining bytes in copy 2d kernel
2018-05-24 08:27:24 +05:30
Alex Voicu
ecefdd6541
Missing commit.
2018-05-23 17:57:47 +01:00
Rahul Garg
2c3d1498d4
Optimize memcpy2D kernel use
2018-05-23 14:43:47 +05:30
Maneesh Gupta
7042fe6067
Merge pull request #464 from gargrahul/fix_memcpy2d_pinned_mem_case
...
Fixed memcpy2D for pinned memory case using 2D kernel
2018-05-22 10:42:28 +05:30
Maneesh Gupta
85342d73b5
Merge pull request #445 from ROCm-Developer-Tools/feature_func_attributes
...
Add support for the hipFuncGetAttributes interface.
2018-05-22 09:37:41 +05:30
Rahul Garg
40fb44dbe6
Fixed memcpy2D for pinned memory case using 2D kernel
2018-05-21 22:14:45 +05:30
Maneesh Gupta
66d05e6fc3
hipMemcpy returns success if sizeBytes is 0.
...
Fixes SWDEV-153754 & SWDEV-154178.
2018-05-21 15:38:44 +05:30
Maneesh Gupta
cfb7be414b
Merge pull request #455 from ROCm-Developer-Tools/magic
...
Change HIP fat binary magic number
2018-05-21 09:52:03 +05:30
Alex Voicu
43fca684c8
Update hip_module.cpp
...
Typo.
2018-05-18 17:50:45 +01:00
Rahul Garg
4f5bdb071c
Fix for memcpy2DAsync for pinned host memory case
2018-05-18 21:09:50 +05:30
Maneesh Gupta
1c93e11cdf
Merge pull request #433 from gargrahul/add_hipmemset3d
...
Added hipMemset3D
2018-05-18 14:54:15 +05:30
Maneesh Gupta
bb1f53ac44
Merge pull request #448 from 949f45ac/master
...
Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs
2018-05-18 13:18:16 +05:30
Yaxun (Sam) Liu
2f9bce3652
Change HIP fat binary magic number
2018-05-17 17:04:51 -04:00
949f45ac
7bf8402d1d
Reinstate accidentally deleted uchar2Holder
2018-05-17 10:55:45 +02:00
Rahul Garg
8413fb51e1
Fixed hipMemcpy2D to handle 1D memcpy case
2018-05-16 11:07:10 +05:30
Alex Voicu
40a22d235e
Update hip_module.cpp
2018-05-14 17:15:36 +01:00
949f45ac
9210263727
Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs
2018-05-14 08:34:56 +02:00
Alex Voicu
eded014abc
Don't use magic constants, they're evil.
...
Also clarify that the register count cannot be queried at the moment.
2018-05-11 11:31:46 +01:00
Alex Voicu
bf9529aaa8
Add support for the hipFuncGetAttributes interface.
2018-05-11 03:35:10 +01:00
Rahul Garg
78568435da
Added hipMemset3D
2018-05-07 10:24:30 +05:30
Lakhan Singh
12d8a47c0c
Null checks added for hipmallocpitch and hipmemcpy apis
2018-05-03 09:27:50 +05:30
Rahul Garg
1d76c48e3d
Fix texture 3D for HIP/NVCC
2018-05-02 11:56:37 +05:30
Maneesh Gupta
af0c227df4
Merge pull request #415 from deven-amd/master
...
Checkin to fix bugs in math functions.
2018-05-01 12:29:03 +05:30
Deven Desai
65a90c55e7
Checkin to fix bugs in math functions.
...
This change fixes the following bugs that were discovered while debuggnig TF unit test failures (cwise_ops_test)
1. __hisinf and __hisnan routines
Both had incorrect implementations.
2. abs
A "long long" (64bit int) version was missing, resulting in the 32bit version being used for 64bit ints (which resulted in incorrect results, when the value passed in was outside the 32bit int range)
3. lgamma
We seemed to have a custom version for the 'double' datatype (which was giving incorrect results). Replaced it with a call to the 'double' version of the underlying 'hc::precision_math::lgamma'
2018-04-24 18:10:07 +00:00
Lakhan Singh
74faa61d52
SWDEV-141024
2018-04-20 17:40:00 +05:30
Rahul Garg
fcc0866681
Added hipMemset2DAsync support
2018-04-17 18:27:27 +05:30
Maneesh Gupta
72ff4c5cc4
Merge pull request #400 from gargrahul/hipModule_cleanup
...
hip_module code cleanup
2018-04-17 09:00:15 +05:30