İşleme Grafiği

769 İşleme

Yazar SHA1 Mesaj Tarih
Maneesh Gupta cb642f14ab Merge pull request #482 from ROCm-Developer-Tools/feature_clean_up_hip_math
Switch to using ROCDL directly, as opposed to via HC. Add missing bits.
2018-06-06 16:07:22 +05:30
Maneesh Gupta 53037472ff Merge pull request #497 from gargrahul/fix_memcpy3d_fastpath
Fix hipMemcpy3D for fast path
2018-06-06 14:44:02 +05:30
Maneesh Gupta 28c5b15d88 Merge pull request #492 from gargrahul/fix_depth_3d_alloc
Fix depth value for 3D allocations
2018-06-06 14:41:23 +05:30
Maneesh Gupta f7c49dde38 Merge pull request #491 from scchan/fix_wait
callback handling: don't need to wait for the thread to become ready
2018-06-06 14:38:25 +05:30
Rahul Garg 163d4a5b03 Fix hipMemcpy3D for fast path 2018-06-05 18:54:33 +05:30
Siu Chi Chan 0d719c514f remove the _ready flag in ihipStreamCallback_t and the mutex that protects it. 2018-06-04 17:29:04 -04:00
Rahul Garg fa6ce7a724 Fix depth value for 3D allocations 2018-06-04 18:00:22 +05:30
Siu Chi Chan e21e6ed3a0 callback handler: don't need to wait for the thread to become ready 2018-06-02 17:55:37 -04:00
Alex Voicu 68a0dd826d Remove vestigial implementations. 2018-06-02 11:37:08 +01:00
Rahul Garg 94f086e9cd Add integrated device property 2018-06-02 13:11:16 +05:30
Alex Voicu f2d7f112ab Re-sync with upstream. 2018-06-01 15:49:05 +01:00
Maneesh Gupta 095d4dd91e Merge pull request #484 from gargrahul/fix_malloc_hiphostreg
Fix memcpy2D for malloc+ hostRegister
2018-06-01 16:53:25 +05:30
Maneesh Gupta 8ecb3eeb55 Merge pull request #466 from ROCm-Developer-Tools/feature_use_Float16
Feature use _Float16 and match CUDA __half behaviour.
2018-06-01 13:50:12 +05:30
Alex Voicu e03ca1a72e Re-sync with upstream. Add integer abs. 2018-05-31 16:38:00 +01:00
Rahul Garg a3609eaf61 Fix memcpy2D for malloc+ hostRegister 2018-05-31 13:14:27 +05:30
Alex Voicu 14e6a04387 Switch to using ROCDL directly, as opposed to via HC. Add missing bits. 2018-05-31 03:17:26 +01:00
Maneesh Gupta 3327a9a2de Merge pull request #472 from Jorghi12/patch-3
Adding double/long int signatures for abs
2018-05-30 08:32:14 +05:30
Jorghi12 61ff40a1cf Update math_functions.cpp
CUDA also has a function named labs.
2018-05-26 16:21:14 -04:00
Jorghi12 13f37d550f Adding double/long int signatures for abs
Adding overloads for abs that are found in cuda's math_functions.
2018-05-26 00:40:14 -04:00
Rahul Garg 27e0566c3a Use 64x4 grid dims 2018-05-24 23:51:52 +05:30
Rahul Garg aed2653857 Clean up and fix remaining bytes copy 2018-05-24 23:30:27 +05:30
Alex Voicu f2a86f3e1c Remove vestigial inline LLVMIR. 2018-05-24 12:46:14 +01:00
Rahul Garg 53a7c61e9b Fix memcpy2d kernel dims 2018-05-24 17:00:12 +05:30
Rahul Garg 96b4618d26 Correct remaining bytes in copy 2d kernel 2018-05-24 08:27:24 +05:30
Alex Voicu ecefdd6541 Missing commit. 2018-05-23 17:57:47 +01:00
Rahul Garg 2c3d1498d4 Optimize memcpy2D kernel use 2018-05-23 14:43:47 +05:30
Maneesh Gupta 7042fe6067 Merge pull request #464 from gargrahul/fix_memcpy2d_pinned_mem_case
Fixed memcpy2D for pinned memory case using 2D kernel
2018-05-22 10:42:28 +05:30
Maneesh Gupta 85342d73b5 Merge pull request #445 from ROCm-Developer-Tools/feature_func_attributes
Add support for the hipFuncGetAttributes interface.
2018-05-22 09:37:41 +05:30
Rahul Garg 40fb44dbe6 Fixed memcpy2D for pinned memory case using 2D kernel 2018-05-21 22:14:45 +05:30
Maneesh Gupta 66d05e6fc3 hipMemcpy returns success if sizeBytes is 0.
Fixes SWDEV-153754 & SWDEV-154178.
2018-05-21 15:38:44 +05:30
Maneesh Gupta cfb7be414b Merge pull request #455 from ROCm-Developer-Tools/magic
Change HIP fat binary magic number
2018-05-21 09:52:03 +05:30
Alex Voicu 43fca684c8 Update hip_module.cpp
Typo.
2018-05-18 17:50:45 +01:00
Rahul Garg 4f5bdb071c Fix for memcpy2DAsync for pinned host memory case 2018-05-18 21:09:50 +05:30
Maneesh Gupta 1c93e11cdf Merge pull request #433 from gargrahul/add_hipmemset3d
Added hipMemset3D
2018-05-18 14:54:15 +05:30
Maneesh Gupta bb1f53ac44 Merge pull request #448 from 949f45ac/master
Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs
2018-05-18 13:18:16 +05:30
Yaxun (Sam) Liu 2f9bce3652 Change HIP fat binary magic number 2018-05-17 17:04:51 -04:00
949f45ac 7bf8402d1d Reinstate accidentally deleted uchar2Holder 2018-05-17 10:55:45 +02:00
Rahul Garg 8413fb51e1 Fixed hipMemcpy2D to handle 1D memcpy case 2018-05-16 11:07:10 +05:30
Alex Voicu 40a22d235e Update hip_module.cpp 2018-05-14 17:15:36 +01:00
949f45ac 9210263727 Provide correct __mul64hi and __umul64hi builtins, using code from ROCm-Device-Libs 2018-05-14 08:34:56 +02:00
Alex Voicu eded014abc Don't use magic constants, they're evil.
Also clarify that the register count cannot be queried at the moment.
2018-05-11 11:31:46 +01:00
Alex Voicu bf9529aaa8 Add support for the hipFuncGetAttributes interface. 2018-05-11 03:35:10 +01:00
Rahul Garg 78568435da Added hipMemset3D 2018-05-07 10:24:30 +05:30
Lakhan Singh 12d8a47c0c Null checks added for hipmallocpitch and hipmemcpy apis 2018-05-03 09:27:50 +05:30
Rahul Garg 1d76c48e3d Fix texture 3D for HIP/NVCC 2018-05-02 11:56:37 +05:30
Maneesh Gupta af0c227df4 Merge pull request #415 from deven-amd/master
Checkin to fix bugs in math functions.
2018-05-01 12:29:03 +05:30
Deven Desai 65a90c55e7 Checkin to fix bugs in math functions.
This change fixes the following bugs that were discovered while debuggnig TF unit test failures (cwise_ops_test)

1. __hisinf and __hisnan routines
   Both had incorrect implementations.

2. abs
   A "long long" (64bit int) version was missing, resulting in the 32bit version being used for 64bit ints (which resulted in incorrect results, when the value passed in was outside the 32bit int range)

3. lgamma
  We seemed to have a custom version for the 'double' datatype (which was giving incorrect results). Replaced it with a call to the 'double' version of the underlying 'hc::precision_math::lgamma'
2018-04-24 18:10:07 +00:00
Lakhan Singh 74faa61d52 SWDEV-141024 2018-04-20 17:40:00 +05:30
Rahul Garg fcc0866681 Added hipMemset2DAsync support 2018-04-17 18:27:27 +05:30
Maneesh Gupta 72ff4c5cc4 Merge pull request #400 from gargrahul/hipModule_cleanup
hip_module code cleanup
2018-04-17 09:00:15 +05:30