2
0
Gráfico de cometimentos

230 Cometimentos

Autor(a) SHA1 Mensagem Data
Ben Sander bb005d1755 Remove faulty assert for kernelCnt==0
Change-Id: I8a925c95f48e857c0a31f44561499e90dc6df552
2016-08-01 13:38:47 -05:00
Aditya Atluri 9c7ee12822 Signal Fix: The signals in a stream are re-used
1. Before, the signal pool is increased depending on the usage
2. After, a static number of signals are allocated to the pool
Only these are used by hip in a stream
3. If the signals required are more than the pool size, the
stream has to wait to make sure all the signals are available
4. Once they are available, the stream can use them
5. Removed HIP_NUM_SIGNALS_PER_STREAM because of redundancy with HIP_STREAM_SIGNALS
6. Increased signal count from 2 to 32.
Future Work: Dynamically increase the pool size depending on the number of
streams allocated by the application. And, null stream should have more signals

Change-Id: I6be36e084f26bb04766fabf776c7210aee0f9e91
2016-07-28 23:01:35 -05:00
Ben Sander f7ab82cb39 Remove dead enqueueBarrier function.
Change-Id: Ib18fe6bd96ce24dbeb342961ddb5721f7d03f2b2
2016-07-28 22:48:22 -05:00
Ben Sander ef61aae878 Cleanup sync code.
Remove dead depFutures, enqueueBarrier call.
Rename some parms to reflect usage.
Add comments to better explain tricky parts of sync code.

Change-Id: I763296421d9c2b3b58fc8cef5f010b12ab49553c
2016-07-27 18:31:11 -05:00
Ben Sander f5118ce3cd Fix API string message for hipDeviceGetAttribute
Change-Id: I30f54627630c8ee835506be8c9921742bb68a43a
2016-07-27 16:18:14 -05:00
Aditya Atluri 1b2a24d0b8 Signal Fix: Added signal limit to allocSignal
1. Did not change the logic in allocSignal
2. Added guard to wait on signal limit

Change-Id: I78f29097e6a584b3c3d78319dac19869067bd1fe
2016-07-27 13:48:49 -05:00
Aditya Atluri 7be196de48 Signal Fix: Moved kernel count to critical stream
1. Added environment variable HIP_NUM_KERNELS_INFLIGHT
2. Moved kernelcount variable inside stream critical section

Change-Id: I51d24d0a2a109467209170de117a6d02ba4e308e
2016-07-26 17:09:27 -05:00
Aditya Atluri 2e754d27dc Signal Fix: Changed global signal count to per stream signal count
1. The number of kernels that can use signals are increased to 128
2. The kernel count is now specific to the stream

Change-Id: Ie6d1aa3f437aad8f08c3333fe48bd3f46e551e60
2016-07-26 14:03:51 -05:00
Aditya Atluri 524127b4a4 removed redundant signal destroy
Change-Id: Icf0cd76b2620d34c87cfb6c7a83049087c0a0bc4
2016-07-26 13:35:35 -05:00
Aditya Atluri 0232e6bbb4 Added re-fix for memcpy kernel sync
1. The patch uses HIP signal pools to sync between copy and kernel commands
2. The hsa_signal_create is removed
3. Left the redundant enqueueBarrier method just in case

Change-Id: I3dff3e8ee57fff3cd49bec802ff735ed128e5ca1
2016-07-26 09:22:59 -05:00
Rahul Garg d11d65d401 D2H and H2D unpinned memory transfer support
Change-Id: If6d6c970f435e5d917d5cc6cddc2ee2918cd1c37

Conflicts:
	src/hip_hcc.cpp
2016-07-25 14:36:07 +05:30
Aditya Atluri 1704006bed Partial fix async after kernel launch signal issue
Change-Id: Ib48d6564379160035bded9493b93663fba361710
2016-07-23 14:54:20 -05:00
pensun 6db08e5135 Add empty stubs for threadfence family routines, changes include:
- stubs and documentation in include/hcc_details/hip_runtime.h
    - stubs with "no-op" in src/hip_memory.cpp
    - document update in hip_kernel_language.md, add suggestions to
    disable L1 and L2 caches when using the threadfence routines.

Change-Id: Ic0753170f802003055bca9d7476d7f48817b98b7
2016-07-22 10:40:58 -05:00
Maneesh Gupta b485470819 Replace calls to ihipInit with use of HIP_INIT_API macro
Change-Id: Iabf7df79f0238a8ddffea4607fe945df36642850
2016-07-22 15:46:55 +05:30
Maneesh Gupta dffed956fb Fix using ATP markers
Change-Id: If2d04f80b580237426c569737551e2001a8cd35a
2016-07-21 16:02:51 +05:30
Maneesh Gupta 7d5cffdc17 Merge branch 'hiparray' into amd-develop
Change-Id: I63ca7b1db7b593ac5cfb3fd7cd5d08d6e4075a4c
2016-07-21 12:29:56 +05:30
Aditya Atluri 77d7134619 added fix for signal overflow in kernels
Change-Id: Ie0b1f97f69b7d7b34e445f6f120472819be03a0e
2016-07-19 13:51:44 -05:00
Maneesh Gupta 2577b6158f Merge branch 'amd-develop' into amd-master
Change-Id: I04f85b207e15e66c1a546675dc0937726ee08362
2016-06-30 18:36:07 +05:30
Fan Cao dc0a787984 Replace GPU agent with CPU agent properly for memory async copy API
ihipStream_t::copySync use GPU agent in memory async copy API, even
if the src/dst memory does not belong to GPU, which cause the hsa
runtime to choose a slower copy engine.

SWDEV-95191

Change-Id: If3cab3d493c0c96ed63721cdcf28247a1193887c
2016-06-30 18:23:29 +05:30
Aditya Atluri 38720f8a4e moved half support to a source file
Change-Id: I7c09b41877e22c1b743dea25a585e5307427dafd
2016-06-30 18:23:29 +05:30
7SK 54034e5048 NVCC_COMPAT
add support for both cuda compatible implementation and hcc(faster)
implementation with test

Change-Id: I79a22344f458391d7dffac5f147619a542e97e4e
2016-06-28 09:36:06 +05:30
Maneesh Gupta dca8fca8eb Merge branch 'amd-master' into amd-develop 2016-06-24 21:13:11 +05:30
Rahul Garg 226aa917e7 Included code to calculate value of maxThreadsPerMultiprocessor property
Change-Id: Ie7cad7442f36a7163e715048de5a309febc28664
2016-06-24 15:10:11 +05:30
Ben Sander e27b5cc927 Grid-launch updates to 2.0 and cleanup of old.
_ Use fields from GRID_LAUNCH_20 structure
  (See USE_GRID_LAUNCH_20 define, currently set to 0)
  "1" will require HCC support.
- Remove old DISABLE_GRID_LAUNCH support.

Change-Id: I584ce648d217251789a6283cf27feb24cb7dc8d1
2016-06-21 23:24:38 -05:00
Maneesh Gupta 2d50e4b9e0 default value of uninitialized dim3 elements should be 1
Change-Id: Idff38fac8dfca68f38f1714f8fdec64df2890a6a
2016-06-20 10:13:46 +05:30
Aditya Atluri ffcfc95360 able to pass non-dim launch parm to kernel launch
Change-Id: I0411849a27efcba597a1a9aa08be179635e04988
2016-06-18 11:28:20 -05:00
Ben Sander 44d117ba63 Clean up old work-week and USE_* refs
Change-Id: I929c979fa085f8e5205194cbccca46e9b5516aa9
2016-06-17 15:18:57 -05:00
Aditya Atluri ba262ea855 added tests for host math functions
Change-Id: I66a5c574a27190e32054586f07ecf20e1ff71292
2016-06-17 15:05:33 -05:00
Aditya Atluri 75fc024308 added bessel nth order function
Change-Id: I18a64d894dda9330b39638535dfafd7ce31bb968
2016-06-17 09:22:23 +05:30
Ben Sander 6a2a140f34 NVCC improvements.
- Complete translation tables for cudaError <-> hipError_t.
- Remove some odd errors that were not correctly translated or not used.
- Add HIPCHECK_API to test infrastructure.  Used for negative testing
  an API ; if a mismatch occurs it shows the expected return error
  code.  Can also print a warning rather than error.
- Enable hipMemoryAllocate on NV system, and review error coded.
- Add hipErrorName to nvcc.

Change-Id: I680427dcf32a5796d5913cf9e7f3b4c6f6b91599

Conflicts:
	tests/src/CMakeLists.txt

Bug fixes and improved docs for hipFree and hipHostFree.

    - Passing NULL pointer initialized runtime and return hipSuccess
      (not an error like before).
    - add negative test for this. (hipMemoryAllocate, improved)
    - Match NVCC errors for invalid pointers, add to test.
    - Update hipFree and hipHostFree docs.
    - hipGetDevicePointer always set *devicePointer=NULL, even for
      invalid flags.
    - Gate shared memory usage on specific HCC work-week.

Change-Id: I533b4fd3280a3d6cdbf05eb768976f0c7506c012
2016-06-16 06:13:51 +05:30
Aditya Atluri c4e667cf90 added more host functions and tests
Change-Id: I9904e65e14c5479ba33d836c5c0b763cb5af71e3
2016-06-15 11:45:19 -05:00
Aditya Atluri bb02880a12 added host device functions
Change-Id: I8f299752fb8dd8e8947da62e4ad88842c1c19f62
2016-06-14 18:14:44 -05:00
Aditya Atluri ce52a8f70c added bessel zero and one order functions
Change-Id: I57039d54eae7207db00415bc7ba09bbf9cb6425a
2016-06-14 11:50:48 +05:30
Aditya Atluri 720fa16355 added erfinv software implementation
Change-Id: Ib1a5584f6c81ab3afa70f7bcbfd7780e156454e3
2016-06-14 00:09:41 -04:00
Aditya Atluri ae96fe4d12 added more device functions
Change-Id: I191919060b393772ee442cc19d83479217c5a4ce
2016-06-13 11:55:12 -05:00
Aditya Atluri 20b991e99a added normcdf support
Change-Id: I4887bc588589ed067eaa339d5eccd988c1c5d649
2016-06-13 10:09:37 +05:30
Aditya Atluri c7462bd524 Added more device functions
1. Added copyright for device float test
2. Added device double functions support
3. Added device double functions test
4. Corrected device function signatures in headers

Change-Id: I13c8829682c925992f5cad84062bc9f702fe4048
2016-06-10 09:46:31 -05:00
Aditya Atluri 681c5fda12 added more float device functions
Change-Id: I106ce6de9ed8806b3699dcf0add9efc9e8583615
2016-06-10 06:22:00 -04:00
7SK fda049fa5f fix_ldg
Change-Id: I53de5fa91b4f57d496ffe46787d197ae84dde4a4
2016-06-09 16:56:05 -04:00
Maneesh Gupta cce19ad99a Merge "Squashed commit of the following:" into amd-master 2016-06-07 12:52:21 -04:00
Maneesh Gupta b35fa83c47 Use cpu agent when using staging buffer
Change-Id: I195a8137e86f2752681d6ba4dc7ba1b6f654e264
2016-06-06 12:42:44 +05:30
Jack Chung 65448e74ed Squashed commit of the following:
commit 9548493fa754b3bf5c31cbdc2211db1e73e8c07c
Author: Jack Chung <whchung@gmail.com>
Date:   Mon May 23 11:57:23 2016 +0800

    Rename hipExternShared test to hipDynamicShared

    Change-Id: I180d9d539420fb69cfc121eceaa7db9da03483b2

commit 827081f8244a38f010789d556db0c4ff7b6422d8
Author: Jack Chung <whchung@gmail.com>
Date:   Mon May 23 11:56:27 2016 +0800

    Rename HIP_DECLARE_EXTERN_SHARED to HIP_DYNAMIC_SHARED

    Change-Id: I22362d179812ac547e0f11ba4e2bb999050e08ae

commit 4c277228ed41af187739610fa17eab1fb144c947
Author: Jack Chung <whchung@gmail.com>
Date:   Thu May 19 17:49:52 2016 +0800

    Adopt new interface to get dynamic LDS in hc.hpp

    Change-Id: I47b433b714633a4c97df87c40a0b1d3386429a00

commit 5a36117d777064113a528dc47b42e8c8413baa97
Author: Jack Chung <whchung@gmail.com>
Date:   Thu May 19 11:29:24 2016 +0800

    Add test patterns for regular expression to match "extern __shared__"

    These test patterns should better be saved as an individual test case, but I'm
    not familiar with HIP test structures so I leave them as comments in hipify as
    of now.

    Change-Id: I7fee89c89b9e73de2133357a226ec0c769733531

commit 1b26284168c7f5339f63338fd0149bed5d994656
Author: Jack Chung <whchung@gmail.com>
Date:   Thu May 19 11:25:23 2016 +0800

    Add one HIP unit test to use HIP_DECLARE_EXTERN_SHARED

    Change-Id: I4d9907815920693a74ea9d575fe26e7c67636109

commit 77b816ee5972b13d829d5bbcf06fbfd07acea2af
Author: Jack Chung <whchung@gmail.com>
Date:   Wed May 18 19:18:59 2016 +0800

    Adopt HIP_ prefix for DECLARE_EXTERN_SHARED macro

    Change-Id: I555ded16b449b67d2e20904013d86fe1ded6a2be

commit ef0997939c3578a9ae11621bf21c0416f04d2622
Author: Jack Chung <whchung@gmail.com>
Date:   Wed May 18 17:42:04 2016 +0800

    Modify hipify to support converting extern __shared__ to DECLARE_EXTERN_SHARED macro

    Added regular expression to search & replace extern __shared__ declarations to
    DECLARE_EXTERN_SHARED macro.

    Limitation:
    - Won't work if "extern __shared__" is declared at global scope

    Sample Usages:
    extern __shared__ double foo[];
    extern __shared__ unsigned int foo[];
    extern volatile __shared__ double foo[];
    extern volatile __shared__ unsigned int sdata[];
    extern __shared__ volatile unsigned int sdata[];
    extern __shared__ T s[];
    extern __shared__ T::type s[];
    extern __shared__ blah<T>::type s[];
    extern __shared__ typename mapper<Float>::type s_data[];
    extern __attribute__((used)) __shared__ typename mapper<Float>::type s_data[];

    Change-Id: I2be0b7039adeddb789f5a2b067d403a43fdc3e26

commit 93ff268724493aedfacdcd5a5aa9a100f4ebaed0
Author: Jack Chung <whchung@gmail.com>
Date:   Wed May 18 15:13:09 2016 +0800

    Introduce DECLARE_EXTERN_SHARED macro to encapsulate "extern __shared__" decls

    Change-Id: I93b2d37c763195b0ca9fd0afee78605a1e3272db

commit cff9c95412de343cc6405158b5acc4f1029267ff
Author: Jack Chung <whchung@gmail.com>
Date:   Wed May 18 12:53:54 2016 +0800

    Add __get_dynamic_groupbaseptr() to point to dynamic LDS

    Change-Id: I97b548d8a691488057617c551a8f331cad7afc77

Change-Id: I84e7875b76fa1f59e860e19c93bd4209cdd1fd2c
2016-06-05 06:20:44 -04:00
Rahul Garg 02a6c1fbe0 Update in clock function
Change-Id: I5819aa62693dc3b9b5d7e39944d1e58aadc72027
2016-05-20 11:12:32 +05:30
Maneesh Gupta 429c26ea93 Add misssing unsigned keyword to atomicInc and atomicDec
Change-Id: I658479c4c7c409dba117152165229880aeb5ab9f
2016-05-16 10:42:13 +05:30
Rahul Garg 8c11c333e2 Support for Atomic inc and dec in HIP
Change-Id: I783e4917cece5cc379894f0d293382315fbfa8b0
2016-05-12 11:10:48 +05:30
Aditya Atluri 67e2ee1efe Added copyright for device functions file
Change-Id: I689345ae7428928b4d2d7cd37fbc561309db3256
2016-05-06 10:51:06 -05:00
Ben Sander 07b5785da2 comment change 2016-05-03 08:33:35 -05:00
Ben Sander 20043d602e Merge branch 'privatestaging' into grid_launch 2016-05-02 18:38:20 -05:00
Ben Sander 79983a1f4b Merge branch 'privatestaging' into p2p 2016-05-02 11:10:10 -05:00
Aditya Atluri c46e03de96 removed warnings 2016-04-30 12:11:04 -05:00