Граф коммитов

169 Коммитов

Автор SHA1 Сообщение Дата
Aditya Atluri 1859c6e515 Signal Fix: Added signal limit to allocSignal
1. Did not change the logic in allocSignal
2. Added guard to wait on signal limit

Change-Id: I78f29097e6a584b3c3d78319dac19869067bd1fe
2016-07-27 13:48:49 -05:00
Aditya Atluri 0a31b47e2e Signal Fix: Moved kernel count to critical stream
1. Added environment variable HIP_NUM_KERNELS_INFLIGHT
2. Moved kernelcount variable inside stream critical section

Change-Id: I51d24d0a2a109467209170de117a6d02ba4e308e
2016-07-26 17:09:27 -05:00
Aditya Atluri 53d7629a85 Signal Fix: Changed global signal count to per stream signal count
1. The number of kernels that can use signals are increased to 128
2. The kernel count is now specific to the stream

Change-Id: Ie6d1aa3f437aad8f08c3333fe48bd3f46e551e60
2016-07-26 14:03:51 -05:00
Aditya Atluri fa7933eb91 removed redundant signal destroy
Change-Id: Icf0cd76b2620d34c87cfb6c7a83049087c0a0bc4
2016-07-26 13:35:35 -05:00
Aditya Atluri 4bdf26a82e Added re-fix for memcpy kernel sync
1. The patch uses HIP signal pools to sync between copy and kernel commands
2. The hsa_signal_create is removed
3. Left the redundant enqueueBarrier method just in case

Change-Id: I3dff3e8ee57fff3cd49bec802ff735ed128e5ca1
2016-07-26 09:22:59 -05:00
Rahul Garg 42a3ed544c D2H and H2D unpinned memory transfer support
Change-Id: If6d6c970f435e5d917d5cc6cddc2ee2918cd1c37

Conflicts:
	src/hip_hcc.cpp
2016-07-25 14:36:07 +05:30
Aditya Atluri c756bb3398 Partial fix async after kernel launch signal issue
Change-Id: Ib48d6564379160035bded9493b93663fba361710
2016-07-23 14:54:20 -05:00
Maneesh Gupta 71d51170ef Replace calls to ihipInit with use of HIP_INIT_API macro
Change-Id: Iabf7df79f0238a8ddffea4607fe945df36642850
2016-07-22 15:46:55 +05:30
Maneesh Gupta b23fad53cc Fix using ATP markers
Change-Id: If2d04f80b580237426c569737551e2001a8cd35a
2016-07-21 16:02:51 +05:30
Aditya Atluri c8542007b5 added fix for signal overflow in kernels
Change-Id: Ie0b1f97f69b7d7b34e445f6f120472819be03a0e
2016-07-19 13:51:44 -05:00
Fan Cao 6a2bbbcb75 Replace GPU agent with CPU agent properly for memory async copy API
ihipStream_t::copySync use GPU agent in memory async copy API, even
if the src/dst memory does not belong to GPU, which cause the hsa
runtime to choose a slower copy engine.

SWDEV-95191

Change-Id: If3cab3d493c0c96ed63721cdcf28247a1193887c
2016-06-30 18:23:29 +05:30
Maneesh Gupta f4cc90472d Merge branch 'amd-master' into amd-develop 2016-06-24 21:13:11 +05:30
Rahul Garg 3029be78b8 Included code to calculate value of maxThreadsPerMultiprocessor property
Change-Id: Ie7cad7442f36a7163e715048de5a309febc28664
2016-06-24 15:10:11 +05:30
Ben Sander 9f29cc6989 Grid-launch updates to 2.0 and cleanup of old.
_ Use fields from GRID_LAUNCH_20 structure
  (See USE_GRID_LAUNCH_20 define, currently set to 0)
  "1" will require HCC support.
- Remove old DISABLE_GRID_LAUNCH support.

Change-Id: I584ce648d217251789a6283cf27feb24cb7dc8d1
2016-06-21 23:24:38 -05:00
Maneesh Gupta 1a2db6e6e4 default value of uninitialized dim3 elements should be 1
Change-Id: Idff38fac8dfca68f38f1714f8fdec64df2890a6a
2016-06-20 10:13:46 +05:30
Aditya Atluri 90cd67e0b5 able to pass non-dim launch parm to kernel launch
Change-Id: I0411849a27efcba597a1a9aa08be179635e04988
2016-06-18 11:28:20 -05:00
Ben Sander 2ab19ca505 NVCC improvements.
- Complete translation tables for cudaError <-> hipError_t.
- Remove some odd errors that were not correctly translated or not used.
- Add HIPCHECK_API to test infrastructure.  Used for negative testing
  an API ; if a mismatch occurs it shows the expected return error
  code.  Can also print a warning rather than error.
- Enable hipMemoryAllocate on NV system, and review error coded.
- Add hipErrorName to nvcc.

Change-Id: I680427dcf32a5796d5913cf9e7f3b4c6f6b91599

Conflicts:
	tests/src/CMakeLists.txt

Bug fixes and improved docs for hipFree and hipHostFree.

    - Passing NULL pointer initialized runtime and return hipSuccess
      (not an error like before).
    - add negative test for this. (hipMemoryAllocate, improved)
    - Match NVCC errors for invalid pointers, add to test.
    - Update hipFree and hipHostFree docs.
    - hipGetDevicePointer always set *devicePointer=NULL, even for
      invalid flags.
    - Gate shared memory usage on specific HCC work-week.

Change-Id: I533b4fd3280a3d6cdbf05eb768976f0c7506c012
2016-06-16 06:13:51 +05:30
Ben Sander 89df2f4e2f Merge branch 'privatestaging' into grid_launch 2016-05-02 18:38:20 -05:00
Aditya Atluri cac8110a4f changed to guard from hc.hpp 2016-04-27 17:46:27 -05:00
bwicakso f0974e5867 Merge remote-tracking branch 'refs/remotes/origin/privatestaging' into kernel_synchronization 2016-04-25 13:57:28 -05:00
Maneesh Gupta ffdf6ab23b Merge branch 'release_0.84.00' into privatestaging
Conflicts:
	include/hcc_detail/hip_runtime.h
	src/hip_hcc.cpp
2016-04-22 10:55:58 +05:30
bwicakso df98fd8531 Fix for kernel synchronization
The completion future of a particular kernel is lost if there are
multiple kernels in the stream. This can cause a racing condition where
the signal associated with the unreferenced completion_future might get
released by hcc runtime.
2016-04-20 15:51:39 -05:00
Aditya Atluri f74b7a3636 added support pinned dma memcpy between host and device 2016-04-20 14:21:22 -05:00
Aditya Atluri 805b268ad4 added support for __ldg 2016-04-20 12:25:40 -05:00
Ben Sander 453615ed57 Fix hipDeviceReset synchronization 2016-04-19 11:56:12 -05:00
Ben Sander 8c97a258de Set chicken bits to 0. 2016-04-19 11:56:12 -05:00
Ben Sander 22d5738f82 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging 2016-04-18 21:51:13 -05:00
Ben Sander e020d68309 Fixes for P2P and hipDeviceReset
- devicereset would lose track of default stream and thus subsequent
  synchronization calls might not actually sychronize.
- Also deviceReset now correctly frees streams.
- fix waits in P2P staging copy - first phase (Device0-to-Staging) must
  wait for second phase (Staging to Device1) to finish draining the
  buffer.
2016-04-18 20:49:33 -05:00
Aditya Atluri ea647727df Update hip_hcc.cpp 2016-04-18 11:36:51 -05:00
Ben Sander 65abde6626 Move HIP_HCC define to CMake 2016-04-17 07:40:04 -05:00
Ben Sander 49cc5aec91 Merge branch 'privatestaging' into p2p
Conflicts:
	include/hcc_detail/hip_hcc.h
	src/hip_hcc.cpp
2016-04-17 06:46:52 -05:00
Aditya Atluri 8dc1bdcbe6 Corrected Memcpydefault 2016-04-16 17:10:13 -05:00
Ben Sander dcabc9dbf7 P2P Update.
- add P2P staging buffer copy.
- If copy device does not have sufficient access permissions, fall back
  to staging buffer.
- improve docs for which copy device is used.
2016-04-16 10:18:56 -05:00
Ben Sander 31dc13d2ec Merge branch 'p2p' of https://github.com/AMDComputeLibraries/HIP-privatestaging into p2p
Conflicts:
	RELEASE.md
	include/hcc_detail/hip_hcc.h
	samples/1_Utils/hipInfo/hipInfo.cpp
	src/hip_hcc.cpp
	src/hip_peer.cpp
2016-04-11 09:17:27 -05:00
Ben Sander 1f53c55d3e P2p checkpoint.
- set USE_PEER_TO_PEER=3 (requires HCC "am_memtracker_update_peers")
- when enabling peer, turn it on for previously allocated memory.
- hipDeviceCanAccessPeer is no longer self-ware (self does not qualify
  as a peer)
- device peerlist always includes self, so when we call allow_access
  we never remove self access.
- hipDeviceReset() removes old peer mappings.
2016-04-11 12:52:18 -05:00
Ben Sander b0529e04f1 Clean up disable.
Add USE_HCC_LOCK (disabled)
Disable USE_PEER_TO_PEER.
2016-04-11 09:09:36 -05:00
Ben Sander 83f0de7806 P2p checkpoint.
- set USE_PEER_TO_PEER=3 (requires HCC "am_memtracker_update_peers")
- when enabling peer, turn it on for previously allocated memory.
- hipDeviceCanAccessPeer is no longer self-ware (self does not qualify
  as a peer)
- device peerlist always includes self, so when we call allow_access
  we never remove self access.
- hipDeviceReset() removes old peer mappings.
2016-04-11 07:58:59 -05:00
Ben Sander fb31eaf07b fix bugs in P2P implementation
- addPeers polarity reversed, would never add.
- check allow_access return value, pipe error to hipMalloc.
2016-04-11 07:58:58 -05:00
Ben Sander 813b063888 For P2P, use the peer list when allocating Device memory or pinned host.
Each new allocation is automatically mapped into the address space of
all enabled peers.
2016-04-11 07:58:58 -05:00
Ben Sander f2aa470f7f P2P checkpoint.
Maintain enabled peer tables for each device.
2016-04-11 07:58:58 -05:00
Ben Sander 69f2469cbb Checkpoint initial peer2peer implementation. 2016-04-11 07:58:58 -05:00
Ben Sander 7886c9e3d9 fix bugs in P2P implementation
- addPeers polarity reversed, would never add.
- check allow_access return value, pipe error to hipMalloc.
2016-04-09 04:11:31 -05:00
pensun 4b2c5976ce clean up unused comments 2016-04-07 09:46:00 -05:00
Ben Sander 41f7317fb5 For P2P, use the peer list when allocating Device memory or pinned host.
Each new allocation is automatically mapped into the address space of
all enabled peers.
2016-04-06 16:44:31 -05:00
Ben Sander 36926e6233 P2P checkpoint.
Maintain enabled peer tables for each device.
2016-04-06 15:50:47 -05:00
Ben Sander b02e9163ab Checkpoint initial peer2peer implementation. 2016-04-06 15:50:47 -05:00
Ben Sander 740b730cac Add runtime switch to control HIP_ATP_MARKER
Only generate the function strings if requested at
compile-time && runtime.
2016-03-29 17:27:30 -05:00
Ben Sander 8635863724 Tweak thread-safe implementation.
introduce LockedAccessor option so destructor does not unlock.
Allows locks to exist across function boundaries, required
for hipLaunchKernel macro which has several unusual requirements.
(including C comppatibility, must use variadic macro, more).
2016-03-28 21:41:47 -05:00
Ben Sander 3f18bab2c7 Stream thread-safe checkpoint.
Moving data structures to critical / protected section.
2016-03-28 09:46:40 -05:00
Ben Sander ec049e7f0b Stream thread-safe checkpoint. 2016-03-28 04:22:20 -05:00