Граф коммитов

291 Коммитов

Автор SHA1 Сообщение Дата
Aditya Atluri 1cead6a4cd added new api hipHccModuleLaunchKernel
1. hipHccModuleLaunchKernel is same as hipModuleLaunchKernel with OpenCL workitem model
2. Added copy right
3. Fixed header naming

Change-Id: I6a7c35a3566e2f8d3f5056613e34193775d4b236
2017-03-31 12:11:34 -05:00
Rahul Garg dfa516f804 Fix for hipMemcpyFromSymbolAsync
Change-Id: I449c669c8f0ef041deaf0a1bc812a71b2f0cc5a6
2017-03-24 10:30:33 +05:30
Rahul Garg e22044de36 Fix for hipMemcpyFromSymbol (sync)
Change-Id: I66afec5443ce904a63ced1fafece5144ca59393e
2017-03-21 23:48:04 +05:30
pensun ad882222b0 Initial integration with Alex' Generic Grid Launch
Change-Id: I559afb80e9e39ec0d119bb3bf3b85ef9e448caf6
2017-03-17 14:59:34 -05:00
Ben Sander e43592721e Update hipHostRegister debug and pointerTracker debug and notes 2017-03-11 09:18:27 -06:00
Aditya Atluri 9f575721aa added new field to hipDeviceProp_t structure gcnArch.
1. It is an integer containing gfx values 701, 801, 802, 803
2. On NV path, it is zero

Change-Id: I2b4c7f48981d0214d8c6b1905d2cc85b16203419
2017-03-07 11:24:32 -06:00
Rahul Garg 158cb58c36 Removed hsakmt headers
Change-Id: I4ffc95d5823489195ebc5638226b49ea2995f603
2017-03-06 22:37:05 +05:30
Rahul Garg c837b8d713 Context management related changes in HIP.
-
-Contexts across threads are listed under device
-Device reset cleans up all contexts and re-initializes _primaryCtx

Change-Id: Ie1cfbb26d43a8dc6869be3e6ebaf7344ce374643
2017-02-27 15:24:17 +05:30
Aditya Atluri 7ac5017cb9 Added initial support for hipMemcpyFromSymbol. But not working!
Change-Id: I48d8c7de4ec9f85c6c942be995fb488a3931f5d7
2017-02-23 11:29:06 -06:00
Aditya Atluri 3d348b2d81 added runtime api hipMemcpyFromSymbolAsync
Change-Id: Ibaf925faf0ba464dd0ed6c5ea74c224c2ce38889
2017-02-22 19:16:35 -06:00
Aditya Atluri 0d4e6ae60a fixed symbol memcpy issue
Change-Id: I89d7401be51d194bcbf771020ba66e3d3b6a18f8
2017-02-01 17:54:59 -06:00
Ben Sander 1a24178c78 Add HIP_FAIL_SOC.
Fail sub-optimal-copies rather than perform them slowly.
SOC occur on async copy of unpinned memory, or P2P copy between GPUs
that are not peers.
2017-01-25 21:53:17 -06:00
Ben Sander 2f7a8ec39c Read HCC_OPT_FLUSH and optimize dispatch accordingly.
If HCC is in this mode, we can use less aggressive flushes in some
cases.
2017-01-25 21:50:52 -06:00
Ben Sander bc809460f5 Move core env var processing to env.cpp 2017-01-23 22:34:41 -06:00
Ben Sander 4de3df746c Add debug tips to docs 2017-01-23 22:34:41 -06:00
Ben Sander db3f4889ca Add HIP_SYNC_HOST_ALLOC, HipReadEnv 2017-01-19 23:55:24 -06:00
Ben Sander ca1cef4e06 Doc update - describe debug techniques
Also tweak sample to remove unneeded HIP_KERNEL_NAME.
Comment update
2017-01-19 12:40:45 -06:00
Aditya Atluri e9ff23e5f9 changed copyright year from 2016 to 2017 in src directory
Change-Id: Idb97db509b2b4b1656b2df7a14a62ade38c9d574
2017-01-11 18:05:41 -06:00
Ben Sander b29fbf736d Add HIP_MAX_QUEUES feature.
Includes some tricky manipulation of the locks for contexts and streams.
issue is that stealing a stream requires we lock the context to
walk the streams to find a victim.  To avoid deadlock, we can't
have a stream locked when we lock the context.  This implementation
releases the stream lock, then acquires the context and selects the
victim.
A more stable implemenation might be to copy the stream list
from a context so that a lock is not required to walk all streams.
Smart shared_ptr could be used to prevent the streams from being
deallocated during the walk.
2017-01-09 21:02:56 -06:00
Ben Sander c9f5fe34e6 First pass at virtualized queue support.
Also updated stream debug messages to consistently use trace_helper.
2017-01-09 21:02:53 -06:00
Ben Sander 49d1477b9d tolerate spaces in hip args 2017-01-09 20:57:13 -06:00
Rahul Garg 090eadd0bd Added state for hipDevice.
Change-Id: Idbc3c04cd054a01b634856a1e0a23ff172e991aa
2017-01-09 23:54:01 +05:30
Ben Sander fd5b0c68b1 Support size_t in memset kernel.
Add disable for HSA_AMD_AGENT_INFO_MAX_WAVES_PER_CU
Remove one copy of completion_future in memset.
2016-12-22 12:25:09 -06:00
Ben Sander cf338d716b Increment API sequence number.
Change name to tls_tidInfo
2016-12-21 15:30:36 -06:00
Rahul Garg 4988975b59 Fix for HCSWAP-67
Change-Id: I0b2ce5ab933237947fb41d89769db3da16e5be6a

Conflicts:
	src/hip_hcc.cpp
2016-12-19 16:19:51 +05:30
Ben Sander 06d382bc6d Remove USE_DISPATCH_HSA_KERNEL=0 path. 2016-12-17 07:22:56 -06:00
pensun 4cb1579d4a HIP resource leaks fix from Jack
Change-Id: I93f3ad7cb94ff1cba1577bd8acc90e826693d12e
2016-12-05 20:21:33 -06:00
Maneesh Gupta ac93376c26 Revert "Enable USE_DISPATCH_HSA_KERNEL."
This reverts commit f8bcbe8680.
2016-12-05 16:55:26 +05:30
Ben Sander f8bcbe8680 Enable USE_DISPATCH_HSA_KERNEL.
Optimize hipLaunchModule dispatch latency.
2016-12-04 00:13:19 -06:00
Ben Sander 783ac156ce Add additional controls for forcing serialization and blocking.
Move HIP_COHERENT_HOST_ALLOC so it is read once at init time.
Add HIP_LAUNCH_BLOCKING_KERNELS, HIP_API_BLOCKING.
Update docs on debug and chicken bits.

Conflicts:
	src/hip_hcc.cpp
2016-12-02 18:03:59 -06:00
pensun 504fcaf786 Change to use produce device name by default
Change-Id: Ie2cee2a2e94a08b5874a2f5abee5d1ab6c9fdf47
2016-11-29 11:34:06 -06:00
Ben Sander a504df955e Add more debug info 2016-11-26 08:56:02 -06:00
Ben Sander 9db93a1b96 Improve docs in some places
Change-Id: If31e84fbf0c8595ca72edb842dce7ce47783579b
2016-11-23 08:16:18 -06:00
Ben Sander 111b57ddd0 Improve debug capabilities.
Print TID mapping at init when HIP_TRACE_API=1.
Print base host/dev info from tracker during copy.

Change-Id: I84e26d7b801567e5a91baad36126fb590920ec87
2016-11-23 08:16:18 -06:00
Rahul Garg afbd278804 Removed hsaKmtReleaseSystemProperties call
Change-Id: I7cb992cccf587c333f0ca0cb518409f3944bdb06
2016-11-22 06:15:35 +05:30
Maneesh Gupta 2195e3c37d Refactor for building HIP as dynamic library
Change-Id: I65a3d9d589c4fdbbdcf1611e5427224253be8260
2016-11-18 14:33:20 +05:30
Aditya Atluri 603bb321ec Added i8 packed math intrinsics
1. Added add, sub, mul packed math i8 intrinsics
2. Removed c++ packed data structures included from HCC

Change-Id: I1d109c5ce10c48b7cd3ea059478b88fc1de78499
TODO: Add better packed data structures support
2016-11-17 01:09:12 -06:00
Maneesh Gupta 0696d4417f Enable USE_COPY_EXT_V2 by default
Change-Id: I2c0dc80f85a0ccb5744715b5418a604e38b249ed
2016-11-15 10:42:27 +05:30
Aditya Atluri 6dcdf08e0d fixed multi-dim module kernel launch
Change-Id: Id1d81f2375d058979ab526433f905cf0ea3d23d6
2016-11-11 12:25:23 -06:00
Ben Sander d666fbaafe Add option to deny peer access.
Also fix test.

Change-Id: I1b247f6c4271442b008e560669bca4daf8eb94c7
2016-11-10 23:12:48 -06:00
Ben Sander 6e54a600b6 Use forceUnpinnedCopy to resolve P2p corner cases.
Change-Id: I2aebb419881246cebb696bec87798635bc71acc2
2016-11-10 23:12:48 -06:00
Ben Sander 0eeaa3bcd5 Enable async copy again.
Also add HIP_FORCE_SYNC_COPY chicken bit.

Change-Id: I76a385410494b99bf27305d3c08f55dd81987565
2016-11-10 23:12:48 -06:00
Ben Sander e767e0032e Refactor copy and P2P logic.
Prefer use of source-engine for DMA copies, even if user submits copy
in a stream attached to a different device.
The stream is now used only for synchronization, and HIP
makes the most optimal decision for which engine to perform the
copy - typically the source copy engine.

HIP now makes decision on which engine should perform the copy
and passes this to HCC using new apis.
HIP has additional information about peer
visibility and will make a decision which agent should perform
the copy .

Change-Id: I0cf4cfebeae256e6ca795f08a7ed7130f4857d1f
2016-11-10 23:12:48 -06:00
Ben Sander f3d38c2615 Improve Peer support and testing.
Change-Id: Icadc65988aaf145a265587ab0357c5bf4d26f3eb
2016-11-06 03:22:36 -06:00
Ben Sander 06ecfa3975 Set forceHostCopyEngine for other copy dirs. Support HIP_FORCE_P2P_HOST
Also: more debug for copy and P2p.

Change-Id: I87030c525410e041b2a00baaf6c68e6c0977ff42
2016-11-04 19:53:23 -05:00
Ben Sander 926e63c655 Refactor resolve-mem step1
Change-Id: I7b8b2bbb56d7b31a97b48ebd42002641cd07a460
2016-11-04 09:37:56 -05:00
Ben Sander 00276d141e Add debug for Peer APIs. Enable PeerMemcpy APIs by default.
Change-Id: I46e39a9e7b07686a78484c1f3b5495b08e052fbb
2016-11-04 08:51:16 -05:00
Aditya Atluri f097b6ef81 added inter thread data movement intrinsics
Change-Id: I2a8a8ed49429cb7f96439bd28c4b83b5142737df
2016-11-01 16:37:33 -05:00
Ben Sander 3d0fa30183 Print short hipLaunchKernel correctly.
Change-Id: I6ca03d7c707cd03d6982199830213953d5855f17
2016-10-27 23:09:32 -05:00
Ben Sander 18dbafe6e8 Add initial hipProfileStart/Stop
And modify sample to show how to use.
Still needs some work to understand interaction with CXL.

Change-Id: I2579824d2dd7863ea23874d34f0dabb3cb305d3e
2016-10-27 23:09:32 -05:00