Граф коммитов

218 Коммитов

Автор SHA1 Сообщение Дата
Ben Sander dee364cb08 Add DISABLE_COPY_EXT option. 2016-10-05 12:18:42 -05:00
Ben Sander 821080487a Add HIP_BLOCKING_SYNC environment var to control stream sync behavior. 2016-10-05 12:18:42 -05:00
Maneesh Gupta b951cc99ed Move include/* to include/hip/*
Change-Id: I7a7b2839b4df59c7a4c503550f99fdc9e45c0f54
2016-10-04 22:17:18 +05:30
Ben Sander 4ff6dc8f38 Refactor asyncCopy and syncCopy to fix deadlock case.
- Minimize time that locks are held.
- Eliminate copy code that locked stream and ctx at same time.
    - Stream was locked to ensure thread-safe enqueue to the queue.
    - Devices were locked to query peer-lists.

Change-Id: Ibe8880bb7fb995a3da8f90ff911f212d81525018
2016-09-27 15:45:40 -05:00
Ben Sander 6de9136002 Add debug option to print ThreadID with each message.
Also print messages with single fprintf to prevents threads from
interleaving.

Change-Id: Ib3999fe6b1e67b4a16cd7dcde82f3dfc99dd48ff
2016-09-27 15:45:40 -05:00
Ben Sander 225e37fdc9 Fix signal resource issue.
Remove memory leak with new hc::completion_future.
Implement HIP_LAUNCH_BLOCKING with queue-level wait.

Change-Id: I45975f81c4d239fdeed7776970988d28449865dc
2016-09-26 16:47:32 -05:00
Ben Sander c769abcbeb Peer-to-Peer improvements.
- Bug fix for peer visibility.  Now contexts correctly detect when they can use SDMA for P2P vs staging buffers.
- Interface to new HCC copy_ext function.
- Improve context and peer print /debug options.
- Add comments and usage to hipPeerToPeer_simple test.
2016-09-22 14:21:19 -05:00
Ben Sander c645e53fdd Remove unpinned_copy code. Other cleanup.
Change-Id: Ie3f71439cf1ba729ef223d078917c403d3de879a
2016-09-22 14:21:19 -05:00
Ben Sander e0ce1d3954 Cleanup. Remove cfs, ihipSignal_t, staging buffer calls.
Change-Id: I8bb67c484e3a65be06a03665f059217930da2bed
2016-09-22 14:21:19 -05:00
Ben Sander 12cb1d88aa Cleanup: Remove HIP signal pool.
Change-Id: Icebfd0509d12396cc5933d5556d68b53e1be36e0
2016-09-22 14:21:19 -05:00
Ben Sander 7530fa6dbe Remove HIP command dependency tracking.
Change-Id: I991c13bc5108193959ba70f9f6f9c692c9ad3a5b
2016-09-22 14:21:19 -05:00
Ben Sander 8c4cecf367 Cleanup, remove preCopyCommand.
Change-Id: I3768d3789a99be8136b43179d4152fa1875665cb
2016-09-22 14:21:19 -05:00
Ben Sander 9c9b0ab555 Change HIP async copy to call av::copy_async.
Change-Id: I4274b63ced3940d5249c32bd9d156296529c70e8
2016-09-22 14:21:19 -05:00
Ben Sander da44f3f907 Use HCC's synchronous accelerator_view::copy
Replace large block of HIP code with a call to HCC av::copy().

Change-Id: Ic32e1801cf8d4cd116ac02b72c41b1a1e4b6065c
2016-09-22 14:21:19 -05:00
Ben Sander e843d8cb51 Remove USE_AV_COPY, USE_PEER_TO_PEER fallback paths.
Change-Id: I9c20173e62029c4caebabc98784c6d7697758e4f
2016-09-22 14:21:19 -05:00
Ben Sander ccc1bbe6b1 Remove HIP_STAGING_BUFFER
Code simplification/cleanup:
Remove stale fallback paths that uses something besides the unpinned engine.
Remove HIP_STAGING_BUFFER env var - now is const 2, 0 no longer has
special meaning.

Change-Id: I7d24cdd1067dd0c244e87b6a83897cb135d307e7
2016-09-22 14:21:18 -05:00
Ben Sander 442d74f027 Move isLargeBar to UnpinnedCopyEngine constructor.
Change-Id: I7a7d3a40b1d4e0c6ec856658a6a70e5e70d287ce
2016-09-22 14:21:18 -05:00
Ben Sander e300cb4405 Refactor Staging Buffer CopyDeviceToHost
Use copyMode.  Embed algorithm selection inside the unpinned class.

Change-Id: Ic75fd5931717a3160904402794bbed3ccd445112
2016-09-22 14:21:18 -05:00
Ben Sander c532de9f5a Refactor staging buffer CopyHostToDevice.
- Move algorithm selection inside Unpinned class.
- Refactor function names.
- Use size_t for size threshholds.

Change-Id: Iac4de652ac9d49acbf527aa0849e388b8ecd8486
2016-09-22 14:21:18 -05:00
Ben Sander 83140f8423 Updates docs for hipHcc* functions, move to header 2016-09-22 13:05:47 -05:00
Aditya Atluri 7407cb2600 added more error codes to hipErrorGetString
Change-Id: I80c675905d94813502040fd0caa07985fa8c7dcc
2016-09-15 11:28:18 -05:00
Aditya Atluri 8110f562ab added new error reporting case
Change-Id: I5f0a37dbe396412f5602d04df19d538e451c2696
2016-09-15 10:50:26 -05:00
Aditya Atluri f03570d8cc Added signal management which passes stress tests
Change-Id: I7e1660a8ca2c5ee580a91f76eae9a58ca49f0457
2016-09-08 14:52:51 -05:00
Ben Sander 4e994a3025 Add hipStreamQuery
Change-Id: Ib0813b1065feba4fe9ae861d24cfc6f9c5f580be
2016-09-07 15:18:34 -05:00
Ben Sander 48b1f7a6ea refactor ihipPreLaunchKernel phase#1
- Fix calls to HIP_INIT_API to pass all function arguments.
- Change ihipFunction to follow coding convention:
    - leading underscore for member fields,
    - camelCase for member fields.
- move kernel print function inside ihipPreLaunchKernel.
- add HIP_TRACE_API_COLOR, control color of messages.
- add ihipLogStatus wrapper to hipDeviceSynchronize()

Change-Id: I20bbb644da213f821404648945197254e3648fc9
2016-09-07 15:18:34 -05:00
Aditya Atluri 2c2f6ab078 Fixed group and private memory size to AQL
Change-Id: I6e721f63fe5697b7b90a7d25add9aa024d9dc429
2016-09-07 12:57:18 -05:00
Ben Sander cdba60a566 Fix double-lock of stream on hipModuleLaunchKernel
Change-Id: I4ca164971c25f4eb8fbcca11d6258367bb3d2ab4
2016-09-02 12:47:49 -05:00
Ben Sander db9fe9f494 Only use ihipLogStatus from top-level HIP functions.
Change-Id: I07e9c088d5c16a79ed52cb008a798889a656016c
2016-09-02 09:46:59 -05:00
Ben Sander aa823871db Use create_blocking_marker for WaitEvent implementation
Change-Id: Ib3113f69a14e48b9fe0558d7b455148e478d8eed
2016-09-02 09:46:59 -05:00
Ben Sander e76a272d48 Refactor for stream->_av.
- move _av into stream critical section.  ( HCC accelerator_view is not
  thread-safe but HIP steram is. )
- Refactored many places in code that need to acquire critical section.
some were previously thread races, ie enqueueing marker.

-remove support for GRID_LAUNCH_VERSION < 20
-Enable USE_AV_COPY based on HCC work-week.
- Review hipModule docs, some calrity/editing.

Change-Id: I3ce7c25ece048c3504f55ecd4683e506bb1fc8b6
2016-09-02 09:46:59 -05:00
Aditya Atluri 1769c4b4b2 remove HIP_INIT_API from ihipSynchronize
Change-Id: Ibe0739efe55573c023212d9c28ba847c777e434c
2016-08-29 21:42:22 -05:00
Ben Sander 21e5c25225 Refactor trace code for hipLaunchKernel.
- Use standard print functions for streams.
- Add HIP_INIT macro, for cases where we want to initialize HIP but not
  log an API (ihipPreKernelLaunch).

Change-Id: If43cf8a363d918bcd3722a2e6a965d4cfa2e03e7
2016-08-29 18:37:57 -05:00
Aditya Atluri 4152746e26 Added explicit memory copy direction apis
- Fixed stale printf in context api
- Added 4 sync memcpy apis
  1. hipMemcpyHtoD
  2. hipMemcpyDtoH
  3. hipMemcpyDtoD
  4. hipMemcpyHtoH
- Added test for added apis

Change-Id: I4a9c382445b62631f8d0bcbb9a670322288b72b1
2016-08-26 13:11:01 -05:00
Aditya Atluri 842553a6e1 Changed how hipEvent_t is typedefed internall
- Mapped hipEvent_t directly to ihipEvent_t* instead of a handle

Change-Id: I5a8bcca0ef962932e0738c03eb1fc914d23022ae
2016-08-25 14:34:41 -05:00
Aditya Atluri 8f0f97f8f9 Added stream synchronisation for hipLaunchModuleKernel
- The module kernel launch is now in sync with commands in its stream
- Moved launch kernel inside ihipStream

Change-Id: Ic00cfcf4882bf81b6203c36881a52575ea68b529
2016-08-22 14:17:55 -05:00
Rahul Garg a498753041 Added support for hipCtxSynchronize and hipCtxGetFlags,modified hipDeviceSynchronize
Change-Id: If7bac667a262fa8c0cb3dc93e97f2534855acd07
2016-08-22 16:15:27 +05:30
Ben Sander 89164259ab Context update.
- Remove tls_deviceID.
- Add first passing test.

Change-Id: If3e2f254abf589028cfe4f9e6369745f04160de0
2016-08-10 08:59:47 -05:00
Rahul Garg 2ac93c340d Changed StagingBuffer class to UnpinnedCopyEngine
Change-Id: I1e212bfc8030dcf225ecf78fd7b23fda9b1de92f
2016-08-09 21:29:42 +05:30
Rahul Garg 023b1ecf33 Moved sync copy decision logic to staging buffer class
Change-Id: I5c398772375fcc1f174a7597eea1215ce7bf80b4
2016-08-09 09:28:18 +05:30
Ben Sander 8f402132ba Add initial context implementation.
APIs: hipInit, hipCtxCreate.
Track TLS default ctx.  Set deviceID now changes the ctx.
Add first context test.

Change-Id: If1cb9989b5a04a36147e25e84904336c7b6f3d88
2016-08-08 17:49:02 -05:00
Ben Sander ed0a2c02fe Code cleanup, use camelCase where appropriate.
Change-Id: I5a7ec50df8bbb3e7a3b313c0b12e2dd55ae4a09c
2016-08-08 14:54:38 -05:00
Ben Sander cfdacab32f Split ihipCtx_t into ihipCtx_t and ihipDevice_t .
Major change to existing code base.
    Ctx holds streams, enables peers, and flags.
    Device holds accelerator, hsa-agent, device props.

Add hipCtx_t.

Add peer APIs that accept hipCtx_t (in addition to deviceId)

Compiles and passes directed tests.

Change-Id: Iddab1eb9edbf90caad2ef5959c6b811d658197f1
2016-08-08 11:55:57 -05:00
Ben Sander 2dc3d3238b Change Device->Ctx
Change ihipDevice_t -> ihipCtx_t (new)
Change ihipGetTlsDefaultDevice->ihipGetTlsDefaultCtx
Some other changes from device->ctx where appropriate.

Change-Id: I5c4ae93b2fd42c6303aa23d748eb166b7431925d
2016-08-07 21:47:12 -05:00
Ben Sander e7d7c5cbe8 Remove ihipStream_r::_device_index
Replace with direct pointer to device.  Cleaner, and prep
for transition to contexts.

Change-Id: I0e550f34412923d46c541c0a14bb7d29c3fd4b11
2016-08-07 20:47:06 -05:00
Rahul Garg fcb2fcce1e Region based apis to pool based api changes
Change-Id: If53019eebafe051ab4e811863995f78315297080
2016-08-05 15:05:57 +05:30
Ben Sander f43d02027e Remove faulty assert for kernelCnt==0
Change-Id: I8a925c95f48e857c0a31f44561499e90dc6df552
2016-08-01 13:38:47 -05:00
Aditya Atluri 9062ebcf3a Signal Fix: The signals in a stream are re-used
1. Before, the signal pool is increased depending on the usage
2. After, a static number of signals are allocated to the pool
Only these are used by hip in a stream
3. If the signals required are more than the pool size, the
stream has to wait to make sure all the signals are available
4. Once they are available, the stream can use them
5. Removed HIP_NUM_SIGNALS_PER_STREAM because of redundancy with HIP_STREAM_SIGNALS
6. Increased signal count from 2 to 32.
Future Work: Dynamically increase the pool size depending on the number of
streams allocated by the application. And, null stream should have more signals

Change-Id: I6be36e084f26bb04766fabf776c7210aee0f9e91
2016-07-28 23:01:35 -05:00
Ben Sander 666c227c7d Remove dead enqueueBarrier function.
Change-Id: Ib18fe6bd96ce24dbeb342961ddb5721f7d03f2b2
2016-07-28 22:48:22 -05:00
Ben Sander 02dd7a7399 Cleanup sync code.
Remove dead depFutures, enqueueBarrier call.
Rename some parms to reflect usage.
Add comments to better explain tricky parts of sync code.

Change-Id: I763296421d9c2b3b58fc8cef5f010b12ab49553c
2016-07-27 18:31:11 -05:00
Aditya Atluri 1859c6e515 Signal Fix: Added signal limit to allocSignal
1. Did not change the logic in allocSignal
2. Added guard to wait on signal limit

Change-Id: I78f29097e6a584b3c3d78319dac19869067bd1fe
2016-07-27 13:48:49 -05:00