İşleme Grafiği

38 İşleme

Yazar SHA1 Mesaj Tarih
Sean Keely 8323b2e1d7 Add pooling for Signal ABI blocks (SharedSignal).
Makes better use of memory and greatly reduces mmap count.

Change-Id: Ib444cd1ccd144986adbcc7cec297a966e2c08bc7
2018-11-12 22:37:28 -06:00
Sean Keely e0839ab27e Implement SDMA copy rect for gfx9.
Fix pitch overflow due to small element detection.
Add wide pitch 2D copy handling.
Cleanup code duplication.

Change-Id: I93b1584aba8e5964957eb7ab3544df806ca3e2f9
2018-08-29 19:13:07 -04:00
Sean Keely 6c47780620 Experimental flag to swap copy agent for async copy APIs.
Adds env flag HSA_REV_COPY_DIR.  If set to 1 async copy will
copy from dst device to src device rather than from src to dst.

Change-Id: I3095642066fa026dc112c2eac06db9393341cd7e
2018-08-09 10:58:14 -04:00
Evgeny 0e0be791ec Tool load failure report changing to unconditional print bcos it's already is controlled with the env var
Change-Id: I91b400ba94575a32005e825e6b41bda05c55b357
2018-05-03 22:31:17 -05:00
Hari Thangirala 3e0cd85d69 Allow HSA_ENABLE_SDMA to override runtime defaults.
Change-Id: I2305304228010157bfb589c365f4a998577231cd
2018-04-10 12:56:48 -04:00
Sean Keely 7caf9633f6 Support large scratch allocations and reclaim.
Also improve small_heap used for scratch region allocation.

Change-Id: Ib7311b663b38968d88ebc355b81e12c0863dc541
2018-04-05 21:51:56 -04:00
Ramesh Errabolu 987f3f97aa Enable sDMA packet HDP Flush on Gfx9 and later devices
Change-Id: I85922e5266883ef7e9eed3565e2c3b209009d294
2018-04-02 11:47:59 -04:00
Sean Keely bd5dd47ca1 Defer creation of internal queues and blits until first needed.
Change-Id: I2e61d7e102f38389d806d9eb24beda910573157b
2018-02-09 14:02:07 -05:00
Sean Keely ca4c884306 Report library load errors in debug builds.
Change-Id: I24e63b15ad74fb86ecfe839f543800c2140c09d9
2017-12-05 18:49:33 -05:00
Sean Keely f312a7386e Exception support for Queue.
Remove "zombie" queue state and report queue creation failure via
exceptions.  Make Shared object a final container and support array
objects with Shared.  Add message printing to hsa_exception in
debug builds.

Change-Id: I459f38c80846018acbf45538874e95f91dd6b195
2017-11-08 15:50:02 -05:00
Sean Keely 0c7dde2d1f Add queue intercept support to the runtime.
Queue intercept is exposed as two tools-only APIs via the API
intercept table.

Change-Id: Iac9602ed3143974d85c3569e9092295ad18037f8
2017-11-08 15:50:01 -05:00
Sean Keely 476c8e36bf Fix assert in simple_heap.
Also add comments to clarify pointer info constraints.

Change-Id: I8d07831a0e953d667c84c96fe53ed07c18ba115c
2017-09-21 00:47:18 -04:00
Sean Keely 30fce248c6 Enable use of CLOCK_MONOTONIC_RAW for post 4.4 kernels.
Change-Id: I3c1f27c7e639df5128c36d81f715fa16e6c1cf13
2017-09-20 14:28:23 -04:00
Sean Keely 9dfdce5b3c Improve branch elimination in ScopedAcquire.
GCC can't reasonably be told that the lock ptr isn't null.  Adding a private bool
allows the branch to be eliminated, along with the bool.

Change-Id: I0605d69474d6a6e6951be93c0af1d8caf3f77124
2017-09-19 06:08:36 -04:00
Sean Keely 4275661682 Add env key to disable 2MB suballocation.
Change-Id: Icca3041c3578aa180a656c01aae62f2ad6e8b583
2017-09-19 06:08:36 -04:00
Sean Keely 117be0b55a Add suballocator for ordinary VRAM allocations smaller than 2MB.
Track pointer info for sub 2MB fragment allocations in allocation_map_.

Add fragment support to IPC.

Change-Id: I00cfc2e2fa289aac90a4718c392f9bb056a61a87
2017-09-19 06:08:36 -04:00
hthangir 87d2df3da3 Use non-RAW version in clock_getres to workaround bug in older kernels
Change-Id: Ice0606a42cd7054f0804baf4af3521ffae3b7d50
2017-09-14 13:56:15 -05:00
Sean Keely f1a661dedb Report tools library load failures in debug builds.
Change-Id: Ie1ff313e929fc46134e58730a1d370c5d7ace8db
2017-08-31 21:32:48 -04:00
Laurent Morichetti 0f05ef73ac Include <errno.h> for EBUSY
Change-Id: I9fa3417445866f3ce37af2169f623afa8e92e873
2017-08-31 07:32:51 -07:00
hthangir 9ee0108e58 Fix compilation issue reported with GLIBC 2.12 (RHEL 6.9)
Change-Id: I770b72ba1d61475a76aa72d0c52ebfb380db6019
2017-07-28 11:11:01 -04:00
Sean Keely 29b5b5c029 Correct handling of slow clocks under linux.
Change-Id: I9a1b08d5457caa6739220603bbd37b00febc64d7
2017-07-12 12:49:49 -04:00
Ramesh Errabolu 08e0bca567 Support Perf Cntrs (PMC) and Thread Trace (SQTT) over AQL queues
Change-Id: I716b722895d90b46914c31377e791ad602acecc1
2017-06-15 12:58:31 -04:00
Kenny Ho 5b4df54b10 Revert "Implement memory fault analysis through context save area"
This reverts commit 75c9506f9d.

Change-Id: Ibf11b764b383b9be291f3009a30550e1a1e2d115
2017-06-14 14:21:53 -04:00
Jay Cornwall 75c9506f9d Implement memory fault analysis through context save area
When a fatal memory fault occurs the scheduler context-saves all queues
in the process and notifies the runtime through the memory event. The
saved state contains all GPR/LDS data at the moment of the fault.

Retrieve this state and present it to the user if HSA_DEBUG_FAULT is set
to "analyze" and the wavefront caused the fault. If amdgcn-capable objdump
is in the PATH invoke this to disassemble code around the PC.

Queue lifetime is now managed by the runtime to allow querying the
context save state for all active queues.

Change-Id: I6fee662fad1c4f9aa125bf5c53d7d0ea1ab32f95
2017-06-13 23:12:28 -04:00
hthangir a0957bc679 The fallback path covers not just ARM64, need this for Power as well.
Change-Id: I7bbf76f77bd3ac47a0a0987c1e880e23834588e2
2017-06-07 14:45:29 -05:00
Sean Keely 426d41e27c Adjust signal sleep to reflect null kernel latency. Performance tested on Gromacs.
Change-Id: I3851148ee8544b15d840f2c26ca73a83f8d0df2e
2017-03-09 15:20:53 -05:00
Sean Keely 505d722b7d Fix Api table copy operation and tools version checking.
Change-Id: Ia76d16f3ea6d0abb931813f90bc3bc2119da5999
2017-02-07 14:26:20 -05:00
Jay Cornwall 9e575ea96a Add support for ARM
- Build system fixes
- No user-mode high-precision timer by default, use clock_gettime
- Use C11 aligned_alloc pending C++17 std::aligned_alloc

Change-Id: I268365bdfd11d1e817a89584b9e086ee5b86e1dc
2017-01-31 16:43:49 -08:00
Sean Keely 0e17cc2887 Allow reducing max occupancy (max scratch waves) when applications request large amounts of scratch.
Also emit error messages to stderr if no async queue error callback was registered and queue fault messages are enabled (on by default).
Queue fault messages are controlled with env key HSA_ENABLE_QUEUE_FAULT_MESSAGE.

Change-Id: I496487b8d048b83aa95b9784e92928211f167b17
2016-12-20 16:52:59 -06:00
Ramesh Errabolu eb2efb83d1 Initial set of changes for ThreadTrace
Change-Id: I07ce31f9b4f508cef0fc9ca6dadcf26b6c90361e
2016-11-21 23:40:56 -06:00
Jay Cornwall c30c25bd30 Fix miscellaneous warnings flagged by Clang
Change-Id: I85a45cb3b44e4379b31bcc56af061fd1571f2af5
2016-10-26 19:26:16 -05:00
Ramesh Errabolu c54304fe38 Print Debug Mesg if private segment memory request is illegal
Change-Id: I46351651b6b2bf14e26645440a4321bc941900b2
2016-09-08 11:16:09 -04:00
Sean Keely 1bc15bbf79 Switch atomic_helpers.h from C11 atomics to GCC __atomic builtins.
C11 atomics are not statically guaranteed to be lock free and so
may not be atomic with respect to atomic operations originating
outside the standard library, such as platform atomics.

C11 macros to statically discover always lock free operations
(ATOMIC_*_LOCK_FREE) do not cover uint64_t in GCC and
std::atomic<uint64_t> is not a type alias of any covered type.

All use of __atomic by atomic_helpers.h is statically checked to be
always lock free.

GCC builtin fencing does not appear to be strong enough for WC memory.
Added an option (enabled) to enforce consistency for WC memory on x64.

__sync builtin's were not used as they were declared legacy by GCC.

Added a strongly conservative option (ALWAYS_CONSERVATIVE) to enable
use of full memory fences in place of partial fences and compiler
driven processor specific optimization.

Change-Id: Id7aaaca626144070f58759f6a348cbee4612bbc0
2016-09-03 06:22:42 -04:00
Jay Cornwall acc5f15e4c Enable VM fault message by default on Linux
This option was disabled by default to address issues writing to stderr
in Windows applications. The lack of an error message for memory access
faults is confusing to users, however.

Enable the error message by default on Linux only.

Change-Id: I1f44ba42362f8874abdc7c8e63ddd54a855b5394
2016-07-30 10:10:14 -04:00
Besar Wicaksono (xN/A) TX [TEXT] c95f96a9e4 Add environment flag to enable sdma workaround that will wait for the sdma queue to be idle before updating the write pointer. Add class to manage environment flags.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1254004]
2016-04-01 17:13:45 -05:00
James Edwards (xN/A) TX 7d2bc9d113 Separate open source core runtime code from DK makefiles.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1250152]
2016-03-22 18:10:13 -05:00
James Edwards (xN/A) TX 7d1e6c3a57 Remove opensrc test files.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1249961]
2016-03-22 13:39:51 -05:00
James Edwards (xN/A) TX c9ffe0004e Check open source core runtime code into perforce. This includes license and README files.
[git-p4: depot-paths = "//depot/stg/hsa/drivers/hsa/runtime/": change = 1249136]
2016-03-20 15:39:40 -05:00