Граф коммитов

70 Коммитов

Автор SHA1 Сообщение Дата
Sean Keely 299874f17d Initial support for deallocation callbacks.
Adds hsa_amd_register_deallocation_callback and hsa_amd_deregister_deallocation_callback
to notify when HSA memory has been released.

Change-Id: I1f33cee250ca890e5c2e7fddfa4479aa5874651d
2019-06-26 04:12:17 -05:00
Sean Keely 4b22d24346 Revert to SystemClockCounter for HSA system time.
CPUClockCounter is not NTP adjusted (CLOCK_MONOTONIC_RAW) so should be 
better for measurements.  However, it is implemented with syscall while
CLOCK_MONOTONIC is implemented via vDSO.  The latency increase becomes
significant when language layers make corresponding clock measurements.
Reverting to CLOCK_MONOTONIC will reduce latency and allow small
duration events to be measured at the cost of incorporating NTP
frequency skew errors.  NTP may adjust frequency by 500ppm so limits us
to ~3 decimals in elapsed time.

Change-Id: I920b9f707f47109d80d6c256c475638c03fb8d76
2019-06-17 21:07:26 -04:00
Sean Keely 6e2a056e1b Correlate errors for time stamps which predate process start.
Small times may be given to time conversion if GPU clocks are used to
accumulate elapsed time.  Because HSA APIs deal in absolute time this
leads to large conversion offsets of order system uptime.  Variation
in relative clock ratio estimation may be amplified in this case,
destroying elapsed time measurements.

This patch fixes the relative clock ratio used for times which predate
the call to hsa_init.  This correlates errors in such times allowing
the elapsed time to be correctly computed.

The effective maximum system uptime before elapsed time conversion becomes
inaccurate is ~3.5 months.  GPU event timestamps are good for process uptime
of ~3.5 months.  These are limited by double's mantissa precision.

Change-Id: I48752ff354920439d91016d6f2b0c8ddfa60b445
2019-05-14 17:35:06 -04:00
Sean Keely 67376e06ab Report SRAM ECC errors through the system event handler.
Modify the system event handler to support multiple users.
Name memory fault reason codes.

Change-Id: I1b5979b36ab15637eb2be59a61e2d57e76d0a70e
2019-02-27 18:08:07 -05:00
Sean Keely 4e8597681b Cache KFD Events used by user allocated InterruptSignals.
Change-Id: I7f102f880fea9c78febe28cd262f93ee77f03184
2018-11-12 22:37:42 -06:00
Sean Keely 8323b2e1d7 Add pooling for Signal ABI blocks (SharedSignal).
Makes better use of memory and greatly reduces mmap count.

Change-Id: Ib444cd1ccd144986adbcc7cec297a966e2c08bc7
2018-11-12 22:37:28 -06:00
Sean Keely 936ecd1885 Remove legacy SVM region concept.
Also rename blit_agent to region_gpu and add comments to clarify
its role in deprecated region API support rather than to do blits.

Change-Id: I80b1043db2e1c5d40a58fc801eef70a688ea9169
2018-11-09 06:27:53 -06:00
Sean Keely dda9c17b45 Move VM fault handler init to after all devices are registered.
During registration we must not call any function that depends on registered
data as the lists are not yet complete.  This includes signal allocation since
allocating shared GPU mapped memory depends on the list of GPUs.

Change-Id: I94d59e847802c546c2a5a0d9f55fe5ac3fd1d878
2018-11-09 03:10:08 -06:00
Sean Keely 9ec37b5103 Ensure runtime cleanup when hsa_init ref count reaches 0.
Delete the runtime object when the last hsa_shut_down occurs.

Change-Id: I2005d52d06702eaef166714fd5e471cc277924db
2018-10-22 19:32:00 -05:00
Sean Keely 757502ccd6 Report internal queue creation to tools.
Debug agent requires handles to internal queues for single step debugging.
Added tools only API hsa_amd_runtime_queue_create_register for reporting.

hsa_amd_runtime_queue_create_register sets a callback which is invoked
when internal queues are created.

Change-Id: Ia5190ae724fadba686c15f25b2cd085350eeff0e
2018-10-20 23:12:27 -04:00
Sean Keely 5975c465ad Fully initialize GPU agents before loading tools.
Required for debug agent requires copy API and trap handler to be initalized
prior to loading.  Existing tools do not make use of internal queue or scratch
memory intercept which is what PostToolsInit allows.

PostToolsInit() will be removed in a following cleanup change.

Change-Id: If43377843808e3eff0defd9204910a67a852902f
2018-10-20 23:12:14 -04:00
Sean Keely 6852282a07 Refactor of Runtime::CopyMemory()
Change-Id: I32a7cb24d00660ff4471d121ef7b3c2eec8fced2
2018-10-20 14:38:50 -04:00
Sean Keely 1e0d690948 Use ptrinfo rather than apertures in hsa_memory_copy
Apertures now overlap with the change to 48bit addressing which
precludes using aperture checks to discover buffer ownership.
Switches to ptrinfo to decide which device a buffer owned by.

This corrects faults in the legacy hsa_memory_copy api.

Change-Id: I5c7ce0216e1cdc96f836fc6fec9c3defdf4b9d90
2018-10-11 13:34:53 -04:00
Ramesh Errabolu 01eea21d6c Capture number of Numa Nodes present on system
Change-Id: Ic789a6b9da8e316cb483e50b0fe9faa03798f97c
2018-09-18 16:27:30 -05:00
Ramesh Errabolu f007870792 ROCr changes to enable small BAR P2P over xGMI
Change-Id: I6aaa3fe2565cdf7e15d58a7484d6bd5916ffff64
2018-09-17 22:54:40 -04:00
Sean Keely 2843988dd7 Remove redundant initialization.
LinkInfo is already initialized to zero in its default constructor.

Change-Id: Ifa4fb886cce9b474c6879c9c82744044ab394082
2018-08-29 19:36:07 -04:00
Sean Keely cd8e5c1da8 Expose ROCr build ID.
Adds HSA_AMD_SYSTEM_INFO_BUILD_VERSION=0x200 to hsa_system_info_t.
This returns a const char* pointing at the build string (git describe).

Change-Id: I73e6612482bf6ffc4037fd365808eb9211a650ad
2018-08-20 20:44:32 -05:00
Sean Keely 6c47780620 Experimental flag to swap copy agent for async copy APIs.
Adds env flag HSA_REV_COPY_DIR.  If set to 1 async copy will
copy from dst device to src device rather than from src to dst.

Change-Id: I3095642066fa026dc112c2eac06db9393341cd7e
2018-08-09 10:58:14 -04:00
Evgeny 0e0be791ec Tool load failure report changing to unconditional print bcos it's already is controlled with the env var
Change-Id: I91b400ba94575a32005e825e6b41bda05c55b357
2018-05-03 22:31:17 -05:00
Qingchuan Shi 49d2175c74 debug suport for queue error.
1/ Revised debug event handler to handle different events.
2/ Added queue error handler using the callback in queue create, which will print out wave info when queue in error state.
3/ Preempt queue instead of destory queue when queue error state.

Change-Id: Ib727d208de9caf1c72c76d42268483b24aaebde8
2018-04-20 14:25:16 -04:00
Sean Keely 6df9ba97ce Sequence queue error callbacks with queue destroy.
HSA v1.2 update.

Change-Id: I13975e71b2c1ea5b7738236f5d02df84312ad00c
2018-04-04 08:12:58 -04:00
Sean Keely 31c05d2fc7 Add exception safety to Runtime::Acquire.
Change-Id: Ia2a9baf08bb56971412f1ac3914592612de5f134
2018-02-28 05:21:07 -06:00
Sean Keely 95c926059d Improve fragment map reporting format.
Change-Id: I85d09d085b08de46271ec902c766a8609a4b921a
2018-02-09 14:03:03 -05:00
Sean Keely 9212e7a09f Emit fragment map and thunk ptr info with VM faults.
Change-Id: If1302f674df7a636529c64bf66dfdda755a70c32
2018-02-09 14:02:26 -05:00
Sean Keely 4b603e803d Improve loop variables.
Derived from github pull request by folklore1984.

Change-Id: I70cd3da131691543fed8bf913d6245d41c49280d
2017-11-28 20:36:22 -05:00
Evgeny 6e1b9288f6 aqlprofil API: removing from HSA hsa_api_trace/hsa_ext_interface
Change-Id: I12fac55ea9ccfdb119899bf9d000e3c8b0bf4bbb
2017-11-11 10:01:12 -06:00
Sean Keely a6d8a48cbc Add callback exception forwarding.
Modified callbacks for intercept queue, queue error, iterate agent and
iterate region.

Change-Id: I8bdd67f2312510ea7eb9caec93babca244938b40
2017-11-08 15:50:02 -05:00
Sean Keely 0c7dde2d1f Add queue intercept support to the runtime.
Queue intercept is exposed as two tools-only APIs via the API
intercept table.

Change-Id: Iac9602ed3143974d85c3569e9092295ad18037f8
2017-11-08 15:50:01 -05:00
Qingchuan Shi ce6aee01ed Add APIs to support debugging vm fault
1. Add hsa ext api hsa_amd_register_vmfault_handler for debugger to register callback in case of VM fault.
2. Extend hsa_ven_amd_loader API to:
   (1) iterate loaded code objects in executable:
       hsa_ven_amd_loader_executable_iterate_loaded_code_objects
   (2) get loaded code object info:
       hsa_ven_amd_loader_loaded_code_object_get_info
3. Make the id of hsa_queue the same as the one used in communication with thunk (for amd_aql_queue)

Change-Id: I68910809e59e24297350d262606f00e96c14bcbd
2017-10-28 21:48:26 -04:00
Sean Keely e9a6f2c3e6 Support hsa_amd_agents_allow_access on page fragments.
Since access may only be manipulated on whole pages, suballocator fragments must cooperate to set the page's access.
Since the KFD does not migrate memory on access changes this implementation makes agent access sticky across the requests in a fragmented page.

Change-Id: I88479ed45fb40e9782b704526a7b8ffb22e7bd76
2017-09-27 19:04:04 -05:00
Sean Keely 476c8e36bf Fix assert in simple_heap.
Also add comments to clarify pointer info constraints.

Change-Id: I8d07831a0e953d667c84c96fe53ed07c18ba115c
2017-09-21 00:47:18 -04:00
Sean Keely 117be0b55a Add suballocator for ordinary VRAM allocations smaller than 2MB.
Track pointer info for sub 2MB fragment allocations in allocation_map_.

Add fragment support to IPC.

Change-Id: I00cfc2e2fa289aac90a4718c392f9bb056a61a87
2017-09-19 06:08:36 -04:00
Sean Keely f1a661dedb Report tools library load failures in debug builds.
Change-Id: Ie1ff313e929fc46134e58730a1d370c5d7ace8db
2017-08-31 21:32:48 -04:00
Sean Keely 0cb1e8cb35 Correct vm_fault signal cleanup.
Change-Id: Id2f14b911e3991a76771425bc09f38a613280e6b
2017-08-18 22:12:38 -04:00
Sean Keely dec5c52e07 Simplify pointer info version check.
Change-Id: I0ed363f1261ffc041547f313970ca67298ace56c
2017-08-12 03:14:39 -04:00
Sean Keely c9642cf7af Initial IPC signal support.
Added an API for creating signals with attributes.
Added two APIs for IPC operations on signals.
Initial use of exceptions for error handling.

Add ref counting to signals.
Removed spin loops from signal destructors.
Signals are no longer to be destroyed with delete, use DeleteSignal instead.
Added delete safety to doorbells.
Added secondary hsa_signal_t -> Signal* translation path for IPC enabled signals.

Change-Id: Id59065d002f0c2566b0a9425694da2ed27cb7d7f
2017-08-11 18:41:34 -05:00
Evgeny 287afd3a52 adding aqlprofile member to HsaApiTable
Change-Id: Id674186dfa2e83295a51f770ccc0400f1cb51a98
2017-08-09 16:09:39 -05:00
Sean Keely a0a3587345 Remove use of anonymous member in C builds.
Tools/CodeXL will retain older versions of structs if them need them.

Change-Id: I568d7b445778dd575ef71000b4b839300572288e
2017-07-12 16:40:00 -04:00
Sean Keely c9f0427cb0 Decrement hsa_init ref counter when init fails.
Change-Id: If9376344d4b559e601932d070731132c8450104e
2017-07-07 21:21:03 -05:00
Evgeny 4174f07fd1 hsa-runtime integration
Change-Id: I48968966ffe164218ebff88d0e3a1268e96bf1dd
2017-07-05 10:55:30 -04:00
Kenny Ho 5b4df54b10 Revert "Implement memory fault analysis through context save area"
This reverts commit 75c9506f9d.

Change-Id: Ibf11b764b383b9be291f3009a30550e1a1e2d115
2017-06-14 14:21:53 -04:00
Jay Cornwall 75c9506f9d Implement memory fault analysis through context save area
When a fatal memory fault occurs the scheduler context-saves all queues
in the process and notifies the runtime through the memory event. The
saved state contains all GPR/LDS data at the moment of the fault.

Retrieve this state and present it to the user if HSA_DEBUG_FAULT is set
to "analyze" and the wavefront caused the fault. If amdgcn-capable objdump
is in the PATH invoke this to disassemble code around the PC.

Queue lifetime is now managed by the runtime to allow querying the
context save state for all active queues.

Change-Id: I6fee662fad1c4f9aa125bf5c53d7d0ea1ab32f95
2017-06-13 23:12:28 -04:00
Sean Keely c3e2a88ade Add preferred agent info to pointer info struct.
Lookup blit agent via pointer info in memory_fill.

Change-Id: I02feaf68bb9726858e8cb0ede6bc5f2b3707f5af
2017-05-31 05:16:05 -04:00
hthangir 8aa19388a9 On GFX9+ amd_queue_t.scratch_backing_memory_location must store the queue's scratch backing store VA, not the offset.
Also fix permission in couple files.

Change-Id: I4203f8e5a36406b20562d8943ea5c341847f039a
2017-04-18 22:37:56 -05:00
Sean Keely 8a5ff78be6 Remove comments, no functional change.
Change-Id: I923c037803a847352c2c50d9d47460cb0f01f22c
2017-03-28 18:22:49 -05:00
Sean Keely 7dfeee5074 Support async. queue errors and dynamic scratch without KFD events.
Change-Id: I4e9e7a37aa7b9c96b28ce79f562760283e02b1e0
2017-03-28 19:18:18 -04:00
hthangir ba3f1cb476 We should be using the "used" gcc attribute.
Change-Id: I1589273740ae66e8d7d8186a88e2c411a2e0425c
See: https://gcc.gnu.org/onlinedocs/gcc/Common-Variable-Attributes.html#Common-Variable-Attributes
2017-03-20 11:57:39 -04:00
Sean Keely 505d722b7d Fix Api table copy operation and tools version checking.
Change-Id: Ia76d16f3ea6d0abb931813f90bc3bc2119da5999
2017-02-07 14:26:20 -05:00
Chris Freehill 160f8c5880 HSA Enabled IPC support
Uncommented HSA IPC code.
Changed hsa_amd_ipc_memory_t to be 8 uint32_t's instead of 9 to
match spec

Change-Id: Id1523125e9b876a23c3743df1be29c98b47f6725
2016-12-15 19:16:29 -05:00
Sean Keely 8081758a55 Add InterProcess memory sharing support.
Support is disabled pending KFD / Thunk readiness.

Change-Id: I55def748e3d56cbfcfa6e24983a0ab78567aa81d
2016-11-15 18:58:29 -06:00