Commit grafiek

2959 Commits

Auteur SHA1 Bericht Datum
James Edwards a1353acd85 Update the ROCt CMakeList.txt files to build both runtime and devel packages.
Change-Id: I01b6e4e5db91dd5f56ffea54c548e10f1f4aae5d
2017-07-19 01:16:13 -04:00
Felix Kuehling c48ff6b482 Make HSA_DISABLE_CACHE work on gfx900
Change-Id: I624390bfa70b2ff4cefd1bbdf8d960b7121f22bb
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-07-17 15:13:52 -04:00
Felix Kuehling c7bd7733e5 Align large buffers to BigK or huge-page boundary
This should allow us to take advantage of BigK fragments and huge pages
and improve TLB efficiency for VRAM allocations. Huge pages only work
with 4-level page tables (gfx900 and up). BigK fragments work on older
GPUs.

Change-Id: I02e1fbf74de554e16fdaf44e44d03b47df45c3b0
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-07-17 12:04:05 -04:00
Chris Freehill a12c5628ea Added dispatch time, async copy and test template rocrtst tests
Change-Id: I57a844ee65c36bd61616ee6d60d358303f51db56
2017-07-17 10:30:26 -05:00
Felix Kuehling dc2c52be78 Align imported graphics buffers
Imported graphics buffers are most likely images. Align them for
tiled image access. 64KB seems to do the trick.

This fixes VM faults with OpenCL graphics interop.

Change-Id: I7f60e205d93fff9407e0d00d3dbb02cc4990b863
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-07-17 11:19:08 -04:00
Evgeny 08d5efe29d moving hsa-amd-aqlprofile to ssh://gerritgit/hsa/ec/aqlprofile
Change-Id: Ic42752ca41f877db02aa5a5d8d617cd67cce8956
2017-07-14 14:59:42 -05:00
Evgeny ab67b8511b hsa_ven_amd_aqlprofile.h: include <hsa.h> fix
Change-Id: Idfd2fdde112d502d4b4a3365512ec601f7e56a5b
2017-07-12 15:43:58 -05:00
Sean Keely a0a3587345 Remove use of anonymous member in C builds.
Tools/CodeXL will retain older versions of structs if them need them.

Change-Id: I568d7b445778dd575ef71000b4b839300572288e
2017-07-12 16:40:00 -04:00
Sean Keely bc0bd00746 Fix queue interception in tools.
1. Correct amd::AqlQueue::ExecutePM4 to support interception.
2. Minor fixes to AqlPacket and SoftCP.
3. Minimal change to disable interception of runtime internal queues.

Change-Id: I103fece2ebf9a188d27f01e61221c737405d7253
2017-07-12 16:39:43 -04:00
Sean Keely 29b5b5c029 Correct handling of slow clocks under linux.
Change-Id: I9a1b08d5457caa6739220603bbd37b00febc64d7
2017-07-12 12:49:49 -04:00
Sean Keely 3e50adc7ce Properly order signal copy agent tagging with copy operation.
Change-Id: Ic428c958551279fbea1b2449afba930b82804ede
2017-07-11 13:10:00 -04:00
Sean Keely c9f0427cb0 Decrement hsa_init ref counter when init fails.
Change-Id: If9376344d4b559e601932d070731132c8450104e
2017-07-07 21:21:03 -05:00
Amber Lin ac468f676c Replace lock file with shared memory
Performance counters have limited slots for concurrent profiling. We
need a mechanism to synchronize the slots access across different
processes. Lock file file was first used for this access control. It
reveals a RedHat bug that /var/lock, symbolic linked to /run/lock, is
not writable by others. To avoid this bug and to simplify the code,
POSIX shared memory is created to replace the lock file usage. Access
of the shared memory is controlled by semaphores.

Change-Id: I1e13c17f0e042fdfe6657afe8b3c88db7e84d292
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-07-06 13:48:34 -04:00
Evgeny 4174f07fd1 hsa-runtime integration
Change-Id: I48968966ffe164218ebff88d0e3a1268e96bf1dd
2017-07-05 10:55:30 -04:00
Jay Cornwall 4fbffcdd9c Always allocate space for control stack at beginning of save area
Hardware block testing is done with the workgroup state offset
initialized to the control stack size on all ASICs. MEC microcode
assumes this space is available when the workgroup state offset is
reset after a context restore event.

Fixes context save area overrun when the full save area is used.

Change-Id: I8eeb62f97140c6fe409fe78b4497d833584feea8
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
2017-07-04 10:12:02 -04:00
Harish Kasiviswanathan dc6ece67fd Fix fd leak if application forks
If the application forks, close the fd inherited from the parent.

Change-Id: I48e4157d5f0d6f04d07ecb23b719a23934687cdb
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2017-06-30 15:59:41 -04:00
Harish Kasiviswanathan 4d0697bf65 Honor ReadOnly bit in HsaMemFlags
Change-Id: I456cde81384bf0f4bf055711d94b731179706d28
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2017-06-28 15:02:21 -04:00
Amber Lin 897c4e2fff Replace printf/fprintf with pr_xxx
Libraries normally don't print messages. We use pr_err, pr_warn,
pr_info, and pr_debug to print messages to stderr when prints are
enabled for debugging.

Change-Id: I9caf719343aa618c88e7b500f9737a46702e424a
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-06-28 10:47:35 -04:00
Amber Lin ccfe739929 Introduce debug level to Thunk
Existing Thunk has printf/fprintf in the code while normally libraries
don't print any message. This patch introduces a print machenism similar to
how the Linux kernel prints to console based on the log level. The default
is not to print any message, but setting HSAKMT_DEBUG_LEVEL will enable the
prints.

Change-Id: Ic071e122d35a82260218e9914cde4815e69df742
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-06-26 16:33:17 -04:00
Evgeny c533229bc1 Block list extending
Change-Id: Id17efde25fce287296e80f2b37c77b15aa59b561
2017-06-23 16:37:02 -04:00
Amber Lin 13aadde56e Use environment variable to force gfx version
For experimental purpose, we need an option to change compute capability
by forcing the GfxIp version. This patch allows to use environment
variable HSA_OVERRIDE_GFX_VERSION=major.minor.stepping to replace the
default version. For example:
export HSA_OVERRIDE_GFX_VERSION=9.0.1

Change-Id: I90cfbd43619d9d3aebf53321d4e058f01bcd7088
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-06-22 18:15:17 -04:00
Evgeny 8618bf7e2c minor fixes, debug output, comments, using env vars, dead code
Change-Id: I08ad73b561709c1818d78a9191c96d6ad141a609
2017-06-22 18:04:26 -04:00
Evan Quan 5b3c9f0b31 Revert "Change gfx900 compute capability to 9.0.1"
This reverts commit 5114a9368b.


Change-Id: Id9c4f43462820bf09f25674fa30e6eb04098808e
Signed-off-by: Evan Quan <evan.quan@amd.com>
2017-06-20 15:36:09 +08:00
Amber Lin 6e113e2634 Free control stack correctly
ctl_stack_copy is allocated from malloc. It should be freed by free.

Change-Id: Ib924da20200d91f52f106fe173464d47862759a8
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-06-19 09:01:35 -04:00
Ramesh Errabolu 08e0bca567 Support Perf Cntrs (PMC) and Thread Trace (SQTT) over AQL queues
Change-Id: I716b722895d90b46914c31377e791ad602acecc1
2017-06-15 12:58:31 -04:00
Kenny Ho 5b4df54b10 Revert "Implement memory fault analysis through context save area"
This reverts commit 75c9506f9d.

Change-Id: Ibf11b764b383b9be291f3009a30550e1a1e2d115
2017-06-14 14:21:53 -04:00
Evgeny 35b376e2ee GFX8 API
Change-Id: I9d0c430e4199f043226c8897f3320a7973cbdeda
2017-06-14 12:24:28 -04:00
Jay Cornwall 75c9506f9d Implement memory fault analysis through context save area
When a fatal memory fault occurs the scheduler context-saves all queues
in the process and notifies the runtime through the memory event. The
saved state contains all GPR/LDS data at the moment of the fault.

Retrieve this state and present it to the user if HSA_DEBUG_FAULT is set
to "analyze" and the wavefront caused the fault. If amdgcn-capable objdump
is in the PATH invoke this to disassemble code around the PC.

Queue lifetime is now managed by the runtime to allow querying the
context save state for all active queues.

Change-Id: I6fee662fad1c4f9aa125bf5c53d7d0ea1ab32f95
2017-06-13 23:12:28 -04:00
Amber Lin 5114a9368b Change gfx900 compute capability to 9.0.1
9.0.1 is XNACK enabled gfx900 compile target. Compiler must generate ISA
that's XNACK enabled.



Change-Id: Ic4987132ef9f8d06d9e2bcdb8f7eeb875cdd2b44
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-06-13 17:04:42 -04:00
Evgeny 25035b8d09 Adding HSA extension AMD AQL profile library, see Readme.txt
Change-Id: Icbc1e0fb0185642eabbab411a2138ea030d22be8
2017-06-13 16:18:06 -04:00
Evgeny da831502ab Adding GFXIP and kernel code object
Change-Id: Ieb2dfea8d9e909efac583f541730d77b7d0c9679
2017-06-13 14:58:29 -05:00
Harish Kasiviswanathan 5e26827d05 Support deb package build for other architectures
Use build machine architecture to build debian package. Useful for
building on Power8 and ARM64 machines.

Change-Id: I97fc80a6723b139e753019a355f11ced0bba0dd4
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
2017-06-13 12:12:37 -04:00
Konstantin Zhuravlyov d98e99949a Update hsa_isa_t entries
- Add 7.0.2 (consumer hawaii)
  - Add 9.0.1 (gfx900 with xnack)
  - Add 9.0.2
  - Add 9.0.3

Change-Id: I6a07797027c4eaf47038837c5ae51e05b2aba0e4
2017-06-12 14:34:11 -04:00
hthangir a0957bc679 The fallback path covers not just ARM64, need this for Power as well.
Change-Id: I7bbf76f77bd3ac47a0a0987c1e880e23834588e2
2017-06-07 14:45:29 -05:00
Qingchuan Shi cd35fb280a Patch target name in code object for future-proof
Change-Id: I6f12b5e5791bd1745ec3ab76d382fad50282e733
2017-06-05 19:08:27 -04:00
Chris Freehill 801bf4398c Added async. mem. copy sample.
Change-Id: I4fbb009181056c5f293d17720214b70588d44bdf
2017-06-05 17:20:51 -04:00
Amber Lin ceaaa1a57c A missing block in PMC
DB block was missing in the UUID look-up.

Change-Id: Ife5c25859bab6ec7fd99d0cd4d098ab044a08142
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-06-05 12:21:56 -04:00
Jay Cornwall 5db53ceda1 Enable SDMA on gfx9, disable on gfx8
gfx9 has passed qualification. gfx8 stability is under investigation.

Change-Id: Ia72211d47756399ecdfceafeb67c2ab34ebda834
2017-06-02 15:14:14 -05:00
Felix Kuehling 374bd89d8c Remove deprecated implementation of hsaKmtMapGraphicHandle
The KFD implementation has been removed and will not be upstreamed.
This API has been superseded by hsaKmtRegisterGraphicsHandleToNodes.

Change-Id: I5f2d8da3260974618cdb6ea3fdcd77d37b82c9cb
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-06-02 13:52:19 -04:00
Amber Lin 683fc96325 Implement hsaKmtGetQueueInfo interface
For items in HsaQueueInfo, control stack information comes from KFD, CU
mask information is maintained in Thunk, and others (queue detail error
and queue type extended) are ignored (value = 0) at this point.

Change-Id: Ib21370b0f52b2bb4ebe6a9b4b6ec6139cccb25ca
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2017-06-01 14:15:54 -04:00
Kent Russell b78e0e152a Clean up thunk code
Use checkpatch.pl to fix the majority of errors. Some that remain and
will be excluded:
Use of typedefs/externs/volatile/sscanf
Lines over 80 characters

Remaining errors are due to misunderstanding the * symbol with typedefs

Also use this opportunity to spell manageable properly

Change-Id: I0b335e9cb3e1eea38bee27eaa1f582b2c9b09b38
2017-05-31 14:38:59 -04:00
Chris Freehill 1170244ae2 Added IPC sample
Change-Id: I980c430d6e091eb1abbc0df89ca74c96348bcd37
2017-05-31 09:47:16 -04:00
Chris Freehill adf201d6a5 Added rocrinfo sample
Corrected a few formatting issues with binary_search.cpp

Change-Id: I9dcc0a231c6b8c424b44f4ab17032ff51b81a1ba
2017-05-31 09:46:06 -04:00
Sean Keely c3e2a88ade Add preferred agent info to pointer info struct.
Lookup blit agent via pointer info in memory_fill.

Change-Id: I02feaf68bb9726858e8cb0ede6bc5f2b3707f5af
2017-05-31 05:16:05 -04:00
Sean Keely 59cc20d3cb Check mmap return address for allocation, not requested address.
Change-Id: Ifeb7b17976fc791e3256c70d57cb4d1324a8b960
2017-05-30 21:26:55 -05:00
Qingchuan Shi 77e5b30c41 remove finalizer usage from image ext
Change-Id: I282f02cedce790bf42f07c588fd50e346b9ba665
2017-05-29 20:44:52 -04:00
Sean Keely e38ff18990 Unmap GPUs when allow_access removes them from system pools.
Change-Id: Ib9eb88622fded43ebd9eddbf78ad6771a5b91e77
2017-05-17 20:58:05 -04:00
Felix Kuehling 8aeb933426 Add some additional gfx900 PCI IDs
Change-Id: I5f00f3b30a27285d75c606c1308abfe032ce1d02
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-05-11 16:39:19 -04:00
Chris Freehill 8161ebb915 Refactored performance test code
Commented and flattened binary search sample.

Change-Id: Ib783292207c956d16003195924a3bcfbbde5039f
2017-05-11 14:45:45 -04:00
Felix Kuehling ea58703ece Fix uninitialized memory bug in hsaKmtWaitOnMultipleEvents
Use calloc to allocate event data. Otherwise random data may be filled
in for events that haven't actually signalled. This could trigger the
VM fault handler in the Runtime when no VM fault actually happened and
lead to intermittent HSA conformance test failures.

Change-Id: Icf702970e73a485b50633703c1b164f87fbb8606
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2017-05-10 18:16:31 -04:00