Commit Graph

2930 Commitit

Tekijä SHA1 Viesti Päivämäärä
Chris Freehill 3cd4461a7d gfx908 loader/isa related changes
Change-Id: I638d4b2b300ac5a99d4d31d4fadcfe9e1e3c7748


[ROCm/ROCR-Runtime commit: 6588165de1]
2019-07-23 03:41:27 -04:00
Chris Freehill 123dea7733 Add ISAREG entry for gfx908 for ECC not supported
* Also, re-enable rocrtst

Change-Id: I70106c5a1788818387e46f240d577cbe59bc89f4


[ROCm/ROCR-Runtime commit: 2c15bcac9d]
2019-07-22 21:50:09 -04:00
Chris Freehill a87ff82cad Initial gfx908 updates
Change-Id: I3d6307d6613a38861a95561b9ac68abaa5964b48


[ROCm/ROCR-Runtime commit: 447a30e985]
2019-07-22 17:25:06 -04:00
shaoyunl 8e30a50ffd KFDTest: Added gfx1010 SDMA fence packet support
Change-Id: I33d824353d77317363b73ddc52cd182f86b8bc66
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 5b44be1907]
2019-07-22 16:37:02 -04:00
shaoyunl a619145290 KFDTest : Add family ID when building SDMA packet
Some SDMA packet format might be different among asic versions

Change-Id: Ic7eda7554c23e3972e168480874ca67a92677346
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>


[ROCm/ROCR-Runtime commit: 395750264d]
2019-07-22 16:36:49 -04:00
Yong Zhao 22da7a83ab kfdtest: Sumbit to SDMA ring when using libdrm command submission
Because not all ASICs (like gfx908) have GFX rings, we should use SDMA
rings instead of GFX rings.

Change-Id: Ibcc9f9e555302ba4ce25ac76c2ca73b8c3962a58
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 4baeef356f]
2019-07-19 16:58:22 -04:00
Sean Keely e2375b9328 Update README build instructions.
Change-Id: I595e629117adfb44afb2e829d1f975782238277e


[ROCm/ROCR-Runtime commit: 0721dfd2e7]
2019-07-19 14:17:47 -04:00
Sean Keely 3959e99131 Add deallocation callback test to rocrtst.
Change-Id: Ia20abd8f1f64213eea0c3c1c771cc229cf38fd5d


[ROCm/ROCR-Runtime commit: 4fafdcb00c]
2019-07-19 14:17:21 -04:00
Sean Keely b66fecd12f Adjust agentOwner in pointer info queries for locked memory.
agentOwner from thunk reflects the GPU which holds the device alias.
We need to return a CPU to better reflect that the memory is system memory.

Change-Id: I9233f8779a4bfd471f68dbbbce07ae4528412e18


[ROCm/ROCR-Runtime commit: 6e07bc8dc4]
2019-07-19 14:17:13 -04:00
shaoyunl db09beaa08 KFDTest: remove the usage global g_TestGPUFamilyId
Adjust the KFDTest for multi-gou support

Change-Id: Ib3491e3f645d35fdba6ab702d65fcc86f48d3958
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>


[ROCm/ROCR-Runtime commit: b4e834ab61]
2019-07-19 13:26:49 -04:00
shaoyunl 6f29801c64 KFDTest : Add gfx1xxx release_mem and acquire_mem packet support
use family ID as parameter when construct the packets

Change-Id: I6c1706954ab7b8cbb8bef2aab16edf21f5e1abf0
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>


[ROCm/ROCR-Runtime commit: e9882daf11]
2019-07-18 10:43:48 -04:00
shaoyunl c2d5d06c43 kfdtest: Add Gfx10 pm4 packet format
Add release_mem and acquire_mem pm4 packet format for nv

Change-Id: I172407c3418005922c17937e1e43f57d153ea732
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>


[ROCm/ROCR-Runtime commit: ff82d3a354]
2019-07-17 10:42:32 -04:00
Sean Keely 49e70a3ef5 PR from github user DiamondLovesYou.
Allow user specified profiles if the HSAIL note is not found.

Konstantin reviewed and approved.  HSAIL note is not generated by LLVM.

Change-Id: I40fbfbaedd6787b6a716507918f698d02007afe1


[ROCm/ROCR-Runtime commit: 465a8eb40b]
2019-07-16 13:55:38 -05:00
Ramesh Errabolu 9364c7ac0e Allocate fine-grained regions for Gpu devices that are members of Hives
Change-Id: Ibbed393aeac691793845d16d2f3fe2c3e5a7ec40


[ROCm/ROCR-Runtime commit: 4daee0c8a1]
2019-07-13 01:12:53 -04:00
Felix Kuehling 30e96da4f1 kfdtest: MMBench: Test a more useful range of buffer sizes
Currently the test only covers relatively small buffers sizes. It's
useful to test buffer sizes up to 1GB to see the impact of features
that target the efficiency of large buffer allocations and mappings.

Change-Id: I2e8d5afd482894dbe2166f32d38091199b9c15e6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 4e9ff4393d]
2019-07-11 17:15:39 -04:00
shaoyunl 050b676533 Thunk: Add gfx1010 initial support
Add gfx1010 basic support on Thunk

Change-Id: Ie4c0922158c7f5e2951f8694f4b204f371f1aa23
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>


[ROCm/ROCR-Runtime commit: 02ccb9eb57]
2019-07-11 17:08:11 -04:00
Felix Kuehling f53e199ce1 kfdtest: Disable CheckZeroInitializationVram test
KFD will soon stop initializing VRAM allocations.

Change-Id: I901c736886bb3bd3b1b54a21d383ccd7907928fd
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 67c4fe230b]
2019-07-10 17:05:57 -04:00
Felix Kuehling 2ffc094890 kfdtest: Add multi-process oversubscription test
This test is designed to reproduce soft-hangs cause by HWS running
with oversubscription.


Change-Id: I49861522b3ff5ba50df5ddc968545c35ccb25353
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 5475e618e5]
2019-07-10 17:05:57 -04:00
Felix Kuehling d9f3f826dc kfdtest: Factor out multi-process test into a base class
Create KFDMultiProcessTest base class for tests forking multiple
child processes. Derive KFDEvictTest from that class.

Change-Id: Ie5f3362c45be2b807bf7a83839ab3820352a67f9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 6704b051d2]
2019-07-10 17:05:57 -04:00
Philip Yang 9b451c41e8 fix mbind on NUMA system
mbind walks through pages to setup vma memory policy. So we need do mmap
to create vma mappings first, then call mbind. mbind will do nothing if
vma does not exist.

And add numa available check before executing mbind, and return NULL to
hsaKmtAllocMemory if mbind failed.



Change-Id: I28ab661885d807ca51ef90e87230669dc80f10ec
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 67f366243d]
2019-07-09 17:53:30 -04:00
shaoyunl 92b73c10c5 Add gfx IsaGenerator
Change-Id: I93ccb889b4bb7f0f5921a90cebbc0550d1eb3f7d
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>


[ROCm/ROCR-Runtime commit: c6ed539b44]
2019-07-09 11:39:38 -04:00
shaoyunl 8232817dd9 Added family ID for gfx1010
Change-Id: I1b9a2b5270e70d12f066906f4e6cfea2cbfc2110
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>


[ROCm/ROCR-Runtime commit: 6cad92de6f]
2019-07-09 11:38:57 -04:00
Oak Zeng c214576bd3 Device HDP flush test
Change-Id: I1c19e44caeee4a6e59200dceb718896fcff9bf82
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 3b014adccc]
2019-07-07 21:59:37 -04:00
Chris Freehill 290dfd785f Make build_rocrtst.sh build all target kernels by default
This will allow the default target list to be branch
specific.

Change-Id: If8ecc14e2b7fb5ed2eb25ab447480308d539b248


[ROCm/ROCR-Runtime commit: d699039284]
2019-07-05 19:30:07 -04:00
shaoyunl 4a9ffdd56d Added SP3 assembler support for gfx10
Change-Id: I31c1df0f6d5243089e2ec3db381a19362be18d6c
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>


[ROCm/ROCR-Runtime commit: 664c6617ad]
2019-07-05 10:40:54 -04:00
Yong Zhao 7330d49568 kfdtest: Add core test category
This will faciliate ASIC bringup, including under simulation environment.

Change-Id: Ie027a77a2498cba739fea51f404d9843ce8dbeae
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: c27704ded9]
2019-07-02 22:28:23 -04:00
Jay Cornwall 60da601be4 Handle traps, illegal instruction, memory violations through queue signal
Report traps and fatal exceptions through a wavefront's
amd_queue_t.queue_inactive_signal. Previously, only traps were
reported and requireed the compiler to pass in the signal pointer
in s[0:1].

The signal is obtained through a mapping from doorbell index to
amd_queue_t*. The doorbell is fetched within a wavefront through
the gfx9+ S_SENDMSG(MSG_GET_DOORBELL) instruction.

Change-Id: I319b45f2e15dfcfe4db8f4065da1136e9539a42b


[ROCm/ROCR-Runtime commit: ff8f439112]
2019-07-01 22:59:41 -04:00
Jay Cornwall 822d838eae Replace gfx9 SP3 trap handler with LLVM, fix IB_STS restore
Assembler toolchains are moving from SP3 to LLVM. Replace trap handler
source code with LLVM equivalent.

Fix a trap issue with SQ_WAVE_IB_STS restore. Mostly harmless as all
traps are currently considered fatal to the wavefront.

Change-Id: Iacecd9dd31a1d96a083c8b8327f442f33c861f9f


[ROCm/ROCR-Runtime commit: 6ed686ee29]
2019-07-01 22:59:27 -04:00
Chris Freehill 970cca3731 Temporarily disable Debug test
Change-Id: Iabb238fcd78b9c2eb0c085b19ab93b8c9e538140


[ROCm/ROCR-Runtime commit: 8caa6c0b01]
2019-06-29 04:55:35 -04:00
Yong Zhao 7fb7eab2d4 kfdtest: Use SDMA engine information directly from the node
Change-Id: Icd391c8e821fb0ff5a1094f21b880a97e6d417a3
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: b507911ccd]
2019-06-28 00:47:15 -04:00
Kent Russell 9e301177f6 Remove failing tests due to gfx1010 kernel merge
BasicAddressWatch causes issues where KFDEvictTest and
KFDQMTest.OverSubscribeCpQueues fails, and results in a GPU hang/reset.
PM4EventInterrupt just hangs indefinitely. Remove them for now to allow
the kernel merges to resume, and figure out what happened in the nv10
merge to cause it

Change-Id: I418f9561ecb3e71bc52ac48ea363fcbde82a8e2b


[ROCm/ROCR-Runtime commit: be6ff2cdff]
2019-06-27 10:19:46 -04:00
Sean Keely 872c359ba2 Initial support for deallocation callbacks.
Adds hsa_amd_register_deallocation_callback and hsa_amd_deregister_deallocation_callback
to notify when HSA memory has been released.

Change-Id: I1f33cee250ca890e5c2e7fddfa4479aa5874651d


[ROCm/ROCR-Runtime commit: 299874f17d]
2019-06-26 04:12:17 -05:00
Chris Freehill 9d70b6a420 rocrtst fixes for hsa_signal cleanup and aql packet dispatch
In several places aql packets were written to queue all at once
instead of doing the header atomically. These cases have been
fixed.

There were a few hsa_signal leaked that have been addressed.

There was some duplication of code that has been addressed.

Addresses ROCMOPS-456

Change-Id: Ia1869bc370f92e49ac560301df47741d5f76978e


[ROCm/ROCR-Runtime commit: 081a2cc875]
2019-06-21 17:34:10 -05:00
Felix Kuehling 121ad3f820 Restore SDMA blacklist
The SDMA blacklist should contain all tests that use SDMA. It will
be applied to all ASICs that are know to have SDMA stability issues.

Change-Id: I53e723382c12f99bddf9c535000e27737a7ea1f6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 62ee7b4112]
2019-06-21 16:08:22 -04:00
Oak Zeng 4b48c71c38 Re-enable HostHdpFlush test
The bus error bug was fixed from kfd driver and Thunk

Change-Id: Id02617fdc26f1c49307f90a0a939e05f22d739e7
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: be9ac578ef]
2019-06-21 11:52:07 -04:00
Oak Zeng 54388cabdc Fix HostHdpFlush shader
1. Use s_mov_b32 to move 0xcafe to s18. s_movk_i32 is a sign extention move
instruction. Oxcafe will be extended to 0xffffcafe which is not desired
2. Add wait to s_load_dword instruction to make sure memory read finish before
the next store instruction.

Change-Id: I665d1d471019edfaba5693e07cdc567d4103573f
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 5d163cd821]
2019-06-21 11:51:51 -04:00
Evgeny 87cdf00d09 aqlprofile api fix
Change-Id: I2a710040422c7853ece5472ea776442b25d69dcb


[ROCm/ROCR-Runtime commit: 6c0aaa2773]
2019-06-19 23:14:27 -04:00
Sean Keely 904723af7c Fix IPC related hangs/faults in rocrtst.
IPC was failing due to calling fork when HSA was open.  The fix
was correcting incomplete cleanup in several other tests.

TestBase::Close (via CommonCleanUp) now checks that HSA is properly
closed between tests.

rocrtstPerf.Memory_Async_Copy uses hwloc which uses OpenCL which
has no shutdown routine.  Consequently this test can not cleanup
properly.  I added a hack to force HSA refcount to the value
it should have if OpenCL were cleaning up but this leaks resources
and potentially puts hwloc & OpenCL in a bad state.

OpenCL loads LLVM which installs some exit handlers.  Those handlers
can't execute in a child process and can't be removed since OpenCL
doesn't cleanup.  IPC hacks around this by aborting rather than exiting
in the child process.

Change-Id: I92326a73d7b11632208717d99728e6dafdc7d3ca


[ROCm/ROCR-Runtime commit: bb980462e7]
2019-06-19 01:03:52 -04:00
Philip Yang ae92e8fdff kfdtest: increase BigBufStressTest timeout and avoid VM fault
If TTM eviction and restore happens, it may takes very long time if
retry, the longest time is 5 minutes during my test. There is chance
packet is submited to queue while eviction, we have to increase the
Wait4PacketConsumption timeout.

The queue will continue to execute after eviction and restore. If we
upmap the memory from GPU while queue is evicted, this will cause VM
fault. Change to unmap memory after queue is destroyed.



Change-Id: I1b44e2274ea7b83398b2e3293578dad6947cb5af
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 4066dcd542]
2019-06-18 09:28:43 -04:00
Philip Yang 3edf77b5c9 kfdtest: avoid BigBufStressTest run on NUMA node 0
Because dma32 zone is on node 0, use all system memory on node 0 will
cause TTM eviction to free dma32 zone for other devices which only
work with 32bit physical address. The TTM eviction and restore may take
too long and cause queue timeout.

Running on other NUMA nodes, the NUMA default memory policy is
MPOL_PREFERRED, means TTM will get pages from local node first, and then
get remaining pages from other nodes. Check /proc/buddyinfo can confirm
this.

Reset NUMA bind to all after the test.



Change-Id: I39b373c07a2d5aa396f5c7602bffabab0481930f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 36776e9917]
2019-06-18 09:28:20 -04:00
Sean Keely 5d5d40fcf9 PTHREAD_STACK_MIN may differ from system parameters.
Restrict stack adjustment to non-default stack requests and allow
stack growth within reason (20MB cutoff).

Change-Id: I320280c711402ac29683e94c7246b7c32c797611


[ROCm/ROCR-Runtime commit: 0c0e634458]
2019-06-17 21:04:17 -05:00
Sean Keely ca44cbb3d9 Revert to SystemClockCounter for HSA system time.
CPUClockCounter is not NTP adjusted (CLOCK_MONOTONIC_RAW) so should be 
better for measurements.  However, it is implemented with syscall while
CLOCK_MONOTONIC is implemented via vDSO.  The latency increase becomes
significant when language layers make corresponding clock measurements.
Reverting to CLOCK_MONOTONIC will reduce latency and allow small
duration events to be measured at the cost of incorporating NTP
frequency skew errors.  NTP may adjust frequency by 500ppm so limits us
to ~3 decimals in elapsed time.

Change-Id: I920b9f707f47109d80d6c256c475638c03fb8d76


[ROCm/ROCR-Runtime commit: 4b22d24346]
2019-06-17 21:07:26 -04:00
Cole Nelson 41b06f1eee kfdtest: Blacklist multiple tests on gfx900/20
PSDB and other jenkins jobs are currently failing on several kfd tests.
This is blocking user throughput for screening patches by PSDB.
Blacklist multiple tests and submit JIRA's.

KFDIPCTest.BasicTest (ROCMOPS-459) .CMABasicTest (ROCMOPS-460) .CrossMemoryAttachTest (ROCMOPS-461)
KFDMemoryTest.BigBufferStressTest (ROCMOPS-462)
KFDQMTest.MultipleSdmaQueues (ROCMOPS-463) (ROCMOPS-416)
KFDEvictTest.BurstyTest (ROCMOPS-464)

Change-Id: I2c7cdeabc26654f39823201ce86d4113b3a98a0e
Signed-off-by: Cole Nelson <cole.nelson@amd.com>


[ROCm/ROCR-Runtime commit: 3f2d2e67c9]
2019-06-16 19:24:22 -04:00
Chris Freehill 74eb2440c3 Temporarily disable some failing tests
Change-Id: Iee713bb963db812c36ce2568aee2a4f8409c52e5


[ROCm/ROCR-Runtime commit: 259a1bac18]
2019-06-14 08:36:11 -05:00
Ori Messinger 95ccc6f000 Remove passing blacklisted kfd tests
This relates to the following commits:

1. commit 931dd817fa
2. commit 34e6346848
3. commit 880119d3a3

Change-Id: I3d0d3214baba403b4709b358132b6756a15f42d7
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/ROCR-Runtime commit: fe4db33875]
2019-06-12 06:14:46 -04:00
Sean Keely ba3ec88220 Fix description of HSA_AMD_MEMORY_POOL_INFO_ACCESSIBLE_BY_ALL.
Description was inconsistent with itself and code.  Existing behavior
returns HSA_AMD_MEMORY_POOL_INFO_ACCESSIBLE_BY_ALL == true for system
memory pools only and system memory pools do require hsa_amd_agents_allow_access.

Change-Id: I64b287bff9fdb21688aa169296e410edf1b209b5


[ROCm/ROCR-Runtime commit: bbb90bdfc9]
2019-06-11 01:45:22 -04:00
Oak Zeng 2f9c7afcfe Use kfd fd to mmap mmio
Change-Id: Iadd2e1ea46d0951aaa5a6cefbc7d42d1b2c1f653
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 888e1a7ae7]
2019-06-10 21:07:45 -05:00
Oak Zeng ef47fe0e1e Thunk API to allocate queue GWS
Change-Id: I6c5b109e2567cb71aed9245923cfcbeee6295ab2
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 65d554f5e4]
2019-06-10 21:07:45 -05:00
Oak Zeng b17d287432 Add node property to report number of GWS
Change-Id: I81263ca7ebfa3c0f9f1be78acfa0920e47d551b1
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 45d717d860]
2019-06-10 21:07:45 -05:00
Felix Kuehling 82670ee7fc kfdtest: Allocate PM4 queue and dispatch earlier KFDEvictTest.QueueTest
Allocating these before the big memory allocations minimizes the chances
of spurious out of memory errors.

Change-Id: I94aff9ec7ea34d4dc98ae08ac4cf9dc335b3df7f
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 396a85e97b]
2019-06-07 16:54:28 -04:00