rocm-systems

作者	SHA1	備註	日期
shaoyunl	6cad92de6f	Added family ID for gfx1010 Change-Id: I1b9a2b5270e70d12f066906f4e6cfea2cbfc2110 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: shaoyunl <shaoyun.liu@amd.com>	2019-07-09 11:38:57 -04:00
Oak Zeng	3b014adccc	Device HDP flush test Change-Id: I1c19e44caeee4a6e59200dceb718896fcff9bf82 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-07-07 21:59:37 -04:00
Chris Freehill	d699039284	Make build_rocrtst.sh build all target kernels by default This will allow the default target list to be branch specific. Change-Id: If8ecc14e2b7fb5ed2eb25ab447480308d539b248	2019-07-05 19:30:07 -04:00
shaoyunl	664c6617ad	Added SP3 assembler support for gfx10 Change-Id: I31c1df0f6d5243089e2ec3db381a19362be18d6c Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> Signed-off-by: shaoyunl <shaoyun.liu@amd.com>	2019-07-05 10:40:54 -04:00
Yong Zhao	c27704ded9	kfdtest: Add core test category This will faciliate ASIC bringup, including under simulation environment. Change-Id: Ie027a77a2498cba739fea51f404d9843ce8dbeae Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>	2019-07-02 22:28:23 -04:00
Jay Cornwall	ff8f439112	Handle traps, illegal instruction, memory violations through queue signal Report traps and fatal exceptions through a wavefront's amd_queue_t.queue_inactive_signal. Previously, only traps were reported and requireed the compiler to pass in the signal pointer in s[0:1]. The signal is obtained through a mapping from doorbell index to amd_queue_t*. The doorbell is fetched within a wavefront through the gfx9+ S_SENDMSG(MSG_GET_DOORBELL) instruction. Change-Id: I319b45f2e15dfcfe4db8f4065da1136e9539a42b	2019-07-01 22:59:41 -04:00
Jay Cornwall	6ed686ee29	Replace gfx9 SP3 trap handler with LLVM, fix IB_STS restore Assembler toolchains are moving from SP3 to LLVM. Replace trap handler source code with LLVM equivalent. Fix a trap issue with SQ_WAVE_IB_STS restore. Mostly harmless as all traps are currently considered fatal to the wavefront. Change-Id: Iacecd9dd31a1d96a083c8b8327f442f33c861f9f	2019-07-01 22:59:27 -04:00
Chris Freehill	8caa6c0b01	Temporarily disable Debug test Change-Id: Iabb238fcd78b9c2eb0c085b19ab93b8c9e538140	2019-06-29 04:55:35 -04:00
Yong Zhao	b507911ccd	kfdtest: Use SDMA engine information directly from the node Change-Id: Icd391c8e821fb0ff5a1094f21b880a97e6d417a3 Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>	2019-06-28 00:47:15 -04:00
Kent Russell	be6ff2cdff	Remove failing tests due to gfx1010 kernel merge BasicAddressWatch causes issues where KFDEvictTest and KFDQMTest.OverSubscribeCpQueues fails, and results in a GPU hang/reset. PM4EventInterrupt just hangs indefinitely. Remove them for now to allow the kernel merges to resume, and figure out what happened in the nv10 merge to cause it Change-Id: I418f9561ecb3e71bc52ac48ea363fcbde82a8e2b	2019-06-27 10:19:46 -04:00
Sean Keely	299874f17d	Initial support for deallocation callbacks. Adds hsa_amd_register_deallocation_callback and hsa_amd_deregister_deallocation_callback to notify when HSA memory has been released. Change-Id: I1f33cee250ca890e5c2e7fddfa4479aa5874651d	2019-06-26 04:12:17 -05:00
Chris Freehill	081a2cc875	rocrtst fixes for hsa_signal cleanup and aql packet dispatch In several places aql packets were written to queue all at once instead of doing the header atomically. These cases have been fixed. There were a few hsa_signal leaked that have been addressed. There was some duplication of code that has been addressed. Addresses ROCMOPS-456 Change-Id: Ia1869bc370f92e49ac560301df47741d5f76978e	2019-06-21 17:34:10 -05:00
Felix Kuehling	62ee7b4112	Restore SDMA blacklist The SDMA blacklist should contain all tests that use SDMA. It will be applied to all ASICs that are know to have SDMA stability issues. Change-Id: I53e723382c12f99bddf9c535000e27737a7ea1f6 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2019-06-21 16:08:22 -04:00
Oak Zeng	be9ac578ef	Re-enable HostHdpFlush test The bus error bug was fixed from kfd driver and Thunk Change-Id: Id02617fdc26f1c49307f90a0a939e05f22d739e7 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-06-21 11:52:07 -04:00
Oak Zeng	5d163cd821	Fix HostHdpFlush shader 1. Use s_mov_b32 to move 0xcafe to s18. s_movk_i32 is a sign extention move instruction. Oxcafe will be extended to 0xffffcafe which is not desired 2. Add wait to s_load_dword instruction to make sure memory read finish before the next store instruction. Change-Id: I665d1d471019edfaba5693e07cdc567d4103573f Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-06-21 11:51:51 -04:00
Evgeny	6c0aaa2773	aqlprofile api fix Change-Id: I2a710040422c7853ece5472ea776442b25d69dcb	2019-06-19 23:14:27 -04:00
Sean Keely	bb980462e7	Fix IPC related hangs/faults in rocrtst. IPC was failing due to calling fork when HSA was open. The fix was correcting incomplete cleanup in several other tests. TestBase::Close (via CommonCleanUp) now checks that HSA is properly closed between tests. rocrtstPerf.Memory_Async_Copy uses hwloc which uses OpenCL which has no shutdown routine. Consequently this test can not cleanup properly. I added a hack to force HSA refcount to the value it should have if OpenCL were cleaning up but this leaks resources and potentially puts hwloc & OpenCL in a bad state. OpenCL loads LLVM which installs some exit handlers. Those handlers can't execute in a child process and can't be removed since OpenCL doesn't cleanup. IPC hacks around this by aborting rather than exiting in the child process. Change-Id: I92326a73d7b11632208717d99728e6dafdc7d3ca	2019-06-19 01:03:52 -04:00
Philip Yang	4066dcd542	kfdtest: increase BigBufStressTest timeout and avoid VM fault If TTM eviction and restore happens, it may takes very long time if retry, the longest time is 5 minutes during my test. There is chance packet is submited to queue while eviction, we have to increase the Wait4PacketConsumption timeout. The queue will continue to execute after eviction and restore. If we upmap the memory from GPU while queue is evicted, this will cause VM fault. Change to unmap memory after queue is destroyed. Change-Id: I1b44e2274ea7b83398b2e3293578dad6947cb5af Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2019-06-18 09:28:43 -04:00
Philip Yang	36776e9917	kfdtest: avoid BigBufStressTest run on NUMA node 0 Because dma32 zone is on node 0, use all system memory on node 0 will cause TTM eviction to free dma32 zone for other devices which only work with 32bit physical address. The TTM eviction and restore may take too long and cause queue timeout. Running on other NUMA nodes, the NUMA default memory policy is MPOL_PREFERRED, means TTM will get pages from local node first, and then get remaining pages from other nodes. Check /proc/buddyinfo can confirm this. Reset NUMA bind to all after the test. Change-Id: I39b373c07a2d5aa396f5c7602bffabab0481930f Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2019-06-18 09:28:20 -04:00
Sean Keely	0c0e634458	PTHREAD_STACK_MIN may differ from system parameters. Restrict stack adjustment to non-default stack requests and allow stack growth within reason (20MB cutoff). Change-Id: I320280c711402ac29683e94c7246b7c32c797611	2019-06-17 21:04:17 -05:00
Sean Keely	4b22d24346	Revert to SystemClockCounter for HSA system time. CPUClockCounter is not NTP adjusted (CLOCK_MONOTONIC_RAW) so should be better for measurements. However, it is implemented with syscall while CLOCK_MONOTONIC is implemented via vDSO. The latency increase becomes significant when language layers make corresponding clock measurements. Reverting to CLOCK_MONOTONIC will reduce latency and allow small duration events to be measured at the cost of incorporating NTP frequency skew errors. NTP may adjust frequency by 500ppm so limits us to ~3 decimals in elapsed time. Change-Id: I920b9f707f47109d80d6c256c475638c03fb8d76	2019-06-17 21:07:26 -04:00
Cole Nelson	3f2d2e67c9	kfdtest: Blacklist multiple tests on gfx900/20 PSDB and other jenkins jobs are currently failing on several kfd tests. This is blocking user throughput for screening patches by PSDB. Blacklist multiple tests and submit JIRA's. KFDIPCTest.BasicTest (ROCMOPS-459) .CMABasicTest (ROCMOPS-460) .CrossMemoryAttachTest (ROCMOPS-461) KFDMemoryTest.BigBufferStressTest (ROCMOPS-462) KFDQMTest.MultipleSdmaQueues (ROCMOPS-463) (ROCMOPS-416) KFDEvictTest.BurstyTest (ROCMOPS-464) Change-Id: I2c7cdeabc26654f39823201ce86d4113b3a98a0e Signed-off-by: Cole Nelson <cole.nelson@amd.com>	2019-06-16 19:24:22 -04:00
Chris Freehill	259a1bac18	Temporarily disable some failing tests Change-Id: Iee713bb963db812c36ce2568aee2a4f8409c52e5	2019-06-14 08:36:11 -05:00
Ori Messinger	fe4db33875	Remove passing blacklisted kfd tests This relates to the following commits: 1. commit `aa7c13264a` 2. commit `54807526b9` 3. commit `6df62c78b8` Change-Id: I3d0d3214baba403b4709b358132b6756a15f42d7 Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>	2019-06-12 06:14:46 -04:00
Sean Keely	bbb90bdfc9	Fix description of HSA_AMD_MEMORY_POOL_INFO_ACCESSIBLE_BY_ALL. Description was inconsistent with itself and code. Existing behavior returns HSA_AMD_MEMORY_POOL_INFO_ACCESSIBLE_BY_ALL == true for system memory pools only and system memory pools do require hsa_amd_agents_allow_access. Change-Id: I64b287bff9fdb21688aa169296e410edf1b209b5	2019-06-11 01:45:22 -04:00
Oak Zeng	888e1a7ae7	Use kfd fd to mmap mmio Change-Id: Iadd2e1ea46d0951aaa5a6cefbc7d42d1b2c1f653 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-06-10 21:07:45 -05:00
Oak Zeng	65d554f5e4	Thunk API to allocate queue GWS Change-Id: I6c5b109e2567cb71aed9245923cfcbeee6295ab2 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-06-10 21:07:45 -05:00
Oak Zeng	45d717d860	Add node property to report number of GWS Change-Id: I81263ca7ebfa3c0f9f1be78acfa0920e47d551b1 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-06-10 21:07:45 -05:00
Felix Kuehling	396a85e97b	kfdtest: Allocate PM4 queue and dispatch earlier KFDEvictTest.QueueTest Allocating these before the big memory allocations minimizes the chances of spurious out of memory errors. Change-Id: I94aff9ec7ea34d4dc98ae08ac4cf9dc335b3df7f Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2019-06-07 16:54:28 -04:00
Felix Kuehling	f474cf21cd	kfdtest: Reduce libdrm VRAM usage in eviction tests This reduces thrashing due to graphics submissions only and significantly speeds up the BasicTest when keeping idle compute processes evicted. In the BasicTest compute is always idle, so only one compute eviction and no restore is triggered. Then graphics submissions complete quickly without thrashing each other. Change-Id: Iae6da98903b20424a5097f235e1d09cf13e4b41b Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2019-06-07 16:54:28 -04:00
Felix Kuehling	6984f3e3b4	kfdtest: Add KFDEvictionTest.BurstyTest Change-Id: I748603b0b204ffc3ea33399ecbc022233a7447d3 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2019-06-07 16:54:28 -04:00
Felix Kuehling	6f5379d315	kfdtest: Pass timeout parameter to BaseQueue::Wait4PacketConsumption Change-Id: I0e88db5ca8e6712e9efc419a10eb4c49cedb6f62 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2019-06-07 16:54:28 -04:00
Evgeny	a06d96cef8	aqlprofile API: sdma blocks Change-Id: I619af8adc17706f808644180cdd5a5c785e052ec	2019-06-05 18:54:08 -05:00
Felix Kuehling	f5a094bc96	libhsakmt: Update kfd_ioctl.h Change-Id: Ibf165023b98787fdf295f50324e19aa062f2421d Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2019-06-03 19:15:49 -04:00
Evgeny	1be9298f72	adding new trace API Change-Id: I6c83b5789f5a6cdbb574d041c40d5a47229c7f1a	2019-06-01 14:33:59 -04:00
Eric Huang	47d1c17592	kfdtest: fix error injection failure in RAS test 1. umc error injection only accepts parameter "0 0". 2. flush output to file in order to make writing happen immediately. Change-Id: I8d3bde287caee6b90b6eec56c760f5a228be7595 Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>	2019-05-30 16:38:15 -04:00
Eric Huang	d278b2579e	kfdtest: fix debugfs path bug in RAS test The path was wrong based on assumption that GPU dri render node starts from 0, because if there is a VGA device on board, node 0 will be VGA and node 1 will be GPU. So the fix will look at the name of GPU minor node and find the correct primary node on which RAS debugfs entry exists. Change-Id: Icc5e63ce48698d5d29105c0417e3bec8afa0a7c8 Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>	2019-05-29 11:14:22 -04:00
Matt Arsenault	0016c6ce5b	Don't check VERSION_BUILD is defined Check if it is true or not. The string() call would define this to an empty string, which would pass. This would then leave a trailing - in the version string, which dpkg would error on during package installation. Change-Id: Ifb5fc15f5dde506e96bff7881a5d3f22d983406e	2019-05-29 11:09:31 -04:00
Sean Keely	22de0e7fb9	Allow hsa_status_string when HSA is closed. API is a stateless lookup of RO data and needed to interpret hsa_init error codes. Change-Id: If80cba2f697843d08e529da0f790acf3c37127a7	2019-05-24 22:40:03 -04:00
Sean Keely	9f81bdfbe1	Add exception and error safety for CreateThread. Change-Id: I82aaf64e039ca9614b4948deec1f87147f56279a	2019-05-24 22:39:55 -04:00
Matt Arsenault	22d29b55a4	Change include flag order Search the local src directories first. If using a system installed hsakmt, this would pick the installed hsa headers. Change-Id: I9746d6e9db1749a130e4d93e024556754a537083	2019-05-22 16:43:18 -07:00
Felix Kuehling	64b90261d9	libhsakmt: Enable invisible debug VRAM mappings by default Remove the HSA_DEBUG environment variable that controlled the creation of these mappings. This should allow the debugger to attach to a running process and access VRAM buffers through ptrace without having to do anything special. On processes that create many small VRAM mappings, this may cause regressions due to the per-process mmap limit. However, the sub-allocator in ROCr should consolidate most small allocations into 2MB blocks nowadays, for good TLB efficiency. So this is unlikely to cause problems. Change-Id: I929da1be0f6cb51ec00a02f3f241d16083e4d95f Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2019-05-17 18:28:14 -04:00
Sean Keely	a913549190	Correct pthread join/detach handling. Joined threads can not be joined more than once nor can they be detached. Thread library wait and close allows multiple waits and separate close so this fixes the pthread implementation. Change-Id: I0019271a438f11ed4c6c11854011f5c4f6e16b65	2019-05-16 12:14:06 -05:00
Philip Cox	608bc7c3a0	Fix type mismatch passed to queue suspend/resume The queue IDs passed over to the kernel via kfd_ioctl_dbg_trap_args->ptr should be a list of uint32_t's. Need to convert from the passed in 64 bit HSA_QUEUEID to 32 bit uint32_t's. Change-Id: I8718566d9f9ffc90ce0b2ecc129b10c49d73186a Signed-off-by: Philip Cox <Philip.Cox@amd.com>	2019-05-15 07:33:47 -04:00
Sean Keely	6e2a056e1b	Correlate errors for time stamps which predate process start. Small times may be given to time conversion if GPU clocks are used to accumulate elapsed time. Because HSA APIs deal in absolute time this leads to large conversion offsets of order system uptime. Variation in relative clock ratio estimation may be amplified in this case, destroying elapsed time measurements. This patch fixes the relative clock ratio used for times which predate the call to hsa_init. This correlates errors in such times allowing the elapsed time to be correctly computed. The effective maximum system uptime before elapsed time conversion becomes inaccurate is ~3.5 months. GPU event timestamps are good for process uptime of ~3.5 months. These are limited by double's mantissa precision. Change-Id: I48752ff354920439d91016d6f2b0c8ddfa60b445	2019-05-14 17:35:06 -04:00
Kent Russell	54e042eee1	Add missing gfx803 ID Change-Id: I9eca81f0f149ea924c3b81bd80680d7fd1ad7a6c	2019-05-13 09:03:06 -04:00
Sean Keely	06376e726b	Expose HDP flush registers. Exposed via agent info query. Only valid if fine grain PCIe memory is enabled. Change-Id: Ib4770901592ec047276458926a947737f9b93bb5	2019-05-11 00:04:47 -04:00
Oak Zeng	78e4ef17c2	Temporarily disable HostHdpFlush test Change-Id: I070cb3523a33b4efbfa7041fa2623059e1ff37bb Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>	2019-05-10 09:34:40 -04:00
Felix Kuehling	8f10c9375d	libhsakmt: Disable -Werror by default This can cause build failures on unknown of future compiler versions. Only enable it if explicitly enabled by an environment variable. This allows us to continue building with -Werror in internal builds with known compiler versions. Change-Id: Ic1cd9d223218cc4e4cddba49df93bb357c1cbd40 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>	2019-05-07 16:06:51 -04:00
Philip Cox	b0d23aee16	fix suspend/resume logic in debug_trap code There was a mistake and RESUME was used when it should have been suspend in two places in the suspend resume code. This fixes that error. Change-Id: I69be733d7ae7c14ce5ee8af57a307976e4212d62	2019-05-07 06:56:00 -04:00

... 38 39 40 41 42 ...

2959 次程式碼提交