Γράφημα Υποβολών

520 Υποβολές

Συγγραφέας SHA1 Μήνυμα Ημερομηνία
Philip Yang ae92e8fdff kfdtest: increase BigBufStressTest timeout and avoid VM fault
If TTM eviction and restore happens, it may takes very long time if
retry, the longest time is 5 minutes during my test. There is chance
packet is submited to queue while eviction, we have to increase the
Wait4PacketConsumption timeout.

The queue will continue to execute after eviction and restore. If we
upmap the memory from GPU while queue is evicted, this will cause VM
fault. Change to unmap memory after queue is destroyed.



Change-Id: I1b44e2274ea7b83398b2e3293578dad6947cb5af
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 4066dcd542]
2019-06-18 09:28:43 -04:00
Philip Yang 3edf77b5c9 kfdtest: avoid BigBufStressTest run on NUMA node 0
Because dma32 zone is on node 0, use all system memory on node 0 will
cause TTM eviction to free dma32 zone for other devices which only
work with 32bit physical address. The TTM eviction and restore may take
too long and cause queue timeout.

Running on other NUMA nodes, the NUMA default memory policy is
MPOL_PREFERRED, means TTM will get pages from local node first, and then
get remaining pages from other nodes. Check /proc/buddyinfo can confirm
this.

Reset NUMA bind to all after the test.



Change-Id: I39b373c07a2d5aa396f5c7602bffabab0481930f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 36776e9917]
2019-06-18 09:28:20 -04:00
Cole Nelson 41b06f1eee kfdtest: Blacklist multiple tests on gfx900/20
PSDB and other jenkins jobs are currently failing on several kfd tests.
This is blocking user throughput for screening patches by PSDB.
Blacklist multiple tests and submit JIRA's.

KFDIPCTest.BasicTest (ROCMOPS-459) .CMABasicTest (ROCMOPS-460) .CrossMemoryAttachTest (ROCMOPS-461)
KFDMemoryTest.BigBufferStressTest (ROCMOPS-462)
KFDQMTest.MultipleSdmaQueues (ROCMOPS-463) (ROCMOPS-416)
KFDEvictTest.BurstyTest (ROCMOPS-464)

Change-Id: I2c7cdeabc26654f39823201ce86d4113b3a98a0e
Signed-off-by: Cole Nelson <cole.nelson@amd.com>


[ROCm/ROCR-Runtime commit: 3f2d2e67c9]
2019-06-16 19:24:22 -04:00
Ori Messinger 95ccc6f000 Remove passing blacklisted kfd tests
This relates to the following commits:

1. commit 931dd817fa
2. commit 34e6346848
3. commit 880119d3a3

Change-Id: I3d0d3214baba403b4709b358132b6756a15f42d7
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/ROCR-Runtime commit: fe4db33875]
2019-06-12 06:14:46 -04:00
Oak Zeng 2f9c7afcfe Use kfd fd to mmap mmio
Change-Id: Iadd2e1ea46d0951aaa5a6cefbc7d42d1b2c1f653
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 888e1a7ae7]
2019-06-10 21:07:45 -05:00
Oak Zeng ef47fe0e1e Thunk API to allocate queue GWS
Change-Id: I6c5b109e2567cb71aed9245923cfcbeee6295ab2
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 65d554f5e4]
2019-06-10 21:07:45 -05:00
Oak Zeng b17d287432 Add node property to report number of GWS
Change-Id: I81263ca7ebfa3c0f9f1be78acfa0920e47d551b1
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 45d717d860]
2019-06-10 21:07:45 -05:00
Felix Kuehling 82670ee7fc kfdtest: Allocate PM4 queue and dispatch earlier KFDEvictTest.QueueTest
Allocating these before the big memory allocations minimizes the chances
of spurious out of memory errors.

Change-Id: I94aff9ec7ea34d4dc98ae08ac4cf9dc335b3df7f
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 396a85e97b]
2019-06-07 16:54:28 -04:00
Felix Kuehling f4af9cef34 kfdtest: Reduce libdrm VRAM usage in eviction tests
This reduces thrashing due to graphics submissions only and
significantly speeds up the BasicTest when keeping idle compute
processes evicted. In the BasicTest  compute is always idle, so
only one compute eviction and no restore is triggered. Then
graphics submissions complete quickly without thrashing each other.

Change-Id: Iae6da98903b20424a5097f235e1d09cf13e4b41b
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: f474cf21cd]
2019-06-07 16:54:28 -04:00
Felix Kuehling b3aa83930f kfdtest: Add KFDEvictionTest.BurstyTest
Change-Id: I748603b0b204ffc3ea33399ecbc022233a7447d3
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 6984f3e3b4]
2019-06-07 16:54:28 -04:00
Felix Kuehling ceba63cbe2 kfdtest: Pass timeout parameter to BaseQueue::Wait4PacketConsumption
Change-Id: I0e88db5ca8e6712e9efc419a10eb4c49cedb6f62
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 6f5379d315]
2019-06-07 16:54:28 -04:00
Felix Kuehling f013b274aa libhsakmt: Update kfd_ioctl.h
Change-Id: Ibf165023b98787fdf295f50324e19aa062f2421d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: f5a094bc96]
2019-06-03 19:15:49 -04:00
Eric Huang ed1099161b kfdtest: fix error injection failure in RAS test
1. umc error injection only accepts parameter "0 0".
2. flush output to file in order to make writing happen
   immediately.

Change-Id: I8d3bde287caee6b90b6eec56c760f5a228be7595
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>


[ROCm/ROCR-Runtime commit: 47d1c17592]
2019-05-30 16:38:15 -04:00
Eric Huang c7c4d6d59b kfdtest: fix debugfs path bug in RAS test
The path was wrong based on assumption that GPU dri render
node starts from 0, because if there is a VGA device on
board, node 0 will be VGA and node 1 will be GPU. So the fix
will look at the name of GPU minor node and find the correct
primary node on which RAS debugfs entry exists.

Change-Id: Icc5e63ce48698d5d29105c0417e3bec8afa0a7c8
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>


[ROCm/ROCR-Runtime commit: d278b2579e]
2019-05-29 11:14:22 -04:00
Felix Kuehling f6e1bfb83b libhsakmt: Enable invisible debug VRAM mappings by default
Remove the HSA_DEBUG environment variable that controlled the
creation of these mappings.

This should allow the debugger to attach to a running process and
access VRAM buffers through ptrace without having to do anything
special.

On processes that create many small VRAM mappings, this may cause
regressions due to the per-process mmap limit. However, the
sub-allocator in ROCr should consolidate most small allocations
into 2MB blocks nowadays, for good TLB efficiency. So this is
unlikely to cause problems.

Change-Id: I929da1be0f6cb51ec00a02f3f241d16083e4d95f
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 64b90261d9]
2019-05-17 18:28:14 -04:00
Philip Cox 3d13159fd6 Fix type mismatch passed to queue suspend/resume
The queue IDs passed over to the kernel via kfd_ioctl_dbg_trap_args->ptr
should be a list of uint32_t's.  Need to convert from the passed in
64 bit HSA_QUEUEID to 32 bit uint32_t's.

Change-Id: I8718566d9f9ffc90ce0b2ecc129b10c49d73186a
Signed-off-by: Philip Cox <Philip.Cox@amd.com>


[ROCm/ROCR-Runtime commit: 608bc7c3a0]
2019-05-15 07:33:47 -04:00
Kent Russell 84cc063225 Add missing gfx803 ID
Change-Id: I9eca81f0f149ea924c3b81bd80680d7fd1ad7a6c


[ROCm/ROCR-Runtime commit: 54e042eee1]
2019-05-13 09:03:06 -04:00
Oak Zeng 58d3a9f92a Temporarily disable HostHdpFlush test
Change-Id: I070cb3523a33b4efbfa7041fa2623059e1ff37bb
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 78e4ef17c2]
2019-05-10 09:34:40 -04:00
Felix Kuehling 1629c543db libhsakmt: Disable -Werror by default
This can cause build failures on unknown of future compiler versions.
Only enable it if explicitly enabled by an environment variable. This
allows us to continue building with -Werror in internal builds with
known compiler versions.

Change-Id: Ic1cd9d223218cc4e4cddba49df93bb357c1cbd40
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 8f10c9375d]
2019-05-07 16:06:51 -04:00
Philip Cox 00ec9fac6e fix suspend/resume logic in debug_trap code
There was a mistake and RESUME was used when it should
have been suspend in two places in the suspend resume
code.  This fixes that error.


Change-Id: I69be733d7ae7c14ce5ee8af57a307976e4212d62


[ROCm/ROCR-Runtime commit: b0d23aee16]
2019-05-07 06:56:00 -04:00
Philip Cox 596a2491c7 libhsakmt: Update wave suspend/resume API
This is updating to the new suspend and resume API for the
KFD and the thunk.  We now support passing in a list of queues
to suspend, and not just all of the queues for the process.

The kfdtest testcase was also updated so it still compiles.

Change-Id: I71d1b178476bd9df0c311bdedaa6a891528cebcf
Signed-off-by: Philip Cox <Philip.Cox@amd.com>


[ROCm/ROCR-Runtime commit: c2c1385e29]
2019-05-03 10:32:47 -04:00
Philip Cox 5e56c9a609 libhsakmt: Update HsaQueueInfo for GetQueueInfo
hsaKmtGetQueueInfo needs to return the control stack size, and the
wave state size for the debugger.  These changes are needed to support
returning the new values.

Change-Id: Ib4c60e0ea34446c06aef4a86996250989f348a69
Signed-off-by: Philip Cox <Philip.Cox@amd.com>


[ROCm/ROCR-Runtime commit: d21e9d5bbd]
2019-05-03 10:32:47 -04:00
Oak Zeng 3914fb9424 Host HDP flush test
Change-Id: I396ac021d15da972f4841d6d8f90d4b175e64ecd
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: b26580788b]
2019-05-03 09:31:41 -04:00
Felix Kuehling 4b8a5ead52 kfdtest: Make eviction tests more robust
- Run more graphics command submissions with shorter delay between
  them
- Synchronize after every graphics command submission
- Include the big VRAM BO in the BOList of the command submission
  to trigger more evictions
- In QueueTest, run AMDGPU command submissions concurrently with
  compute shader on the user mode queue
- Submit AMDGPU commands to GFX queue instead of compute queue to
  avoid deadlocks between user-mode and kernel-mode queues on the
  same pipe
- Allocate slightly less memory from KFD to avoid allocation errors
  due to fragmentation or memory leaks in previous tests
- Running only two processes maximizes the number of KFD evictions
  (probably because of lower chances of evicting non-KFD BOs)

Change-Id: If05d53f5fcf690b6488998a3f933f120ddaa71ee
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: c8d823eb10]
2019-05-02 17:34:11 -04:00
Oak Zeng dbfa65a604 Add MMIO_REMAP heap type
Add a MMIO_REMAP heap type and expose mmio virtual address
through HsaKmtGetNodeMemoryProperties



Change-Id: I1e585e6dfbec8fa7c85f1dda7b89b763a8e2c439
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 804aa90a22]
2019-04-30 15:40:50 -05:00
Oak Zeng 5acd835452 Fix return value of fmm_get_aperture_base_and_limit
Only return success when the aperture is valid

Change-Id: I63b97fd0450e1ff277cf45abc7a1be9f7a0c0d50


[ROCm/ROCR-Runtime commit: e4a6a01389]
2019-04-30 09:38:46 -05:00
Oak Zeng 8b70424b2f Map remapped mmio page to process space
HDP conherence registers are remapped at driver level
to an empty page in mmio space (the remapped mmio page).
This change allocate and map the remapped mmio page to
process space.



Change-Id: I89c405c41870a79c5b58eea0d8e564aa35f55182
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: ae111689f0]
2019-04-30 09:00:16 -05:00
Yong Zhao c14a254f2e kfdtest: Add CreateDestroyCpQueue and CreateDestroySdmaQueue test
Those two tests cover the basic queue creation and destruction
without submitting packets to CP and SDMA user queues. During bringup,
they bring values in term of untangling the issues arising in queue
creation and packet execution, which are two very different kinds.

Because of those two tests, we also rename some existing tests as
follows:
CreateCpQueue             ->   SubmitPacketCpQueue
CreateSdmaQueue           ->   SubmitPacketSdmaQueue
CreateMultipleCpQueues    ->   MultipleCpQueues
CreateMultipleSdmaQueues  ->   MultipleSdmaQueues

Lastly, move MultipleCpQueues test closer to the CP queue section
rather than leaving it behind the SDMA queue section.

Change-Id: I110fb3f3fb21878339045dd1d1c8c9d61b8988b7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 5b18614eaf]
2019-04-23 15:08:17 -04:00
Kent Russell 56bbd862e6 Fix test if Render Node can't be found
If a Render Node can't be found, we should finish off all child
processes immediately, then return.

Trying to do the check before forking the children results in the test
failing as well, regardless of the status of finding the render node,
which is likely why the forking occurred first in the test's initial
creation. This way we ensure that things finish cleanly before moving to
the next test


Change-Id: I2e1b62fed25c30ff1f179612127c23960da4ee5e


[ROCm/ROCR-Runtime commit: e109ce541c]
2019-04-23 07:14:35 -04:00
Alex Voicu 87d0f4246b Macros are devious, wrap argument to prevent surprises.
Change-Id: Ib99e7d2ec1e7a2802f4ae7946ba1fa92c9940a85


[ROCm/ROCR-Runtime commit: ee9831779c]
2019-04-19 07:06:28 -04:00
Alex Voicu 605a038c57 Fix Clang build
Change-Id: I0b51699c0a1368cf5813bd9d3cd4479139d23d6a


[ROCm/ROCR-Runtime commit: fdadae6745]
2019-04-19 07:06:28 -04:00
Andreas Schneider f3838c57bc cmake: Do not mess with CMAKE flags
Specifically, don't mess with CMAKE_SHARED_LINKER_FLAGS or
CMAKE_C_FLAGS

Change-Id: I73e287df5b80d440079c6b3abe8c401d492d11dd


[ROCm/ROCR-Runtime commit: 0045974858]
2019-04-19 07:06:28 -04:00
Andreas Schneider d698471b8a cmake: Do not strip targets in the release build
Distributions want to generate debuginfo packages, do not strip them! If
you want to do it during installation use 'make install/strip'!

Change-Id: I3983af24ce4f4ddb189ede0ed0820dfee83b6280


[ROCm/ROCR-Runtime commit: 8ccfa4c75c]
2019-04-19 07:06:28 -04:00
Andreas Schneider 4006e1f112 cmake: Create cmake config file
Another cmake project like hsa-runtime could just use:

find_package(hsakmt REQUIRED 1.9.0)

Change-Id: Ia1c9a80ef287facdd607382d69649b0718d687b4


[ROCm/ROCR-Runtime commit: b8a1331763]
2019-04-03 08:36:09 -04:00
Tom Stellard 540fa908f6 Separate build version from library version
This patch separates the build version (i.e. ROCm version) from the
library version used to set the SONAME of the shared objects.  This
prevents the SONAME from getting bumped each time there is a new ROCm
release without any change to the libhsakmt ABI.

1.0.6 was choosen as the library version since this was the
last library version used prior to switching to the ROCm version
numbers.

Change-Id: I7c29ae84d8a362a831e804569d8147ca65155cad


[ROCm/ROCR-Runtime commit: 006c2c248d]
2019-02-28 21:32:45 -08:00
Eric Huang cec44d33c9 kfdtest: fix and change in RAS test
1. RAS error injection debugfs interface has been changed which
is using ras_ctrl instead of *_err_inject.

2. Remove ASSERT_SUCCESS for fwrite, because fwrite returns
the size of written item but not the error number.

3. Using throw exception instead of return to avoid a segment fault.

Change-Id: I6c4d9c2f7e66719faec99abd1552105a08c238a4
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>


[ROCm/ROCR-Runtime commit: e5b215570b]
2019-03-29 11:00:01 -04:00
Eric Huang 6e93266f63 libhsakmt: update kfd_ioctl.h regarding RAS interface
It is aligned with RAS changes in KFD.

Change-Id: I52816da01a4001158a40a1207d1fbe6ec3271343
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>


[ROCm/ROCR-Runtime commit: d6cde5bf08]
2019-03-28 21:38:44 -04:00
Sean Keely 80ffb43205 Add hsaKmtRegisterMemoryWithFlags.
API follows hsaKmtRegisterMemory but allows passing HsaMemFlags.

Change-Id: I66a230a87c8b085f27c769bdf2cb4d0d96a5d6dd


[ROCm/ROCR-Runtime commit: c7f1277013]
2019-03-28 17:17:40 -04:00
Kent Russell 5de58e1ab2 libhsakmt: Add Vega M support
While this may not be supported in the runtime, the kernel/firmware
support it

Change-Id: I7fe4536a6b3055f39e25f453060e899938645d91
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: 1304a92e17]
2019-03-21 08:05:47 -04:00
Kent Russell 14676959fb libhsakmt: Add another gfx902 GPU ID
Change-Id: I967f16bf548171df73d2e721f16c1aac52e99852
Signed-off-by: Kent Russell <kent.russell@amd.com>


[ROCm/ROCR-Runtime commit: c7439a0039]
2019-03-21 08:05:47 -04:00
Philip Yang c140f95b1b Revert "kfdtest.exclude: Temporarily blacklist IPC on gfx900"
This reverts commit a349805264.

Fixes for HMM change corner cases are merged in from drm-next.
Tests are passed on gfx900 with the latest amd-kfd-staging.

Change-Id: I6c00d1eacf6b3f1ce715e085ae622b4e9ff1b7ff
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 0bd9f35563]
2019-03-18 10:10:00 -04:00
Amber Lin bc0bc878f4 libhsakmt: Fix missing apicid in topology
While adding x2APIC support, apicid for non-x2apic was missing out by
mistake.



Change-Id: I25eed362c035c0e9fb9ea948899c49f70311f269
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: cfa47ac1f9]
2019-03-11 13:05:37 -04:00
Oak Zeng 1efba15c34 Introduce XGMI SDMA queue type
Change-Id: I8c6ff04f92c2bbea0bab94ddb8cc4cceb5d74d02
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 1046a1fd72]
2019-03-07 19:42:20 -05:00
Oak Zeng f77c754281 Thunk interface to get SDMA engine info
Add SDMA engine info fields to node properties and
modify get node properties API to read SDMA engine
info from sysfs

Change-Id: Iea877b5bc008cc9df9405daf564a359535f1bc9f
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 414a3508d6]
2019-03-07 14:09:43 -05:00
Oak Zeng a75addec83 Use latest kfd_ioctl.h
A new SDMA queue type for XGMI was added

Change-Id: Iad065c1a7c053a58e0d86becfb374215e316a611
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: e4109de26d]
2019-03-07 13:25:33 -05:00
Amber Lin ee54c2ec5a libhsakmt: Support x2APIC in topology
Current processor/cache topology code implements xAPIC architecture, which
is 8 bits addressability. This is not enough for a system having more than
255 processors. x2APIC is the extension of xAPIC architecture to support
32 bit addressability of processors. This patch detects the x2APIC
enablement and uses the extension leaf to get apicid when detected.

Change-Id: I0826585d02f696a46cd5efb9a6630c60af01e2d8
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: f8028a40fd]
2019-03-01 16:42:30 -05:00
Kent Russell a349805264 kfdtest.exclude: Temporarily blacklist IPC on gfx900
Due to the recent HMM changes, the KFDIPCTest can intermittently fail,
combined with CrossMemoryAttach consistently failing. Remove it for now
while Philip Yang investigates

Change-Id: Icf272100bb7882eff4202ad6f4ced63b569f4e7d


[ROCm/ROCR-Runtime commit: d00ec779ce]
2019-02-28 07:29:47 -05:00
Kent Russell 16d110f9c1 Temporarily remove CMATest from gfx900
Per Philip Yang:
For forked child process, userptr allocated on heap (through malloc)
will have two vmas if child process malloc smaller size buf, free it,
this is on vma cloned from parent process. Then malloc larger size buf,
kernel will put some pages on previous freed space from vma cloned,
create new vma for the rest of pages. This is what IPCTest does.

Change-Id: I054771e20880f975d7cc774225f19aad5363843f


[ROCm/ROCR-Runtime commit: a0b8dd8462]
2019-02-27 07:05:42 -05:00
Yong Zhao c326949ac7 kfdtest: Add a result check in CreateCpQueue test
With the orginal code, CreateCpQueue will report failure if
WaitOnValue return false. Add EXPECT_TRUE() so that in that case
the failure is reported.

Change-Id: I043d013958b452d7ccb9538dc296d99d024abf01
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 1d478f3cf2]
2019-02-21 17:42:45 -05:00
Eric Huang 658a1d8f41 kfdtest: add RAS tests
They are disabled for now.

Change-Id: I9c936130cbaf8c773f4b8e94bccf4af1f45eda65
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>


[ROCm/ROCR-Runtime commit: 7349276860]
2019-02-15 15:03:32 -05:00