This is updating to the new suspend and resume API for the
KFD and the thunk. We now support passing in a list of queues
to suspend, and not just all of the queues for the process.
The kfdtest testcase was also updated so it still compiles.
Change-Id: I71d1b178476bd9df0c311bdedaa6a891528cebcf
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
hsaKmtGetQueueInfo needs to return the control stack size, and the
wave state size for the debugger. These changes are needed to support
returning the new values.
Change-Id: Ib4c60e0ea34446c06aef4a86996250989f348a69
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
- Run more graphics command submissions with shorter delay between
them
- Synchronize after every graphics command submission
- Include the big VRAM BO in the BOList of the command submission
to trigger more evictions
- In QueueTest, run AMDGPU command submissions concurrently with
compute shader on the user mode queue
- Submit AMDGPU commands to GFX queue instead of compute queue to
avoid deadlocks between user-mode and kernel-mode queues on the
same pipe
- Allocate slightly less memory from KFD to avoid allocation errors
due to fragmentation or memory leaks in previous tests
- Running only two processes maximizes the number of KFD evictions
(probably because of lower chances of evicting non-KFD BOs)
Change-Id: If05d53f5fcf690b6488998a3f933f120ddaa71ee
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Add a MMIO_REMAP heap type and expose mmio virtual address
through HsaKmtGetNodeMemoryProperties
Change-Id: I1e585e6dfbec8fa7c85f1dda7b89b763a8e2c439
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
HDP conherence registers are remapped at driver level
to an empty page in mmio space (the remapped mmio page).
This change allocate and map the remapped mmio page to
process space.
Change-Id: I89c405c41870a79c5b58eea0d8e564aa35f55182
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
At the moment it is not possible to build ROCr with Clang. This is
a spurious limitation. The present PR addresses it by guarding GCC
only flags and by fixing some additional warnings that Clang triggers;
one of said warnings did outline a rather interesting issue with math
being done on void*s. - AlexVlx
Void ptr arithmetic had already been fixed in amd-master branch.
Change-Id: I5ee97e20b5c40b10dd73facecabe75f02ba46462
Non-paged memory can be IPC-shared even when HSA_USERPTR_FOR_PAGED_MEM
is enabled.
Change-Id: I8b1fa6d7a4a9327c78a77b3679697fbf55397093
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Those two tests cover the basic queue creation and destruction
without submitting packets to CP and SDMA user queues. During bringup,
they bring values in term of untangling the issues arising in queue
creation and packet execution, which are two very different kinds.
Because of those two tests, we also rename some existing tests as
follows:
CreateCpQueue -> SubmitPacketCpQueue
CreateSdmaQueue -> SubmitPacketSdmaQueue
CreateMultipleCpQueues -> MultipleCpQueues
CreateMultipleSdmaQueues -> MultipleSdmaQueues
Lastly, move MultipleCpQueues test closer to the CP queue section
rather than leaving it behind the SDMA queue section.
Change-Id: I110fb3f3fb21878339045dd1d1c8c9d61b8988b7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
If a Render Node can't be found, we should finish off all child
processes immediately, then return.
Trying to do the check before forking the children results in the test
failing as well, regardless of the status of finding the render node,
which is likely why the forking occurred first in the test's initial
creation. This way we ensure that things finish cleanly before moving to
the next test
Change-Id: I2e1b62fed25c30ff1f179612127c23960da4ee5e
Distributions want to generate debuginfo packages, do not strip them! If
you want to do it during installation use 'make install/strip'!
Change-Id: I3983af24ce4f4ddb189ede0ed0820dfee83b6280
KFD no longer reports MemoryAccessFault.Failure with retry fault
implementation. ROCr ignores the memory event when Failure = 0.
Use the Flags field instead, which will be non-zero when the
event is triggered.
Change-Id: Ie90799a303b0b2f1b476b20ffafdde79ae137182
Makes malloc memory accessible to GPUs so that the memory has the
capabilities of the pool it is locked to.
This admits fine grained locked memory and reserves API space for any future
special CPU pools.
Change-Id: If8c3dd8582a43f19d3d36b3763c1a688cc419ef0
This patch separates the build version (i.e. ROCm version) from the
library version used to set the SONAME of the shared objects. This
prevents the SONAME from getting bumped each time there is a new ROCm
release without any change to the libhsakmt ABI.
1.0.6 was choosen as the library version since this was the
last library version used prior to switching to the ROCm version
numbers.
Change-Id: I7c29ae84d8a362a831e804569d8147ca65155cad
1. RAS error injection debugfs interface has been changed which
is using ras_ctrl instead of *_err_inject.
2. Remove ASSERT_SUCCESS for fwrite, because fwrite returns
the size of written item but not the error number.
3. Using throw exception instead of return to avoid a segment fault.
Change-Id: I6c4d9c2f7e66719faec99abd1552105a08c238a4
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
While this may not be supported in the runtime, the kernel/firmware
support it
Change-Id: I7fe4536a6b3055f39e25f453060e899938645d91
Signed-off-by: Kent Russell <kent.russell@amd.com>
This reverts commit d00ec779ce.
Fixes for HMM change corner cases are merged in from drm-next.
Tests are passed on gfx900 with the latest amd-kfd-staging.
Change-Id: I6c00d1eacf6b3f1ce715e085ae622b4e9ff1b7ff
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
While adding x2APIC support, apicid for non-x2apic was missing out by
mistake.
Change-Id: I25eed362c035c0e9fb9ea948899c49f70311f269
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Add SDMA engine info fields to node properties and
modify get node properties API to read SDMA engine
info from sysfs
Change-Id: Iea877b5bc008cc9df9405daf564a359535f1bc9f
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
Current processor/cache topology code implements xAPIC architecture, which
is 8 bits addressability. This is not enough for a system having more than
255 processors. x2APIC is the extension of xAPIC architecture to support
32 bit addressability of processors. This patch detects the x2APIC
enablement and uses the extension leaf to get apicid when detected.
Change-Id: I0826585d02f696a46cd5efb9a6630c60af01e2d8
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Due to the recent HMM changes, the KFDIPCTest can intermittently fail,
combined with CrossMemoryAttach consistently failing. Remove it for now
while Philip Yang investigates
Change-Id: Icf272100bb7882eff4202ad6f4ced63b569f4e7d
Per Philip Yang:
For forked child process, userptr allocated on heap (through malloc)
will have two vmas if child process malloc smaller size buf, free it,
this is on vma cloned from parent process. Then malloc larger size buf,
kernel will put some pages on previous freed space from vma cloned,
create new vma for the rest of pages. This is what IPCTest does.
Change-Id: I054771e20880f975d7cc774225f19aad5363843f
With the orginal code, CreateCpQueue will report failure if
WaitOnValue return false. Add EXPECT_TRUE() so that in that case
the failure is reported.
Change-Id: I043d013958b452d7ccb9538dc296d99d024abf01
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
We need to black list this testcase temporarily because
it is failing intermittently. The failure tends to only happen
when the certain build machine is used to build it.
This issue is being tracked by Jira ticket:
ROCMOPS-389
Change-Id: Ic4682c9da389ed731cbc034dff57e6646bba0e9d
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
A basic sanity test that tests the codepath for
the debugger suspend and resume code path.
Change-Id: If4c64f7bd6a1ef45068a33965b829725a78ce492
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
Version is now a fixed string that matches previous internal builds.
This also matches released DEB/RPM builds (but not github versions).
Change-Id: Id4819b9de8c855250aadf1a1cebb187b5c031721