For userptr, after taking aperture lock, decrease registration_count and
ensure object registration_count equal to 0 to release KFD and thunk
object.
Move decrementing of registration count from fmm_deregister_memory into
__fmm_release to avoid a race condition when dropping the aperture lock
in fmm_deregister_memory.
Change-Id: I5381fa6b8a77a1516af2554e5174e91969c338c4
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
This test requires KFD patch "drm/amdkfd: SVM map to gpus check vma
boundary" to pass, the patch is on staging-dkms branch, not land on
mainline branch. Temporary blacklist this to unblock QA, as QA reports
kfdtest failure.
Change-Id: I00515cd5d5d1c5612f4f8d48605d86f4a7e62ce2
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Because of sharing ports with other engines, the
hardware design team has advised that SDMA0 on gfx90a
should only be used for host-to-device data transfers.
The recommendation is to use SDMA1 for any device-to-device
or device-to-host data transfers.
A driver change will ensure that, for each gfx90a
device, only the first PCIe SDMA queue a process
requests will possibly be from SDMA0. This patch ensures
that the first PCIe queue requested (which may be from
SDMA0) is always set up for host-to-device.
Change-Id: I6793ca95596dedaed9d5be1dbd9469ceef2a5c33
After adjusting the memory usage, re-enable KFDEvictTest.BasicTest on
gfx906 and KFDEvictTest.QueueTest on GFX10
Change-Id: I401e679e447f3150241078154635f0b30692513d
Signed-off-by: Kent Russell <kent.russell@amd.com>
We were hitting memory map errors and segfaults when trying to use 7/8
VRAM on certain cards (dmesg showing "Failed to map to gpu 0/1"), as the
original check didn't see if this would exceed the GTT size.
Allocate the smaller of 1/3 of SRAM or 7/16 of VRAM (7/8 / 2) for the
tests
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: Ic11a5cee058535418eef903a28846e00e1839969
SDMA firmware of gfx902 has a regression which causes cp hang
in kfdtest, blacklist related test cases.
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: Ie75b72a952a29d6b1394c5fcb67ff9a5143b3b07
Bumps cmake minimum version to 3.7 for version comparison operator.
Previously the Clang cmake project version strings were used. These
are not defined if the clang cmake project has not been loaded.
We should use CMAKE_CXX_COMPILER_VERSION to check the version when
only the compiler binary is redirected and the project files are
not available.
Also adjust device libs lookup logic to handle multiple paths in
CMAKE_PREFIX_PATH.
Change-Id: I67b6958d8241685cd6c3a0af68507c9fdc6331ef
This is required for Marketing Name, but Marketing Name isn't a hard
requirement for ROCT, so make it a Suggested package
Change-Id: Ibafcce2c59dc8bdba90c171e766122bebf548a48
Signed-off-by: Kent Russell <kent.russell@amd.com>
This was never validated on GFX10, so remove it for now until it's
either validated and confirmed, or move the check to the test itself
Change-Id: Ie4d8b31885fbe6e5ed84b7b174c0bfed60879741
to faciliate debugging, print errno when queue destroy fail
current log give very little information when fail:
[ RUN ] KFDQMTest.AllSdmaQueues
[ ] Regular SDMA engines number: 1 SDMA queues per engine: 2
[ OK ] KFDQMTest.AllSdmaQueues (11 ms)
[ RUN ] KFDQMTest.AllXgmiSdmaQueues
[ ] XGMI SDMA engines number: 0 SDMA queues per engine: 2
[ OK ] KFDQMTest.AllXgmiSdmaQueues (6 ms)
[ RUN ] KFDQMTest.AllQueues
/home/foreman/build/hsakmt-roct-amdgpu-1.0.9.40500/sources/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:381: Failure
Value of: (cpQueues[i].Destroy())
Actual: 1
Expected: HSAKMT_STATUS_SUCCESS
Which is: 0
Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>
Change-Id: I5b1b5b616a5fd7ff198360c893a7aeed685022bd
For minimal latency we should place command queues and blit code
in the nearest numa node to each GPU. Add an allocator matching
the current runtime default allocator interface to each GpuAgent
that allocates on the closest numa node as represented by kfd
topology. Use this allocator for queue ring buffers and blit
objects.
Change-Id: I181127f9c27bafe68976312963146616e3f58369
After calling ioctl to create userptr obj, take aperture lock, check if
there is same userptr obj created after finding object, to catch the
race that multiple threads register same userptr to multiple GPUs.
If same userptr obj exist, then increase userptr registeration_count,
and free the newly create obj.
Change-Id: I63ae3a4f54da8aedd11c124d8d53ebe727b8203a
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Map readonly range to GPU, test GPU can read range but write to
range trigger GPU vm fault.
Change-Id: Id2ce106055a353f21f4c34fa7f562d546523bc49
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
To test if SVM support address mmap on file backed memory.
Change-Id: I227cc0b44d167c4fa2e63257e13a481aefa4f750
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
g_props is intialized in topology_take_snapshot which needs to call
hsaKmtAcquireSystemProperties. In hsaKmtOpenKFD, it doesn't call hsaKmtAcquireSystemProperties.
So it needs to change parameter of topology_is_svm_needed from node_id
to EngineId to avoid Segmentation fault on gfx902.
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
Change-Id: Iba8d20a9142510c70927454d26bdcaf579ad5574
Also make failure to handle queue errors fatal.
Motivation is to improve detection of queue error conditions
that currently appear as application hangs.
Change-Id: I655643616dc0bd303d7df3ce8aca2c099bec3d46
Sets package found and component lists. ROCr does not have components
so this is mostly cosmetic. It's part of maintaining a compliant
cmake project config file though.
Change-Id: Ida2ef746375143babd3a6f938727a47135606f01
Per clang 13 option -Wno-error=unused-but-set-variable is not
recoginized nor is the diagnostic emitted. Set this option
conditional to the clang compiler version.
Change-Id: I3c0958dffa985d53b641f9eff4e702988dffd033
Passing 0 into num_cu_mask_count used to be an implicit error.
This has been repurposed as a short hand for enabling all CUs.
Enabling all CUs when HSA_CU_MASK is set will cause the CU mask to
reset to whatever was set by HSA_CU_MASK which may then be queried.
Change-Id: I1d6bb2034595a78ee48fa72aa05563e8ea6c0fff
Delay parsing until after GPU discovery. Use the surfaced
GPU count and maximum phyiscal CU count to limit parsed bit masks.
This prevents pathological input such as
HSA_CU_MASK=0-8000000:0-8000000 from attempting to consume 7TiB.
Change-Id: I3773d2db3740c2023b0f6275d1818b69119b0495
Use the database provided by libdrm-amdgpu,as this is maintained by AMD
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: I319a46d833cb23173a77fdff1deae69ea58588b8
Consolidate all thunk components into a single development package
Also declare build dependencies via find_package/library and report the
location of said dependencies during build time
Change-Id: I8524ad1a2d9a664accd93471abacc4bac82b9a7c
Signed-off-by: Kent Russell <kent.russell@amd.com>
Update KFDCWSRTest.BasicTest to run CWSR test with legacy HWS,
MES, and software scheduler.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I09af904507109c2bbbace6ff616a57fa2b1b8aa9
This is reported as failing on pretty much every ASIC, so move it to the
temporary exclude list for all ASICs until the test can be fixed.
Signed-off-by: Kent Russell <kent.russell@amd.com>
Change-Id: I2973feb5705c178db32009e4603376d1f5e94a2f
If lookup is successful, device info from find_hsa_gfxip_device() is
used to populate props->EngineId.ui32 gfx version fields and node CAL
name (props->AMDName).
If unsuccessful, uses gfx_target_version sysfs field if present
(otherwise results in error), and sets the node CAL name to
"GFX<GFX_VERSION>" (e.g. 9.0.8 -> GFX090008).
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Id39c521b769aebbe40c934f19e03150a66884cce
Some g_system and Node ID checks already performed by validate_nodeid().
Fixes bug where hsaKmtGetNodeProperties() would return on
validate_nodeid() error without releasing lock. Switches some
conditionals to return INVALID_NODE_UNIT instead of INVALID_PARAMETER if
NodeId >= NumNodes. Removes some g_system assertions to return error
code rather than trigger a hard fault.
Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Id1604b20c2cef8808b98cdad61bd47aa7ea3d229