Commit graph

2959 Commits

Autor SHA1 Nachricht Datum
Laurent Morichetti 5774d9162b Fix a typo in INSN_S_ENDPGM_OPCODE encoding.
The correct s_endpgm instruction encoding is 0xBF810000.

Change-Id: I03f304762dcaced5bf3fa4f069da7a0b287d1cd2
2019-11-12 11:54:17 -08:00
Yong Zhao 4b36a1e728 kfdtest: Rename KFDQMTest.MultipleSdmaQueues to AllSdmaQueues
The test actually tested all available SDMA queues, so change the name
to reflect the fact.

Change-Id: Ia23df3e5ac79b692b0b60194b05603ba8dd897a4
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-08 19:00:37 -05:00
Yong Zhao 174484aac3 kfdtest: Use GetSystemTickCountInMicroSec() as much as possible
GetSystemTickCountInMicroSec() wraps the function gettimeofday().

Change-Id: I7b767a6efdd1db491fc8113313945b578ac69382
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-07 18:21:04 -05:00
Philip Yang 2fa7d23a82 kfdtest: use flag NoNUMABind for more test cases
If NUMA system no available memory on node 0, mbind will fail on node
0, so set flag NoNUMABind=1 to bypass mbind for all test cases which use
node 0 and allocate system memory.

Change-Id: I7962938ad2bed5a293ca5e6a8500c7f7e15ff453
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-11-06 18:33:46 -05:00
Philip Yang 59c857476f libhsakmt: use the closest NUMA node to allocate queue ctx area
On NUMA system, allocate queue ctx save restore area on the closest NUMA
node to the GPU which the queue is going to run. This will improve
performance on NUMA system generally by reducing schedule latency and
fix the multi-node rccl-tests unstable performance issue.

If the closest NUMA node has no memory available, set flags NoNUMABind=1
to bypass mbind, to use default NUMA memory policy to allocate system
memory.



Change-Id: Ic62bfa5bb2efbf4f6ae79ff403e9610ddf18d45c
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-11-06 17:33:26 -05:00
Ramesh Errabolu 086b4b686c Fix Aql Header initialization
Change-Id: If23c12b28d22fb62b673038890f40bb0f3c3948b
2019-11-06 16:13:58 -05:00
Ori Messinger e7f45fae8a Add non-priv PMC blocks to GFX10
This patch adds the non-privileged PMC blocks for GFX10/gfx1010.

Change-Id: I4b98cb2159d71113c12920ca7fd10e45096b4e2c
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>
2019-11-05 13:07:13 -05:00
Felix Kuehling 07b8c30ce8 kfdtest: Return address of packet from IndirectBuffer::AddPacket
This can be used for allocating space in the IB for write-back data
in a NOP-packet with a payload.

Change-Id: I6202b89a455a65326366a15291789004dfdcc0b9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2019-11-01 18:31:28 -04:00
Felix Kuehling 664bb13bc4 kfdtest: Add PM4 NOP packet payload capability
Useful for allocating space for write-back data in a queue or IB.

Change-Id: I47bd2dcb8537a91410e9a91413979a8a2c1c5f21
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2019-11-01 18:30:52 -04:00
Felix Kuehling ea7619c929 kfdtest: Re-factor dynamic packet allocation into BasePacket
Avoids unnecessary and error-prone duplication of dynamic packet
allocation in many packet classes.

Change-Id: Icec981ae2cd7a3d88cdf9213298940d85627f853
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2019-11-01 18:30:52 -04:00
Felix Kuehling 12e4ea94f1 kfdtest: Fix SDMA NOP packet size
packetSize is used as the size in bytes.

Change-Id: I900f9a4f37b840cbb957184705db04a4c64d1654
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2019-11-01 18:30:52 -04:00
Oak Zeng fa0cb9ebeb Handle IOCTL failure in fmm_release
FREE_MEMORY_OF_GPU ioctl could fail, e.g., if memory is still mapped
to GPU. Handle this failure by return error in fmm_release/HsaKmtFreeMemory

Change-Id: I5461db39964f733cf97376d50e44906a9b4c0f13
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-11-01 08:59:05 -04:00
Oak Zeng 7593b41575 Unmap memory from GPU before free
Change-Id: Ic33b17cbaee5de7908d37527254f4f146e6b71e3
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-11-01 08:58:55 -04:00
Qingchuan Shi 16a20cfb8c Adding code object list in loader.
Change-Id: Iab3541287bd56276fd32615ee59fcd590de84ca0
2019-10-30 20:31:51 -04:00
Jay Cornwall 78e754935c Merge debugger trap handler into ROCr trap handler
Debugger path is taken for (trap_id >= 3) and single step exceptions.
Other traps/exceptions behave as before.

Change-Id: I276c0eb69953709968353a57717ee017d22348a2
2019-10-30 13:56:06 -04:00
Sean Keely 851ee799c4 Correct strip command.
Strip should only apply to the output target library.  Symlinks
with .so endings which will be relocated during install will cause
strip to fail, aborting the build.

Change-Id: Ieb598c2cec5277d9d14c8afa88b91ca2c7f4412d
2019-10-30 01:24:43 -05:00
Sean Keely 6c3acda664 Adapt to new versioning.
Using branch point for count since last change since we don't
have questions answered on tags yet.

Removed unused CMake files.
Restructured CMake to use the cache rather than only commandline
and be ccmake & cmake-gui friendly.
Dependency search paths are added for the Repo tree layout.

Search paths still needed for install paths.

Simplified packaging.  hsa-ext-rocr-dev package and contents now
build from the package CMake rather than being 3 separate projects.

Not applying new version number or new install paths!

Change-Id: Ibea50dc8a6ab091e91857f78833f5379a4511547
2019-10-29 04:21:17 -05:00
Jeremy Newton cb4d4107ed libhsakmt: default packaging prefix to install
If CPACK_PACKAGING_INSTALL_PREFIX is not set, it's safer to assume it's the
same as CMAKE_INSTALL_PREFIX, incase only the latter is set.

Change-Id: I70727ebbc50f21f8d6e3add10d7f9f2d5e747dee
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>
2019-10-25 15:40:03 -04:00
Yong Zhao da9a404ac3 kfdtest: Add a function to wait indefinitely until user input
This will facilitate the need of halting the execution during debugging.

Change-Id: I058c81bbc9f655bbc477b2b542d6b43aa423324c
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-10-23 11:07:43 -04:00
Yong Zhao ac5c433420 kfdtest: Add a Nop packet submission test for CP and SDMA queue
The tests are useful to triage the fundamental queue submission
functionality by excluding the packet format variable from the equation.

Change-Id: I2c7fcda811f93bdefc1b62396233559416be44e7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-10-22 13:42:33 -04:00
Yong Zhao 75c654cfea kfdtest: Add SDMANopPacket class
The class is very useful for triaging complex SDMA issues.

Change-Id: Ib5de729f7fc62f41e894ef98d3967e7e1745d454
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-10-22 13:41:44 -04:00
Yong Zhao 98b0652917 kfdtest: Add a core test filter for software scheduler mode
The new filter can be used by "./run_kfdtest.sh -p core_sws".

Change-Id: I1c43669cfc07c09ccafb9fa2e2851932ac59307d
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-10-22 11:23:33 -04:00
Yong Zhao ffbdb726ac kfdtest: Temporarily disable performance counter tests for gfx1010
We are still working on those tests for gfx1010, so disable them
temporarily.

Change-Id: I5d51b4b02bc753137014684859cc033f759b2899
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-10-22 10:39:58 -04:00
Felix Kuehling bee47ab013 libhsakmt: Fix installation path for pkg-config file
CMAKE_INSTALL_PREFIX would give incorrect result for packages
build by CPack.

Change-Id: Iaeb4d30eb44080b7924ecf892de011ed6e365f5f
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2019-10-21 14:53:54 -04:00
Yong Zhao ab2daf6538 libhsakmt: Add a message when a device is not supported
This helps to quickly triage problems.

Change-Id: Iad2b4b74209ab972be0c2f6311eeb3aaf098d29f
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-10-20 12:44:13 -04:00
Yong Zhao 1c7755d2da libhsakmt: Add gfx1012 device IDs
Now the gfx1012 device IDs are okay to reveal.

Change-Id: I9da2a036b74ec7b6b8b1fb7587597a5847f02205
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-10-17 12:35:22 -04:00
Yong Zhao 16fa78b134 libhsakmt: Print an error message when map_mmio failes
Without this change, the failure was hard to notice when it happened.

Change-Id: I99c3e8cea0d0cbd3bcfe79069410e6e870e225bf
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-10-17 12:32:32 -04:00
Amber Lin 23541e0289 libhsakmt: handle CPU cache info on non-NUMA sys
When CONFIG_NUMA is not enabled in the kernel config, only one CPU node
presents on the system and /sys/devices/system/node/nodeX directories
don't exist. Read CPU cache information from /sys/devices/system/cpu in
this situation.

Change-Id: I017ff17dd72678a0551edcc77446664501aa42ca
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2019-10-17 11:53:19 -04:00
Eric Huang 0174377351 kfdtest: add xgmi path for p2p tests
When large bar is not available, we can use
xgmi to do p2p tests.

Change-Id: Ib7b59fb8a4d41f605739a0428973f6b2f1a3450f
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2019-10-17 10:21:10 -04:00
Chris Freehill 53228ad819 Add some gfx1xxx targets
This is to fix 

Change-Id: I69a87884d8174733905e4c007cf0f19b5103482a
2019-10-16 09:01:22 -04:00
Philip Cox 6933540c81 Remove debugger data reg accesses
The debug trap accesses the data0/data1 registers, so we do not
want the userspace to write values to it.  We remove the calls to
set the data0/data1 register values.

Change-Id: Iaba842a4c445f339f16a39fe1994526ff78a2f3c
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
2019-10-10 14:32:54 -04:00
Philip Cox efe3769835 kfdtest: Check kfd debugger version in tests
Need to check the kfd debugger version of the kernel before
calling kfd debugger tests in kfdtest.  If they are out of sync,
the tests may fail.

Change-Id: I1df5e89fb1199304e6fbe8973c60b76062514c03
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
2019-10-10 14:32:54 -04:00
Philip Cox dbbd189b33 Add functions to get the kfd debugger version info
To support adding new features to the kfd debugger, and not break
functionality, we need to be able to check the kfd debugger support
version info from the kernel.

Change-Id: Icd88e4edab8430c35eaed588e62d892c1b5c62ec
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
2019-10-10 14:32:54 -04:00
Philip Cox 35d56297d3 Add kfd debugger version support
To check the KFD debugger API support, we need to be able to check
the major/minor version of the kfd debugger version, so we need to
expose this function from the kernel.

Change-Id: I8a3dc617607e2efa9e65306d08b8583b8b1a2172
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
2019-10-10 14:32:54 -04:00
Philip Cox b48f7d6ea3 kfdtest: Disable kfd debugger tests on gfx10
The KFD debugger is only supported on gfx9 platforms, so we need to
restrict it from running on gfx10 platforms until it is supported.

Change-Id: I500f0e20fda71021f2cce70a67fc8d9d042209fe
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
2019-10-10 12:50:12 -04:00
Oak Zeng da789a2584 Fix memory map issue in KFDMemoryTest.CacheInvalidationOnRemoteWrite
The memory need to be mapped for both local and remote GPU access

Change-Id: I4aeaffc0851b6107fc91e9eaa6150764b06f5ca9
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-10-08 16:33:51 -05:00
Oak Zeng dadbbbb03c Disable KFDMemoryTest.CacheInvalidateOnRemoteWrite temporarily
This is some data fabric/vbios issue that causing system hard hang
while running this test. Will enable it after the HW/vibos fix.

Change-Id: Ic0753c2d92e9e4863c310da9a595b2af302f17f8
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-10-04 11:09:19 -04:00
Oak Zeng d7c53bb1fa Test new RW mtype for gfx908
Change-Id: Ia859c8f2e3c486f119772231a2d887f6783caf36
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-10-04 10:49:15 -04:00
Oak Zeng 51388973a1 Add gfx908 to asic family
Change-Id: I838aa4be45ddfb34c5d36c519e28b4218fc32ba4
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-10-04 10:45:02 -04:00
Cole Nelson 1ac8b5538c kfdtest.exclude: blacklist KFDEvictTest.BurstyTest for gfx908 (gfx908)
Change-Id: I5e1ac25c066ee20e34e102043d27eeab73313c6f
Signed-off-by: Cole Nelson <cole.nelson@amd.com>
2019-10-02 14:06:50 -07:00
Oak Zeng 63eb3948be More parameter check in HsaMemoryBuffer constructor
if parameter "zero" is set, check buffer host access.

Change-Id: I9893062726fc240777405167a638cbea18fdf559
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-10-01 14:44:29 -04:00
Amber Lin 5a09880620 libhsakmt: fix typo in error message
When fail to get CPU dirs from //sys/devices/system/node/nodeX directory,
the error message should print node_dir, not path.

Change-Id: If76a51918c8dd55fa6605a62f3d29f9efc6fadb3
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
2019-09-30 14:29:39 -04:00
shaoyunl a1e399a3ff Thunk : Add gfx1011 support from thunk side
Change-Id: I6b202b75fc1ad0e69576a35a6a3e499818137e04
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2019-09-25 11:02:33 -04:00
Philip Yang 71cf3cf5d3 libhsakmt: correct number of NUMA nodes calculation
numa_max_node() return the highest node number available on the current
system, number of NUMA nodes should be numa_max_node() + 1.

Change-Id: I20a6c17af071e73e853cb5ea6d0304c8aca52681
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-09-16 16:25:57 -04:00
Philip Yang 69d8f2d734 kfdtest: use flag NoNUMABind to allocate system memory
Allocate system memory from node id 0 will fail on NUMA system which has
no memory on node 0. Change to use new flag NoNUMABind to allocate
system memory from NUMA nodes which have free memory.

Change-Id: I8ef9ca28fc2ab5dd31d07a2d3eaf1d5886e798a0
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-09-16 12:25:01 -04:00
Philip Yang 42392f093f libhsakmt: handle NUMA system with no memory on node 0
on NUMA system, node 0 may have no memory, application pass node id
0 to hsaKmtAllocMemory will fail because mbind to specify the allocation
from node 0 return EINVAL.

Add new flag NoNUMABind for application to pass it to hsaKmtAllocMemory
to skip mbind.

hsaKmtCreateEvent and hsaKmtCreateQueue specify the new flag NoNUMABind
to allocate system memory for event page and CWSR area, don't bind the
system memory to a specific NUMA node.

Change-Id: I854e5a57502c7807c4c5ff2e441d499ae515c309
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-09-16 11:30:24 -04:00
Jay Cornwall 906cd84186 Disable SDMA HDP flush on gfx10
Not currently functional, triggering SRBM write protection.

Change-Id: Ib0b832357e3df5a6a0d0b46648515ec9bd70f017
2019-09-14 14:08:47 -04:00
Jay Cornwall e0358d7dc2 Set MTYPE field in SDMA fence command on gfx10
This is the only SDMA command with an MTYPE field.

Change-Id: Ice146ace9c3e8e7aff038e1e004be73c070f48fe
2019-09-14 14:07:57 -04:00
Jay Cornwall 32a9a5dbb0 Add gfx1010, gfx1011, gfx1012 ELF types to loader
Change-Id: I23a1159fb10f60881ea6830ba13ee73bd373bfc9
2019-09-14 14:07:16 -04:00
Jay Cornwall 5b64fbd0e5 Implement code cache (SQC I$, SQC K$, TCP, GL1, GL2) invalidation for gfx10
Change-Id: I8b2a59118094fbb55e3f575fa9f79959d3725d7d
2019-09-14 14:06:31 -04:00