Граф коммитов

2925 Коммитов

Автор SHA1 Сообщение Дата
Eric Huang d49fbe2475 kfdtest: change baseline in BasicCuMaskingEven
To keep normal performance, we have to use symmetrical cu masks
based on shader engines. So change baseline from 1 cu to 1 cu
per SE.

Change-Id: I9e83a87fb670bb406f7983714fa0d8ab673609eb
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>


[ROCm/ROCR-Runtime commit: caa744944e]
2019-12-11 14:48:55 -05:00
Sean Keely 3bcde37b58 Initial GWS queue support.
Queues should transition to ref counting for all queues eventually.
That cleanup will be part of shared queue pooling support.

Change-Id: I217ff5d573156678b9559da6fb81baa8cd31c617


[ROCm/ROCR-Runtime commit: 0a43a107b1]
2019-12-09 21:21:17 -05:00
Ramesh Errabolu a161850fcf Add package dependencies for Images and Tools
Change-Id: Id77ba0e81d3b3e872153cdd7680338dd70319026


[ROCm/ROCR-Runtime commit: 144017e148]
2019-12-06 16:38:21 -06:00
Yong Zhao ee8ab015dd kfdtest: Add some logs to Atomics test
This will help us triage the unexpected hangs on the farm systems.
Meanwhile, simplify the logic.

Change-Id: Ie50b97a34cb86891720dce11f2d178bff9aa2cd5
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 8f140fc03d]
2019-12-02 12:56:03 -05:00
Jonathan Kim efef21f4ff add queue snapshot test
adds api and test to get newly create queue snapshot per ptraced process.

Change-Id: Ife97123a5b930e837ccaa386801145ef23c2cc2c
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>


[ROCm/ROCR-Runtime commit: 8b01a1c4c5]
2019-12-02 11:56:04 -05:00
Yong Zhao 8feaf7585f kfdtest: Declare SetUpTestCase() to be public
SetUpTestCase() and TearDownTestCase() are declared as protected,
preventing us from using TEST_P().

Change-Id: I1d049a475a1b3bd44b5f96305a48751b90d572ce
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: ad76ffd544]
2019-12-02 11:38:18 -05:00
Oak Zeng 6cb65e8c37 Support XGMI optimized SDMA queue
Change-Id: I6ad62fc94a9838df505879b1ddaccfeb9881a6e8
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 0b71bbb787]
2019-11-26 21:42:47 -05:00
Yong Zhao c8eb45a5fa kfdtest: Merge the two largest buffer test helper functions
This is cleaner.

Change-Id: I7740f3e0f93a63b35fefc3cb69712dfad68df552
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 4daa25fceb]
2019-11-25 18:45:23 -05:00
Yong Zhao 53c9d60035 kfdtest: Split BigBufferStressTest into two smaller tests
The previous BigBufferStressTest has too much stuff and takes a long
time to run. By separating largest*BufferTest out into other
tests, we dramatically reduce the time to run BigBufferStressTest and
therefore make reproducing issues much easier.

Meanwhile, rename the test to BigSysBufferStressTest to express more
information.

Change-Id: I5911f113c0bd50627ee6d84bbb4f2972cbed8886
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: b6cefa7bda]
2019-11-25 18:28:17 -05:00
Yong Zhao 44d016bc49 kfdtest: Remove the queue submission in BigBufferStressTest
In order to accommodate the flaky queue submission under memory
shortage scenarios, BigBufferStressTest has become very much a hack,
undermining its purpose of testing the basic memory related operations.

Therefore, remove the queue submission part. The EvictTest should serve
the purpose of testing the memory and queue submission functionalities
when memory eviction happens.

Change-Id: I3c3603a0e834267eccb72f46efeabe1e053c8fc5
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: cbe21fa261]
2019-11-25 18:25:23 -05:00
Sean Keely 511b86a55e Allow disabling scratch reclaim.
Debug and RCCL NPI assist feature.

Change-Id: I2cb76f0a086fa341465df3ede26965ab713bc3b4


[ROCm/ROCR-Runtime commit: d2a50a0048]
2019-11-20 02:41:58 -05:00
Sean Keely c6592f7757 Raise large scratch allocation limit for RCCL.
Temporary workaround for 2.10 release.  RCCL, compiler, or firmware
must be corrected and this code reverted before another ASIC release.

Change-Id: I27851353289b93df9acb72d28b8c6ccb9f7f7d7a


[ROCm/ROCR-Runtime commit: 35c1ffa863]
2019-11-20 02:41:27 -05:00
Sean Keely 7af8b533c2 Correct clang paths for incremental builds.
We should be using bin/clang, not the build/lightining/bin/clang since
build/ is the project's internal build directory.  This patch corrects
this where possible.

However, lightning does not install all it's end user files under out/
so some headers can not be found anywhere in out/ in an incremental build.
This header (opencl-c.h) if fetched out of the lightning source tree if
necessary.

Change-Id: I083d8b27bb39dd615fba3bb0711a789318f95e77


[ROCm/ROCR-Runtime commit: f62017f4a5]
2019-11-16 03:10:28 -06:00
Felix Kuehling d89c9c1f64 kfdtest: Change PM4 dispatch to workaround GWS-related FW problem
v2: Remove useless AcquireMem packet after the dispatch
v3: Minimum poll_interval is 4

Change-Id: I352eb21c781ed9e03d62c0febd532da6a9854afa
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 8afdf52001]
2019-11-15 15:27:21 -05:00
Felix Kuehling 1ff1d4418d kfdtest: Add PM4WaitRegMemPacket
v2: optimize_ace_offload_mode=1, recommended by firmware team

Change-Id: Ia54e37242b4eaaf631c35e61a59f03ee0f85ca35
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 0b7fcadc63]
2019-11-15 15:27:21 -05:00
Yong Zhao 471f9cf127 kfdtest: Expand KFDQMTest.MultipleCpQueues to cover all CP queues
Because of that, rename the test to AllCpQueues.

Change-Id: I57105f863db2558e850c703d151ffebcce2c7a17
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: a4d570fa2b]
2019-11-15 08:25:14 -05:00
Huang Rui d0c78bf4d5 libhsakmt: add gfx90c support for thunk
This patch adds the support for gfx90c apu. So far we treat it as "dgpu" and
gfx900. Will update hsa gfxip table while the isa/llvm is implemented on gfx90c.

Change-Id: I6ef164bf3e751fe6dd6287cac212a500dce84b1a
Signed-off-by: Huang Rui <ray.huang@amd.com>


[ROCm/ROCR-Runtime commit: fdba74c2fb]
2019-11-14 20:02:53 -05:00
Philip Cox 5b008cefc8 kfdtest: Enable kfd debugger tests on gfx10
Change-Id: If3f86624c76805a6bc0e154d31b957eb63120418
Signed-off-by: Philip Cox <Philip.Cox@amd.com>


[ROCm/ROCR-Runtime commit: 6742e585be]
2019-11-13 08:29:46 -05:00
Laurent Morichetti 6e3347ed25 Fix a typo in INSN_S_ENDPGM_OPCODE encoding.
The correct s_endpgm instruction encoding is 0xBF810000.

Change-Id: I03f304762dcaced5bf3fa4f069da7a0b287d1cd2


[ROCm/ROCR-Runtime commit: 5774d9162b]
2019-11-12 11:54:17 -08:00
Yong Zhao 82c5ab47a6 kfdtest: Rename KFDQMTest.MultipleSdmaQueues to AllSdmaQueues
The test actually tested all available SDMA queues, so change the name
to reflect the fact.

Change-Id: Ia23df3e5ac79b692b0b60194b05603ba8dd897a4
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 4b36a1e728]
2019-11-08 19:00:37 -05:00
Yong Zhao d98400f647 kfdtest: Use GetSystemTickCountInMicroSec() as much as possible
GetSystemTickCountInMicroSec() wraps the function gettimeofday().

Change-Id: I7b767a6efdd1db491fc8113313945b578ac69382
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 174484aac3]
2019-11-07 18:21:04 -05:00
Philip Yang 44c020f2a3 kfdtest: use flag NoNUMABind for more test cases
If NUMA system no available memory on node 0, mbind will fail on node
0, so set flag NoNUMABind=1 to bypass mbind for all test cases which use
node 0 and allocate system memory.

Change-Id: I7962938ad2bed5a293ca5e6a8500c7f7e15ff453
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 2fa7d23a82]
2019-11-06 18:33:46 -05:00
Philip Yang bdb519bd35 libhsakmt: use the closest NUMA node to allocate queue ctx area
On NUMA system, allocate queue ctx save restore area on the closest NUMA
node to the GPU which the queue is going to run. This will improve
performance on NUMA system generally by reducing schedule latency and
fix the multi-node rccl-tests unstable performance issue.

If the closest NUMA node has no memory available, set flags NoNUMABind=1
to bypass mbind, to use default NUMA memory policy to allocate system
memory.



Change-Id: Ic62bfa5bb2efbf4f6ae79ff403e9610ddf18d45c
Signed-off-by: Philip Yang <Philip.Yang@amd.com>


[ROCm/ROCR-Runtime commit: 59c857476f]
2019-11-06 17:33:26 -05:00
Ramesh Errabolu 1798d1386b Fix Aql Header initialization
Change-Id: If23c12b28d22fb62b673038890f40bb0f3c3948b


[ROCm/ROCR-Runtime commit: 086b4b686c]
2019-11-06 16:13:58 -05:00
Ori Messinger 7fad396d95 Add non-priv PMC blocks to GFX10
This patch adds the non-privileged PMC blocks for GFX10/gfx1010.

Change-Id: I4b98cb2159d71113c12920ca7fd10e45096b4e2c
Signed-off-by: Ori Messinger <Ori.Messinger@amd.com>


[ROCm/ROCR-Runtime commit: e7f45fae8a]
2019-11-05 13:07:13 -05:00
Felix Kuehling 0e79756932 kfdtest: Return address of packet from IndirectBuffer::AddPacket
This can be used for allocating space in the IB for write-back data
in a NOP-packet with a payload.

Change-Id: I6202b89a455a65326366a15291789004dfdcc0b9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 07b8c30ce8]
2019-11-01 18:31:28 -04:00
Felix Kuehling 7b3b7de7b6 kfdtest: Add PM4 NOP packet payload capability
Useful for allocating space for write-back data in a queue or IB.

Change-Id: I47bd2dcb8537a91410e9a91413979a8a2c1c5f21
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 664bb13bc4]
2019-11-01 18:30:52 -04:00
Felix Kuehling 1e6ec31611 kfdtest: Re-factor dynamic packet allocation into BasePacket
Avoids unnecessary and error-prone duplication of dynamic packet
allocation in many packet classes.

Change-Id: Icec981ae2cd7a3d88cdf9213298940d85627f853
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: ea7619c929]
2019-11-01 18:30:52 -04:00
Felix Kuehling b6487541a6 kfdtest: Fix SDMA NOP packet size
packetSize is used as the size in bytes.

Change-Id: I900f9a4f37b840cbb957184705db04a4c64d1654
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: 12e4ea94f1]
2019-11-01 18:30:52 -04:00
Oak Zeng d5d66661f6 Handle IOCTL failure in fmm_release
FREE_MEMORY_OF_GPU ioctl could fail, e.g., if memory is still mapped
to GPU. Handle this failure by return error in fmm_release/HsaKmtFreeMemory

Change-Id: I5461db39964f733cf97376d50e44906a9b4c0f13
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: fa0cb9ebeb]
2019-11-01 08:59:05 -04:00
Oak Zeng 6b094ef4d8 Unmap memory from GPU before free
Change-Id: Ic33b17cbaee5de7908d37527254f4f146e6b71e3
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>


[ROCm/ROCR-Runtime commit: 7593b41575]
2019-11-01 08:58:55 -04:00
Qingchuan Shi 2ab9ce6d5c Adding code object list in loader.
Change-Id: Iab3541287bd56276fd32615ee59fcd590de84ca0


[ROCm/ROCR-Runtime commit: 16a20cfb8c]
2019-10-30 20:31:51 -04:00
Jay Cornwall 698ddfed09 Merge debugger trap handler into ROCr trap handler
Debugger path is taken for (trap_id >= 3) and single step exceptions.
Other traps/exceptions behave as before.

Change-Id: I276c0eb69953709968353a57717ee017d22348a2


[ROCm/ROCR-Runtime commit: 78e754935c]
2019-10-30 13:56:06 -04:00
Sean Keely 37f96146f5 Correct strip command.
Strip should only apply to the output target library.  Symlinks
with .so endings which will be relocated during install will cause
strip to fail, aborting the build.

Change-Id: Ieb598c2cec5277d9d14c8afa88b91ca2c7f4412d


[ROCm/ROCR-Runtime commit: 851ee799c4]
2019-10-30 01:24:43 -05:00
Sean Keely afc463c817 Adapt to new versioning.
Using branch point for count since last change since we don't
have questions answered on tags yet.

Removed unused CMake files.
Restructured CMake to use the cache rather than only commandline
and be ccmake & cmake-gui friendly.
Dependency search paths are added for the Repo tree layout.

Search paths still needed for install paths.

Simplified packaging.  hsa-ext-rocr-dev package and contents now
build from the package CMake rather than being 3 separate projects.

Not applying new version number or new install paths!

Change-Id: Ibea50dc8a6ab091e91857f78833f5379a4511547


[ROCm/ROCR-Runtime commit: 6c3acda664]
2019-10-29 04:21:17 -05:00
Jeremy Newton e9dd9ef156 libhsakmt: default packaging prefix to install
If CPACK_PACKAGING_INSTALL_PREFIX is not set, it's safer to assume it's the
same as CMAKE_INSTALL_PREFIX, incase only the latter is set.

Change-Id: I70727ebbc50f21f8d6e3add10d7f9f2d5e747dee
Signed-off-by: Jeremy Newton <Jeremy.Newton@amd.com>


[ROCm/ROCR-Runtime commit: cb4d4107ed]
2019-10-25 15:40:03 -04:00
Yong Zhao 4c3a63755c kfdtest: Add a function to wait indefinitely until user input
This will facilitate the need of halting the execution during debugging.

Change-Id: I058c81bbc9f655bbc477b2b542d6b43aa423324c
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: da9a404ac3]
2019-10-23 11:07:43 -04:00
Yong Zhao 35067d85fd kfdtest: Add a Nop packet submission test for CP and SDMA queue
The tests are useful to triage the fundamental queue submission
functionality by excluding the packet format variable from the equation.

Change-Id: I2c7fcda811f93bdefc1b62396233559416be44e7
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: ac5c433420]
2019-10-22 13:42:33 -04:00
Yong Zhao 54cc49ef1a kfdtest: Add SDMANopPacket class
The class is very useful for triaging complex SDMA issues.

Change-Id: Ib5de729f7fc62f41e894ef98d3967e7e1745d454
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 75c654cfea]
2019-10-22 13:41:44 -04:00
Yong Zhao 9bb602199c kfdtest: Add a core test filter for software scheduler mode
The new filter can be used by "./run_kfdtest.sh -p core_sws".

Change-Id: I1c43669cfc07c09ccafb9fa2e2851932ac59307d
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 98b0652917]
2019-10-22 11:23:33 -04:00
Yong Zhao e667ac8648 kfdtest: Temporarily disable performance counter tests for gfx1010
We are still working on those tests for gfx1010, so disable them
temporarily.

Change-Id: I5d51b4b02bc753137014684859cc033f759b2899
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: ffbdb726ac]
2019-10-22 10:39:58 -04:00
Felix Kuehling e5c7385d3d libhsakmt: Fix installation path for pkg-config file
CMAKE_INSTALL_PREFIX would give incorrect result for packages
build by CPack.

Change-Id: Iaeb4d30eb44080b7924ecf892de011ed6e365f5f
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>


[ROCm/ROCR-Runtime commit: bee47ab013]
2019-10-21 14:53:54 -04:00
Yong Zhao c1b8cb1464 libhsakmt: Add a message when a device is not supported
This helps to quickly triage problems.

Change-Id: Iad2b4b74209ab972be0c2f6311eeb3aaf098d29f
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: ab2daf6538]
2019-10-20 12:44:13 -04:00
Yong Zhao 8b8f758dab libhsakmt: Add gfx1012 device IDs
Now the gfx1012 device IDs are okay to reveal.

Change-Id: I9da2a036b74ec7b6b8b1fb7587597a5847f02205
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 1c7755d2da]
2019-10-17 12:35:22 -04:00
Yong Zhao 3872b5ce32 libhsakmt: Print an error message when map_mmio failes
Without this change, the failure was hard to notice when it happened.

Change-Id: I99c3e8cea0d0cbd3bcfe79069410e6e870e225bf
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>


[ROCm/ROCR-Runtime commit: 16fa78b134]
2019-10-17 12:32:32 -04:00
Amber Lin 211742a1f7 libhsakmt: handle CPU cache info on non-NUMA sys
When CONFIG_NUMA is not enabled in the kernel config, only one CPU node
presents on the system and /sys/devices/system/node/nodeX directories
don't exist. Read CPU cache information from /sys/devices/system/cpu in
this situation.

Change-Id: I017ff17dd72678a0551edcc77446664501aa42ca
Signed-off-by: Amber Lin <Amber.Lin@amd.com>


[ROCm/ROCR-Runtime commit: 23541e0289]
2019-10-17 11:53:19 -04:00
Eric Huang 0cd804f8a0 kfdtest: add xgmi path for p2p tests
When large bar is not available, we can use
xgmi to do p2p tests.

Change-Id: Ib7b59fb8a4d41f605739a0428973f6b2f1a3450f
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>


[ROCm/ROCR-Runtime commit: 0174377351]
2019-10-17 10:21:10 -04:00
Chris Freehill 6687f73732 Add some gfx1xxx targets
This is to fix 

Change-Id: I69a87884d8174733905e4c007cf0f19b5103482a


[ROCm/ROCR-Runtime commit: 53228ad819]
2019-10-16 09:01:22 -04:00
Philip Cox 74fe695127 Remove debugger data reg accesses
The debug trap accesses the data0/data1 registers, so we do not
want the userspace to write values to it.  We remove the calls to
set the data0/data1 register values.

Change-Id: Iaba842a4c445f339f16a39fe1994526ff78a2f3c
Signed-off-by: Philip Cox <Philip.Cox@amd.com>


[ROCm/ROCR-Runtime commit: 6933540c81]
2019-10-10 14:32:54 -04:00
Philip Cox 00c5838996 kfdtest: Check kfd debugger version in tests
Need to check the kfd debugger version of the kernel before
calling kfd debugger tests in kfdtest.  If they are out of sync,
the tests may fail.

Change-Id: I1df5e89fb1199304e6fbe8973c60b76062514c03
Signed-off-by: Philip Cox <Philip.Cox@amd.com>


[ROCm/ROCR-Runtime commit: efe3769835]
2019-10-10 14:32:54 -04:00