Commit-Graf

2959 Incheckningar

Upphovsman SHA1 Meddelande Datum
Sean Keely 6957202df8 Update pointer info to include IPC memory.
IPC memory was previously returned as HSA_POINTER_ALLOCATED and
had garbage in the node_id field.  Due to ROCR_VISIBLE_DEVICES
we need to be able to distinguish between imported memory and
regular memory because imported memory may not be owned by an
agent that is visible in the process.  Differentiating these flags
allows the users to expect null agent for the owning agent.

Fixes 

Change-Id: Ide3489cec1ee2072dc9697fa5cb71ddb17999d14
2020-02-05 01:55:39 -06:00
Kent Russell f2b8965d7b Fix upgrade for new package name in RPM
The name changed and thus will end up throwing errors if an upgrade is
done, compared to a plain install

Change-Id: Ibc7876a66a414034a00f924cdd750e6a08d6c9cc
2020-01-31 10:41:35 -05:00
Yong Zhao 4f2ff25a3d kfdtest: Enable some tests on gfx1xxx series
Those tests are currently all passing.

Change-Id: I233afe33e8275d482bab5b5590b856fce49af76d
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-01-29 21:23:05 -05:00
Philip Yang 5858aa17a9 libhsakmt: Ignore mbind failure if flag NoSubstitute = 0
From Thunk spec, flag NoSubstitute = 0, if specific memory type is not
available on node, allocation may fall back to other memory that can
replace it on that node. mbind return failure if no memory available on
the specific node, we should ignore the mbind failure for this case.




Change-Id: I651a1bedf1852330604e56965cc17862403ebf87
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2020-01-27 13:27:37 -05:00
Eric Huang 634b30119a kfdtest: fix infinite loop in sdma multicopy
when variable interval become zero, the loop of while (timeConsumption >= interval)
will be a infinite loop. The fix is to avoid that case.

Change-Id: I8fd07296925300bace5ab7d3da86482b6d8b0d03
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2020-01-23 14:54:23 -05:00
Sean Keely 3e9aca0f34 Support stripped binaries and remove unneeded attributes.
Attribute optimize(0) doesn't appear to be helpful helpful.  This
prevents optimization in the function but not at call sites to the
function.  The function may still be inlined since it has no side
effect (in some cases that we currently don't support).

Having a side effect prevents a call site optimization that allows
removal of a noinline function call with no side effect.  Call site
optimization should only happen (in GCC at least) when using whole
program optimization so this may be stronger than we strictly need.

Also added _amdgpu_r_debug to the exported symbol list (global) and
switched to the standard macro for an exported symbol (HSA_API).
Without being in the global list the debugger will not find this
symbol if the binary has been stripped.

Change-Id: Ieb00175ccc55fda4491deee44711cd55b3f24aeb
2020-01-21 20:08:02 -05:00
Felix Kuehling 87e10cd0b4 libhsakmt: Improve error handling in child process
Check for errno == EBADF in kmtIoctl to detect misuse of the kfd_fd
in a forked child process.

Detect being in a forked child process pro-actively by implementing
a pthread_atfork callback.

Make sure all mutexes get reinitialized in the child process to avoid
deadlocks.

Check for being in a forked child process in CHECK_KFD_OPENED so that
all hsaKmt functions will return the appropriate status
HSAKMT_STATUS_KERNEL_IO_CHANNEL_NOT_OPENED.

Update InvalidKFDHandleTest to expect that error code.

Change-Id: I0238e5fba344dcaa454e97a35db2e2dcc8d1f607
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2020-01-20 18:01:21 -05:00
Freddy Paul 4b2256ac21 ADD libhsakmt.so RUNPATH for SLES
For SLES libhsakmt.so is located in <ROCm install>/lib64

Change-Id: I038dd80b65b4a493ac37981649b02f1b35caea88
2020-01-16 22:20:14 -05:00
Laurent Morichetti 19e1fb3a4e Fix a build error when compiling with clang
Check __clang__ before __GNUC__ as clang defines both.

Change-Id: I9963f8e0665efb4cb08bd3886fb38fee42dd9861
2020-01-15 18:52:53 -08:00
Srinivasan Subramanian 54d94d02bd Avoid shared library conflicts across multiple ROCM version
Adding patch number based on ROCM build/release to have unique
file name for libraries across multiple versions of ROCM.

Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>

Change-Id: I58d665b0e7d577b5bd7a6000d1202a0242672727
2020-01-15 01:08:23 -05:00
Qingchuan Shi d63886190f fix optimize(0) for clang.
Change-Id: I83bc57d42815f37445ae97bf6950147e3358ac45
2020-01-13 20:53:40 -05:00
Yong Zhao fe97612800 kfdtest: Add basic tests for XGMI SDMA queues
After XGMI SDMA queues were separated from regular SDMA queues, they
were not covered in the current tests. Add tests for them now.

Change-Id: I036e3ca5d583ab7f022a9dc6cda3ef867f4773a0
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-01-10 15:32:12 -05:00
Srinivasan Subramanian 5c2fc61d95 Use runpath instead of rpath for libraries
enable-new-dtags option is added

Change-Id: I1f406b3f30ddc6491aad3ef7a84dfd415917b1aa
Signed-off-by: Freddy Paul <Freddy.paul@amd.com>
2020-01-09 18:05:57 -05:00
Pruthvi Madugundu 3f8a07e460 support installtion of multiple ROCM version
Changes to resolve
1) Multiple rocm release installation support
2) Multiple rocm shared lib conflicts

Change-Id: I792feb36cdf4516d108f1ef71abe0c87522f018a
Signed-off-by: Pruthvi Madugundu <pruthvi.madugundu@amd.com>
Signed-off-by: Freddy Paul <Freddy.paul@amd.com>
2020-01-09 18:05:24 -05:00
Yong Zhao 2b70d73f68 kfdtest: Improve the stablility of SignalHandling test
On gfx1012, allocating 1/4 of the system memory on a 32G RAM machine
could fail, resulting in this test to fail. Limit the maximum buffer
to allocate to be smaller than 3G to accommodate this situation.

Change-Id: I38b0a0f7da1f0b9ca851e04d2d0a51767858c801
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-01-07 17:28:57 -05:00
Kent Russell 1589cf5dc5 Fix package naming without using newer variables
CMake 3.6.0 and older don't have CPACK_RPM_FILE_NAME support, so make
the packaging simpler by checking the OS. If it's Ubuntu, call the dev
package hsakmt-roct-dev, otherwise call it hsakmt-roct-devel, since
Ubuntu is the only officially-ROCm-supported deb-based OS, and
deb-based dev packages end with -dev (while rpm-based use -devel)

Change-Id: If75324f82f507c3b312bb6176c06643d521ccb68
2020-01-06 09:21:58 -05:00
Oak Zeng 38b429fd52 Fix GWS test function call parameter issue
Change-Id: I3f4fde9ec8268362cffecb2d95177913b583486f
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2020-01-06 08:11:49 -05:00
Huang Rui 06464b917d libhsakmt: add NumCpQueues and NumSdmaQueuesPerEngine data field (v3)
NumCpQueues and NumSdmaQueuesPerEngine should be got by kfd driver not hardcode.
So add two data fields in HsaNodeProperties then thunk is able to get it from sysfs
that exposed by kfd.

v2: change NumCpQueues/NumSdmaQueuesPerEngine to one byte.
v3: merge two commits as one to avoid ABI update two times.

Change-Id: Ie386e4685f13493e22db6e207a399db6a4c5b9dc
Signed-off-by: Huang Rui <ray.huang@amd.com>
2020-01-03 23:27:42 -05:00
Oak Zeng af7feb93db Add KFD GWS test
Change-Id: Ie90d9119da6cee41ddab10deb427d4ae9fd9a16b
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2020-01-03 15:19:56 -05:00
Yong Zhao facb6c056d kfdtest: Rework the shader used for CWSR test
The shader supports multiple work items. It also eliminates one input buffer.

Change-Id: If0596b306065980b74fb92613e95610defc00164
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2020-01-02 16:05:01 -05:00
Ramesh Errabolu be5782e198 Cpack rule encoding dependencies was broken
Change-Id: I2b8df7d255cdffd6b42713f0b59df2aeef83a607
2019-12-23 10:51:55 -06:00
Sean Keely 22a601292d Disable SDMA on gfx10.
Lack of cache controls only allow operating SDMA at
agent scope.  All copy APIs are defined at system scope so may
result in data errors.

Change-Id: I9cd10007defddcbf8feb14a2e3daa1ba17c0489f
2019-12-20 17:25:47 -06:00
Yong Zhao 22e9ef7303 libhsakmt: Add the perf counter support for gfx1012
Change-Id: I55d68a77928617edaabd33ae0807bf23f739c8de
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-12-18 20:49:36 -05:00
Ramesh Errabolu d015d78de3 Support the building and packaging of legacy ROCr tools by itself
Change-Id: I2247bf7a46ee93495340f7b2603b09dc6b667443
2019-12-18 19:20:03 -06:00
Yong Zhao 44db5cb011 kfdtest: Enable KFDExceptionTest on gfx906 and gfx1xxx series
KFDExceptionTest on those platform is passing.

Change-Id: I328ee4fd4ff5b339e560f2f79e754fd34459210a
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-12-18 17:48:16 -05:00
Kent Russell 8b14ea2e83 kfdtest.exclude: Remove BasicCuMaskingEven from gfx908
The initial baseline measurements are proving inconsistent, which
results in the test failing more often with different variant rates

Change-Id: I1f4e04bf7d615cf39de9605bd5141a997b22cdfc
2019-12-18 14:24:53 -05:00
Kent Russell 57b42045d0 Set package name explicitly
Instead of letting CPack generate the filename according to
CPACK_PACKAGE_NAME}-${CPACK_PACKAGE_VERSION}-${CPACK_SYSTEM_NAME},
set the package name ourselves. Also take the change to fix the name of
hsakmt-dev to hsakmt-devel for RPM, since that's their convention

Previous packaging format:
hsakmt-roct-1.0.9-229-g2144854-Linux.deb
hsakmt-roct-1.0.9-229-g2144854-Linux.rpm
hsakmt-roct-dev-1.0.9-229-g2144854-Linux.deb
hsakmt-roct-dev-1.0.9-229-g2144854-Linux.rpm

New format:
hsakmt-roct-1.0.9-292-gc66f8cf.x86_64.deb
hsakmt-roct-1.0.9-292-gc66f8cf.x86_64.rpm
hsakmt-roct-dev-1.0.9-292-gc66f8cf.x86_64.deb
hsakmt-roct-devel-1.0.9-292-gc66f8cf.x86_64.rpm

Change-Id: I4fc4e0fd2eafd25669c1cfffb39860e25a0b645c
Signed-off-by: Kent Russell <kent.russell@amd.com>
2019-12-18 12:21:33 -05:00
Konstantin Zhuravlyov 096e715629 Switch to llvm monorepo
Change-Id: Ibfe045afd811d36521486573168aecd06279ccb6
2019-12-17 22:55:20 -05:00
Yong Zhao f7c0172385 kfdtest: Rename two exception test cases
The old names are not accurate enough and we rename them according to
their corresponding fault types.

Change-Id: Icf4d52ba0ab9d49af5d912a0feb82665b1e8d344
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-12-17 19:46:33 -05:00
Yong Zhao 1c2c5a7b9a kfdtest: Delete two useless exception tests
The InvalidPPR* tests are only useful for gfx801 right now, on which
they won't trigger exceptions. So they are not relevent in the
KFDExceptionTest category. In addition, given AccessPPRMem already tests
the PPR memory functionality, we can just delete those two tests.

Change-Id: Id5c6e23c4c0ce47a4f04e9e1f0fa9083e0a9d0e0
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-12-17 19:37:14 -05:00
Yong Zhao 10ffc63d7b kfdtest: Add AllQueues test
This puts all CP and SDMA queues in a single test, which is
currently missing.

Change-Id: I98bf58df1be65fe9daf6311c016a48569a8ab674
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-12-17 19:25:14 -05:00
Chris Freehill 4c22962024 Import PAL version of addrlib with initial gfx1xxx support
Change-Id: I439930a5cbf5b13a359ec164e75c6828af8c668d
2019-12-12 21:38:01 -06:00
Eric Huang caa744944e kfdtest: change baseline in BasicCuMaskingEven
To keep normal performance, we have to use symmetrical cu masks
based on shader engines. So change baseline from 1 cu to 1 cu
per SE.

Change-Id: I9e83a87fb670bb406f7983714fa0d8ab673609eb
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2019-12-11 14:48:55 -05:00
Sean Keely 0a43a107b1 Initial GWS queue support.
Queues should transition to ref counting for all queues eventually.
That cleanup will be part of shared queue pooling support.

Change-Id: I217ff5d573156678b9559da6fb81baa8cd31c617
2019-12-09 21:21:17 -05:00
Ramesh Errabolu 144017e148 Add package dependencies for Images and Tools
Change-Id: Id77ba0e81d3b3e872153cdd7680338dd70319026
2019-12-06 16:38:21 -06:00
Yong Zhao 8f140fc03d kfdtest: Add some logs to Atomics test
This will help us triage the unexpected hangs on the farm systems.
Meanwhile, simplify the logic.

Change-Id: Ie50b97a34cb86891720dce11f2d178bff9aa2cd5
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-12-02 12:56:03 -05:00
Jonathan Kim 8b01a1c4c5 add queue snapshot test
adds api and test to get newly create queue snapshot per ptraced process.

Change-Id: Ife97123a5b930e837ccaa386801145ef23c2cc2c
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
2019-12-02 11:56:04 -05:00
Yong Zhao ad76ffd544 kfdtest: Declare SetUpTestCase() to be public
SetUpTestCase() and TearDownTestCase() are declared as protected,
preventing us from using TEST_P().

Change-Id: I1d049a475a1b3bd44b5f96305a48751b90d572ce
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-12-02 11:38:18 -05:00
Oak Zeng 0b71bbb787 Support XGMI optimized SDMA queue
Change-Id: I6ad62fc94a9838df505879b1ddaccfeb9881a6e8
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-11-26 21:42:47 -05:00
Yong Zhao 4daa25fceb kfdtest: Merge the two largest buffer test helper functions
This is cleaner.

Change-Id: I7740f3e0f93a63b35fefc3cb69712dfad68df552
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-25 18:45:23 -05:00
Yong Zhao b6cefa7bda kfdtest: Split BigBufferStressTest into two smaller tests
The previous BigBufferStressTest has too much stuff and takes a long
time to run. By separating largest*BufferTest out into other
tests, we dramatically reduce the time to run BigBufferStressTest and
therefore make reproducing issues much easier.

Meanwhile, rename the test to BigSysBufferStressTest to express more
information.

Change-Id: I5911f113c0bd50627ee6d84bbb4f2972cbed8886
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-25 18:28:17 -05:00
Yong Zhao cbe21fa261 kfdtest: Remove the queue submission in BigBufferStressTest
In order to accommodate the flaky queue submission under memory
shortage scenarios, BigBufferStressTest has become very much a hack,
undermining its purpose of testing the basic memory related operations.

Therefore, remove the queue submission part. The EvictTest should serve
the purpose of testing the memory and queue submission functionalities
when memory eviction happens.

Change-Id: I3c3603a0e834267eccb72f46efeabe1e053c8fc5
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-25 18:25:23 -05:00
Sean Keely d2a50a0048 Allow disabling scratch reclaim.
Debug and RCCL NPI assist feature.

Change-Id: I2cb76f0a086fa341465df3ede26965ab713bc3b4
2019-11-20 02:41:58 -05:00
Sean Keely 35c1ffa863 Raise large scratch allocation limit for RCCL.
Temporary workaround for 2.10 release.  RCCL, compiler, or firmware
must be corrected and this code reverted before another ASIC release.

Change-Id: I27851353289b93df9acb72d28b8c6ccb9f7f7d7a
2019-11-20 02:41:27 -05:00
Sean Keely f62017f4a5 Correct clang paths for incremental builds.
We should be using bin/clang, not the build/lightining/bin/clang since
build/ is the project's internal build directory.  This patch corrects
this where possible.

However, lightning does not install all it's end user files under out/
so some headers can not be found anywhere in out/ in an incremental build.
This header (opencl-c.h) if fetched out of the lightning source tree if
necessary.

Change-Id: I083d8b27bb39dd615fba3bb0711a789318f95e77
2019-11-16 03:10:28 -06:00
Felix Kuehling 8afdf52001 kfdtest: Change PM4 dispatch to workaround GWS-related FW problem
v2: Remove useless AcquireMem packet after the dispatch
v3: Minimum poll_interval is 4

Change-Id: I352eb21c781ed9e03d62c0febd532da6a9854afa
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2019-11-15 15:27:21 -05:00
Felix Kuehling 0b7fcadc63 kfdtest: Add PM4WaitRegMemPacket
v2: optimize_ace_offload_mode=1, recommended by firmware team

Change-Id: Ia54e37242b4eaaf631c35e61a59f03ee0f85ca35
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2019-11-15 15:27:21 -05:00
Yong Zhao a4d570fa2b kfdtest: Expand KFDQMTest.MultipleCpQueues to cover all CP queues
Because of that, rename the test to AllCpQueues.

Change-Id: I57105f863db2558e850c703d151ffebcce2c7a17
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-11-15 08:25:14 -05:00
Huang Rui fdba74c2fb libhsakmt: add gfx90c support for thunk
This patch adds the support for gfx90c apu. So far we treat it as "dgpu" and
gfx900. Will update hsa gfxip table while the isa/llvm is implemented on gfx90c.

Change-Id: I6ef164bf3e751fe6dd6287cac212a500dce84b1a
Signed-off-by: Huang Rui <ray.huang@amd.com>
2019-11-14 20:02:53 -05:00
Philip Cox 6742e585be kfdtest: Enable kfd debugger tests on gfx10
Change-Id: If3f86624c76805a6bc0e154d31b957eb63120418
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
2019-11-13 08:29:46 -05:00