Графік комітів

2959 Коміти

Автор SHA1 Повідомлення Дата
Ramesh Errabolu 3fbf03af76 Allows users, via env ROCR_VISIBLE_DEVICES, to surface a subset of Gpu devices
Change-Id: I5662639d5d70f054831969669f9d30dec356dd5a

Update per review comments

Change-Id: I18c7d7cb00b261493b61c2cf5454d486166f40d8
2019-02-06 02:02:29 -06:00
Chris Freehill 014945310a Fix boolean semantic error
Change-Id: Ic927370d5874af3f33105fca6ee0b581ebc6fa08
2019-01-31 14:03:48 -06:00
Yong Zhao 776077fe65 libhsakmt: Use a better name doorbell_mmap_offset
The previous name doorbell_offset is used too extensively throughout
the code and did not reflect the true usage.

Change-Id: I50d33f5c00e82c46cdf4264a78b8f925705bed6a
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-01-31 14:50:04 -05:00
Yong Zhao 7cd7830182 kfdtest: Use SDMA queue info from class KFDBaseComponentTest
Change-Id: Iacdb75006ef5f9ce6c4fbb0525d2c3a8535fdd23
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-01-31 14:00:57 -05:00
Yong Zhao 6857602cbc kfdtest: Include SDMA queue info in class KFDBaseComponentTest
This will facilitate us to avoid using family ID to differentiate the
SDMA engines and SDMA queues.

Change-Id: I8d6203cc5d330e9130a1b2624997c86ba53e8ae4
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-01-31 14:00:57 -05:00
Chris Freehill 626f13a88b Fix async memory test; temporarily disable NUMA memory test
Change-Id: I1c0618f5dba513c1cf8fafb5fc64e5c811df8454
2019-01-31 09:14:54 -05:00
Yong Zhao 51ee5c324a libhsakmt: Introduce HSA_ZFB environment variable
This variable is 0 by default. When set to 1, it means there is no frame
buffer, so all memory allocation is routed to system memory. This mode
is mainly used during emulation.

Use CoarseGrain for VRAM under ZFB mode

Change-Id: I29e8e98be56935e3ceb94782d70771cc45700749
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2019-01-29 19:46:26 -05:00
xinhui pan 41bf449e99 thunk: fix size overflow
Some test case alloc >4gb memory.
Use HSAuint64 in bytes and HSAuint32 in pages.

Change-Id: I0d5e6c299903b5898cfea024178a7a26b9ba3c90
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2019-01-28 10:49:45 -05:00
Sean Keely 65d39cc476 Unify APU and dGPU initial queue scratch allocation.
Both support dynamic scratch allocation so there is no reason
to preemptively allocate on APUs.

Change-Id: I22eaec01a83a091ee9dc1f594a1a9106e8dd81fc
2019-01-25 02:11:39 -05:00
Jay Cornwall b764991982 Set EOP buffer TC policy to non-coherent
Restores regression in dispatch latency.

Change-Id: I17869d3d515d8c1fa055a57afec2531903b88b16
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
2019-01-23 12:21:01 -05:00
Jay Cornwall 079eadd71b Remove legacy microcode version check in GpuAgent::InvalidateCodeCaches
Fixes instruction cache invalidation when using microcode branches.

Change-Id: I932676e683983145f5c807204e592fb5e530c8af
2019-01-22 16:39:52 -06:00
Oak Zeng faba8950d4 Delete a few SDMA queue types
The design changed. Those are not needed any more

Change-Id: Ibb1230d1c34d6ac5153275f9334af45c73805f37
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-01-22 14:49:05 -06:00
Oak Zeng dd6c6e7bc6 Revert "Add more SDMA queue type"
This reverts commit 5173e71810.

Change-Id: I0a52a44a5d141b398d0898bea52e4dcf41dc950f
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-01-21 10:37:56 -06:00
Oak Zeng 124f77775c Revert "Create SDMA queue on specific engine"
This reverts commit 58b95e0a9d.

Change-Id: Idc0decd86364ed3441e9037b83be8be9953f0b3e
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-01-21 10:37:46 -06:00
Oak Zeng 1923d2e335 Revert "Create SDMA queue on specific engine"
This reverts commit acb80d7583.

Change-Id: Ia3e9db5fcba1fef80745c72c78b7c568b5c7315e
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-01-21 10:37:32 -06:00
Oak Zeng 742fa5d871 Revert "Add test to allocate SDMA queue on specific engine"
This reverts commit af5b320c47.

Change-Id: I262d91afc60ba2618bf4a857f162ea5236d54131
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2019-01-21 10:36:54 -06:00
Konstantin Zhuravlyov 8bee6e4976 Loader: update symbol processing for v2+
- Skip symbols that are STB_LOCAL and not STT_AMDGPU_HSA_KERNEL

Change-Id: I68567f58de9bf3f07dbd8020ef63f47667c86367
2019-01-18 15:42:28 -05:00
Konstantin Zhuravlyov c1ad82a6b7 Loader updates for code object v3
- Fix loading in some cases
  - Fix symbol kind

Change-Id: I721b4a35972b6d2a6d0ac733ab770b096cc74e17
2019-01-18 15:41:01 -05:00
Chris Freehill 6bca866e6c Decrease test size for emulatation runs
Decrease number of iterations and array sizes in some cases.

Change-Id: I1a0a43faa907b28662ff3a44c172950ed7b1500e
2019-01-14 21:23:04 -05:00
Philip Cox 37858f2311 Initial gfx9 debugger node suspend/resume
Change-Id: I2a5dac3d02265c11f5b6985ab457e2d1caa0a033
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
2019-01-11 09:00:54 -05:00
Philip Yang b2e026fce3 kfdtest: increase KFDPerformanceTest.P2PBandWidthTest timeout value
KFDPerformanceTest.P2PBandWidthTest[push, push] takes about 3 seconds
on 4 gfx906, the default g_TestTimeout 2 seconds is not enough to wait
for sDMA queue rptr is consumed. Use kfdtest command line option
--timeout=6000, the test is finished and result is reasonable twice as
P2PBandWidthTest[push, none]. Change P2PBandWidthTest wait timeout to 6
seconds.

Add timeout argument to function WaitOnValue, BaseQueue.Wait4PacketConsumption
SDMAQueue.Wait4PacketConsumption, PM4Queue.Wait4PacketConsumption with
default value is g_TestTimeOut.

Change-Id: I0aa04d644339feaeea695e41647ae66568beab9e
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2019-01-04 12:53:55 -05:00
Kent Russell 666f90440a Add lib requirement in CMake file
Adding it to the DEBIAN/control won't work, since we use CMake to build
it. Add all required packages to the CMakeLists file

Change-Id: Iaf62f42e0f998d66038338fb2cf793d29c790205
2019-01-02 07:50:12 -05:00
Yong Zhao 81b8815e1a Add -fPIC flag when building sp3 library
This will support the sp3 library built on one gcc version to be
compatible with another gcc version.

Change-Id: If67714bd63376dc781c56ed025be335fe54b2ba5
Signed-off-by: Yong Zhao <Yong.Zhao@amd.com>
2018-12-13 18:32:23 -05:00
Eric Huang 8ee93b3187 libhsakmt: add RAS support v2
RAS feature enabling bit and errors return are implemented in
existed topology and event mechanism.

v2: change library interface.

Change-Id: I75807c080b5b26e8115240b05b3d7016cb05a31a
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2018-12-13 10:17:12 -05:00
Kent Russell bcc348e3b9 kfdtest: Add gfx900/gfx906 IDs to run_kfdtest.sh
Change-Id: Ib6ee418a432d1de79e2306b54d702132de3d06c5
2018-12-12 08:38:01 -05:00
Kent Russell 53439669d9 libhsakmt: Add new gfx900 and gfx906 GPU IDs
Change-Id: I93b2b845c3edb2da55235a56516a851145745988
2018-12-12 08:36:40 -05:00
Eric Huang 29d11d02e8 Revert "libhsakmt: add RAS support"
This reverts commit 1fbe010354.

Change-Id: I739b17e057f2a8a0f4375741955209d2477c704a
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2018-12-08 19:42:33 -05:00
Kent Russell 54807526b9 Add more SDMA-related tests to SDMA_BLACKLIST
These tests all make use of an SDMAQueue in one way or another, so add
them to the SDMA_BLACKLIST to be 100% certain

Change-Id: Ic29e073c2f46249f3e5918145b13d276aec7bb33
2018-12-06 14:07:50 -05:00
Kent Russell aa7c13264a Add ZeroInitializationVram test to SDMA blacklist
This test uses SDMA, so add it to the SDMA list

Change-Id: I2dc2b0c4328e38e593d455de2103ebe1ef0adbc2
2018-12-06 11:14:26 -05:00
Kent Russell 3a2ec0111e Temporarily remove SDMA tests from gfx906
SDMA is being flaky, so remove SDMA tests from it for now

Change-Id: Ia3612566813f925804ab90d6235520da7cc65926
2018-12-05 08:41:16 -05:00
Kent Russell 381dba3932 Remove SDMAConcurrentCopies from gfx906 execution
This is intermittently causing VM faults and excessive evictions, which
causes the rest of the tests to fail. Take it out for now until someone
can investigate

Change-Id: I9c43890bc9f03a4a31efbc18df0df5e40a232c58
2018-11-28 10:01:35 -05:00
Eric Huang 1fbe010354 libhsakmt: add RAS support
RAS feature enabling bit and errors return are implemented in
existed topology and event mechanism.

Change-Id: I9b018bba80cf4a6998e42a7bff64318c689b1d2a
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
2018-11-23 11:42:34 -05:00
Ramesh Errabolu 28c3f9a269 Initialize queue buffer with Invalid Pkt Headers
Change-Id: I4166f1359746ee6829b730bac2db358af72ab16e
2018-11-21 19:09:10 -05:00
Mark Searles 8ea836017a Force object code v2 until v3 is supported
Change-Id: I4c2a64bf9bd515686d1f1d90aece2a9ac40e5685
2018-11-21 10:06:08 -08:00
changzhu c15cf2e9c3 kfdtest: fix SDMACopyParams build error on redhat 7.2 in KFDTestUtilQueue.cpp
In file included from /usr/include/c++/4.8.2/algorithm:62:0,
                 from /home/jenkins/libhsakmt/tests/kfdtest/src/KFDTestUtilQueue.cpp:24:
/usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘_RandomAccessIterator std::__unguarded_partition(_RandomAccessIterator, _RandomAccessIterator, const _Tp&, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<SDMACopyParams*, std::vector<SDMACopyParams> >; _Tp = SDMACopyParams; _Compare = bool (*)(SDMACopyParams&, SDMACopyParams&)]’:
/usr/include/c++/4.8.2/bits/stl_algo.h:2296:78:   required from ‘_RandomAccessIterator std::__unguarded_partition_pivot(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<SDMACopyParams*, std::vector<SDMACopyParams> >; _Compare = bool (*)(SDMACopyParams&, SDMACopyParams&)]’
/usr/include/c++/4.8.2/bits/stl_algo.h:2337:62:   required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<SDMACopyParams*, std::vector<SDMACopyParams> >; _Size = long int; _Compare = bool (*)(SDMACopyParams&, SDMACopyParams&)]’
/usr/include/c++/4.8.2/bits/stl_algo.h:5499:44:   required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<SDMACopyParams*, std::vector<SDMACopyParams> >; _Compare = bool (*)(SDMACopyParams&, SDMACopyParams&)]’
/home/jenkins/libhsakmt/tests/kfdtest/src/KFDTestUtilQueue.cpp:351:66:   required from here
/usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: error: invalid initialization of reference of type ‘SDMACopyParams&’ from expression of type ‘const SDMACopyParams’
    while (__comp(*__first, __pivot))
                                   ^
/usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: error: invalid initialization of reference of type ‘SDMACopyParams&’ from expression of type ‘const SDMACopyParams’
    while (__comp(__pivot, *__last))
                                  ^

Change-Id: I0fce0c7e6d0a0ce93b1e6522ee8f216615765568
Signed-off-by: changzhu <Changfeng.Zhu@amd.com>
2018-11-21 17:23:03 +08:00
Oak Zeng af5b320c47 Add test to allocate SDMA queue on specific engine
Change-Id: I5b5140e4119fc01db250d63cca7389cf80ec0d16
Signed-off-by: Oak Zeng <ozeng@amd.com>
2018-11-20 11:17:43 -05:00
shaoyunl d8009b4fd3 KFDTest: fix failure when run KFDTest on multi-GPU small bar system
On small bar multi-gpu system, hsaKmtMemoryMapToGPU will fail due to latest
kernel P2P sanity check. Swith to use hsaKmtMemoryMapToGPUNodes to fix
the failure

Change-Id: Id8b6329d1243df0e908cc9a171b5c7f9156f4a8b
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2018-11-19 16:09:31 -05:00
shaoyunl 29b45b8c0a Thunk: make scratch memory only map to its own GPU
Map scratch memory to the GPU that specified when allocate the memory

Change-Id: I788f9ef0dccb63b894a75e75cac5f94a60d7ec48
Signed-off-by: shaoyunl <shaoyun.liu@amd.com>
2018-11-19 10:26:31 -05:00
Sean Keely 8e4177382a Check max wave scratch limits.
HW has limited bits for wave scratch base address stride.  Enforcement
prevents programs with larger than supported scratch allocations from
running and clobbering neighboring scratch space.

Change-Id: I574da888e9d1d5e290a9c0025ba13b5ef9f1e5c0
2018-11-16 20:59:20 -05:00
Sean Keely 269be0be2e Disable forced explicit selection of public vs internal HSA interfaces.
Temporary to reenable OCL builds on TC.

Change-Id: Ia81f2f9a9dd10ae8ce9627313247a586a8711584
2018-11-16 15:26:26 -06:00
Konstantin Zhuravlyov a447d79430 Fix dynamic relocations:
- Process dynamic relocation even if there is
    no symbol associated to it.

Change-Id: Iaefee682ee52f5acda8280e5764e6d5fd992774a
2018-11-14 15:25:41 -05:00
Oak Zeng acb80d7583 Create SDMA queue on specific engine
Change-Id: Iece03795510d66b03324174203faa0ac9eb4fb7d
Signed-off-by: Oak Zeng <ozeng@amd.com>
2018-11-13 14:52:57 -05:00
Oak Zeng 8d65e72045 Move m_Type to a local variable
BaseQueue class has a member function GetQueueType so m_Type
is duplicated.  m_Type is only used in one function. Move it to
a local variable.

Change-Id: Ice144cf723178dd628cb49261c23d10605f9ee7d
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2018-11-13 14:52:17 -05:00
Oak Zeng 58b95e0a9d Create SDMA queue on specific engine
Change-Id: Id651ececda55b81b45e991bd8e6616674be48d8e
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2018-11-13 14:52:17 -05:00
Oak Zeng 5173e71810 Add more SDMA queue type
Those new types are used to create SDMA queue on specific engine

Change-Id: I91c3bcc14fef7404cf42b256a18651432e171091
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2018-11-13 14:52:01 -05:00
Oak Zeng 055f7c9c2c Use latest kfd_ioctl.h file
Change-Id: Icd7da4a305581c6857e17d59fbd0c3bd5101df3b
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
2018-11-13 14:51:46 -05:00
Sean Keely 4e8597681b Cache KFD Events used by user allocated InterruptSignals.
Change-Id: I7f102f880fea9c78febe28cd262f93ee77f03184
2018-11-12 22:37:42 -06:00
Sean Keely 8323b2e1d7 Add pooling for Signal ABI blocks (SharedSignal).
Makes better use of memory and greatly reduces mmap count.

Change-Id: Ib444cd1ccd144986adbcc7cec297a966e2c08bc7
2018-11-12 22:37:28 -06:00
Felix Kuehling 5e4e19d47b libhsakmt: Distinguish EPERM and EACCES
EPERM means "operation not permitted" and is returned when CGroup
access checks fail. EACCES means "permission denied" and is returned
when the device file permission bits or access control list don't
allow access.

EPERM can fail silently, since we assume the administrator disabled
a device on purpose in the CGroup. EACCESS should produce an error
message and an info message to check the device file permissions.

Change-Id: Iee4c5584c5fdc4e113c3d760dede6661097b4341
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
2018-11-12 17:06:18 -05:00
Sean Keely 936ecd1885 Remove legacy SVM region concept.
Also rename blit_agent to region_gpu and add comments to clarify
its role in deprecated region API support rather than to do blits.

Change-Id: I80b1043db2e1c5d40a58fc801eef70a688ea9169
2018-11-09 06:27:53 -06:00