rocm-systems

Autor	SHA1	Zpráva	Datum
Oak Zeng	6da54291bf	Revert "Create SDMA queue on specific engine" This reverts commit `6087fd7bca`. Change-Id: Ia3e9db5fcba1fef80745c72c78b7c568b5c7315e Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `1923d2e335`]	2019-01-21 10:37:32 -06:00
Oak Zeng	c622a2d220	Revert "Add test to allocate SDMA queue on specific engine" This reverts commit `5d15953efb`. Change-Id: I262d91afc60ba2618bf4a857f162ea5236d54131 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `742fa5d871`]	2019-01-21 10:36:54 -06:00
Philip Cox	fa45791c1a	Initial gfx9 debugger node suspend/resume Change-Id: I2a5dac3d02265c11f5b6985ab457e2d1caa0a033 Signed-off-by: Philip Cox <Philip.Cox@amd.com> [ROCm/ROCR-Runtime commit: `37858f2311`]	2019-01-11 09:00:54 -05:00
Philip Yang	4d5fb9f80e	kfdtest: increase KFDPerformanceTest.P2PBandWidthTest timeout value KFDPerformanceTest.P2PBandWidthTest[push, push] takes about 3 seconds on 4 gfx906, the default g_TestTimeout 2 seconds is not enough to wait for sDMA queue rptr is consumed. Use kfdtest command line option --timeout=6000, the test is finished and result is reasonable twice as P2PBandWidthTest[push, none]. Change P2PBandWidthTest wait timeout to 6 seconds. Add timeout argument to function WaitOnValue, BaseQueue.Wait4PacketConsumption SDMAQueue.Wait4PacketConsumption, PM4Queue.Wait4PacketConsumption with default value is g_TestTimeOut. Change-Id: I0aa04d644339feaeea695e41647ae66568beab9e Signed-off-by: Philip Yang <Philip.Yang@amd.com> [ROCm/ROCR-Runtime commit: `b2e026fce3`]	2019-01-04 12:53:55 -05:00
Kent Russell	0cf61d242b	Add lib requirement in CMake file Adding it to the DEBIAN/control won't work, since we use CMake to build it. Add all required packages to the CMakeLists file Change-Id: Iaf62f42e0f998d66038338fb2cf793d29c790205 [ROCm/ROCR-Runtime commit: `666f90440a`]	2019-01-02 07:50:12 -05:00
Yong Zhao	5f38525112	Add -fPIC flag when building sp3 library This will support the sp3 library built on one gcc version to be compatible with another gcc version. Change-Id: If67714bd63376dc781c56ed025be335fe54b2ba5 Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> [ROCm/ROCR-Runtime commit: `81b8815e1a`]	2018-12-13 18:32:23 -05:00
Eric Huang	616392b642	libhsakmt: add RAS support v2 RAS feature enabling bit and errors return are implemented in existed topology and event mechanism. v2: change library interface. Change-Id: I75807c080b5b26e8115240b05b3d7016cb05a31a Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com> [ROCm/ROCR-Runtime commit: `8ee93b3187`]	2018-12-13 10:17:12 -05:00
Kent Russell	c0fa8baec2	kfdtest: Add gfx900/gfx906 IDs to run_kfdtest.sh Change-Id: Ib6ee418a432d1de79e2306b54d702132de3d06c5 [ROCm/ROCR-Runtime commit: `bcc348e3b9`]	2018-12-12 08:38:01 -05:00
Kent Russell	1e7469c682	libhsakmt: Add new gfx900 and gfx906 GPU IDs Change-Id: I93b2b845c3edb2da55235a56516a851145745988 [ROCm/ROCR-Runtime commit: `53439669d9`]	2018-12-12 08:36:40 -05:00
Eric Huang	58c2f26d25	Revert "libhsakmt: add RAS support" This reverts commit `56b9bb17a7`. Change-Id: I739b17e057f2a8a0f4375741955209d2477c704a Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com> [ROCm/ROCR-Runtime commit: `29d11d02e8`]	2018-12-08 19:42:33 -05:00
Kent Russell	34e6346848	Add more SDMA-related tests to SDMA_BLACKLIST These tests all make use of an SDMAQueue in one way or another, so add them to the SDMA_BLACKLIST to be 100% certain Change-Id: Ic29e073c2f46249f3e5918145b13d276aec7bb33 [ROCm/ROCR-Runtime commit: `54807526b9`]	2018-12-06 14:07:50 -05:00
Kent Russell	931dd817fa	Add ZeroInitializationVram test to SDMA blacklist This test uses SDMA, so add it to the SDMA list Change-Id: I2dc2b0c4328e38e593d455de2103ebe1ef0adbc2 [ROCm/ROCR-Runtime commit: `aa7c13264a`]	2018-12-06 11:14:26 -05:00
Kent Russell	4a068e18dd	Temporarily remove SDMA tests from gfx906 SDMA is being flaky, so remove SDMA tests from it for now Change-Id: Ia3612566813f925804ab90d6235520da7cc65926 [ROCm/ROCR-Runtime commit: `3a2ec0111e`]	2018-12-05 08:41:16 -05:00
Kent Russell	18f1cc0e5b	Remove SDMAConcurrentCopies from gfx906 execution This is intermittently causing VM faults and excessive evictions, which causes the rest of the tests to fail. Take it out for now until someone can investigate Change-Id: I9c43890bc9f03a4a31efbc18df0df5e40a232c58 [ROCm/ROCR-Runtime commit: `381dba3932`]	2018-11-28 10:01:35 -05:00
Eric Huang	56b9bb17a7	libhsakmt: add RAS support RAS feature enabling bit and errors return are implemented in existed topology and event mechanism. Change-Id: I9b018bba80cf4a6998e42a7bff64318c689b1d2a Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com> [ROCm/ROCR-Runtime commit: `1fbe010354`]	2018-11-23 11:42:34 -05:00
changzhu	6ab4bbe2a8	kfdtest: fix SDMACopyParams build error on redhat 7.2 in KFDTestUtilQueue.cpp In file included from /usr/include/c++/4.8.2/algorithm:62:0, from /home/jenkins/libhsakmt/tests/kfdtest/src/KFDTestUtilQueue.cpp:24: /usr/include/c++/4.8.2/bits/stl_algo.h: In instantiation of ‘_RandomAccessIterator std::__unguarded_partition(_RandomAccessIterator, _RandomAccessIterator, const _Tp&, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<SDMACopyParams, std::vector<SDMACopyParams> >; _Tp = SDMACopyParams; _Compare = bool ()(SDMACopyParams&, SDMACopyParams&)]’: /usr/include/c++/4.8.2/bits/stl_algo.h:2296:78: required from ‘_RandomAccessIterator std::__unguarded_partition_pivot(_RandomAccessIterator, _RandomAccessIterator, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<SDMACopyParams, std::vector<SDMACopyParams> >; _Compare = bool ()(SDMACopyParams&, SDMACopyParams&)]’ /usr/include/c++/4.8.2/bits/stl_algo.h:2337:62: required from ‘void std::__introsort_loop(_RandomAccessIterator, _RandomAccessIterator, _Size, _Compare) [with _RandomAccessIterator = __gnu_cxx::__normal_iterator<SDMACopyParams, std::vector<SDMACopyParams> >; _Size = long int; _Compare = bool ()(SDMACopyParams&, SDMACopyParams&)]’ /usr/include/c++/4.8.2/bits/stl_algo.h:5499:44: required from ‘void std::sort(_RAIter, _RAIter, _Compare) [with _RAIter = __gnu_cxx::__normal_iterator<SDMACopyParams, std::vector<SDMACopyParams> >; _Compare = bool ()(SDMACopyParams&, SDMACopyParams&)]’ /home/jenkins/libhsakmt/tests/kfdtest/src/KFDTestUtilQueue.cpp:351:66: required from here /usr/include/c++/4.8.2/bits/stl_algo.h:2263:35: error: invalid initialization of reference of type ‘SDMACopyParams&’ from expression of type ‘const SDMACopyParams’ while (__comp(__first, __pivot)) ^ /usr/include/c++/4.8.2/bits/stl_algo.h:2266:34: error: invalid initialization of reference of type ‘SDMACopyParams&’ from expression of type ‘const SDMACopyParams’ while (__comp(__pivot, __last)) ^ Change-Id: I0fce0c7e6d0a0ce93b1e6522ee8f216615765568 Signed-off-by: changzhu <Changfeng.Zhu@amd.com> [ROCm/ROCR-Runtime commit: `c15cf2e9c3`]	2018-11-21 17:23:03 +08:00
Oak Zeng	5d15953efb	Add test to allocate SDMA queue on specific engine Change-Id: I5b5140e4119fc01db250d63cca7389cf80ec0d16 Signed-off-by: Oak Zeng <ozeng@amd.com> [ROCm/ROCR-Runtime commit: `af5b320c47`]	2018-11-20 11:17:43 -05:00
shaoyunl	0ad77ef647	KFDTest: fix failure when run KFDTest on multi-GPU small bar system On small bar multi-gpu system, hsaKmtMemoryMapToGPU will fail due to latest kernel P2P sanity check. Swith to use hsaKmtMemoryMapToGPUNodes to fix the failure Change-Id: Id8b6329d1243df0e908cc9a171b5c7f9156f4a8b Signed-off-by: shaoyunl <shaoyun.liu@amd.com> [ROCm/ROCR-Runtime commit: `d8009b4fd3`]	2018-11-19 16:09:31 -05:00
shaoyunl	a8af1e5e56	Thunk: make scratch memory only map to its own GPU Map scratch memory to the GPU that specified when allocate the memory Change-Id: I788f9ef0dccb63b894a75e75cac5f94a60d7ec48 Signed-off-by: shaoyunl <shaoyun.liu@amd.com> [ROCm/ROCR-Runtime commit: `29b45b8c0a`]	2018-11-19 10:26:31 -05:00
Oak Zeng	6087fd7bca	Create SDMA queue on specific engine Change-Id: Iece03795510d66b03324174203faa0ac9eb4fb7d Signed-off-by: Oak Zeng <ozeng@amd.com> [ROCm/ROCR-Runtime commit: `acb80d7583`]	2018-11-13 14:52:57 -05:00
Oak Zeng	6c760dcb74	Move m_Type to a local variable BaseQueue class has a member function GetQueueType so m_Type is duplicated. m_Type is only used in one function. Move it to a local variable. Change-Id: Ice144cf723178dd628cb49261c23d10605f9ee7d Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `8d65e72045`]	2018-11-13 14:52:17 -05:00
Oak Zeng	f0eb1573e6	Create SDMA queue on specific engine Change-Id: Id651ececda55b81b45e991bd8e6616674be48d8e Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `58b95e0a9d`]	2018-11-13 14:52:17 -05:00
Oak Zeng	b87f8459f4	Add more SDMA queue type Those new types are used to create SDMA queue on specific engine Change-Id: I91c3bcc14fef7404cf42b256a18651432e171091 Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `5173e71810`]	2018-11-13 14:52:01 -05:00
Oak Zeng	49dbd130f5	Use latest kfd_ioctl.h file Change-Id: Icd7da4a305581c6857e17d59fbd0c3bd5101df3b Signed-off-by: Oak Zeng <Oak.Zeng@amd.com> [ROCm/ROCR-Runtime commit: `055f7c9c2c`]	2018-11-13 14:51:46 -05:00
Felix Kuehling	6819730ea3	libhsakmt: Distinguish EPERM and EACCES EPERM means "operation not permitted" and is returned when CGroup access checks fail. EACCES means "permission denied" and is returned when the device file permission bits or access control list don't allow access. EPERM can fail silently, since we assume the administrator disabled a device on purpose in the CGroup. EACCESS should produce an error message and an info message to check the device file permissions. Change-Id: Iee4c5584c5fdc4e113c3d760dede6661097b4341 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> [ROCm/ROCR-Runtime commit: `5e4e19d47b`]	2018-11-12 17:06:18 -05:00
Mike Li	b3fdcfe3b9	Changed scripts to include running kfdtest in docker container Change-Id: I822ff4869610df6abad846542d7c290b7a5aae79 [ROCm/ROCR-Runtime commit: `3afce42b57`]	2018-11-07 16:09:12 -05:00
Gang Ba	9147adc1d5	Add code to support packet capture and replay in the Thunk This feature only support dgpu for now. Change-Id: Ic766ec06892c955dd605ecc335a776335edc0df2 Signed-off-by: Gang Ba <gaba@amd.com> [ROCm/ROCR-Runtime commit: `c54c1dbdcb`]	2018-10-31 16:53:46 -04:00
Harish Kasiviswanathan	278287f045	libhsakmt: Support device controller cgroup Device whiltelist controller cgroup allows to track and enforce open and mknod restrictions on device files. Tasks should works with /dev/dri/renderN devices that are whitelisted for its cgroup. If a certain node is not whitelisted it is not an error condition. Change-Id: I0b997423ccdc00aee98df5b6f04ed6794549604e Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `c1994e28f0`]	2018-10-30 11:31:53 -04:00
Kent Russell	c3aacd8463	Specify requirement of NUMA libs for Thunk Add the numa libs to the thunk specs for DEB/RPM, so we can remove the manual installation requirement Change-Id: I5aadcf581b64e9a20aee9c1e1204af4715d1e990 [ROCm/ROCR-Runtime commit: `10edccb912`]	2018-10-25 07:37:07 -04:00
Philip Cox	84b9ffbbbd	Fix Debug Thunk spec mismatch Move debug trap support capabilities to their own structure to fix thunk spec vs header mismatch. Change-Id: I6694601bfa36097502c8ab932e082d7a4645d5b2 Signed-off-by: Philip Cox <Philip.Cox@amd.com> [ROCm/ROCR-Runtime commit: `105edd4bb4`]	2018-10-24 11:32:12 -04:00
xinhui pan	11106ed72f	kfdtest: blacklist KFDQMTest.SdmaEventInterrupt On gfx900+, the test sometimes timeout due to cp fw bug. Blacklist it until we address the root cause and have a fix. Change-Id: Iff600a6f6dbd86c56e034f530484205520bced32 Signed-off-by: xinhui pan <xinhui.pan@amd.com> [ROCm/ROCR-Runtime commit: `7a13bb4d66`]	2018-10-19 15:29:54 -04:00
xinhui pan	4bf0f9f43c	kfdtest: Add more debug information of sdma event interrupt test We observe this test fails on gfx900+. Looks like the sdma packets are not executed at all after we submit sometimes. Run it with timeout 2s on gfx900. [ RUN ] KFDQMTest.SdmaEventInterrupt [----------] SDMACopyData FAIL! 1485262707170 VS 1485262747814 [----------] Event On Queue 1:0 Timeout, try to resubmit packets! [----------] The timeout event is signaled! [ ] Time Consumption (ns) [ ] 1: 1859427148 [ ] 2: 680148 [ ] 3: 6370 [ ] 4: 5481 /home/pp/code/compute/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:1670: Failure Value of: (ret) Actual: 31 Expected: HSAKMT_STATUS_SUCCESS Which is: 0 [----------] SDMACopyData FAIL! 1485367669958 VS 1485367750022 [----------] Event On Queue 2:1 Timeout, try to resubmit packets! [----------] The timeout event is signaled! [ ] Time Consumption (ns) [ ] 1: 1881615148 [ ] 2: 673629 [ ] 3: 6074 [ ] 4: 5481 /home/pp/code/compute/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:1670: Failure Value of: (ret) Actual: 31 Expected: HSAKMT_STATUS_SUCCESS Which is: 0 [----------] SDMACopyData FAIL! 1485427671250 VS 1485427751238 [----------] Event On Queue 2:1 Timeout, try to resubmit packets! [----------] The timeout event is signaled! [ ] Time Consumption (ns) [ ] 1: 1881508777 [ ] 2: 741629 [ ] 3: 6074 [ ] 4: 5481 /home/pp/code/compute/libhsakmt/tests/kfdtest/src/KFDQMTest.cpp:1670: Failure Value of: (ret) Actual: 31 Expected: HSAKMT_STATUS_SUCCESS Which is: 0 [ FAILED ] KFDQMTest.SdmaEventInterrupt (23675 ms) Change-Id: I7c1b752537d89782570df20838bf976578614f75 Signed-off-by: xinhui pan <xinhui.pan@amd.com> [ROCm/ROCR-Runtime commit: `ab4610cff7`]	2018-10-19 15:29:54 -04:00
Yong Zhao	e3f00a21ad	kfdtest: Clean up the indentations in PM4ReleaseMemoryPacket::InitPacket() Change-Id: I7f6b08697f6a68bf8c4a388c9f1cf3c3c8e6c81f Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> [ROCm/ROCR-Runtime commit: `d7e6d4706c`]	2018-10-17 14:28:15 -04:00
Yong Zhao	569bdf3c84	kfdtest: Improve the SignalEvent test Create an extra event so that the event id to test is non zero. That way we can be sure the context id received in kernel ISR is non zero, which is different from the default value 0 when context id is not set at all. Change-Id: I7e261d1bbb783d5afd15558c7ac00493b1218cef Signed-off-by: Yong Zhao <Yong.Zhao@amd.com> [ROCm/ROCR-Runtime commit: `77bab8596f`]	2018-10-17 14:27:54 -04:00
Gang Ba	197f731fbc	drm/amdkfd: Added gfx904 and gfx803 for KFD. Change-Id: I4406dc70c776926feaecca3f2146d65259a80517 Signed-off-by: Gang Ba <gaba@amd.com> [ROCm/ROCR-Runtime commit: `52ec7f805e`]	2018-09-25 08:17:44 -04:00
Mike Li	7cd87a5590	all_gpu_id_array: Handle GPU resource management GPU Resource management can disable some of the GPU nodes. The Kernel driver could be not aware of this. Get from Kernel driver information of all the nodes and then filter it. Change-Id: I4eeb126a5efce2192c35f5d2b72be1811e9ded32 Signed-off-by: Mike Li <Tianxinmike.Li@amd.com> [ROCm/ROCR-Runtime commit: `3144a84b9a`]	2018-09-24 11:38:11 -04:00
Mike Li	150eaea0af	kfdtest: Handle GPU resource management Currently the FindDRMRenderNode function will access the sysfs directly to find the render node. It doesn't work with the GPU management changes. Have changed code to call hsaKmtGetNodeProperties instead. Change-Id: I3bb537a323bc1e8c49f38d8aabc60c13e268aecd Signed-off-by: Mike Li <Tianxinmike.Li@amd.com> [ROCm/ROCR-Runtime commit: `c3b47c0959`]	2018-09-24 11:38:11 -04:00
Mike Li	3feaa41dd7	Output a error message only when open_drm_render_device failed unexpectedly. Change-Id: I5b9587a8d5c7a900e9ab8611a25d0c49d34b4cef Signed-off-by: Mike Li <Tianxinmike.Li@amd.com> [ROCm/ROCR-Runtime commit: `f9bd960344`]	2018-09-24 11:36:11 -04:00
xinhui pan	5b7d3a16c5	kfdtest: add P2POverheadTest This is to measure the laterncy + overhead of sdma packet consumption on p2p. It is Similar with QueueLatency test. What's more, the queue's overhead with different workload show more details. test result on two gfx900. [ RUN ] KFDPerformanceTest.P2POverheadTest [ ] Test (avg. ns) \| Size 4 8 16 64 256 1024 [ ] ----------------------------------------------------------------------- [ ] [push] [1 -> 0] 333 148 185 111 148 148 [ ] [push] [1 -> 1] 370 222 333 74 148 111 [ ] [push] [1 -> 2] 333 148 148 148 148 148 [ ] [push] [2 -> 0] 111 333 259 148 148 148 [ ] [push] [2 -> 1] 222 148 185 148 148 148 [ ] [push] [2 -> 2] 222 111 370 111 74 148 [ ] [pull] [1 -> 0] 370 296 296 148 185 148 [ ] [pull] [1 -> 1] 185 333 222 148 222 148 [ ] [pull] [1 -> 2] 222 444 259 148 185 111 [ ] [pull] [2 -> 0] 148 148 148 148 148 148 [ ] [pull] [2 -> 1] 148 148 148 148 148 148 [ ] [pull] [2 -> 2] 185 148 148 74 222 296 [ ] [push\|pull][1 -> 0] 1259 1222 1259 1074 1037 962 [ ] [push\|pull][1 -> 1] 1037 1037 1037 740 740 1000 [ ] [push\|pull][1 -> 2] 1259 1259 1296 1037 1000 1074 [ ] [push\|pull][2 -> 0] 1037 1037 1037 1074 1037 1148 [ ] [push\|pull][2 -> 1] 1037 1037 1037 1037 925 1074 [ ] [push\|pull][2 -> 2] 666 666 740 740 703 925 [ OK ] KFDPerformanceTest.P2POverheadTest (459 ms) Change-Id: I422263cb52f7ce184f6f1ff4466d04c239fbe9c9 Signed-off-by: xinhui pan <xinhui.pan@amd.com> [ROCm/ROCR-Runtime commit: `918a45a430`]	2018-09-24 09:28:00 -04:00
Harish Kasiviswanathan	f709e5f94d	Topology: Use processors available to the process The existing call sysconf (_SC_NPROCESSORS_ONLN) provides the number of processors available to the scheduler. When a KFD process is run under a container environment, only a subset (cpuset) of processors are available to the current process. For getting CPU cache information use sched_getaffinity() to get the number of processors available to the current process. Change-Id: Ieac02f1f61c17e24ac34ba502968c69d3bc631cb Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `fb79a0efe2`]	2018-09-21 10:31:59 -04:00
xinhui pan	c61fffa876	kfdtest: Add P2P bandwidth test The test measures the bandwidth between GPUs. Currently we do not care numa topology as some products really support across PCI-e root complex p2p. test result on two gfx900 system. [ RUN ] KFDPerformanceTest.P2PBandWidthTest [ ] Copy from node to node by [push, NONE] [ ] [1 -> 0] 6.13477 - 6.12695 GB/s [ ] [1 -> 2] 3.77734 - 3.76855 GB/s [ ] [2 -> 0] 6.67676 - 6.6543 GB/s [ ] [2 -> 1] 6.14453 - 6.12793 GB/s [ ] Copy from node to node by [pull, NONE] [ ] [1 -> 0] 6.10547 - 6.08105 GB/s [ ] [1 -> 2] 9.65527 - 9.65039 GB/s [ ] [2 -> 0] 6.49805 - 6.4873 GB/s [ ] [2 -> 1] 8.95508 - 8.85254 GB/s [ ] Full duplex copy from node to node by [push\|pull, NONE] [ ] [1 -> 0] 11.0986 - 11.0986 GB/s [ ] [1 -> 2] 7.54297 - 7.54297 GB/s [ ] [2 -> 0] 12.0264 - 11.9639 GB/s [ ] [2 -> 1] 12.0469 - 12.0371 GB/s [ ] Full duplex copy from node to node by [push, push] [ ] [1 <-> 2] 11.7324 - 11.4541 GB/s [ ] Full duplex copy from node to node by [pull, pull] [ ] [1 <-> 2] 11.4824 - 11.0508 GB/s [ ] Copy from node to multiple nodes by [push, NONE] [ ] [1 -> [0...2]] 5.625 - 5.73633 GB/s [ ] [2 -> [0...2]] 6.45801 - 6.4707 GB/s [ ] Copy from multiple nodes to node by [push, NONE] [ ] [[1...2] -> 0] 12.8379 - 12.2578 GB/s Now we can get more timestamp info like below. Copy from node to node by [push, NONE] [1 -> 0] [1 : 0] #-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-##-#-#-#-#-############################### [1 : 1] #################################################################################################### [1 -> 2] [1 : 0] #--#-#-#-#-#--#-#-#-#-#--#-#-#-#-#--#-#-#-#-#-#--#-#-#-#-#--#-#-#-#-#--#-#-#-#-#-#--#-#-#-#-#--#-#-###################################### [1 : 1] ##################################################################################################-# [2 -> 0] [2 : 0] ##-###-##-###-###-##-###-##-###-###-##-###-###-##-###-###-##-###-##-###-###-##-###-###-##-###-###-################# [2 : 1] ###############################################################################-#############-###-## [2 -> 1] [2 : 0] ##-##-##-##-##-###-##-##-##-##-##-###-##-##-##-##-###-##-##-##-##-##-###-##-##-##-##-###-##-##-##-#################### [2 : 1] ################################################################################-###-############-## [snip] Full duplex copy from node to node by [push, push] [1 <-> 2] [1 : 0] #-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#################################### [1 : 1] ################-###################################################-############-####-############# [2 : 2] #-##-##-##-#-##-##-##-##-#-##-##-##-##-#-##-##-##-##-#-##-##-##-##-#-##-##-##-##-##-#-##-##-##-##-#-##-##-##-##-##-#-################## [2 : 3] #####-######-#####-######-#####-######-#####-######-#####-######-#####-######-#####-######-#####-######-#####-#####-## Full duplex copy from node to node by [pull, pull] [1 <-> 2] [1 : 0] ######################################################################-##-#-###############-####-### [1 : 1] #-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-##-#-#-############################ [2 : 2] ##-##-##-##-###-##-##-##-##-###-##-##-##-###-##-##-##-##-###-##-##-##-###-##-##-##-##-###-##-##-##-##-###-##-##-##-###-##-##-############ [2 : 3] #-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#########-############# Copy from node to multiple nodes by [push, NONE] [1 -> [0...2]] [1 : 0] #-#--#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-#-############################### [1 : 1] ########################################################################################-###-###-### [2 -> [0...2]] [2 : 0] ##-##-##-###-##-###-##-##-###-##-###-##-##-###-##-###-##-###-##-##-###-##-###-##-##-###-##-###-##-################## [2 : 1] -################################################################################################-## Copy from multiple nodes to node by [push, NONE] [[1...2] -> 0] [1 : 0] #-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-#-#-#-#-#-##-#-#-#-############################### [1 : 1] ################################################################################################-#-# [2 : 2] ##-##-##-###-##-##-###-##-##-##-###-##-##-###-##-##-###-##-##-###-##-##-##-###-##-##-###-##-##-###-##-################## [2 : 3] #########################-#########################-#########################-######################### [ OK ] KFDPerformanceTest.P2PBandWidthTest (15982 ms) Change-Id: Ia90044191d51650ccb220476d31fb317aa3ad6ce Signed-off-by: xinhui pan <xinhui.pan@amd.com> [ROCm/ROCR-Runtime commit: `e5a541eaf2`]	2018-09-19 12:03:05 +08:00
xinhui pan	6b357be502	kfdtest: add KFDTestUtilQueue Some infrastructures below, Implement SdmaTimePacket which records the global GPU timestamp. Introduce class AsyncMPSQ and AsyncMPMQ. AsyncMPSQ is aka async multiple packet single queue. It takes a set of packet when create and submits them to a GPU to run. While AsyncMPMQ is aka async multiple packet multiple queue. It manages a set of AsyncMPSQ, and use a forloop to do operations of AsyncMPSQ. Implement sdma_multicopy helper functions. Change-Id: I47e1d2ca9630113b2a1d85a0055f3f8ee629fb5f Signed-off-by: xinhui pan <xinhui.pan@amd.com> [ROCm/ROCR-Runtime commit: `f618b3f075`]	2018-09-19 12:03:05 +08:00
Xiaojie Yuan	ca0873a234	Use 'RecordProperty' to record performance scores For following test cases: - KFDQMTest.QueueLatency - KFDQMTest.BasicCuMaskingLinear - KFDQMTest.BasicCuMaskingEven - KFDMemoryTest.MMBandWidth - KFDMemoryTest.MMapLarge - KFDMemoryTest.MMBench v2: xml element cannot start with a number, so change the key name of MMBandWidth and MMBench accordingly xml element cannot contain whitespaces, so trim whitespaces in "VRAM " v3: introduce KFDLog-like way to use KFDRecord Change-Id: Ifc3ed5657621252a7b39dccf1ef4f50a92593f77 Signed-off-by: Xiaojie Yuan <xiaojie.yuan@amd.com> [ROCm/ROCR-Runtime commit: `247fa9f1e0`]	2018-09-18 17:41:14 +08:00
xinhui pan	175bd1ed3d	kfdtest: Do not set GTEST_FLAG throw_on_failure This change is from commit a505c9bb("kfdtest: Do not set GTEST_FLAG throw_on_failure"). But it is unexpected to reverted by commit b86f1456("kfdtest: Clean up comments"). So add this change back. Fix: `b86f1456` Change-Id: Ia9e99c9ca17b99aab62b4db55017018ddae43dfb Signed-off-by: xinhui pan <xinhui.pan@amd.com> [ROCm/ROCR-Runtime commit: `a6287ba919`]	2018-09-11 10:25:56 +08:00
xinhui pan	501d3878ae	kfdtest: Fix queuelatency fail issue The timestamp written by releaseMemory packet might still not be visible when we fetch it. To fix this bug, use event-based wait. Change-Id: If2324eb3b3a632c711ee4dff4d03a93d5306c289 Signed-off-by: xinhui pan <xinhui.pan@amd.com> [ROCm/ROCR-Runtime commit: `07bd97a864`]	2018-09-10 21:17:29 -04:00
Felix Kuehling	c08dca02d7	libhsakmt: Fix segfault on gfx801 Handle the case that svm.dgpu_aperture does not exist in vm_find_object. Change-Id: Ic0983d4f321f1b6248514f2fa25162976e90bd75 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> [ROCm/ROCR-Runtime commit: `be574169c1`]	2018-09-10 14:39:05 -04:00
Harish Kasiviswanathan	af0eadcee6	kfdtest: GetNodeIoLinkProperties: Display NodeFrom Use the NodeFrom returned by hsaKmtGetNodeIoLinkProperties() to check its correctness. Change-Id: I6ce436dc7c5d5b192bee21156292bd3eff77f916 Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `1fda429726`]	2018-09-10 09:44:24 -04:00
Harish Kasiviswanathan	a0cee77f82	Add cgroup support Some nodes are unavailable based on the task's cgroup hierarchy. Handle this situation by ignoring those nodes Change-Id: I72f9e822d2ec8cf15732df95e427d5549a75b55d Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `7876bb70a9`]	2018-09-06 16:56:32 -04:00
Harish Kasiviswanathan	ef0dad6679	iolinks: Handle GPU resource management With GPU resource management, some nodes are unavailable based on the cgroup hierarchy of the task. Kernel via sysfs specifies all the iolinks. Skip the links which are not accessible. Also iolinks specified by the kernel refer to sysfs Node IDs. Map it to relevant user Node IDs v2: NodeFrom mapped from sysfs Node to User Node Change-Id: I95312ee6ca51b89fe9e6ca2a9185c2ea1e94afc4 Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `866ef20054`]	2018-09-06 16:56:07 -04:00
Harish Kasiviswanathan	b3329ec72d	Replace global variable _system with g_system Change-Id: I452090473a5b46b32204f7f916bdcfdd3e8a47bd Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> [ROCm/ROCR-Runtime commit: `f84a99e953`]	2018-09-06 16:56:07 -04:00

1 2 3 4 5 ...

459 Commity