rocm-systems

作者	SHA1	备注	提交日期
Shane Xiao	5d6f900353	kfdtest: DeviceHdpFlush need set target ASIC with different Gfx versions If Dev0 and Dev1 are not the same gfx, we should temporarily set the target ASIC for compiling Shader code. Signed-off-by: Shane Xiao <shane.xiao@amd.com> Signed-off-by: Shikai Guo <shikai.guo@amd.com> Change-Id: I5836beb16ade519f5a148d3d2b9c2875554f0c35	2023-05-09 09:50:07 -04:00
Sam Wu	57b3fcde51	add sphinx configurations Change-Id: I1a66a02b18fb699415a87a6473eb72c097a13b5f	2023-05-08 15:58:01 -06:00
Graham Sider	54136f60a0	kfdtest: Add Assembler::RunAssembleBuf overload Overload Assembler::RunAssembleBuf to take in an extra Gfxv parameter. Using this overload will temporarily set the target ASIC to Gfxv before calling RunAssemble, and copy back the original MCPU literal upon completion. The copy to reset the original MCPU in this case is safe as the MCPU length is always known. This will be useful in multi-device test cases whereby the devices are not necessarily the same gfx version. The overload is explicitly for the RunAssembleBuf wrapper rather than RunAssemble to ensure the default MCPU is always reset independent of errors in RunAssemble. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I7fe5a962876314b6df32e4b7160174949d98f9e3	2023-05-08 11:35:32 -04:00
Graham Sider	7a4c9273d7	rocrtst: Move kernel object loading outside of loops Negative queue validation tests were doing many redundant from-file kernel object loads in a loop. This was creating many simulataneous open file handles within many dynamically allocated CodeObject objects. While the CodeObject class implements RAII on the file handles to cleanup on destruction, clear_code_object() only gets called on the destruction of the TestBase-derived test objects (these being a suite abstraction). Due to this we were hitting file open() EMFILE errors (too many open files) in gfx94x CPX mode. Move LoadKernelFromObjFile outside of the test loops and clear_code_object() for each test on each agent. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I6f9d23fd122720c49a58c22698f097906d2fc97c	2023-04-27 16:16:12 -04:00
David Yat Sin	a180c9ee78	Add env var to override SRAM ECC Add HSA_ENABLE_SRAMECC environment variable that can be used to override SRAM ECC mode reported by KFD Change-Id: I2b95511820a2d3d146a76b03070659c0695b61fd	2023-04-27 16:16:05 -04:00
David Yat Sin	f024d21e3d	Add query for number of XCCs per agent Change-Id: I4b694b4904ba0326c998356388a62c19a972a7ff	2023-04-27 16:15:59 -04:00
Mike Li	46b667e530	Return failure with any IMAGE attribute for gfx940 The gfx940 does not support IMAGE instructions. Any get_info with IMAGE attributes should return failure. Signed-off-by: Mike Li <Tianxinmike.Li@amd.com> Change-Id: I12005628f92780f551ab6f8b41526c66b54c6a59	2023-04-27 16:15:51 -04:00
Mike Li	d7fa654338	Do not use the function part of the location_id The function IDs used to be 0 on previous asics but on gfx94x and newer asics, these bits are set. These bits are used by user applications to uniquely identify the locations of GPU nodes. These exta bits break hwloc and are not needed for rocrtst. Signed-off-by: Mike Li <Tianxinmike.Li@amd.com> Signed-off-by: David Yat Sin <David.YatSin@amd.com> Change-Id: I1202f504645b0662d009b9c0926eebb7ddc08d73	2023-04-27 16:15:43 -04:00
Mike Li	9554e95de0	Scratch memory changes to support multi-xcc Change-Id: I115ba4cfe250c59cb7421217cfe0fad6302f25b3	2023-04-27 16:15:30 -04:00
Laurent Morichetti	f31b312611	Update the trap handler for gfx940 gfx940 uses ttmp11 to hold the queue packet index so the first level trap handler uses ttmp13 instead to save ib_sts. Repurpose ttmp11[31] to mean that the ttmps are initialized. The issue was that the debugger could not tell whether ttmp6 was written by the trap handler when determining the stop reason. If ttmp11[31]=0, then the trap handler has not been executed and ttmp6 should be assumed to be 0. If ttmp11[31]=1, then ttmp6 holds the trap_id, if an s_trap instruction caused the exception. Signed-off-by: Laurent Morichetti <laurent.morichetti@amd.com> Signed-off-by: Lancelot Six <lancelot.six@amd.com> Change-Id: I9af903abae044b9ec530306229caf3b883f3ee46	2023-04-27 16:15:14 -04:00
Mike Li	de4d1ce424	Add gfx940 to AmdHsaCode Signed-off-by: Mike Li <Tianxinmike.Li@amd.com> Change-Id: Ib4f7c801c3d3bac9a04c880c5bf86b72bfa3404f	2023-04-27 16:09:26 -04:00
Mike Li	bd98a1e5bf	Added gfx940 ISA Signed-off-by: Mike Li <Tianxinmike.Li@amd.com> Change-Id: Icb1830fe186abc69fe7ee709b7f12b882cab9e87	2023-04-27 16:08:58 -04:00
Graham Sider	92f3d4a458	kfdtest: Fix new shader directives LLVM MC does not seem to accept multi-line conditionals. This may be fixable in the future with macros. The Aqua Vanjaram shader spec states that while buffer_invl2 has been replaced by buffer_inv, the former may still be used for compatibility. However, this does not seem to be implemented. For now, fix conditional. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I7f8b64c96055371d7e0090b758d2cfd2a37ecd3c	2023-04-27 10:48:44 -04:00
xinhui pan	77761836ae	thunk: Fix and optimise for pointer range search Previous code might fail to get the correct ln node. And trigger extra walk through of the tree. Fix it. While walking through the tree, better to search from right to left as the node->start likely close to address. Change-Id: If86ddf73e59a1eb88225d1ea90797818e8165488 Signed-off-by: xinhui pan <xinhui.pan@amd.com>	2023-04-20 19:36:29 -04:00
David Francis	eed5518e4c	kfdtest: Enable gfx90a coherency tests on Aqua Vanjaram These tests should also pass on Aqua Vanjaram, so enable them Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Ibbb9cd43d653c63b08c39efd1d7326cfac1f8411	2023-04-19 10:28:05 -04:00
David Francis	30b1f23f7a	kfdtest: Add coherency tests for Aqua Vanjaram Aqua Vanjaram is intended to have fine-grained coherency from anywhere to anywhere else using read-acquire and write-release primitives. Add a test that writes to memory covered by five different cache lines, then write-releases, while another thread read-acquires, then reads those five locations in memory. There are nine variations of the test to cover CPU-GPU, same-GPU and across-GPU, vector instructions and scalar instructions, and data local to the acquirer or receiver. Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: I20d2db5c53bd280e971479aad7e61df6ed5d3623	2023-04-19 10:28:05 -04:00
Philip Yang	0696f06c16	kfdtest: fix KFDSVMRangeTest.MultiGPU tests vector iterator For vector iterator loop access current node directly, don't need gpuNodesAll.at(i), which also causes out of range access. Change vector index loop to iterator loop to simplify the code. Change-Id: I2627ef8d13b5d2c9cd8c51cf4dacc3e8a97fcfb0 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2023-04-18 17:58:06 -04:00
Alex Sierra	e82025bffa	use mkstemp instead tempnam for temp file tempnam has been marked as obsolete. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: Ie64d9a351bf386da00a96ceff059f685e11f2cca	2023-04-17 15:38:59 -04:00
Philip Yang	21abaef3f8	kfdtest: AppAPU Skip KFDEvictTest, KDFSVMEvictTest, HMMProfilingEvent AppAPU VRAM is part of system memory managed by Linux kernel, no VRAM eviction and restore is needed between VRAM and system memory. Those Evict test failed on AppAPU now, skip those tests on AppAPU. No page migration between VRAM and system on AppAPU, HMMProfilingEvent depends on migration event, skip it on AppAPU. Change-Id: I4c809b97c947e809d136c1f88db2278cf74f5b47 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2023-04-17 13:28:23 -04:00
Philip Yang	e2df2c21af	kfdtest: Add helper to check if IsAppAPU system If there is connection between GPU and CPU with weight 13, KFD_CRAT_INTRA_SOCKET_WEIGHT, then this is AppAPU. This will be used to skip tests not suitable for AppAPU. Change-Id: If6fad81528b52afd4ac4cefa508d787b0f6637ca Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2023-04-17 13:28:23 -04:00
Lancelot SIX	183f5d90aa	linux os_thread: improve error handling On Linux, the os_thread abstraction is built on top of pthread. Many of the pthread calls might fail and return error codes. The error conditions are only checked via assertions (if ever checked) which means that when doing a release build, no error condition is checked. The same goes for dlsym/dlinfo and clock_gettime. This commit improves the situation this by checking the error conditions and acting accordingly. When the error condition is detected in a function with a mean to indicate some error to its caller, then this patch prints some error message and returns. If there is no way to propagate the error up the call stack, print some error message and abort the process. For the os_info::os_info ctor, the only user is CreateThread, which checks that the built thread is Valid(). If not, nullptr is returned to the caller. It could be possible to use exceptions when functions cannot pass errors, but for now I only use abort as it is what abort would do with debug build. Change-Id: I815703c3b95777cc29bb89a7d654ac879c14a759	2023-04-17 09:48:11 -04:00
Lancelot SIX	72219b8237	Runtime::GetSystemInfo: Supress parentheses warning When building with g++-11.3.0, I have the following warning: /home/.../core/runtime/runtime.cpp: In member function ‘hsa_status_t rocr::core::Runtime::GetSystemInfo(hsa_system_info_t, void*)’: /home/.../core/runtime/runtime.cpp:693:56: warning: suggest parentheses around ‘&&’ within ‘\|\|’ [-Wparentheses] 693 \| kfd_version.KernelInterfaceMajorVersion == 1 && \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ 694 \| kfd_version.KernelInterfaceMinorVersion >= 12) \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This patch adds the parenthesis as suggested. This silences the compiler warning. No functional change expected. Change-Id: I69c1a73a432b0f2393dbaf36d4424cf0056c535f	2023-04-17 09:43:02 -04:00
Jesse Zhang	4d54d6e706	libhsakmt: Add compute core check for APU We should check compute core instead of cpu core, in order to exclude the case of APU. Signed-off-by: Jesse zhang <jesse.zhang@amd.com> Change-Id: I2ec2a6807f51f49f80e0e500f5d9af81c2efae37	2023-04-17 09:34:37 +08:00
Graham Sider	11a04fe1f5	kfdtest: Fix PersistentIterateShader for gfx target 9.4.x Replace 'flat_load_dword <...> glc' with appropriate macro. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I9fdc7c916c685304457cd9698e741577f6c10c82	2023-04-14 12:05:08 -04:00
Graham Sider	e2435d9e93	kfdtest: Add flat compatability macros for gfx target 9.4.x For GC 9.4.0, modifications were made to various shaders since certain flat_ instructions no longer support glc/slc modifiers (replaced with nt/sc1/sc0). Instead of repeating conditionals inside various shader bodies, we can make use of LLVM AMDGCN macros. This patch modularizes the shader macros into seperated defines. Prior to the core raw-string literal, each shader now starts with the SHADER_START literal (".text\n") plus any number of SHADER_MACRO_* literals. This allows us to seperate the macro definitions logically and use the pre-processor to only include the required macro groups on a per-shader basis. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I19eb3fd14252a0601bb7509249051b68e7fdb02a	2023-04-14 12:05:08 -04:00
David Francis	680c8ca5a9	kfdtest: Make queue evict tests use constant number of wavefronts. Previously, KFDEvictTest.QueueTest and KFDSVMEvictTest.QueueTest would create a variable number of wavefronts, one for each 64MB of memory under test. This ran into limits on the buffers used by the wavefronts, and may at some point have exceeded the wavefront limit. Restrict the number of wavefronts to 512, and adjust the shader to accomodate a variable buffer size Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: I2ec292e2900e2efa62a08313bca3d2f4bdabca8b	2023-04-14 12:05:08 -04:00
Graham Sider	831d1ad352	libhsakmt: Mask stepping version for GC 9.4.3 checks GC 9.4.3 to set gfx target version to 9.4.x dependent on revision and capabilities. Due to this, where applicable, mask off the gfx target stepping version and only check major/minor version (9.4). There are no collisions due to this change since GC 9.4.3 is the only ASIC that uses gfx target version 9.4.x. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I72803e594c421f054d18ccfa7e92c507128fa5be	2023-04-14 12:03:23 -04:00
Philip Yang	598e3e8d86	kfdtest: KFDMemoryTest.DeviceHdpFlush requires large bar KFDMemoryTest.DeviceHdpFlush requires device node 0 is large bar to check VRAM content from CPU, run the test only if device 0 is large bar GPU. Change-Id: I874b153219550c50b724625e971e3ed3a84dc652 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2023-04-14 10:03:38 -04:00
David Francis	e32278a612	kfdtest: Restrict DriverHDPFlush to systems with PCIe Nodes with XGMI have no HDP, so DriverHDPFlush should skip. Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: If5a87e660712e51d03e750d8e044786036b2e603	2023-04-14 10:03:38 -04:00
David Francis	16c6530330	kfdtest: Deprecate PollNCMemoryIsa Even with the restriction to only compile on gfx90a, this shader still fails CompileShaders test. There don't seem to be any systems that actually use it. Leave it in the shader store, but remove it otherwise Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: I41bec6ba10363d42b163ac101c3a92edaad6d6df	2023-04-14 10:03:38 -04:00
David Francis	2a01e5c33b	kfdtest: Use scalar path for PollMemoryIsa Shader on gfx940 A gfx940 code path was erroneously added to this shader. It's unneccesary; without this path, the shader uses the scalar store, which works just fine on gfx940 without changes. Remove it. Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: I825cbbebbdb25c4a7c2f16e228c2bea6a6bcc30c	2023-04-14 10:03:38 -04:00
Ori Messinger	c234f84245	kfdtest: Update blacklist for Aqua Vanjanran Signed-off-by: Ori Messinger <Ori.Messinger@amd.com> Signed-off-by: Amber Lin <Amber.Lin@amd.com> Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com> Signed-off-by: David Francis <David.Francis@amd.com> Signed-off-by: Philip Yang <Philip.Yang@amd.com> Change-Id: I8f822bb71e8e5dbee6bdb62f77cbe5ea83faabb5	2023-04-14 10:03:38 -04:00
David Francis	30da9a3cf9	kfdtest: Update shaders to compile on gfx940 gfx940 changed the semantics of the glc and slc coherency options on vector stores and loads. This means that shaders that use those bits no longer compile on gfx940. Add precompilation if statements to those shaders to use the new coherency bits. Also add gfx940 to ASMTest so that compilation is tested. Note: One of the tests enabled by this patch on gfx940, KFDEvictTest.QueueTest, does not pass on gfx940 emulators. Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: I942f9d2536e9eb5510c4d5af30df6ff1a95c8cf7	2023-04-14 10:03:38 -04:00
Graham Sider	ae659e5427	libhsakmt: Fix queue destroy SVM path free size Use q->total_mem_alloc_size for munmap in SVM codepath of free_queue. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I2fecaa1ddb337b1fe71f9cbba45a0c9467eff0c0	2023-04-14 10:03:38 -04:00
Mukul Joshi	a713fb766e	libhsakmt: Fix memory leak on queue destroy for GFX9.4.3 Currently, on queue destroy, context save restore memory is freed only for a single XCC. Instead, we need to free the entire context save restore memory, which was allocated for all XCCs. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Change-Id: I51ebb12fa8d5ebed41979d68e74f7c5392dca062	2023-04-14 10:03:38 -04:00
David Belanger	252a2cf959	libhsakmt: EOP Removal Do not allocate the EOP buffer when not required. Signed-off-by: David Belanger <david.belanger@amd.com> Change-Id: I1664a3f0a882219a72278174006cdb8d46fd4f5e	2023-04-14 10:03:38 -04:00
Mukul Joshi	8994c3ba0e	kfdtest: Program COMPUTE_PGM_RSRC3 for GFX 9.4.3 Program ACCUM_OFFSET to match the number of VGPRS used by the shader as part of Dispatch setup. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Change-Id: Icfa1fbe4de2a62f00743de567f3ed382d3378b17	2023-04-14 10:03:38 -04:00
David Yat Sin	f43a284b8e	Change error reported when receiving code 128 We used to report HSA_STATUS_ERROR_INVALID_ISA when receiving error code 128, but there are several other reasons why we could be exceeding number of VGPRs, so updating the error code. Change-Id: I6a6980d5b07b09c93d00dee5207a0d52399bc77e	2023-04-14 09:12:07 -04:00
Graham Sider	fd48f14ceb	libhsakmt: Update FD creation logic In multi-partition modes, e.g. CPX, we want to create new file descriptor despite using the same render node. Update open_drm_render_device to use a gpu_id to fd map partitioned by render node. Different gpu_id's requesting the same render node will be added to that render node's map list for fetching its fd. Different gpu_id's requesting different render nodes as well as the same gpu_id's requesting the same render node will behave as they did previously. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Ie153d42355d4d75b1c6ba6ff40fac3295bc87009	2023-04-13 15:25:09 -04:00
Mukul Joshi	97a669a979	libhsakmt: Update context save handling for multi XCC Allocate debug area big enough for all XCCs in the partition. Also, fix the cu_num calculations as driver now reports cu_num as the total number of CUs in the partition. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Change-Id: I6e80d57196b770bb3c2506bc58cb366c0046084b	2023-04-13 15:25:09 -04:00
Graham Sider	6be4461a0d	libhsakmt: Add Aqua Vanjaram support Add gfx version for VGPR size per CU calc, add FAMILY_AV to KfdFamilyId, add blacklist filter to kfdtest.exclude. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I9b8072e45f4d497e0a8fd3f8f97f1425238e8b42	2023-04-13 15:25:09 -04:00
David Yat Sin	511855d344	Fix assertion when _GLIBCXX_ASSERTIONS is enabled One some platforms, e.g Arch Linux, -D_GLIBCXX_ASSERTIONS compile flag is enabled by default, causing a runtime assertion. Avoid assertion by using std::vector accessor function data(). Change-Id: I118cdf102c3e353f32c618823e363ee1059f3453	2023-04-11 11:40:10 +00:00
David Yat Sin	c5bf7eb112	Fix for overwriting pointer info size Fix for overwriting pointer info size provided by caller of hsa_amd_pointer_info. Change-Id: I2e5d73ab9ba1a32bc9b4d112bc29b4a99fd8b3b5	2023-04-06 16:35:37 -04:00
David Yat Sin	8ebf5f9c48	Adding scratch memory reservation Some applications will keep trying to allocate device memory until the allocation fails. This causes all device memory to be used up and we are then unable to allocate scratch memory for dispatches. Reserve enough memory for 1 small scratch allocation. Change-Id: I968400d41540ba1aca8f28581f229693eec02225	2023-04-06 15:13:36 +00:00
Kent Russell	d0c2770cde	CMakeLists: Use pkgconfig more effectively with DRM_DIR Instead of hard-coding lib64 and other include locations, just prepend the DRM_DIR to the beginning of the CMake prefix path. Then let pkgconfig find the package, the same way that it would if DRM_DIR wasn't set. DRM_DIR takes precedence, but the default paths will be used if DRM_DIR isn't set, or doesn't point to where libdrm is housed Note that /lib and /lib/$ARCH aren't required for DRM_DIR, just the path to the root folder for the package (e.g. /opt/amdgpu instead of /opt/amdgpu/lib or /opt/amdgpu/lib64 or /opt/amdgpu/lib/x86_64-linux-gnu etc) Change-Id: I56767db28476d14e3fa77be1089c3904e2a32450	2023-04-06 10:39:40 -04:00
Kent Russell	aab0e36538	README: Update README to point to current documentation Signed-off-by: Kent Russell <kent.russell@amd.com> Change-Id: I3fed80e94edf5ff08a70b2e43450fe8168c5d355	2023-04-05 10:35:49 -04:00
Graham Sider	287cb29340	Revert "kfdtest: add MES judging API in test utility." See description of previous revert. This reverts commit `564913526a`. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I969dc6469e62b50cd7ba0595918538602afa7516	2023-03-27 17:08:03 -04:00
Graham Sider	0750856d4a	Revert "kfdtest: Using non-paged memory allocation only on devices that have MES scheduler" This patch and the previous made it such that the queue ring buffer was allocated as non-paged for GFX11+. The queue ring buffer should not be mapped as non-paged; the non-paged requirement on GFX11 is only needed for the queue wptr. This patch was causing issues on various tests, such as intermittent CP_INTSRC_BAD_OPCODE interrupts. This reverts commit `e40ae8481e`. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I55b64aed73dc3b792f0756ae00daf6e10d93ce10	2023-03-27 17:07:59 -04:00
Graham Sider	5d80a4d214	kfdtest: Add KFDQMTest.BasicCuMaskingEven to GFX11 blacklist Test is inconsistent across ASICs. Add to blacklist to unblock QA. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I31e5aa2450165227107536bef8402db2c0dc6d7f	2023-03-23 11:14:58 -04:00
Alex Sierra	2a1d6ee8b5	libhsakmt: query svm info from userptrs at fault events Get more debug information about user pointers that were registered through SVM API, and triggered by memory exception events. A new kfdtest with this use case was also included inside KFDExceptionTest. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: I0ef4929afe0625b9b5cbbbebef11ede66dda60ab	2023-03-22 13:34:02 -05:00

... 16 17 18 19 20 ...

2959 次代码提交