rocm-systems

Author	SHA1	Message	Date
James Zhu	4ba8f1fe77	kfdtest: Add test for event wait with event age tracking enable Add 5 different test scenario to cover new event age tracking features. Change-Id: Icab43240fd127208b18abbd7542d6444127ef0c7 Signed-off-by: James Zhu <James.Zhu@amd.com>	2023-06-10 11:41:50 -04:00
James Zhu	a0cbf90b90	libhsakmt: add event age tracking Keeping last signaled event age to avoid race conditions for HSA_EVENTTYPE_SIGNAL when event age init value is non-zero. Change-Id: Ifb9a11a6868e5762a9f92f579e45a0a2c8fa1017 Signed-off-by: James Zhu <James.Zhu@amd.com>	2023-06-10 11:41:50 -04:00
Laurent Morichetti	6a82b0a038	Fix a race condition in the trap handler status.priv may be read after returning from the trap handler, which causes sq_interrupt_word_wave.priv to be 0 even though the s_sendmsg instruction was initiated when status.priv was 1. To work around this, added a s_waitcnt lgkmcnt(0) after s_sendmsg to make sure the message is sent before continuing. Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com> Signed-off-by: Laurent Morichetti <Laurent.Morichetti@amd.com> Change-Id: Ieb75005ca1559ef03d0efac80e966f521e41fcb7	2023-06-09 10:03:55 -04:00
Ammar ELWazir	fc603d58d2	: Adding support to UMC & MMEA System Blocks Change-Id: I92601f37757e0cff3f1fdc10f2e5e0db51c1ee2d	2023-06-08 21:22:19 +00:00
Ori Messinger	4675492852	kfdtest: Fix minor typo The purpose of this patch is to fix a minor typo in KFDSVMRangeTest. Before: "Skipping test: no enough system memory." After: "Skipping test: Not enough system memory." Signed-off-by: Ori Messinger <Ori.Messinger@amd.com> Change-Id: I247cb558a177a1d25c393bf16c7386f4d79d0fba	2023-06-08 15:58:25 -04:00
Graham Sider	d1a095123d	kfdtest: Update GFX11 blacklist KFDQMTest.MultipleCpQueuesStressDispatch is fixed as of MES SCHQ version 0x3c (). Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I437f3eb5f12dc159339a9b7c7cff2e2b8214ad7c	2023-06-08 14:11:30 -04:00
Jonathan Kim	233413eb08	Remove Tab Indent on SDMA Status Fix Use spaces not tabs. Change-Id: Icaeb16158ebaddd8e5ac518103d285d55fe976f3	2023-06-07 16:47:04 -04:00
Xiaomeng Hou	389cd3564b	Do not reserve scratch memory on asic with finite vram resource Change-Id: I0a2207cb01f464ed3e73331637cfa9bd62f03d97	2023-06-06 22:01:31 +08:00
Sreekant Somasekharan	1428a7538e	kfdtest: RoundToPowerOf2 function modified for compiler compliant bit shift values Compiler behavior is undefined if the right operand is negative, or greater than or equal to the width of the promoted left operand. For release builds with address sanitizer enabled, this compiler optimization behavior leads to unsupported queue size value since current method shifts till 128 bits on a 64 bit value. Signed-off-by: Sreekant Somasekharan <sreekant.somasekharan@amd.com> Change-Id: Iafdc82d0dfb7f79e3012fb7bb70eda80e4b7a7a6	2023-06-05 18:14:58 -04:00
David Yat Sin	e4fffa140a	Removing __linux__ definition in CMake Removing this definition as this should already be defined by compiler. This is causing compile errors on newer versions of llvm because the macro is being redefined. Change-Id: Ica6a06f46a14e16d3f52e83b9b5ee8cfd7359510	2023-06-05 12:23:56 -04:00
Graham Sider	e2c3c3e510	Revert "Disable Queue_Validation_InvalidGroupMemory" This reverts commit `7b74271d5e`. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I8424c96d5e5c3c9a9e7711ecff7c5372190b0d2d	2023-06-05 09:41:02 -04:00
Graham Sider	dbe2a82e35	rocrtst: Remove extra clear_code_object() calls A patch was made in gfx940 npi branch to move the kernel object file loading to outside the rocrtstNeg.Queue_Validation_* main queue creation and submission loops, and added a clear_code_object() after the loop. Another patch was made to the non-npi branch which adds a clear_code_object() inside the loop. When the npi branch patch was merged, this was causing the code object to be cleared at the end of the first loop. Remove these clear_code_object() calls. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Id4188e78411e81c5071bf715c1f02491f571ab79	2023-06-05 09:41:02 -04:00
Alex Sierra	728162c2c8	libhsakmt: include changes for upstream debugger API Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: Id296e13dff431c7a151c5aae0b93412b1e116467	2023-06-02 15:56:19 -04:00
Xiaomeng Hou	557da77c4e	Correct the SDMA engine mask reported on apu There is only one SDMA instance on small APUs. Change-Id: I9d4dda511c40fc78f002be720e5f1909dc5b91e4	2023-06-02 19:10:08 +08:00
David Yat Sin	fc3b554121	Change failure to parse CPUID to warning Change-Id: If42dbcd11ac1be09597e43a8f11caa91cf37903e	2023-05-31 11:46:52 -04:00
Kent Russell	718d95de77	fmm.c: Fix possibly initialized variable usage If we end up in the first if clause, aperture_base is not set, unlike the other 2 clauses. Initialize it to NULL at declaration time, and only change its value in the final else clause, where we set it to aperture->base Change-Id: I2bf44dc93cae8a03e66f41cedd85d57be2115bba Signed-off-by: Kent Russell <kent.russell@amd.com>	2023-05-30 15:46:14 -04:00
Xiaogang Chen	f6183f937e	libhsakmt: allow gpu nodeid arrary is null and number of gpu is zero. Allow hsaKmtRegisterGraphicsHandleToNodes parameters NodeArray be null and NumberOfNodes be zero at same time. It is the case we want the imported buffer not be registered by kfd. Set gpu_id_array = NULL explicitly to avoid free uninitialized gpuid array. Report: Yat Sin, David<David.YatSin@amd.com> Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I3babc1160c9573e38dd11d81965c8de2b70cae2e	2023-05-29 00:15:14 -04:00
Xiaogang Chen	7e4e57ae5f	libhsakmt: have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu. Have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu to keep consistency. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: Ifabb72301e1d5a6c1310973bb1321714e12a1fa6	2023-05-29 00:15:14 -04:00
Xiaogang Chen	ac1db60fc2	libhsakmt: query/use render node fds that libdrm uses. Query render node fds that libdrm uses for current process and use them at Thunk if available. v2: avoid naming conflict with amdgpu_device_get_fd from amdgpu.h Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: Id7288c03730f4a4c9c3644e37ca4725fec71a471	2023-05-29 00:15:14 -04:00
Xiaogang Chen	9bebb276be	libhsakmt: add NodeId at HsaGraphicsResourceInfo. Return GPU NodeId that exported the DMA buffer from amdgpu graphic driver at fmm_register_graphics_handle. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: Iaeccce6e6d0b7e27f10b15ed89d1b5310d03d44b	2023-05-29 00:15:14 -04:00
Xiaogang Chen	989c6c617c	libhsakmt: add DMABuf import without address allocation. When gpu map info is not provided import DMABuf without VA assigned. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I996ab4eb46977af5064126529c28a8bf20a67292	2023-05-29 00:15:14 -04:00
Xiaogang Chen	d2a37894bb	libhsakmt: support allocating a fixed address at mmap_aperture. When HsaMemFlags.ui32.FixedAddress=1 allocate fixed address at mmap_aperture. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I1f3b532ec3c1a4fb0962126a0bd56441abaf6a9c	2023-05-29 00:15:14 -04:00
Xiaogang Chen	11ac57d293	libhsakmt: update HsaPointerInfo for address-only allocated VRAM. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: Ib88b34dff772997d2b2e5f3c7e333cef3092ef56	2023-05-29 00:15:14 -04:00
Xiaogang Chen	108c0e5f92	kfdtest: add kfdtest cases for VA-only, VRAM-only allocated VRAM. Alloc vram by kfd, then map by GEM api to GPU VM and map to CPU VM. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: Ib5b2f35662cd5473f622f6ffc9b62925fe57ae42	2023-05-29 00:15:14 -04:00
Xiaogang Chen	0138487aa4	libhsakmt: support vram-only and VA-only alloc/free. Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com> Change-Id: I47cf53642d2ea197c08b20e84d7cae04b2d431e0	2023-05-29 00:15:14 -04:00
Xiaogang Chen	0a2989083b	libhsakmt: add/init a new manageable_aperture_t from NON_CANONICAL space. This new manageable_aperture_t is used for VRAM allocation-only and VA allocation-only. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I3866ef9d35386d6aef7b6934ac8d4a89ef843b50	2023-05-29 00:15:14 -04:00
Xiaogang Chen	cc4fb2d1a9	libhsakmt: Revert "libhsakmt: Update FD creation logic" This reverts commit `fd48f14ceb`. Current amdgpu exposes one render node for one gpu node/partition, revert to previous way to open render node at Thunk. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I436be74f8e872a7ab5c4a1420b4ea884f5a00e57	2023-05-29 00:15:14 -04:00
David Yat Sin	b290d65ec9	Bump interface versions due to hsa_amd_memory_async_copy_on_engine added Change-Id: Iff36719e800280d58217647bb70d3b5d5fcc91fe	2023-05-26 12:04:06 +00:00
Kent Russell	478a68d49c	kfdtest: Test XNACK on and off for SVM tests Add parameterization for KFDSVM tests so that we test with both XNACK enabled and XNACK disabled. This will be overridden by HSA_XNACK, if set Change-Id: Ie96eb61c03115f947e08cfa076ac459f7440f5d8	2023-05-25 12:08:16 -04:00
Graham Sider	0772e8d618	rocrtst: Throw on LocateKernelFile open() failures Throw runtime error instead of returning empty string when open() fails in LocateKernelFile() Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Iafa360fbc2d3c9b01b9fe7ea4c11d70bd254ccce	2023-05-24 14:31:26 -04:00
David Yat Sin	41f6d0426d	Adding gfx941 and gfx942 Adding support for gfx941 and gfx942 ISAs. gfx940 ISA will use sc0:1 sc1:1 on load/store operations gfx942 ISA will use default load/store operations Change-Id: If1efbef86f59e2cf2d48fe359cd4166405a0a579	2023-05-23 11:13:16 -04:00
David Yat Sin	50e754d08b	ASAN: Remap first page of allocations to host mem When compiling in ASAN mode, remap the first page of device allocations to system memory. ASAN's memory allocator uses a small amount of extra memory to store data for housekeeping purpose. But because this memory is from the GPU memory pool, it might have uncommon memory type for host to access. Mapping this section of memory to the host makes this memory accessible to ASAN. Change-Id: I36f659d616a4d15558372592439a8723c5c84a69 Signed-off-by: Bing Ma <Bing.Ma@amd.com>	2023-05-22 20:58:54 -04:00
David Yat Sin	a1f3b619a7	Add mutex when reserving scratch This prevents race condition when creating queues concurrently. Change-Id: I5ea9714926fe06e1719fcb2559cb485063355e4f	2023-05-19 11:05:13 -04:00
David Yat Sin	a397373cea	Add HSA_ENABLE_PEER_SDMA env variable Add support for HSA_ENABLE_PEER_SDMA env variable that can be used to disable use of SDMA engines for device-to-device transfers. Note that setting HSA_ENABLE_SDMA=0 will disable all SDMA transfers and override HSA_ENABLE_PEER_SDMA values. Change-Id: I737b3c2b2efcf3ff237f98bc748f49b8252ed24a	2023-05-18 00:10:20 +00:00
Philip Yang	5df82e3d14	kfdtest: Enable KFDEvictTest and KFDSVMEvictTest on aqua_vanjaram For aqua_vanjaram APU mode, KFDEvictTest and KFDSVMEvictTest are skipped. Those tests passed on dGPU mode with memory reporting partition support on GFX 9.4.3. Change-Id: I56357843c6743b01b807359dbb37b32391fd9a25 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2023-05-17 17:17:46 -04:00
Bing Ma	1e6d728730	libhsakmt: Add support functions for ASAN Add support functions to remap the first page of device memory (GPU/GTT) to share host ASAN logic. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Change-Id: I4c27d5417ba80a172dccb0a079a597c5dc1c8f85	2023-05-17 13:38:19 -04:00
Ranjith Ramakrishnan	ad002f1e7b	Use the RUNPATH provided by build scripts RUNPATH in libraries will be : $ORIGIN RUNPATH in binaries will be : $ORIGIN/../lib Change-Id: Iafa66a8e02cc8c5783903d40927b63652042d2f1	2023-05-17 09:10:50 -04:00
David Yat Sin	39feb83b88	Update documentation for hsa_amd_pointer_info Update documentation for hsa_amd_pointer_info to clarify which fields are invalid when the allocation type is HSA_EXT_POINTER_TYPE_UNKNOWN. Change-Id: Idaed985962c4a98d281ebe01bef8ec2459da3985	2023-05-16 18:36:54 -04:00
David Yat Sin	38e832a682	Reserve scratch on first queue allocation Some workloads running on multi-GPU create 1 process per GPU. So each process creates a GPU agent on every GPU, but will only create queues on one GPU. This would cause un-necessary scratch reservation. Change-Id: I50a216f0bcc0b5f707f3943147390b0ecec1ac22	2023-05-15 17:10:57 -04:00
Graham Sider	bd63e5045c	Fix scratch allocation occupancy reduction loop If the required scratch allocation is too large, ROCr will attempt to reduce it by lowering the dispatch's targeted occupancy. The reduction loop however was prone to overflow if waves_per_cu was not a multiple of waves_per_group. Ensure no overflow by aligning waves_per_cu to waves_per_group. On GC 9.4.3 dGPU, dispatches with a large grid size and a waves_per_group of e.g. 16 may require to reduce occupancy such that waves_per_cu is less than waves_per_group to ensure the allocation size is small enough. Allow this while also ensuring the tmpring scratch wave count is kept divisible by the number of SEs per XCC. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Ie4016dcd8166a9ae69e9decc26a3eec882b49480	2023-05-15 14:55:42 +00:00
Kent Russell	d966243783	kfdtest: Add include directory for ROCr merge When we merge thunk into ROCr, kfdtest will be in a different folder structure. Add the new location to ensure that we can build now and in the future with no disruptions Signed-off-by: Kent Russell <kent.russell@amd.com> Change-Id: I6517e061cb0da7137d903abbc380bfc7126f40d4	2023-05-15 10:13:49 -04:00
David Yat Sin	3477fbc661	Do not report reserved scratch cache as available Scratch cache reserved memory is only available for scratch memory use so do not report this memory as available to the user via the HSA_AMD_AGENT_INFO_MEMORY_AVAIL api. Change-Id: I52f96e62536458bcaa52b9f4be5de856d5680dc4	2023-05-15 09:45:31 -04:00
David Yat Sin	f0000da7b3	Removing invalid gfx entries Change-Id: I1a9a9a064f5f65ecc3e124c5dd7d6baf6b5ccb5c	2023-05-12 11:59:27 -04:00
David Yat Sin	7b74271d5e	Disable Queue_Validation_InvalidGroupMemory Temporarily disabling rocrtstNeg.Queue_Validation_InvalidGroupMemory until it is fixed. Change-Id: Ifc1973a960c8d0bae27e2628e4bfddc60f70325d	2023-05-12 11:03:26 -04:00
Yifan Zhang	d319660838	kfdtest: Using non-paged memory allocation for wptr on devices that have MES scheduler Starting with GFX11, wptr BOs must be mapped to GART for MES to determine work on unmapped queues for usermode queue oversubscription (no aggregated doorbell) Change-Id: I10e30fdc2bec587cef9427faa4874957988c34b3 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>	2023-05-12 01:06:37 -04:00
Yifan Zhang	53ed978c3d	kfdtest: add non paged wptr judging API. If MES is enabled, wptr has to be non paged memory, Add an API to check this condition. Change-Id: I53af1f6687d5332d102e7062c3d760e33b96e722 Signed-off-by: Yifan Zhang <yifan1.zhang@amd.com>	2023-05-12 01:06:37 -04:00
Saleel Kudchadker	adf6512dad	Report XGMI SDMA upon query Report XGMI SDMA engines when queried for H2D/D2H. Change-Id: I4fb7b24bc15d1745b3844485bdeab71282a787a5	2023-05-11 12:20:41 -04:00
David Yat Sin	9b35ce5b3b	Fix incorrect check for image support Change-Id: I77476204d40c245c9d9091853264a4e9fbb80725	2023-05-10 20:13:54 +00:00
Ranjith Ramakrishnan	b487f87363	Set the default value of ROCM_HEADER_WRAPPER_WERROR to OFF Using wrapper header files will result in #warning message by default Change-Id: I8301e433d39f3e5d39384ede6f0e4464d0eb20a6	2023-05-10 12:36:00 -04:00
Ranjith Ramakrishnan	fbcbcd9e73	Set the default value of ROCM_HEADER_WRAPPER_WERROR to OFF Using wrapper header files will result in #warning message by default Change-Id: I87739cabb365b9370b1182cf23ca9b54d99149c3	2023-05-10 00:47:33 -04:00

... 15 16 17 18 19 ...

2959 Commits