rocm-systems

Autor	SHA1	Mensaje	Fecha
Alex Sierra	0cbf26c148	src: add debug API to support GPU core dump Functions to API added to extract the following information from KFD Runtime information, device info and queues snapshot. Signed-off-by: Alex Sierra <Alex.Sierra@amd.com> Change-Id: If995ecc54497ab61189bb0f209c64af0bbb0f56f	2023-06-26 18:58:15 +00:00
Alex Sierra	5e0a32d7b3	add hsaKmtGetRuntimeCapabilities API Queries for runtime capabilities after its being enabled Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: I098c0e9862c0c1d5e304b111cdc281c0ccd09691	2023-06-26 18:58:15 +00:00
James Zhu	a0cbf90b90	libhsakmt: add event age tracking Keeping last signaled event age to avoid race conditions for HSA_EVENTTYPE_SIGNAL when event age init value is non-zero. Change-Id: Ifb9a11a6868e5762a9f92f579e45a0a2c8fa1017 Signed-off-by: James Zhu <James.Zhu@amd.com>	2023-06-10 11:41:50 -04:00
Alex Sierra	728162c2c8	libhsakmt: include changes for upstream debugger API Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: Id296e13dff431c7a151c5aae0b93412b1e116467	2023-06-02 15:56:19 -04:00
Kent Russell	718d95de77	fmm.c: Fix possibly initialized variable usage If we end up in the first if clause, aperture_base is not set, unlike the other 2 clauses. Initialize it to NULL at declaration time, and only change its value in the final else clause, where we set it to aperture->base Change-Id: I2bf44dc93cae8a03e66f41cedd85d57be2115bba Signed-off-by: Kent Russell <kent.russell@amd.com>	2023-05-30 15:46:14 -04:00
Xiaogang Chen	f6183f937e	libhsakmt: allow gpu nodeid arrary is null and number of gpu is zero. Allow hsaKmtRegisterGraphicsHandleToNodes parameters NodeArray be null and NumberOfNodes be zero at same time. It is the case we want the imported buffer not be registered by kfd. Set gpu_id_array = NULL explicitly to avoid free uninitialized gpuid array. Report: Yat Sin, David<David.YatSin@amd.com> Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I3babc1160c9573e38dd11d81965c8de2b70cae2e	2023-05-29 00:15:14 -04:00
Xiaogang Chen	7e4e57ae5f	libhsakmt: have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu. Have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu to keep consistency. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: Ifabb72301e1d5a6c1310973bb1321714e12a1fa6	2023-05-29 00:15:14 -04:00
Xiaogang Chen	ac1db60fc2	libhsakmt: query/use render node fds that libdrm uses. Query render node fds that libdrm uses for current process and use them at Thunk if available. v2: avoid naming conflict with amdgpu_device_get_fd from amdgpu.h Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: Id7288c03730f4a4c9c3644e37ca4725fec71a471	2023-05-29 00:15:14 -04:00
Xiaogang Chen	9bebb276be	libhsakmt: add NodeId at HsaGraphicsResourceInfo. Return GPU NodeId that exported the DMA buffer from amdgpu graphic driver at fmm_register_graphics_handle. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: Iaeccce6e6d0b7e27f10b15ed89d1b5310d03d44b	2023-05-29 00:15:14 -04:00
Xiaogang Chen	989c6c617c	libhsakmt: add DMABuf import without address allocation. When gpu map info is not provided import DMABuf without VA assigned. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I996ab4eb46977af5064126529c28a8bf20a67292	2023-05-29 00:15:14 -04:00
Xiaogang Chen	d2a37894bb	libhsakmt: support allocating a fixed address at mmap_aperture. When HsaMemFlags.ui32.FixedAddress=1 allocate fixed address at mmap_aperture. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I1f3b532ec3c1a4fb0962126a0bd56441abaf6a9c	2023-05-29 00:15:14 -04:00
Xiaogang Chen	11ac57d293	libhsakmt: update HsaPointerInfo for address-only allocated VRAM. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: Ib88b34dff772997d2b2e5f3c7e333cef3092ef56	2023-05-29 00:15:14 -04:00
Xiaogang Chen	0138487aa4	libhsakmt: support vram-only and VA-only alloc/free. Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com> Change-Id: I47cf53642d2ea197c08b20e84d7cae04b2d431e0	2023-05-29 00:15:14 -04:00
Xiaogang Chen	0a2989083b	libhsakmt: add/init a new manageable_aperture_t from NON_CANONICAL space. This new manageable_aperture_t is used for VRAM allocation-only and VA allocation-only. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I3866ef9d35386d6aef7b6934ac8d4a89ef843b50	2023-05-29 00:15:14 -04:00
Xiaogang Chen	cc4fb2d1a9	libhsakmt: Revert "libhsakmt: Update FD creation logic" This reverts commit `fd48f14ceb`. Current amdgpu exposes one render node for one gpu node/partition, revert to previous way to open render node at Thunk. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I436be74f8e872a7ab5c4a1420b4ea884f5a00e57	2023-05-29 00:15:14 -04:00
Bing Ma	1e6d728730	libhsakmt: Add support functions for ASAN Add support functions to remap the first page of device memory (GPU/GTT) to share host ASAN logic. Signed-off-by: David Yat Sin <David.YatSin@amd.com> Change-Id: I4c27d5417ba80a172dccb0a079a597c5dc1c8f85	2023-05-17 13:38:19 -04:00
xinhui pan	77761836ae	thunk: Fix and optimise for pointer range search Previous code might fail to get the correct ln node. And trigger extra walk through of the tree. Fix it. While walking through the tree, better to search from right to left as the node->start likely close to address. Change-Id: If86ddf73e59a1eb88225d1ea90797818e8165488 Signed-off-by: xinhui pan <xinhui.pan@amd.com>	2023-04-20 19:36:29 -04:00
Jesse Zhang	4d54d6e706	libhsakmt: Add compute core check for APU We should check compute core instead of cpu core, in order to exclude the case of APU. Signed-off-by: Jesse zhang <jesse.zhang@amd.com> Change-Id: I2ec2a6807f51f49f80e0e500f5d9af81c2efae37	2023-04-17 09:34:37 +08:00
Graham Sider	831d1ad352	libhsakmt: Mask stepping version for GC 9.4.3 checks GC 9.4.3 to set gfx target version to 9.4.x dependent on revision and capabilities. Due to this, where applicable, mask off the gfx target stepping version and only check major/minor version (9.4). There are no collisions due to this change since GC 9.4.3 is the only ASIC that uses gfx target version 9.4.x. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I72803e594c421f054d18ccfa7e92c507128fa5be	2023-04-14 12:03:23 -04:00
Graham Sider	ae659e5427	libhsakmt: Fix queue destroy SVM path free size Use q->total_mem_alloc_size for munmap in SVM codepath of free_queue. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I2fecaa1ddb337b1fe71f9cbba45a0c9467eff0c0	2023-04-14 10:03:38 -04:00
Mukul Joshi	a713fb766e	libhsakmt: Fix memory leak on queue destroy for GFX9.4.3 Currently, on queue destroy, context save restore memory is freed only for a single XCC. Instead, we need to free the entire context save restore memory, which was allocated for all XCCs. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Change-Id: I51ebb12fa8d5ebed41979d68e74f7c5392dca062	2023-04-14 10:03:38 -04:00
David Belanger	252a2cf959	libhsakmt: EOP Removal Do not allocate the EOP buffer when not required. Signed-off-by: David Belanger <david.belanger@amd.com> Change-Id: I1664a3f0a882219a72278174006cdb8d46fd4f5e	2023-04-14 10:03:38 -04:00
Graham Sider	fd48f14ceb	libhsakmt: Update FD creation logic In multi-partition modes, e.g. CPX, we want to create new file descriptor despite using the same render node. Update open_drm_render_device to use a gpu_id to fd map partitioned by render node. Different gpu_id's requesting the same render node will be added to that render node's map list for fetching its fd. Different gpu_id's requesting different render nodes as well as the same gpu_id's requesting the same render node will behave as they did previously. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: Ie153d42355d4d75b1c6ba6ff40fac3295bc87009	2023-04-13 15:25:09 -04:00
Mukul Joshi	97a669a979	libhsakmt: Update context save handling for multi XCC Allocate debug area big enough for all XCCs in the partition. Also, fix the cu_num calculations as driver now reports cu_num as the total number of CUs in the partition. Signed-off-by: Mukul Joshi <mukul.joshi@amd.com> Change-Id: I6e80d57196b770bb3c2506bc58cb366c0046084b	2023-04-13 15:25:09 -04:00
Graham Sider	6be4461a0d	libhsakmt: Add Aqua Vanjaram support Add gfx version for VGPR size per CU calc, add FAMILY_AV to KfdFamilyId, add blacklist filter to kfdtest.exclude. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I9b8072e45f4d497e0a8fd3f8f97f1425238e8b42	2023-04-13 15:25:09 -04:00
Alex Sierra	2a1d6ee8b5	libhsakmt: query svm info from userptrs at fault events Get more debug information about user pointers that were registered through SVM API, and triggered by memory exception events. A new kfdtest with this use case was also included inside KFDExceptionTest. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: I0ef4929afe0625b9b5cbbbebef11ede66dda60ab	2023-03-22 13:34:02 -05:00
Alex Sierra	63c8cf115a	src: use SVM mechanism to register userptr memory Register and map userptrs through Shared Virtual Memory(SVM) API at the Kernel level when available. Using this approach, performance will be improve as register/unregister memory will not trigger any system call to KFD driver. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: I3726b4b5e1c6a52a83786fbe0af6322eb29ae7c9	2023-03-22 13:33:35 -05:00
Felix Kuehling	332f59eb2a	libhsakmt: Implement dmabuf export for RDMA Implement hsaKmtExportDMABufHandle, which can be used for a new upstreamable RDMA solution. It exports a DMABuf handle for an arbitrary virtual address along with the offset of the address within the allocation. It also checks that the size of the intended export does not exceed the allocation. This uses the new AMDKFD_IOC_EXPORT_DMABUF, which requires KFD ioctl API version 1.12. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Change-Id: Ie5fdb1f73ab3c7fa36c315ce326b1fb89eacc8b6	2023-02-27 14:44:11 -05:00
David Yat Sin	53d53655d7	Fix for unitialized variables Change-Id: Ie8a004db699248d0cde4213077520ea503754399	2023-02-14 14:19:31 +00:00
David Yat Sin	fb8f42233d	Fix unitialized variable warning in valgrind Change-Id: I91e70d67671a8f7289b734407011380b6b97238a	2023-02-09 17:35:53 -05:00
Xiaogang Chen	efcc9b275b	libhsakmt: Correct reporting of Shader Engines number. The Shader Engines number should be shadder array_count divided by simd_arrays_per_engine not array_count. Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com> Change-Id: I808d1fedd6b9843500719e902ecf759f5668a7d1	2023-02-09 14:34:17 -05:00
David Belanger	0eb0bae38b	Revert "libhsakmt: Disabled allocation of CWSR with SVM for GFX11." This reverts commit `b25867c4b8`. Change-Id: I05bf82266f563c63c0b794a24b0926e7652ce42d Signed-off-by: David Belanger <david.belanger@amd.com>	2023-01-25 10:48:46 -05:00
David Belanger	a847a7b80e	libhsakmt: Fixed VGPR memory size for GFX11.0 and GFX11.1. Fixed VGPR memory size, size was too small for some GPU, causing a memory overflow. Refactored macro code into a function. Thanks to Jay Cornwall for locating the problem and proposing the fix. Change-Id: Iffedea1c4f341967f02c56d810ff048225b02c16 Signed-off-by: David Belanger <david.belanger@amd.com>	2023-01-25 10:45:44 -05:00
David Belanger	b25867c4b8	libhsakmt: Disabled allocation of CWSR with SVM for GFX11. This is a temporary work around for GPU hang issues observed on GFX11. Change-Id: I98fbedbbd1c51fe402c2116b35ca548931a390c9 Signed-off-by: David Belanger <david.belanger@amd.com>	2023-01-11 17:28:31 -05:00
Eric Huang	505287412f	Revert "libhsakmt: Remove unnecessary CPU unmap" This reverts commit `7787a039bd`. It causes a regression in pytorch benchmark. Change-Id: I96173dbd061cf38d6f451c02cb181ae51b7f625e Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>	2023-01-06 17:16:40 -05:00
Alex Sierra	f2bda56d04	Revert "src: use SVM mechanism to register userptr memory" This reverts commit `178a619b80`. There are some openMP issues that were introduced after SVM userptr feature was added. Signed-off-by: Alex Sierra <Alex.Sierra@amd.com> Change-Id: I7ef87c5232a3bcbe594c743fa4b4958601845ba5	2022-12-08 17:33:51 -06:00
Alex Sierra	d9f86ae02b	Revert "libhsakmt: query svm info from userptrs at fault events" This reverts commit `45fad29752`. There are some openMP issues that were introduced after SVM userptr feature was added. Signed-off-by: Alex Sierra <Alex.Sierra@amd.com> Change-Id: I6566c9f0d39d05ecb92f38159880763f432939a5	2022-12-08 17:33:50 -06:00
Alex Sierra	21e95a4f2a	Revert "libhsakmt: add env var to en/dis registration through SVM" This reverts commit `8a746bdaed`. There are some openMP issues that were introduced after SVM userptr feature was added. Signed-off-by: Alex Sierra <Alex.Sierra@amd.com> Change-Id: Ib01046571d2c84fa0fd228ecba0dee0eae3f994d	2022-12-08 17:33:48 -06:00
Felix Kuehling	7787a039bd	libhsakmt: Remove unnecessary CPU unmap This is handled by __fmm_release calling aperture_release_area. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Change-Id: Ib8ed300e1734f03aeb9dfc8074897ece310b8af9	2022-11-28 17:18:13 -05:00
Felix Kuehling	73b0fb3d7c	libhsakmt: Refactor and clean up CPU mappings Use a common helper for CPU mappings to reduce duplicate code. Consistently use MAP_SHARED for all render_fd mappings. Remove double-mapping for AQL queue buffers on the CPU. This workaround is only needed on the GPU. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Change-Id: Iff86c8cc9f1e5c982614b3f11129bc2cf8cbba02	2022-11-28 17:18:05 -05:00
Felix Kuehling	2d53430ce3	libhsakmt: Fix and simplify debug_get_reg_status The NULL pointer check was the only way for that function to fail. And it was done after the pointer was accessed. Simplify this by just returning the result as a return value instead of using a pointer as output parameter. This way the function can never fail and the caller doesn't need to do any error handling. Declare the function in libhsakmt.h instead of duplicating the declaration in fmm.c. Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Change-Id: I91b90d66166fd3b5cdc47c73a9bbc369c45b51fe	2022-11-28 17:17:43 -05:00
Alex Sierra	8a746bdaed	libhsakmt: add env var to en/dis registration through SVM Setting this variable to '0' will force to disable memory registration/allocation through SVM API mechanism. Not setting this or setting to '1', SVM API will be used only if all GPUs support it. Signed-off-by: Alex Sierra <Alex.Sierra@amd.com> Change-Id: Icdf7656de09aa9988b567ec6c024953398e9bb48	2022-11-28 13:42:43 -05:00
Felix Kuehling	8e69b9c70e	libhsakmt: Fix use of uninitialized variable When is hsaKmtCreateQueue called first time for node doorbells[NodeId].size is initialized to zero in init_process_doorbells but used to calculate the doorbell offset. It works just by accident because doorbells[NodeId].size is uint32_t so -1 will be 0xFFFFFFFF which is zero extended into 0x00000000FFFFFFFF and it will work as long as mmap offset bits are not within lower 32 bits. Bug: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/issues/78 Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com> Change-Id: Ia791adfc51363d4704cb50fa4f01137b7dd48a75	2022-11-25 14:07:45 -05:00
David Yat Sin	f46ddb7ead	libhsakmt: Initialize fd to -1 Fix compile error due to warning in some environments Change-Id: Ie5fcfabb872c27c0de349eb215345b997fae7201	2022-11-25 15:01:53 +00:00
David Francis	88934cec2c	libhsakmt: Don't close kfd_fd When hsa is closed, it would close open fds for /dev/kfd but not for /dev/dri/renderD. This caused issues with CRIU checkpoint, which expects that /dev/kfd will be open if /dev/dri/renderD is. As a workaround for the CRIU behaviour, leave /dev/kfd open when closing hsa. Signed-off-by: David Francis <David.Francis@amd.com> Change-Id: Ie1b2d5b1d8986750b0e560ae2934b7c73cff942e	2022-11-17 10:04:24 -05:00
Alex Sierra	45fad29752	libhsakmt: query svm info from userptrs at fault events Get more debug information about user pointers that were registered through SVM API, and triggered by memory exception events. A new kfdtest with this use case was also included inside KFDExceptionTest. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: I8e9df3c1c6c3f42d7b9235d12406d80d31746443	2022-10-21 15:33:14 -04:00
Alex Sierra	178a619b80	src: use SVM mechanism to register userptr memory Register and map userptrs through Shared Virtual Memory(SVM) API at the Kernel level when available. Using this approach, performance will be improve as register/unregister memory will not trigger any system call to KFD driver. Signed-off-by: Alex Sierra <alex.sierra@amd.com> Change-Id: I20723cbeb340bf48b95e1115f0102c031397bc14	2022-10-21 15:32:02 -04:00
Graham Sider	79279e860f	libhsakmt: Skip hsa_gfxip_table search for GFX11+ Prior to launch some ASICs may re-use PCI DIDs from older generations. This can cause issues during topology initialization as hsa_gfxip_table lookups will override sysfs-provided gfx versions, causing incorrect gfxip selection. Since no new entries will be added to hsa_gfxip_table, limit its search only to pre-GFX11 ASICs. Signed-off-by: Graham Sider <Graham.Sider@amd.com> Change-Id: I53eaefac5db2650a36a6ce9f21daf750f50cfd26	2022-09-21 14:09:35 -04:00
Philip Yang	093cf898fb	libhsakmt: Set CWSR SVM range MADV_DONTFORK fork process copy-on-write MMU nitifier on CWSR range will evict user queues, and then update GPU mapping and resume queues, use MADV_DONTFORK to avoid COW MMU notifier callback on CWSR SVM range. Use mmap to alloc SVM range for CWSR because posix_memalign don't alloc new range in child process, this fails to register svm range as range is invalid address in forked child process. Change-Id: Ibaea56a691dd6f577ed2e1f2d43f4a3500b8316f Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-08 22:53:47 -04:00
Philip Yang	b2691c359d	libhsakmt: Use mmap aligned for scratch allocation To remove duplicate mmap aligned allocation code. Change-Id: Ibc05cc4aaf6d190bd2382e33bdeca1496960c5f2 Signed-off-by: Philip Yang <Philip.Yang@amd.com>	2022-09-08 22:53:47 -04:00

1 2 3 4 5 ...

600 Commits