Gráfico de commits

600 Commits

Autor SHA1 Mensaje Fecha
Alex Sierra 0cbf26c148 src: add debug API to support GPU core dump
Functions to API added to extract the following information from KFD
Runtime information, device info and queues snapshot.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: If995ecc54497ab61189bb0f209c64af0bbb0f56f
2023-06-26 18:58:15 +00:00
Alex Sierra 5e0a32d7b3 add hsaKmtGetRuntimeCapabilities API
Queries for runtime capabilities after its being enabled

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I098c0e9862c0c1d5e304b111cdc281c0ccd09691
2023-06-26 18:58:15 +00:00
James Zhu a0cbf90b90 libhsakmt: add event age tracking
Keeping last signaled event age to avoid race conditions
for HSA_EVENTTYPE_SIGNAL when event age init value is non-zero.

Change-Id: Ifb9a11a6868e5762a9f92f579e45a0a2c8fa1017
Signed-off-by: James Zhu <James.Zhu@amd.com>
2023-06-10 11:41:50 -04:00
Alex Sierra 728162c2c8 libhsakmt: include changes for upstream debugger API
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Id296e13dff431c7a151c5aae0b93412b1e116467
2023-06-02 15:56:19 -04:00
Kent Russell 718d95de77 fmm.c: Fix possibly initialized variable usage
If we end up in the first if clause, aperture_base is not set, unlike
the other 2 clauses. Initialize it to NULL at declaration time, and only
change its value in the final else clause, where we set it to
aperture->base

Change-Id: I2bf44dc93cae8a03e66f41cedd85d57be2115bba
Signed-off-by: Kent Russell <kent.russell@amd.com>
2023-05-30 15:46:14 -04:00
Xiaogang Chen f6183f937e libhsakmt: allow gpu nodeid arrary is null and number of gpu is zero.
Allow hsaKmtRegisterGraphicsHandleToNodes parameters NodeArray be null
and NumberOfNodes be zero at same time. It is the case we want the imported
buffer not be registered by kfd. Set gpu_id_array = NULL explicitly to avoid
free uninitialized gpuid array.

Report: Yat Sin, David<David.YatSin@amd.com>
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3babc1160c9573e38dd11d81965c8de2b70cae2e
2023-05-29 00:15:14 -04:00
Xiaogang Chen 7e4e57ae5f libhsakmt: have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu.
Have hsaKmtMapMemoryToGPU return same value as fmm_map_to_gpu to keep consistency.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ifabb72301e1d5a6c1310973bb1321714e12a1fa6
2023-05-29 00:15:14 -04:00
Xiaogang Chen ac1db60fc2 libhsakmt: query/use render node fds that libdrm uses.
Query render node fds that libdrm uses for current process and
use them at Thunk if available.

v2: avoid naming conflict with amdgpu_device_get_fd from amdgpu.h

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Id7288c03730f4a4c9c3644e37ca4725fec71a471
2023-05-29 00:15:14 -04:00
Xiaogang Chen 9bebb276be libhsakmt: add NodeId at HsaGraphicsResourceInfo.
Return GPU NodeId that exported the DMA buffer from amdgpu graphic driver
at fmm_register_graphics_handle.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Iaeccce6e6d0b7e27f10b15ed89d1b5310d03d44b
2023-05-29 00:15:14 -04:00
Xiaogang Chen 989c6c617c libhsakmt: add DMABuf import without address allocation.
When gpu map info is not provided import DMABuf without VA assigned.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I996ab4eb46977af5064126529c28a8bf20a67292
2023-05-29 00:15:14 -04:00
Xiaogang Chen d2a37894bb libhsakmt: support allocating a fixed address at mmap_aperture.
When HsaMemFlags.ui32.FixedAddress=1 allocate fixed address at mmap_aperture.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I1f3b532ec3c1a4fb0962126a0bd56441abaf6a9c
2023-05-29 00:15:14 -04:00
Xiaogang Chen 11ac57d293 libhsakmt: update HsaPointerInfo for address-only allocated VRAM.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: Ib88b34dff772997d2b2e5f3c7e333cef3092ef56
2023-05-29 00:15:14 -04:00
Xiaogang Chen 0138487aa4 libhsakmt: support vram-only and VA-only alloc/free.
Signed-off-by: Xiaogang.Chen <Xiaogang.Chen@amd.com>
Change-Id: I47cf53642d2ea197c08b20e84d7cae04b2d431e0
2023-05-29 00:15:14 -04:00
Xiaogang Chen 0a2989083b libhsakmt: add/init a new manageable_aperture_t from NON_CANONICAL space.
This new manageable_aperture_t is used for VRAM allocation-only and
VA allocation-only.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I3866ef9d35386d6aef7b6934ac8d4a89ef843b50
2023-05-29 00:15:14 -04:00
Xiaogang Chen cc4fb2d1a9 libhsakmt: Revert "libhsakmt: Update FD creation logic"
This reverts commit fd48f14ceb.
Current amdgpu exposes one render node for one gpu node/partition,
revert to previous way to open render node at Thunk.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I436be74f8e872a7ab5c4a1420b4ea884f5a00e57
2023-05-29 00:15:14 -04:00
Bing Ma 1e6d728730 libhsakmt: Add support functions for ASAN
Add support functions to remap the first page of device memory (GPU/GTT)
to share host ASAN logic.

Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: I4c27d5417ba80a172dccb0a079a597c5dc1c8f85
2023-05-17 13:38:19 -04:00
xinhui pan 77761836ae thunk: Fix and optimise for pointer range search
Previous code might fail to get the correct ln node. And trigger extra
walk through of the tree. Fix it.

While walking through the tree, better to search from right to left as
the node->start likely close to *address*.

Change-Id: If86ddf73e59a1eb88225d1ea90797818e8165488
Signed-off-by: xinhui pan <xinhui.pan@amd.com>
2023-04-20 19:36:29 -04:00
Jesse Zhang 4d54d6e706 libhsakmt: Add compute core check for APU
We should check compute core instead of cpu core,
in order to exclude the case of APU.

Signed-off-by: Jesse zhang <jesse.zhang@amd.com>
Change-Id: I2ec2a6807f51f49f80e0e500f5d9af81c2efae37
2023-04-17 09:34:37 +08:00
Graham Sider 831d1ad352 libhsakmt: Mask stepping version for GC 9.4.3 checks
GC 9.4.3 to set gfx target version to 9.4.x dependent on revision and
capabilities. Due to this, where applicable, mask off the gfx target
stepping version and only check major/minor version (9.4). There are no
collisions due to this change since GC 9.4.3 is the only ASIC that uses
gfx target version 9.4.x.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I72803e594c421f054d18ccfa7e92c507128fa5be
2023-04-14 12:03:23 -04:00
Graham Sider ae659e5427 libhsakmt: Fix queue destroy SVM path free size
Use q->total_mem_alloc_size for munmap in SVM codepath of free_queue.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I2fecaa1ddb337b1fe71f9cbba45a0c9467eff0c0
2023-04-14 10:03:38 -04:00
Mukul Joshi a713fb766e libhsakmt: Fix memory leak on queue destroy for GFX9.4.3
Currently, on queue destroy, context save restore memory is freed
only for a single XCC. Instead, we need to free the entire context
save restore memory, which was allocated for all XCCs.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I51ebb12fa8d5ebed41979d68e74f7c5392dca062
2023-04-14 10:03:38 -04:00
David Belanger 252a2cf959 libhsakmt: EOP Removal
Do not allocate the EOP buffer when not required.

Signed-off-by: David Belanger <david.belanger@amd.com>
Change-Id: I1664a3f0a882219a72278174006cdb8d46fd4f5e
2023-04-14 10:03:38 -04:00
Graham Sider fd48f14ceb libhsakmt: Update FD creation logic
In multi-partition modes, e.g. CPX, we want to create new file
descriptor despite using the same render node. Update
open_drm_render_device to use a gpu_id to fd map partitioned by render
node. Different gpu_id's requesting the same render node will be added
to that render node's map list for fetching its fd. Different gpu_id's
requesting different render nodes as well as the same gpu_id's
requesting the same render node will behave as they did previously.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: Ie153d42355d4d75b1c6ba6ff40fac3295bc87009
2023-04-13 15:25:09 -04:00
Mukul Joshi 97a669a979 libhsakmt: Update context save handling for multi XCC
Allocate debug area big enough for all XCCs in the partition. Also, fix
the cu_num calculations as driver now reports cu_num as the total number
of CUs in the partition.

Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I6e80d57196b770bb3c2506bc58cb366c0046084b
2023-04-13 15:25:09 -04:00
Graham Sider 6be4461a0d libhsakmt: Add Aqua Vanjaram support
Add gfx version for VGPR size per CU calc, add FAMILY_AV to KfdFamilyId,
add blacklist filter to kfdtest.exclude.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I9b8072e45f4d497e0a8fd3f8f97f1425238e8b42
2023-04-13 15:25:09 -04:00
Alex Sierra 2a1d6ee8b5 libhsakmt: query svm info from userptrs at fault events
Get more debug information about user pointers that were registered
through SVM API, and triggered by memory exception events.
A new kfdtest with this use case was also included inside
KFDExceptionTest.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I0ef4929afe0625b9b5cbbbebef11ede66dda60ab
2023-03-22 13:34:02 -05:00
Alex Sierra 63c8cf115a src: use SVM mechanism to register userptr memory
Register and map userptrs through Shared Virtual Memory(SVM) API at
the Kernel level when available. Using this approach, performance
will be improve as register/unregister memory will not trigger any
system call to KFD driver.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I3726b4b5e1c6a52a83786fbe0af6322eb29ae7c9
2023-03-22 13:33:35 -05:00
Felix Kuehling 332f59eb2a libhsakmt: Implement dmabuf export for RDMA
Implement hsaKmtExportDMABufHandle, which can be used for a new
upstreamable RDMA solution. It exports a DMABuf handle for an arbitrary
virtual address along with the offset of the address within the
allocation. It also checks that the size of the intended export does
not exceed the allocation.

This uses the new AMDKFD_IOC_EXPORT_DMABUF, which requires KFD ioctl
API version 1.12.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ie5fdb1f73ab3c7fa36c315ce326b1fb89eacc8b6
2023-02-27 14:44:11 -05:00
David Yat Sin 53d53655d7 Fix for unitialized variables
Change-Id: Ie8a004db699248d0cde4213077520ea503754399
2023-02-14 14:19:31 +00:00
David Yat Sin fb8f42233d Fix unitialized variable warning in valgrind
Change-Id: I91e70d67671a8f7289b734407011380b6b97238a
2023-02-09 17:35:53 -05:00
Xiaogang Chen efcc9b275b libhsakmt: Correct reporting of Shader Engines number.
The Shader Engines number should be shadder array_count divided by simd_arrays_per_engine
not array_count.

Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
Change-Id: I808d1fedd6b9843500719e902ecf759f5668a7d1
2023-02-09 14:34:17 -05:00
David Belanger 0eb0bae38b Revert "libhsakmt: Disabled allocation of CWSR with SVM for GFX11."
This reverts commit b25867c4b8.

Change-Id: I05bf82266f563c63c0b794a24b0926e7652ce42d
Signed-off-by: David Belanger <david.belanger@amd.com>
2023-01-25 10:48:46 -05:00
David Belanger a847a7b80e libhsakmt: Fixed VGPR memory size for GFX11.0 and GFX11.1.
Fixed VGPR memory size, size was too small for some GPU, causing a memory overflow.
Refactored macro code into a function.
Thanks to Jay Cornwall for locating the problem and proposing the fix.

Change-Id: Iffedea1c4f341967f02c56d810ff048225b02c16
Signed-off-by: David Belanger <david.belanger@amd.com>
2023-01-25 10:45:44 -05:00
David Belanger b25867c4b8 libhsakmt: Disabled allocation of CWSR with SVM for GFX11.
This is a temporary work around for GPU hang issues observed on GFX11.

Change-Id: I98fbedbbd1c51fe402c2116b35ca548931a390c9
Signed-off-by: David Belanger <david.belanger@amd.com>
2023-01-11 17:28:31 -05:00
Eric Huang 505287412f Revert "libhsakmt: Remove unnecessary CPU unmap"
This reverts commit 7787a039bd.

It causes a regression in pytorch benchmark.

Change-Id: I96173dbd061cf38d6f451c02cb181ae51b7f625e
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
2023-01-06 17:16:40 -05:00
Alex Sierra f2bda56d04 Revert "src: use SVM mechanism to register userptr memory"
This reverts commit 178a619b80.
There are some openMP issues that were introduced after SVM userptr
feature was added.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: I7ef87c5232a3bcbe594c743fa4b4958601845ba5
2022-12-08 17:33:51 -06:00
Alex Sierra d9f86ae02b Revert "libhsakmt: query svm info from userptrs at fault events"
This reverts commit 45fad29752.
There are some openMP issues that were introduced after SVM userptr
feature was added.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: I6566c9f0d39d05ecb92f38159880763f432939a5
2022-12-08 17:33:50 -06:00
Alex Sierra 21e95a4f2a Revert "libhsakmt: add env var to en/dis registration through SVM"
This reverts commit 8a746bdaed.
There are some openMP issues that were introduced after SVM userptr
feature was added.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: Ib01046571d2c84fa0fd228ecba0dee0eae3f994d
2022-12-08 17:33:48 -06:00
Felix Kuehling 7787a039bd libhsakmt: Remove unnecessary CPU unmap
This is handled by __fmm_release calling aperture_release_area.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ib8ed300e1734f03aeb9dfc8074897ece310b8af9
2022-11-28 17:18:13 -05:00
Felix Kuehling 73b0fb3d7c libhsakmt: Refactor and clean up CPU mappings
Use a common helper for CPU mappings to reduce duplicate code.
Consistently use MAP_SHARED for all render_fd mappings.
Remove double-mapping for AQL queue buffers on the CPU. This workaround
is only needed on the GPU.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Iff86c8cc9f1e5c982614b3f11129bc2cf8cbba02
2022-11-28 17:18:05 -05:00
Felix Kuehling 2d53430ce3 libhsakmt: Fix and simplify debug_get_reg_status
The NULL pointer check was the only way for that function to fail. And it
was done after the pointer was accessed. Simplify this by just returning
the result as a return value instead of using a pointer as output
parameter. This way the function can never fail and the caller doesn't
need to do any error handling.

Declare the function in libhsakmt.h instead of duplicating the
declaration in fmm.c.

Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: I91b90d66166fd3b5cdc47c73a9bbc369c45b51fe
2022-11-28 17:17:43 -05:00
Alex Sierra 8a746bdaed libhsakmt: add env var to en/dis registration through SVM
Setting this variable to '0' will force to disable memory
registration/allocation through SVM API mechanism.
Not setting this or setting to '1', SVM API will be used only if all
GPUs support it.

Signed-off-by: Alex Sierra <Alex.Sierra@amd.com>
Change-Id: Icdf7656de09aa9988b567ec6c024953398e9bb48
2022-11-28 13:42:43 -05:00
Felix Kuehling 8e69b9c70e libhsakmt: Fix use of uninitialized variable
When is hsaKmtCreateQueue called first time for node
doorbells[NodeId].size is initialized to zero in init_process_doorbells
but used to calculate the doorbell offset. It works just by accident
because doorbells[NodeId].size is uint32_t so -1 will be 0xFFFFFFFF which
is zero extended into 0x00000000FFFFFFFF and it will work as long as mmap
offset bits are not within lower 32 bits.

Bug: https://github.com/RadeonOpenCompute/ROCT-Thunk-Interface/issues/78
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ia791adfc51363d4704cb50fa4f01137b7dd48a75
2022-11-25 14:07:45 -05:00
David Yat Sin f46ddb7ead libhsakmt: Initialize fd to -1
Fix compile error due to warning in some environments

Change-Id: Ie5fcfabb872c27c0de349eb215345b997fae7201
2022-11-25 15:01:53 +00:00
David Francis 88934cec2c libhsakmt: Don't close kfd_fd
When hsa is closed, it would close open fds for /dev/kfd but
not for /dev/dri/renderD*. This caused issues with CRIU
checkpoint, which expects that /dev/kfd will be open if
/dev/dri/renderD* is.

As a workaround for the CRIU behaviour, leave /dev/kfd open
when closing hsa.

Signed-off-by: David Francis <David.Francis@amd.com>
Change-Id: Ie1b2d5b1d8986750b0e560ae2934b7c73cff942e
2022-11-17 10:04:24 -05:00
Alex Sierra 45fad29752 libhsakmt: query svm info from userptrs at fault events
Get more debug information about user pointers that were registered
through SVM API, and triggered by memory exception events.
A new kfdtest with this use case was also included inside
KFDExceptionTest.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I8e9df3c1c6c3f42d7b9235d12406d80d31746443
2022-10-21 15:33:14 -04:00
Alex Sierra 178a619b80 src: use SVM mechanism to register userptr memory
Register and map userptrs through Shared Virtual Memory(SVM) API at
the Kernel level when available. Using this approach, performance
will be improve as register/unregister memory will not trigger any
system call to KFD driver.

Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: I20723cbeb340bf48b95e1115f0102c031397bc14
2022-10-21 15:32:02 -04:00
Graham Sider 79279e860f libhsakmt: Skip hsa_gfxip_table search for GFX11+
Prior to launch some ASICs may re-use PCI DIDs from older generations.
This can cause issues during topology initialization as hsa_gfxip_table
lookups will override sysfs-provided gfx versions, causing incorrect
gfxip selection. Since no new entries will be added to hsa_gfxip_table,
limit its search only to pre-GFX11 ASICs.

Signed-off-by: Graham Sider <Graham.Sider@amd.com>
Change-Id: I53eaefac5db2650a36a6ce9f21daf750f50cfd26
2022-09-21 14:09:35 -04:00
Philip Yang 093cf898fb libhsakmt: Set CWSR SVM range MADV_DONTFORK
fork process copy-on-write MMU nitifier on CWSR range will evict user
queues, and then update GPU mapping and resume queues, use MADV_DONTFORK
to avoid COW MMU notifier callback on CWSR SVM range.

Use mmap to alloc SVM range for CWSR because posix_memalign don't alloc
new range in child process, this fails to register svm range as range is
invalid address in forked child process.

Change-Id: Ibaea56a691dd6f577ed2e1f2d43f4a3500b8316f
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-08 22:53:47 -04:00
Philip Yang b2691c359d libhsakmt: Use mmap aligned for scratch allocation
To remove duplicate mmap aligned allocation code.

Change-Id: Ibc05cc4aaf6d190bd2382e33bdeca1496960c5f2
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
2022-09-08 22:53:47 -04:00