Графік комітів

32 Коміти

Автор SHA1 Повідомлення Дата
Longlong Yao c34ec1e52f wsl/librocdxg: Change scratch memory allocation
Calculate the actual scratch memory size required based on the
packet information for kernel dispatch.

If the required size exceeds the total allocated memory, scratch
memory must be reallocated. Otherwise, no action is needed.

miopen_gtest: Full/GPU_MIOpenDriverRegressionTest_FP16.MIOpenDriverRegressionHalf/0

Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
Reviewed-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Horatio Zhang <Hongkun.Zhang@amd.com>
2026-01-06 10:12:04 +08:00
Longlong Yao 56eeaf26f8 librocdxg: query total shared GPU memory
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
2025-12-24 13:14:55 +08:00
Longlong Yao a2c5e19624 librocdxg: add interface to query segment info
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
2025-12-24 13:08:12 +08:00
Flora Cui 437e4b092e librocdxg: Convert all CmdUtil methods to static
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Longlong Yao <Longlong.Yao@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/113>
2025-11-28 14:52:56 +08:00
Flora Cui 1bc5af684c update hsa header
Signed-off-by: Flora Cui <flora.cui@amd.com>
2025-11-05 18:53:37 +08:00
Chengjun Yao b5dd613ccd librocdxg: Integrate DXCore loader into WDDM thunks
Replace direct D3DKMT API calls with DXCORE_CALL macro in WDDM
thunk layer. This enables dynamic loading of DXCore functions
while maintaining the same API interface.

Updated thunk functions:
- MapGpuVirtualAddress, CreateAllocation, DestroyAllocation
- ReserveGpuVirtualAddress, FreeGpuVirtualAddress
- MakeResident, Evict, ShareObjects
- QueryResourceInfoFromNtHandle, OpenResourceFromNtHandle

All existing functionality is preserved while adding flexibility
for runtime DXCore availability detection.

Signed-off-by: Chengjun Yao <Chengjun.Yao@amd.com>
Signed-off-by: Yang Su <Yang.Su2@amd.com>
Reviewed-by: Shi.Leslie <Yuliang.Shi@amd.com>
2025-11-05 18:53:37 +08:00
Flora Cui bf818a2e75 librocdxg: update rocr queue type to amd_queue_v2_t
Signed-off-by: Flora Cui <flora.cui@amd.com>
2025-11-05 18:53:37 +08:00
Flora Cui 28c81cffda librocdxg: include rocr headers
Signed-off-by: Flora Cui <flora.cui@amd.com>
2025-11-05 18:53:37 +08:00
Flora Cui 25c2b74037 librocdxg: add rocr header files
Signed-off-by: Flora Cui <flora.cui@amd.com>
2025-11-05 18:53:37 +08:00
Flora Cui 99da7e60ec wsl/libhsakmt: adapt to the new check for kernel object
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Longlong Yao <Longlong.Yao@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/99>
2025-11-05 18:53:37 +08:00
Flora Cui e2a1f0c7fc wsl/libhsakmt: refactor handling of kmd priv data
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Longlong Yao <Longlong.Yao@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/98>
2025-11-05 18:53:37 +08:00
Flora Cui 0e8f794b1c wsl/libhsakmt: simplify adapter_info
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Longlong Yao <Longlong.Yao@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/97>
2025-11-05 18:53:37 +08:00
Flora Cui 70b9951b0c wsl/libhsakmt: refactor WDDMDevice creation
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/95>
2025-11-05 18:53:37 +08:00
Flora Cui 838421c540 wsl/libhsakmt: refactor check for supported device
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/95>
2025-11-05 18:53:37 +08:00
Flora Cui 887056d64a wsl/libhsakmt: remove redundant #include "libhsakmt.h"
move libhsakmt.h inclusion to he makefile

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/95>
2025-11-05 18:53:37 +08:00
tiancyin 575e25b7e4 wsl/libhsakmt: move IPC functions from device to thunk runtime
IPC use system memory, it has nothing to do with wddm device.

Reviewed-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
2025-11-05 18:53:37 +08:00
tiancyin 3e40beb68c wsl/libhsakmt: move ReserveGpuVirtualAddress from device to thunk runtime
For multi-GPU supporting, local heap and system heap managers are
implemented in thunk runtime, so the heap allocation function
ReserveGpuVirtualAddress should be moved to runtime too.

Reviewed-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
2025-11-05 18:53:37 +08:00
tiancyin 593e919bcd wsl/libhsakmt: move handle aperture from device to thunk runtime
In multi-GPU, handle aperture is shared between all GPUs, not belongs to
specific one GPU, so move it from wddm device (which presents a specific GPU)
to thunk runtime which has gloable view, can manage handle aperture for all GPUs.

Reviewed-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
2025-11-05 18:53:36 +08:00
tiancyin 557f888e1c wsl/libhsakmt: move system heap from device to thunk runtime
In multi-GPU, system heap space is shared between all GPUs, not belongs to
specific one GPU, so move it from wddm device (which presents a specific GPU)
to thunk runtime which has gloable view, can manage system heap for all GPUs.

Introduce a new va_Mgr instance to manage system heap, since local heap
and system heap both comply with SVM(Shared Virtual Memory), without
this new mgr, every allocation has to call KMD at least once (each GPU
needs a call) to allocate GPU VA, the new mgr manage the space itself,
no longer call KMD.

Reviewed-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
2025-11-05 18:53:36 +08:00
tiancyin 602ed1aff8 wsl/libhsakmt: move local heap and va_Mgr from device to thunk runtime
In multi-GPU, local heap space is shared between all GPUs, not belongs to
specific one GPU, so move it from wddm device (which presents a specific GPU)
to thunk runtime which has gloable view, can manage local heap for all GPUs.

Reviewed-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
2025-11-05 18:53:36 +08:00
Flora Cui a53f1a7c1e wsl/libhsakmt: add same process check for ipc buffer
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/85>
2025-11-05 18:53:36 +08:00
Flora Cui 6d941db5ec wsl/libhsakmt: refactor ipc implementation
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/85>
2025-11-05 18:53:36 +08:00
Flora Cui 61add17468 wsl/libhsakmt: add .NodeId() in WDDMDevice
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/82>
2025-11-05 18:53:36 +08:00
Longlong Yao 250d43508e wsl/libhsakmt: reimplement GetClockCounters
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
Reviewed-by: Flora Cui <flora.cui@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/80>
2025-11-05 18:53:36 +08:00
Flora Cui c01d09114b wsl/libhsakmt: correct gfx family id
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Tianci Yin <tianci.yin@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/54>
2025-11-05 18:53:36 +08:00
Longlong Yao 1c4f3e86fa libhsakmt: add support to get driver version number
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
Reviewed-by: lyndonli <Lyndon.Li@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/43>
2025-11-05 18:53:36 +08:00
tiancyin 5f219029c2 wsl/hsakmt: implement ipc signal
IPC Signal only support sys ram backend and CPU&GPU both accessible,
IPC Memory only support vram backend and only GPU accessible.

Reviewed-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
2025-11-05 18:53:35 +08:00
tiancyin 2081ab01e6 wsl/hsakmt: implement ipc mem of rocr non-legacy mode
The legacy mode means buffer sharing through KFD, KFD provide a buffer
id to exporter, exporter pass it to importer, importer pass buffer id
to KFD to query and import this buffer.

The non-legcay mode relys on socket to pass dmabuf fd between processes.

In hsa-runtime, the legcay mode is the default mode, setting environment
variable HSA_ENABLE_IPC_MODE_LEGACY to 0 can force hsa-runtime to new
mode code path.

Reviewed-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Longlong Yao <Longlong.Yao@amd.com>
Signed-off-by: tiancyin <tianci.yin@amd.com>
2025-11-05 18:53:35 +08:00
lyndonli b4e6cce204 wsl/hsakmt: Implement fetching of UUID
Signed-off-by: lyndonli <Lyndon.Li@amd.com>
Reviewed-by: Shi.Leslie <Yuliang.Shi@amd.com>
Reviewed-by: Flora Cui <flora.cui@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/25>
2025-11-05 18:53:35 +08:00
Flora Cui 4b5a9a0f8c wsl/hsakmt: add ULARGE_INTEGER
for updated d3dukmdt.h

Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/22>
2025-11-05 18:53:35 +08:00
Longlong Yao 129da6526c wsl/hsakmt: Add is_dgpu check for wddm device
Reviewed-by: Flora Cui <flora.cui@amd.com>
Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/18>
2025-11-05 18:53:35 +08:00
Flora Cui 240dc71b91 wsl/hsakmt: move src/inc to include/impl
Signed-off-by: Flora Cui <flora.cui@amd.com>
Reviewed-by: Horatio Zhang <Hongkun.Zhang@amd.com>
Part-of: <http://10.67.69.192/wsl/rocr-runtime/-/merge_requests/15>
2025-11-05 18:53:35 +08:00