This enhances libhsakmt's capabilities for multi-context KFD support by implementing per-context topology management.
Changes:
* Add hsaKmtGetClockCountersCtx for multi-context support
- Add context-aware version of hsaKmtGetClockCounters
- Original API is retained as a wrapper calling the ctx-version with primary context
* Enable independent debug sessions across multiple KFD contexts
-Create hsa_kfd_debug_context, introduce context-aware debug APIs, shift debug state to per-context
* Add perf sub-context for per-context performance counter management
- Introduce hsa_kfd_perf_context, move counter properties, add context - aware perf APIs, and update initialization
* Refactor FMM for per-context resource management
- Refactor multiple global variables related to FMM, including
GPU ID arrays , svm, cpuvm_aperture, and mem_handle_aperture to hsa_kfd_fmm_context
* Implement per-context topology for complete context isolation
- Migrate global topology data (g_system, g_props, map_user_to_sysfs_node_id)
to per-context hsa_kfd_topology_context structure
- Update all topology functions to accept HsaKFDContext parameter for
context-aware operations (validate_nodeid, get_node_props, get_iolink_props, etc.)
- Refactor topology snapshot management for per-context isolation
- Add context-aware PMC trace access APIs
Signed-off-by: Junhua Shen <Junhua.Shen@amd.com>
* SWDEV-558848 - Move DRM calls to thunk for better abstraction
* Use thunk device handle instead of drm inside agent
* Update IPC functions with new thunk calls
* create hsaKmtHandleImport interface to support ipc
* Reset metadata inside hsaKmtMemHandleFree
* remove whitespaces and NULL usage
* Add thunk apis to libhsakmt.ver
* Add comments to new structs in thunk
* Minor fixes to declarations
* resolve merge conflicts in amd_kfd_driver
---------
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
* libhsakmt/virtio: Add alloc memory align api
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Rename CLGL BO to AMDGPU BO
Rename VHSA_BO_CLGL to VHSA_BO_AMDGPU to support generic AMDGPU buffer objects, not just CL/GL interop.
* libhsakmt/virtio: Add atomic helpers and node lookup
Add vhsakmt_atomic_inc/dec macros and vhsakmt_get_node_by_id helper function.
* libhsakmt/virtio: Add AMDGPU device initialization support
Add vamdgpu_device_initialize and vamdgpu_device_deinitialize functions.
* libhsakmt/virtio: Add AMDGPU device handle and DRM command support
Add vamdgpu_device_get_fd, vdrmCommandWriteRead and update vhsaKmtGetAMDGPUDeviceHandle.
* libhsakmt/virtio: Add AMDGPU BO free and CPU map support
Add vamdgpu_bo_free and vamdgpu_bo_cpu_map functions.
* libhsakmt/virtio: Add AMDGPU BO import and export support
Add vamdgpu_bo_import, vamdgpu_bo_export and vhsakmt_bo_from_resid functions.
* libhsakmt/virtio: Add AMDGPU BO VA operation support
Add vamdgpu_bo_va_op function.
* libhsakmt/virtio: Add dma buf export support
Add vhsaKmtExportDMABufHandle API in virtio driver to support export
feature.
* libhsakmt/virtio: Fix potential deadlock in userptr deregistration
Refactor vhsakmt_deregister_userptr_non_svm to avoid calling
vhsakmt_destroy_userptr while holding the bo_handles_mutex lock.
Previously, destroying userptrs directly while iterating the tree
could cause deadlock issues due to nested locking.
- Move interval tree removal from vhsakmt_destroy_userptr to caller
- Collect BOs to free in a temporary array during tree traversal
- Destroy BOs after releasing the mutex to avoid lock contention
- Use dynamic array with realloc to handle arbitrary number of BOs
Signed-off-by: Honglei Huang <honghuan@amd.com>
* rocr: driver/virtio: Implement DMA-BUF import/export and memory mapping APIs
Implement the missing DMA-BUF handling and memory mapping functions
in the virtio KFD driver to enable cross-process memory sharing:
- ExportDMABuf: Export HSA memory as DMA-BUF file descriptor
- ImportDMABuf: Import DMA-BUF fd as shareable buffer object
- Map: Map imported buffer into virtual address space with permissions
- Unmap: Unmap buffer from virtual address space
- ReleaseShareableHandle: Free imported buffer object
Also add drm_perm() helper to convert HSA access permissions to
AMDGPU VM page flags (READABLE/WRITEABLE).
These APIs enable IPC memory sharing between HSA processes through
DMA-BUF mechanism in virtualized environments.
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Add register memory APIs
Add two new memory registration functions to the virtio HSA KMT library:
1. vhsaKmtRegisterMemory: A simplified wrapper for vhsaKmtRegisterMemoryWithFlags
that uses default CoarseGrain memory flags.
2. vhsaKmtRegisterMemoryToNodes: A stub implementation for registering memory
to specific nodes. Returns HSAKMT_STATUS_NOT_IMPLEMENTED as it's currently
not used in ROCR.
Changes:
- Added function declarations in hsakmt_virtio.h
- Implemented functions in hsakmt_virtio_memory.c
- Exported symbols in libhsakmt_virtio.ver
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Add graphics handle registration and mapping APIs
- Add vhsaKmtRegisterGraphicsHandleToNodesExt() with flags support
- Add vhsaKmtMapGraphicHandle() and vhsaKmtUnmapGraphicHandle() stubs
- Refactor existing registration API to use extended version
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Add virtio support for queue APIs
Implement vhsaKmtUpdateQueue, vhsaKmtSetQueueCUMask,
vhsaKmtAllocQueueGWS and vhsaKmtGetQueueInfo functions
with virtio protocol extensions and symbol exports.
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Add new virtio API support for model, SMI, and XNACK mode
Add three new API functions to the virtio backend:
- vhsaKmtModelEnabled: Check if pre-silicon model is enabled (returns false for virtio)
- vhsaKmtOpenSMI: Open SMI interface for a node (not yet supported in virtio)
- vhsaKmtSetXNACKMode: Set XNACK mode via virtio control command
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Add shared memory support for virtio backend
Implement shared memory APIs for the virtio backend to enable
memory sharing between processes:
- Add vhsaKmtShareMemory() to share memory regions and create
shared memory handles
- Add vhsaKmtRegisterSharedHandle() to register shared memory
handles in the current process
- Add vhsaKmtRegisterSharedHandleToNodes() for node-specific
shared memory registration
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Add memory management APIs for virtio
Add the following new memory management APIs to virtio implementation:
- vhsaKmtSetMemoryUserData: Set user data for memory pointer
- vhsaKmtSetMemoryPolicy: Configure memory policy for nodes
- vhsaKmtSVMGetAttr: Get SVM (Shared Virtual Memory) attributes
- vhsaKmtSVMSetAttr: Set SVM attributes
- vhsaKmtReplaceAsanHeaderPage: ASAN header page replacement (stub)
- vhsaKmtReturnAsanHeaderPage: ASAN header page return (stub)
Changes include:
- Added API declarations in hsakmt_virtio.h
- Implemented functions in hsakmt_virtio_memory.c
- Extended protocol definitions in hsakmt_virtio_proto.h
- Added user_data field to vhsakmt_bo structure
- Exported new symbols in libhsakmt_virtio.ver
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Add SPM APIs
Add three new SPM-related APIs to the virtio interface:
- vhsaKmtSPMAcquire: Acquire SPM resources on a preferred node
- vhsaKmtSPMRelease: Release SPM resources on a preferred node
- vhsaKmtSPMSetDestBuffer: Set destination buffer for SPM data with
optional userptr support and data loss detection
These APIs extend the virtio command protocol with new query types:
- VHSAKMT_CCMD_QUERY_SPM_ACQUIRE
- VHSAKMT_CCMD_QUERY_SPM_RELEASE
- VHSAKMT_CCMD_QUERY_SPM_SET_DST_BUFFER
The implementation includes proper buffer management for both
direct BO access and userptr fallback for smaller buffers.
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Add virtio stub for hsaKmtAisReadWriteFile API
Add vhsaKmtAisReadWriteFile stub implementation for the virtio backend
to support AIS (Accelerated I/O Service) file read/write operations.
This stub currently returns HSAKMT_STATUS_NOT_IMPLEMENTED.
Changes include:
- Add vhsaKmtAisReadWriteFile declaration in hsakmt_virtio.h
- Add stub implementation in hsakmt_virtio_memory.c
- Export the symbol in libhsakmt_virtio.ver
Signed-off-by: energystoryhhl <energystoryhhl@users.noreply.github.com>
* libhsakmt/virtio: Add vamdgpu_bo_query_info and vamdgpu_bo_set_metadata APIs
Implement two new virtio wrapper functions for AMDGPU buffer object operations:
1. vamdgpu_bo_query_info: Query buffer object information including
allocation parameters, memory usage, and metadata.
2. vamdgpu_bo_set_metadata: Set metadata for a buffer object, allowing
applications to attach custom data to GPU memory allocations.
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Add ProcessVMRead/Write stub implementations for virtio
Add vhsaKmtProcessVMRead and vhsaKmtProcessVMWrite stub functions
to the virtio interface. These APIs return HSAKMT_STATUS_NOT_IMPLEMENTED
since they are not supported in the baremetal implementation, matching
the behavior of the deprecated hsaKmtProcessVMRead/Write APIs.
Signed-off-by: energystoryhhl <energystoryhhl@users.noreply.github.com>
---------
Signed-off-by: Honglei Huang <honghuan@amd.com>
Signed-off-by: energystoryhhl <energystoryhhl@users.noreply.github.com>
Co-authored-by: energystoryhhl <energystoryhhl@users.noreply.github.com>
* Add HasExpertSchedMode device prop
* Add unit tests for HasExpertSchedMode
* Add gfx12 check for HasExpertSchedMode prop
* Update gfx major version check and test for ExpertSchedMode
* Minor fix and ROCr version bump
* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h
* Update projects/rocr-runtime/runtime/hsa-runtime/inc/hsa_ext_amd.h
* Apply suggestion from @dayatsin-amd
* Apply suggestion from @dayatsin-amd
---------
Co-authored-by: Stefan Sokolovic <stefan.sokolovic2@amd.com>
Co-authored-by: David Yat Sin <77975354+dayatsin-amd@users.noreply.github.com>
* hsakmt: Expose CWSR and Control stack sizes
This is better than hardcoding values and hoping that they align with
KFD's definitions
Signed-off-by: Kent Russell <kent.russell@amd.com>
* hsakmt: Use CwsrSize and CtlStackSize if available
If KFD is providing the CwsrSize and CtlStackSize, use the maximum
of those and the old calculations for the ctx_save_restore_size
and ctl_stack_size defined in the queue
Signed-off-by: Kent Russell <kent.russell@amd.com>
* hsakmt: Add warning when ABI<1.20 on GFX1151
CwsrSize and CtlStackSize are reported by KFD ABI 1.20. GFX1151
specifically may have some issues if these regions are misaligned, so
report a strong warning during topology initialization if the system is
GFX1151 but is using KFD ABI < 1.20
Signed-off-by: Kent Russell <kent.russell@amd.com>
---------
Signed-off-by: Kent Russell <kent.russell@amd.com>
Although the value is correct; there is no source of truth between
kernel and userspace. This leads to problems if the kernel has strict
restrictions (such as kernel 6.17 or earlier). The restrictions were
lifted in 6.17.9 and and 6.18, but there is no guarantee userspace is
using this.
So short term this value will be wrong. But on newer kernels the kernel
will communicate the right size and rocr-runtime will be adjusted to
use that.
Link: https://github.com/ROCm/TheRock/pull/2505
Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
* SWDEV-558848 - vmm api support for rocr on windows
* Fixes to VMM handle Map/Unmap Set/Get Access
* Fix GetShareableHandle to use pointer for shareable handle
* Update os specific map/unmap memory calls
* clang format update
* Minor syntax fixes from code review
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
---------
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
* kfdtest: Replace pthread with std::thread
Modify concurrent kfdtest to use std::thread
instead of pthread, eventually modify KFDTestLaunch
to take in a member function of test instance
instead of static function.
Convert KFDQMTest to pass in member function for
multi-gpu kfdtest.
* kfdtest: Convert KFDPerfCountersTest to use std::thread
Convert KFDPerfCountersTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDGraphicsInterop to use std::thread
Convert KFDGraphicsInterop to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDGWSTest to use std::thread
Convert KFDGWSTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDCWSRTest to use std::thread
Convert KFDCWSRTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDEventTest to use std::thread
Convert KFDEventTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDExceptionTest to use std::thread
Convert KFDExceptionTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDLocalMemoryTest to use std::thread
Convert KFDLocalMemoryTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDMemoryTest to use std::thread
Convert KFDMemoryTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDSVMRangeTest to use std::thread
Convert KFDSVMRangeTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Convert KFDHWSTest to use std::thread
Convert KFDHWSTest to use std::thread for
multi-gpu kfdtest.
* kfdtest: Remove pthread multigpu test structure
Remove older multi-gpu test framework which
uses pthread.
* libhsakmt/virtio: change shmem size to 80
Some DGPU props have a lot of information,
so it is necessary to increase the size of shmem.
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: use BO handle instead of pointer in memory registration
Change vhsakmt_map_to_gpu() return type from void* to vhsakmt_bo_handle
to properly handle buffer object information. This allows access to
both the host address and resource ID needed for memory registration.
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: Improve memory mapping logic
- Update vhsakmt_mappable() to check NoAddress flag and require HostAccess
- Remove mappable checks in cpu_map/unmap to allow all BOs to be mapped
- Set BO flags properly in vhsakmt_alloc_memory and scratch memory creation
- Ensure scratch memory is correctly flagged for proper handling
Signed-off-by: Honglei Huang <honghuan@amd.com>
* libhsakmt/virtio: add no svm mode for libhsakmt virtio
Add no svm mode for libhsakmt virtio driver, in no svm mode userptrs
need UMD to manage, so add interval tree to manage them.
New Features:
- Add augmented red-black tree based interval tree implementation
* Implement RB-tree insertion, deletion, and color balancing
* Provide interval query for fast overlapping range lookup
* Based on Linux kernel's augmented rbtree implementation
- Improve userptr memory management
* Use interval tree to efficiently track userptr memory regions
* Support finding registered memory within given address ranges
* Optimize memory mapping and unmapping performance
Signed-off-by: Honglei Huang <honghuan@amd.com>
---------
Signed-off-by: Honglei Huang <honghuan@amd.com>
* Introduce HsaKFDContext structure and infrastructure for multiple KFD contexts, enabling
independent contexts within a single process.
* Refactor core components (queue, event, FMM, topology) to be context-aware,
using explicit HsaKFDContext parameters instead of global state.
* Replace global hsakmt_kfd_fd with context-specific file descriptors, ensuring full context isolation.
* Maintain backward compatibility by redirecting legacy APIs to use the primary context.
This refactoring establishes a foundation for multi-context support while preserving existing functionality.
Signed-off-by: Junhua Shen <Junhua.Shen@amd.com>
* kfdtest: Enable GPU selection via CLI for multi-GPU tests
Replaced environment variable-based GPU selection with
GPU selection via command-line parameter --concurrentnodes (-c)
Modified g_TestGPUsNum to be passed in via command-line
parameter --testnodenum (t)
Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
* kfdtest: Enable GPU selection via CLI for multi-GPU tests
Replaced environment variable-based GPU selection with
GPU selection via command-line parameter --concurrentnodes (-c)
Modified g_TestGPUsNum to be passed in via command-line
parameter --testnodenum (t)
---------
Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
Co-authored-by: Alysa Liu <Alysa.Liu@amd.com>
* rocr: Add support for VMM and RDMA
Add extra CPU mapping so that kernel-mode drivers can look up the memory
mapping by virtual address.
* Update projects/rocr-runtime/runtime/hsa-runtime/core/runtime/runtime.cpp
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
* Update projects/rocr-runtime/runtime/hsa-runtime/core/inc/runtime.h
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
* rocr: Honor uncache flag in memory_lock_to_pool()
Also, combined several flag options used in apis into a
single integer.
Signed-off-by: Chris Freehill <cfreehil@amd.com>
* rocr: Fix hsa_amd_pointer_info on CPU agents
Fix hsa_amd_pointer_info query returning allowd on VMM pointers for CPU
agents when CPU mapping was mapped with PROT_NONE.
---------
Signed-off-by: Chris Freehill <cfreehil@amd.com>
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
Co-authored-by: Chris Freehill <cfreehil@amd.com>
Co-authored-by: cfreeamd <166262151+cfreeamd@users.noreply.github.com>
Modify the code that computes the adjusted CU mask array to take
into account of additional cases for inactive CUs.
Signed-off-by: David Belanger <david.belanger@amd.com>
* libhsakmt: Update hsakmt_fmm_get_handle to support address range
Currently, hsakmt_fmm_get_handle works only if the address is allocated
(staring) value. Update it so it can find the handle if address falls in
the valid allocated range. This is useful for AMD infinity storage
feature where data needs to be transferred to any memory within in the
allocated range
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* libhsakmt: Introduce AMD Infinity Storage (AIS) API
Add hsaKmtAisReadWriteFile() API to support AMD Infinity Storage. The
API moves data directly from GPU VRAM to a file.
v2: Add in/out ioctl arguments to provide more status information to
user space. Modify hsaKmt API also accordingly.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* rocr: Initial implementation of AMD Infinity Storage (AIS)
Implement first two API: hsa_amd_ais_file_write and hsa_amd_ais_file_read
v2: Change API from hsa_amd_ to hsa_amd_ais_
Change API to take in handle instead of fd for compatibility accross
different platforms
Original Author: Chris Freehill <Chris.Freehill@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
---------
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* libhsakmt: fix UB due to signed integer literal in 1 << 31
Bit shift operations on signed numbers should not shift into or beyond
the signed bit as this results in Undefined Behaviour.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
* libhsakmt: Fix UB due to signed integer literal in 1 << x
Bit Shifting an unsigned integer is undefined behavior.
BUG: SWDEV-532853
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
* rocr: Fix UB in various places due signed integer in bit shift
Bit shifting signed integers into or beyond the sign bit is undefined.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
* rocr: Change signed integer literals to unsigned
Changing the signed integers in the macro expressions throughout the file
to avoid overflow.
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
---------
Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Co-authored-by: Flora Cui <flora.cui@amd.com>
* libhsakmt: Update ioctl version to 1.18
Sync with kernel ioctl version.
Also explicitly set the ioctl flag to KFD_PROC_FLAG_MFMA_HIGH_PRECISION
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* libhsakmt: Sync ioctl header by adding kfd_ioctl_profiler
Sync with kernel ioctl version. Add kfd_ioctl_profiler.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
---------
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Also advice parameter of madvise() system call is not a bitmask. So fix
that also
v2: Use MAP_SHARED instead of MAP_PRIVATE. This avoids MMU notifiers and
evictions.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
This patch uses udmabuf driver to allocate system memory instead of using amdgpu
driver for APU. With this function app can account its consumed system memory by
cgroup mechanism. This function is enabled by env variable HSA_USE_UDMABUF.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
[ROCm/ROCR-Runtime commit: 996e8bbfb7]
This patch adds VirtIO support to the libhsakmt library, enabling communication
with AMD GPUs via VirtIO.
Details
- CMakeLists.txt: Added a new CMakeLists.txt file for the VirtIO component
of libhsakmt.
- hsakmt_virtio.c/h: Implemented the core VirtIO functionality, including
VirtIO GPU device initialization, command execution, and memory management.
- virtio_gpu.c/h: Contains the implementation of the VirtIO GPU device,
including ioctl handling, shared memory management, and command execution.
- hsakmt_virtio_events.c: Implements event handling for VirtIO, such as event
creation, destruction, setting, resetting, and querying event states.
- hsakmt_virtio_memory.c: Manages memory operations for VirtIO, including memory
allocation, freeing, mapping, and unmapping.
- hsakmt_virtio_queues.c: Implements queue management for VirtIO, including
queue creation, destruction, and updating.
- hsakmt_virtio_topology.c: Handles system and node properties for VirtIO.
- hsakmt_virtio_vm.c: Manages VM-related operations for VirtIO, such as
reserving and dereserving VA space.
- include/linux/virtgpu_drm.h: Contains DRM definitions for VirtIO GPU.
Key Features
- VirtIO GPU Initialization: The library can now initialize a VirtIO GPU device
and communicate with it.
- Command Execution: Supports executing commands on the VirtIO GPU device.
- Memory Management: Provides functions for allocating, freeing, mapping, and
unmapping memory for VirtIO operations.
- Event Handling: Implements a comprehensive event system for VirtIO.
- Queue Management: Allows for creating, destroying, and updating queues
on the VirtIO GPU device.
- System and Node Properties: Retrieves and manages system and node
properties for VirtIO.
Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>
[ROCm/ROCR-Runtime commit: 48d3719dba]
Support was removed for these eng samples, so remove them from the
blacklist, and make sure that we're using 942 for the shader store
[ROCm/ROCR-Runtime commit: f755981f03]
- Refactored scratch memory handling by introducing fmm_is_scratch_aperture to
replace repeated for-loops.
- Simplified code paths in hsakmt_fmm_release, hsakmt_fmm_map_to_gpu, and
hsakmt_fmm_unmap_from_gpu by using the new helper.
Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>
[ROCm/ROCR-Runtime commit: 72061a9024]
so that aql-to-pm4 conversion could verify the validity of the kernel
object.
Signed-off-by: Flora Cui <flora.cui@amd.com>
[ROCm/ROCR-Runtime commit: a765dd7e94]
This patch changes the type of several loop index variables from int to
uint32_t in fmm.c. The affected functions are:
- __fmm_release
- _fmm_map_to_gpu
- _fmm_unmap_from_gpu
To fix compile warning:
warning: comparison of integer expressions of different signedness:
'int' and 'uint32_t' {aka 'unsigned int'} [-Wsign-compare]
2009 | for (i = 0; i < object->handle_num; i++) {
Signed-off-by: Honglei Huang <Honglei1.Huang@amd.com>
[ROCm/ROCR-Runtime commit: 45af009c5d]
blacklist the KFDEvictTest suite until the defects
SWDEV 535386 and 537002, where these test cases fail
inconsistently, are fixed
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
[ROCm/ROCR-Runtime commit: 3115384874]
disable KFD RAS test case as the tests cause GPU reset
which affects the active kfdtest, the tests can only be
run successfully as separate processes
Signed-off-by: Apurv Mishra <Apurv.Mishra@amd.com>
[ROCm/ROCR-Runtime commit: d9a95605cc]