This enhances libhsakmt's capabilities for multi-context KFD support by implementing per-context topology management.
Changes:
* Add hsaKmtGetClockCountersCtx for multi-context support
- Add context-aware version of hsaKmtGetClockCounters
- Original API is retained as a wrapper calling the ctx-version with primary context
* Enable independent debug sessions across multiple KFD contexts
-Create hsa_kfd_debug_context, introduce context-aware debug APIs, shift debug state to per-context
* Add perf sub-context for per-context performance counter management
- Introduce hsa_kfd_perf_context, move counter properties, add context - aware perf APIs, and update initialization
* Refactor FMM for per-context resource management
- Refactor multiple global variables related to FMM, including
GPU ID arrays , svm, cpuvm_aperture, and mem_handle_aperture to hsa_kfd_fmm_context
* Implement per-context topology for complete context isolation
- Migrate global topology data (g_system, g_props, map_user_to_sysfs_node_id)
to per-context hsa_kfd_topology_context structure
- Update all topology functions to accept HsaKFDContext parameter for
context-aware operations (validate_nodeid, get_node_props, get_iolink_props, etc.)
- Refactor topology snapshot management for per-context isolation
- Add context-aware PMC trace access APIs
Signed-off-by: Junhua Shen <Junhua.Shen@amd.com>
* SWDEV-558848 - Move DRM calls to thunk for better abstraction
* Use thunk device handle instead of drm inside agent
* Update IPC functions with new thunk calls
* create hsaKmtHandleImport interface to support ipc
* Reset metadata inside hsaKmtMemHandleFree
* remove whitespaces and NULL usage
* Add thunk apis to libhsakmt.ver
* Add comments to new structs in thunk
* Minor fixes to declarations
* resolve merge conflicts in amd_kfd_driver
---------
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
* SWDEV-558848 - vmm api support for rocr on windows
* Fixes to VMM handle Map/Unmap Set/Get Access
* Fix GetShareableHandle to use pointer for shareable handle
* Update os specific map/unmap memory calls
* clang format update
* Minor syntax fixes from code review
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
---------
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
* libhsakmt: Update hsakmt_fmm_get_handle to support address range
Currently, hsakmt_fmm_get_handle works only if the address is allocated
(staring) value. Update it so it can find the handle if address falls in
the valid allocated range. This is useful for AMD infinity storage
feature where data needs to be transferred to any memory within in the
allocated range
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* libhsakmt: Introduce AMD Infinity Storage (AIS) API
Add hsaKmtAisReadWriteFile() API to support AMD Infinity Storage. The
API moves data directly from GPU VRAM to a file.
v2: Add in/out ioctl arguments to provide more status information to
user space. Modify hsaKmt API also accordingly.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* rocr: Initial implementation of AMD Infinity Storage (AIS)
Implement first two API: hsa_amd_ais_file_write and hsa_amd_ais_file_read
v2: Change API from hsa_amd_ to hsa_amd_ais_
Change API to take in handle instead of fd for compatibility accross
different platforms
Original Author: Chris Freehill <Chris.Freehill@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
---------
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
The over arching goal it so provide an API that pre-silicon models can latch into for software bring up.# Please enter the commit message for your changes. Lines starting
[ROCm/ROCR-Runtime commit: d4b85b6bf5]
Currently registering graphics memory without specifying a target
node will return a memory handle that's not a virtual address.
As a result, ROCr is forced to register with a target node for
IPC usage.
Mapping memory without specifying a target node afterwards will
result in mapping to the target node that was imported because the
previous import call flags this node targeting action to future mapping.
For ROCr IPC usage, ROCr wants to map to all GPU nodes if the target node
is not specified.
Allow the caller to register graphics handles that returns a virtual
address without having to specify the target node so that the caller
can make a subsequent map call to all GPUs.
Change-Id: I5a935092b885cc3568e4f3a5dd951c7ec6c84fca
[ROCm/ROCR-Runtime commit: 03463ed2c0]
Extend the current Thunk implementation of queue creation to target
specific SDMA engine IDs.
Also expose the new recommend SDMA engines per IO link from the KFD
sysfs.
Change-Id: I51f9a0d83c0f1fc4d5dc837f879a7ae332e7d7e9
[ROCm/ROCR-Runtime commit: 2f588a2406]
New API to support optional alignment parameter for memory allocations.
The alignment should be larger than or equal to page size and a power
of 2.
Change-Id: Ic3fec43b3c4281f74dd33a57ab4143dcf76e1186
Signed-off-by: Chris Freehill <cfreehil@amd.com>
[ROCm/ROCR-Runtime commit: a31e84eaef]