Implement two new API for cross memory read and write operation.
- hsaKmtProcessVMRead
- hsaKmtProcessVMWrite
Add new ioclts necessary for the above APIs.
Change-Id: I0c153e3b4e1f32b7a8b102ad5c774d9ae9bfc2fa
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
events.c and queues.c were accidently changed to 755 by change
fc70f0c30976f4021f7d763bfc10d76a76029553. Change them back.
Change-Id: If51c0b91139afc23e9051cf94c83d61fc20297e6
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
This avoids unnecessary evictions and failed restores due to the
munmap of userptr BOs that are just about to be freed.
Change-Id: Icf2f0b73991455556a201c54c05ea7e20af80f47
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Should point directly to amd_queue_t.write_dispatch_id. Only noticeable
with HWS enabled which is not yet stable.
Change-Id: I169906d45225379a3ca2729ff04d298fdbb9a9fb
Add IOMMUv2 to blocks returned by hsaKmtPmcGetCounterProperties(). IOMMU
information is read from sysfs.
Change-Id: I3a1c6f902f947913570a78700fc0ffc444e1dd72
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Thunk follows Linux kernel coding convention to use tabs instead of
spaces.
Change-Id: I4eddcfa9a0513f16c869d9cc63f9f1dae0c39f83
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Add gfx803 10/11 device IDs that were recently added to KFD.
Change-Id: Id40b117ae47bacedefa6e333fdfdf58dea92cd2d
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
- Build system fixes
- No user-mode high-precision timer by default, use clock_gettime
- Use C11 aligned_alloc pending C++17 std::aligned_alloc
Change-Id: I268365bdfd11d1e817a89584b9e086ee5b86e1dc
Gfx9 requires monotonic write pointer and doorbell.
Cound fields are 1-based compared with 0-based pre-Gfx9.
- Restructure implementation to use monotonic ring indices
- Remove redundant submission size checks (handled by AcquireWriteAddress)
- Unify copy/fill per-command limit (documentation is unclear)
Change-Id: I57c1675221d2e63aa319fee700d9951671e1bd65
Note: Implementation same as 1.0 APIs for now.
The followup change will have the complete implementation.
Change-Id: Ife633f74ff27eee0bb9b0c46952cf5233b0114e8
If fork() is called, clear all duplicated data that is invalid in the
child process.
Change-Id: I4e27198060db593c630c6337b7071dfbd0d80b83
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
CWSR buffers can be large on dGPUs (~21MB on gfx803). Allocating them
in VRAM limits the number of queues that can be created unnecessarily.
Also make freeing of per-queue buffers symmetric with allocation. All
buffers are now allocated with allocate_exec_aligned_memory on dGPUs
and APUs, so use free_exec_aligned_memory to free them.
Change-Id: I45e8cb1801857d0268750202cdd422426611e457
Also emit error messages to stderr if no async queue error callback was registered and queue fault messages are enabled (on by default).
Queue fault messages are controlled with env key HSA_ENABLE_QUEUE_FAULT_MESSAGE.
Change-Id: I496487b8d048b83aa95b9784e92928211f167b17
Uncommented HSA IPC code.
Changed hsa_amd_ipc_memory_t to be 8 uint32_t's instead of 9 to
match spec
Change-Id: Id1523125e9b876a23c3743df1be29c98b47f6725
Implement three new APIs for IPC buffer sharing:
-hsaKmtShareMemory()
-hsaKmtRegisterSharedHandle()
-hsaKmtRegisterSharedHandleToNodes()
Add new ioclts necessary for the above APIs.
Change-Id: Ia2b4d0dc91ec64bff959395d11c0536467404792
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>