* libhsakmt: Update hsakmt_fmm_get_handle to support address range
Currently, hsakmt_fmm_get_handle works only if the address is allocated
(staring) value. Update it so it can find the handle if address falls in
the valid allocated range. This is useful for AMD infinity storage
feature where data needs to be transferred to any memory within in the
allocated range
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* libhsakmt: Introduce AMD Infinity Storage (AIS) API
Add hsaKmtAisReadWriteFile() API to support AMD Infinity Storage. The
API moves data directly from GPU VRAM to a file.
v2: Add in/out ioctl arguments to provide more status information to
user space. Modify hsaKmt API also accordingly.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* rocr: Initial implementation of AMD Infinity Storage (AIS)
Implement first two API: hsa_amd_ais_file_write and hsa_amd_ais_file_read
v2: Change API from hsa_amd_ to hsa_amd_ais_
Change API to take in handle instead of fd for compatibility accross
different platforms
Original Author: Chris Freehill <Chris.Freehill@amd.com>
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
---------
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* libhsakmt: Update ioctl version to 1.18
Sync with kernel ioctl version.
Also explicitly set the ioctl flag to KFD_PROC_FLAG_MFMA_HIGH_PRECISION
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
* libhsakmt: Sync ioctl header by adding kfd_ioctl_profiler
Sync with kernel ioctl version. Add kfd_ioctl_profiler.
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
---------
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
This patch uses udmabuf driver to allocate system memory instead of using amdgpu
driver for APU. With this function app can account its consumed system memory by
cgroup mechanism. This function is enabled by env variable HSA_USE_UDMABUF.
Signed-off-by: Xiaogang Chen<Xiaogang.Chen@amd.com>
[ROCm/ROCR-Runtime commit: 996e8bbfb7]
Environment variable HSA_HIGH_PRECISION_MODE can be used to control MFMA
precision
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
Change-Id: Ib78dd9dd8867025e090a3cca96ab6db4f65dea12
[ROCm/ROCR-Runtime commit: 2a64fa5e06]
to send data back to user.
Change-Id: If11fb4147e32c0eed319ccf76bcde9d76815ff67
Signed-off-by: James Zhu <James.Zhu@amd.com>
[ROCm/ROCR-Runtime commit: b07a80e505]
Extend the current Thunk implementation of queue creation to target
specific SDMA engine IDs.
Also expose the new recommend SDMA engines per IO link from the KFD
sysfs.
Change-Id: I51f9a0d83c0f1fc4d5dc837f879a7ae332e7d7e9
[ROCm/ROCR-Runtime commit: 2f588a2406]
KFD ioctl version is 1.16 on upstream for contiguous memory support.
Remove pc_sampling version, should be added after pc_sample upstream.
Change-Id: I6e6c3340bc8e371d68dd7741b02578be2fdef801
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
[ROCm/ROCR-Runtime commit: 6e6f445f75]
Since PC Sampling not upstream yet, so use 1.16 for
contiguous VRAM allocation, and 1,17 for pc sampling.
Change-Id: Ib5d22e8f386ce7fe3f7111485b9632b61227e539
Signed-off-by: James Zhu <James.Zhu@amd.com>
Signed-off-by: Chris Freehill <cfreehil@amd.com>
[ROCm/ROCR-Runtime commit: 5786dbbb76]