This adds new acquire_vm ioctl.
Change-Id: Ia6794bfd291706cecdb2d06f4902b324b48577df
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 19dacdecd3]
Query GPUVM aperture limits of all dGPUs to determine SVM aperture
base and limit. This depends on a recent KFD change that reports
the GPUVM apurture limits for dGPUs in the
AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl (drm/amdkfd: Simplify
dGPU SVM aperture handling).
Only initialize SVM aperture once, instead of once per GPU.
Don't call AMDKFD_IOC_SET_PROCESS_DGPU_APERTURE. It's not needed any
more and will not be upstreamed.
Change-Id: Ib3389e8ba18505ba15fc33f45fe8a57e690a565d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 85e1a9bf5e]
Define dgpu_mem_init before it's used and keep the code close to the
rest of the aperture initialization code.
Change-Id: I14ad11a364524a15affee9186b1298ba7d56d2c9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: c5cfb7e25b]
kfdtest hsaKmtOpenKFD failed after 1019 loop if using --gtest_loop=-1,
because default max open file handle limit is 1024. Found shmem file handle
is not closed from lsof output.
Change-Id: I474de2bae6c03e879a219dedf5f18639118b73e5
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 105291849f]
On discrete GPUs place the EOP queue in VRAM. The reader/writer of this
queue is the CP and the size is small. Dispatch latency improves
through lower read latency in AQL completion phase.
Change-Id: Id8351dcddbd21fd7c7d699803c96434c9132db71
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
[ROCm/ROCR-Runtime commit: e2c353dc0d]
Invisible device memory is mmapped as PROT_NONE.
Normal CPU access to the memory is still not allowed but
struct vm_area_struct will be created for the memory address
so ptrace can access the memory via the vma.
Change-Id: I07c69208716c920ccce33e6b494b610b61a0a7c1
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: 25170c3c57]
UCX test cases are reporting uninitialized values when CMA fails. The
application should ideally ignore SizeCopied when the function fails but
it doesn't. This is leading to wrong diagnosis.
v2: Fill in partial SizeCopied in case of failure
Change-Id: I6b7e1c19a8b702ec91ca64201a3dda27bd897877
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
[ROCm/ROCR-Runtime commit: 7de0199e99]
Also update IPC signal API text to allow single process profiling with IPC signals.
Change-Id: I90b246623129d57183acb4ba1789beec360547c3
[ROCm/ROCR-Runtime commit: f59b001c75]
- Add support for R_AMDGPU_RELATIVE64 relocation record.
- Return status error if any unsupported relocation record encountered.
Change-Id: Icbb5dcb81109a70c1f2195412a0df58a11be9da1
[ROCm/ROCR-Runtime commit: d472b24d05]
New CMakeLists.txt sets a default module search so -DCMAKE_MODULE_PATH is
no longer required in the command.
Change-Id: I95189ce2f36016b7c4929239d0e512851bec5ef6
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 7031a77428]
Latest Thunk requires the user to belong to video group. Add this
statement to README.md to notify external users on Github.
Change-Id: Id9843abf09de5b63a3b7c3f7b322bc9099c6ff1a
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 8bc83e1e9b]
Temporary measure. Must be reverted once CRAT tables have been fixed.
Change-Id: Id2f2673edbf7b6fc5752f8d871042b4bf4de653c
[ROCm/ROCR-Runtime commit: b49e5b4917]
This change is needed to match other higher level components.
Change-Id: I45114d23f2ed428dfbbb836061b3020c5ab166ec
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
[ROCm/ROCR-Runtime commit: 0f83774635]
This reverts commit dbd9a8736c,
Plus a bug fix to patch "Cleanup fmm.c":
Call id_in_array with correct parameter. The third parameter
of id_in_array is size in byte of the array, not the number
of array items. Call it correctly.
Change-Id: I72d8e2fcc0df32af76c72967386e92c1be18c159
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: 786e470241]
to fmm_allocate_memory_object. This function name was confusingly
similar to fmm_allocate_device and __fmm_allocate_device. The new name
reflects its function better: allocate the VM object and the kernel
mode buffer object.
Change-Id: I6604d228004b4d41e871d4de784786823608b5d6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 587d4f4bdf]
Create an interface for doorbell signals to reduce code duplication.
No functional changes.
Change-Id: I101a8997dd582ff99e1537758c804b21fe3bb6af
[ROCm/ROCR-Runtime commit: d2e70bb999]
Prefer using memfd_create() for the ring buffer.
We were using /dev/shm, but this won't work on systems that
either don't have /dev/shm or have mounted it with noexec, because
for everything other than gfx700 we map the ring buffer with PROT_EXEC.
memfd_create() is Linux specific and was added in Linux 3.17, so we
will fallback to using /dev/shm on systems where memfd_create() is
not available.
Change-Id: I58fb533eebc362f6d29dc3e316a80801014d50e8
[ROCm/ROCR-Runtime commit: b93ffafdc7]
Corrected semantics used in hsa_queue_load_write_index_relaxed.
The semantics that was used in hsa_queue_load_write_index_relaxed
didn't seem to match the name of the function.
I also removed a useless return keyword.
Change-Id: If3819d38fb367f122fc382edf8ee3771a23279ae
[ROCm/ROCR-Runtime commit: 5872b618de]