This includes the changes provided by Konstantin, "Add xnack from elf header" (Change 136389).
Change-Id: I95e51141caa0d7c21903b09212c02e4906ec54a3
[ROCm/ROCR-Runtime commit: 8e3d26c617]
Move opening of DRM render nodes from topology to FMM aperture
initialization. Keep the same FDs open for the life time of the
process to match how KFD uses the VMs in the FDs. Call acquire_vm
ioctl during aperture initialization to let KFD use the VMs from
the render nodes.
Change-Id: Ie07d57788cbe685b1841cccc00820c12894a0356
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 8ac2150e81]
This adds new acquire_vm ioctl.
Change-Id: Ia6794bfd291706cecdb2d06f4902b324b48577df
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 19dacdecd3]
Query GPUVM aperture limits of all dGPUs to determine SVM aperture
base and limit. This depends on a recent KFD change that reports
the GPUVM apurture limits for dGPUs in the
AMDKFD_IOC_GET_PROCESS_APERTURES_NEW ioctl (drm/amdkfd: Simplify
dGPU SVM aperture handling).
Only initialize SVM aperture once, instead of once per GPU.
Don't call AMDKFD_IOC_SET_PROCESS_DGPU_APERTURE. It's not needed any
more and will not be upstreamed.
Change-Id: Ib3389e8ba18505ba15fc33f45fe8a57e690a565d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 85e1a9bf5e]
Define dgpu_mem_init before it's used and keep the code close to the
rest of the aperture initialization code.
Change-Id: I14ad11a364524a15affee9186b1298ba7d56d2c9
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: c5cfb7e25b]
kfdtest hsaKmtOpenKFD failed after 1019 loop if using --gtest_loop=-1,
because default max open file handle limit is 1024. Found shmem file handle
is not closed from lsof output.
Change-Id: I474de2bae6c03e879a219dedf5f18639118b73e5
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 105291849f]
On discrete GPUs place the EOP queue in VRAM. The reader/writer of this
queue is the CP and the size is small. Dispatch latency improves
through lower read latency in AQL completion phase.
Change-Id: Id8351dcddbd21fd7c7d699803c96434c9132db71
Signed-off-by: Jay Cornwall <Jay.Cornwall@amd.com>
[ROCm/ROCR-Runtime commit: e2c353dc0d]
Invisible device memory is mmapped as PROT_NONE.
Normal CPU access to the memory is still not allowed but
struct vm_area_struct will be created for the memory address
so ptrace can access the memory via the vma.
Change-Id: I07c69208716c920ccce33e6b494b610b61a0a7c1
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: 25170c3c57]
UCX test cases are reporting uninitialized values when CMA fails. The
application should ideally ignore SizeCopied when the function fails but
it doesn't. This is leading to wrong diagnosis.
v2: Fill in partial SizeCopied in case of failure
Change-Id: I6b7e1c19a8b702ec91ca64201a3dda27bd897877
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
[ROCm/ROCR-Runtime commit: 7de0199e99]
Also update IPC signal API text to allow single process profiling with IPC signals.
Change-Id: I90b246623129d57183acb4ba1789beec360547c3
[ROCm/ROCR-Runtime commit: f59b001c75]
- Add support for R_AMDGPU_RELATIVE64 relocation record.
- Return status error if any unsupported relocation record encountered.
Change-Id: Icbb5dcb81109a70c1f2195412a0df58a11be9da1
[ROCm/ROCR-Runtime commit: d472b24d05]
New CMakeLists.txt sets a default module search so -DCMAKE_MODULE_PATH is
no longer required in the command.
Change-Id: I95189ce2f36016b7c4929239d0e512851bec5ef6
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 7031a77428]
Latest Thunk requires the user to belong to video group. Add this
statement to README.md to notify external users on Github.
Change-Id: Id9843abf09de5b63a3b7c3f7b322bc9099c6ff1a
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
[ROCm/ROCR-Runtime commit: 8bc83e1e9b]
Temporary measure. Must be reverted once CRAT tables have been fixed.
Change-Id: Id2f2673edbf7b6fc5752f8d871042b4bf4de653c
[ROCm/ROCR-Runtime commit: b49e5b4917]
This change is needed to match other higher level components.
Change-Id: I45114d23f2ed428dfbbb836061b3020c5ab166ec
Signed-off-by: Yong Zhao <yong.zhao@amd.com>
[ROCm/ROCR-Runtime commit: 0f83774635]
This reverts commit dbd9a8736c,
Plus a bug fix to patch "Cleanup fmm.c":
Call id_in_array with correct parameter. The third parameter
of id_in_array is size in byte of the array, not the number
of array items. Call it correctly.
Change-Id: I72d8e2fcc0df32af76c72967386e92c1be18c159
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: 786e470241]
to fmm_allocate_memory_object. This function name was confusingly
similar to fmm_allocate_device and __fmm_allocate_device. The new name
reflects its function better: allocate the VM object and the kernel
mode buffer object.
Change-Id: I6604d228004b4d41e871d4de784786823608b5d6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 587d4f4bdf]
Create an interface for doorbell signals to reduce code duplication.
No functional changes.
Change-Id: I101a8997dd582ff99e1537758c804b21fe3bb6af
[ROCm/ROCR-Runtime commit: d2e70bb999]