Allocate debug area big enough for all XCCs in the partition. Also, fix
the cu_num calculations as driver now reports cu_num as the total number
of CUs in the partition.
Signed-off-by: Mukul Joshi <mukul.joshi@amd.com>
Change-Id: I6e80d57196b770bb3c2506bc58cb366c0046084b
[ROCm/ROCR-Runtime commit: 97a669a979]
Implement hsaKmtExportDMABufHandle, which can be used for a new
upstreamable RDMA solution. It exports a DMABuf handle for an arbitrary
virtual address along with the offset of the address within the
allocation. It also checks that the size of the intended export does
not exceed the allocation.
This uses the new AMDKFD_IOC_EXPORT_DMABUF, which requires KFD ioctl
API version 1.12.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ie5fdb1f73ab3c7fa36c315ce326b1fb89eacc8b6
[ROCm/ROCR-Runtime commit: 332f59eb2a]
Query family id info from drm render node, then
ROCr can query this info directly from Thunk
instead of parsing the info by itself.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: I030bd27ab2379fbf87f3d787302c3b8613456278
[ROCm/ROCR-Runtime commit: 66e9e97e0d]
To improve performance on queue preemption, allocate ctx s/r
area in VRAM instead of system memory, and migrate it back
to system memory when VRAM is full.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: If775782027188dbe84b6868260e429373675434c
[ROCm/ROCR-Runtime commit: 37be876cad]
It is to add new option for always keeping gpu mapping
and bump KFD version for the feature of unified save
restore memory.
Signed-off-by: Eric Huang <jinhuieric.huang@amd.com>
Change-Id: Iebee35e6de4d52fa29f82dd19f6bbf5640249492
[ROCm/ROCR-Runtime commit: e1d1a6fbb0]
System Management Interface event is read from anonymous file handle,
this helper wrap the ioctl interface to get anonymous file handle for
GPU nodeid.
Define SMI event IDs, event triggers, copy the same value from
kfd_ioctl.h to avoid translation.
Change-Id: I5c8ba5301473bb3b80bb4e2aa33a9f675bedb001
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 405fbd6f93]
Add KFDGpuID to HsaNodeProperties to return gpu_id to upper layer,
gpu_id is hash ID generated by KFD to distinguish GPUs on the system.
ROCr and ROCProfiler will use gpu_id to analyze SMI event message.
Change-Id: I6eabe6849230e04120674f5bc55e6ea254a532d6
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 88778e13dc]
New API function to report available memory per GPU
Signed-off-by: Daniel Phillips <Daniel.Phillips@amd.com>
Change-Id: I63c1e4ca0020c657977ab3635947ab0ed0a81440
[ROCm/ROCR-Runtime commit: 6da6058d4a]
Import the latest version from the kernel tree.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: If5f998ad55085ebd5020adaa382181204d834e3e
[ROCm/ROCR-Runtime commit: f88aaa933b]
We need to add some more information about the debug features supported
by the platform. We are adding the following:
- debug supported
- dispatch info always valid
- precise memop supported
- watchpoints shared
Change-Id: I68deed98619396d17e28c6e18bad424b58297485
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
[ROCm/ROCR-Runtime commit: 489db9fac6]
Query pointer info return HsaPointerInfo with MemFlags for all pointer
type now.
Change-Id: I3c02b7b71ba0af953035e3ed9cd6bb6435bb9b65
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 991cde0656]
sync with KFD ioctl version 1.6:
1.6 - Query clear flags in SVM get_attr API
Change import export handle args pad field to flags, to pass memory
alloc flags from alloc process to import process.
Change-Id: I69360b244651947e885c4a8da9f64a1163101d20
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: dee9c023a2]
kernel-headers provides the drm/drm.h path, while libdrm-dev[el]
provides the libdrm/drm.h path, which is what we want to use. Fix the
path so we use the newer drm.h header, as well as fixing SLES, which
doesn't provide drm.h in their kernel-headers.
Change-Id: Icb2b6643698d356169e3baeef17527a1b4e05483
[ROCm/ROCR-Runtime commit: 4f3440a8ac]
Add hsaKmtRuntimeEnable and disable.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Change-Id: I083f9293948e975546a1b3c1334cb41499b9ab1f
[ROCm/ROCR-Runtime commit: 1ce548829b]
The debugger and debug agent no longer use the Thunk API.
Remove all deprecated functions and keep commented
references for future KFD tests.
Update and the keep the version checks for future use
and hsaKmtRuntimeEnable/Disable.
Signed-off-by: Jonathan Kim <jonathan.kim@amd.com>
Signed-off-by: Laurent Morichetti <laurent.morichetti@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Change-Id: Ia2f10d82f5ac36d0bd1bda233810f26e8a154d55
[ROCm/ROCR-Runtime commit: 31ac82617c]
Update hsaKmtCreateQueue to initialize the new save area header with the
exception payload and event ID.
Signed-by-off: Jonathan Kim <jonathan.kim@amd.com>
Reviewed-by: Felix Kuehling <felix.kuehling@amd.com>
Reviewed-by: Sean Keely <sean.keely@amd.com>
Change-Id: Icd38062dc982cb29b30644699014eeb0b3e26d00
[ROCm/ROCR-Runtime commit: 96c7a5c9dc]
CoherentHostAccess flag member moved from HSA_MEMORYPROPERTY
to HSA_CAPABILITY struct. Now this is reported to the
topology as a capability of the device instead of a device
memory property.
Change-Id: I48e43e4b4a0635b711b62933734587facdfbf88b
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
[ROCm/ROCR-Runtime commit: f85b428265]
New properties SVMAPISupported added in Thunk spec HSA_CAPABILITY, read
from sysfs from KFD topology.
New local memory property flag CoherentHostAccess added to Thunk
HSA_MEMORYPROPERTY, read from sysfs from KFD topology.
Change-Id: I83933f0e5a61508508168873209dba4af0b77295
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: e8f369b385]
Add function definitions to support SVM (shared virtual memory)
and xnack set.
Change-Id: Ia97ad9d0c449d8d500d799f702e1a58e87d65a56
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: c44a4be776]
Add svm (shared virtual memory) range and xnack mode
APIs.
Change-Id: Ibd8d7fe566dc200730da0c892caa71aad7589ebd
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
[ROCm/ROCR-Runtime commit: ce26348f3a]
The old bit was deprecated, because old buggy user mode depends on it
being always 0. The correct value is now reported in a new bit. New user
mode handles the reported EDC setting correctly, so we can report the
correct value in a new bit.
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Change-Id: Ib5d5ed2519810e650458c6b69c97670dab435ddb
[ROCm/ROCR-Runtime commit: d287c60246]
This reverts commit 07b0758bee.
SVM is not ready yet. This was merged by accident.
Change-Id: I8901594a72e785ba5d25a6448718a570e76fe117
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 41cd7aea2f]
This reverts commit 08e65a397a.
SVM is not ready yet. This was merged by accident.
Change-Id: I1bee102823e7e612be8e8f2e0f50580e8692cc80
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 5edd00136d]
Reserve some space in the context save area for the debugger's
use. There should be 32 bytes per wave for a given queue.
Change-Id: I65ddb6123d0f6afd3149844617ad19023009101d
[ROCm/ROCR-Runtime commit: a83f9b67ce]
Add function definitions to support SVM (shared virtual memory)
and xnack set.
Change-Id: Ia97ad9d0c449d8d500d799f702e1a58e87d65a56
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: a352639df5]
Add svm (shared virtual memory) range and xnack mode
APIs.
Change-Id: Ibd8d7fe566dc200730da0c892caa71aad7589ebd
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
[ROCm/ROCR-Runtime commit: 5ae49f2321]
It is to provide an option to map specific memory as
uncached on A+A HW platform.
Signed-off-by: Eric Huang <JinhuiEric.Huang@amd.com>
Change-Id: Ib665cb306a0e78aba3ea5ee2f0e46cb62ae139f8
[ROCm/ROCR-Runtime commit: 2464bfc714]
Reserve some space in the context save area for the debugger's
use. There should be 32 bytes per wave for a given queue.
Change-Id: I65ddb6123d0f6afd3149844617ad19023009101d
[ROCm/ROCR-Runtime commit: 2ed2e46b9b]
This reverts commit c0a0ada18b.
Reason for revert: Change was submitted by accident
Change-Id: If05c705e22296fd3ca789f269737d379a933361d
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: fec3780c1a]
Signed-off-by: Gang Ba <gaba@amd.com>
Change-Id: I5c23a8dacf9bc50c740908aabe391432f2c7112e
Signed-off-by: Gang Ba <gaba@amd.com>
[ROCm/ROCR-Runtime commit: d675d1cce1]
KFD now passes the ASIC revision to user level through some bits
in the HSA topology's capability field. Some user-level software
wants this because different ASIC revisions may require user-level
software to do different things (e.g. patch code for things that
are changed in later hardware revisions).
Change-Id: I16f2a15ae0875edd01ebdb1f1685ec7865f7049e
[ROCm/ROCR-Runtime commit: 5ddd8fb68b]
Update the non-upstream ioctl numbers to align with the change in the
kernel.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Change-Id: Ie0ddccb343a023b55eb18477c59341acaa666e99
[ROCm/ROCR-Runtime commit: a37a88ddcb]
Read device unique id from sysfs and expose it in HsaNodeProperties.
For devices not supported the value will be 0
Signed-off-by: Divya Shikre <DivyaUday.Shikre@amd.com>
Change-Id: I97b8689dfa090971c6876de6feaa97652e28c03d
[ROCm/ROCR-Runtime commit: ebe7de1f99]
IPC memory was previously returned as HSA_POINTER_ALLOCATED and
had garbage in the node_id field. Due to ROCR_VISIBLE_DEVICES
we need to be able to distinguish between imported memory and
regular memory because imported memory may not be owned by an
agent that is visible in the process. Differentiating these flags
allows the users to expect null agent for the owning agent.
Fixes
Change-Id: Ide3489cec1ee2072dc9697fa5cb71ddb17999d14
[ROCm/ROCR-Runtime commit: 6957202df8]
NumCpQueues and NumSdmaQueuesPerEngine should be got by kfd driver not hardcode.
So add two data fields in HsaNodeProperties then thunk is able to get it from sysfs
that exposed by kfd.
v2: change NumCpQueues/NumSdmaQueuesPerEngine to one byte.
v3: merge two commits as one to avoid ABI update two times.
Change-Id: Ie386e4685f13493e22db6e207a399db6a4c5b9dc
Signed-off-by: Huang Rui <ray.huang@amd.com>
[ROCm/ROCR-Runtime commit: 06464b917d]
adds api and test to get newly create queue snapshot per ptraced process.
Change-Id: Ife97123a5b930e837ccaa386801145ef23c2cc2c
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
[ROCm/ROCR-Runtime commit: 8b01a1c4c5]
The debug trap accesses the data0/data1 registers, so we do not
want the userspace to write values to it. We remove the calls to
set the data0/data1 register values.
Change-Id: Iaba842a4c445f339f16a39fe1994526ff78a2f3c
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
[ROCm/ROCR-Runtime commit: 6933540c81]
To support adding new features to the kfd debugger, and not break
functionality, we need to be able to check the kfd debugger support
version info from the kernel.
Change-Id: Icd88e4edab8430c35eaed588e62d892c1b5c62ec
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
[ROCm/ROCR-Runtime commit: dbbd189b33]
To check the KFD debugger API support, we need to be able to check
the major/minor version of the kfd debugger version, so we need to
expose this function from the kernel.
Change-Id: I8a3dc617607e2efa9e65306d08b8583b8b1a2172
Signed-off-by: Philip Cox <Philip.Cox@amd.com>
[ROCm/ROCR-Runtime commit: 35d56297d3]
on NUMA system, node 0 may have no memory, application pass node id
0 to hsaKmtAllocMemory will fail because mbind to specify the allocation
from node 0 return EINVAL.
Add new flag NoNUMABind for application to pass it to hsaKmtAllocMemory
to skip mbind.
hsaKmtCreateEvent and hsaKmtCreateQueue specify the new flag NoNUMABind
to allocate system memory for event page and CWSR area, don't bind the
system memory to a specific NUMA node.
Change-Id: I854e5a57502c7807c4c5ff2e441d499ae515c309
Signed-off-by: Philip Yang <Philip.Yang@amd.com>
[ROCm/ROCR-Runtime commit: 42392f093f]
PCI domain has moved to 32-bits to accommodate virtualization,
so a 32-bit integer is exposed for domain to reflect this change.
Change-Id: I0d767acadcdc8e4277db203b5865dd67dd001cef
Signed-off-by: Ori Messinger <ori.messinger@amd.com>
[ROCm/ROCR-Runtime commit: f2173254e4]
enable thunk query api to report if queue is newly created
test new queue bit test on clear events.
also fixup cleanup to disable debug trap.
Change-Id: I3ebe2d85da66f28b8c82f0e68461ee7d32ec0b0d
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Reviewed-by: Philip Cox <Philip.Cox@amd.com>
Reviewed-by: Felix Kuehling <Felix.Kuehling@amd.com>
[ROCm/ROCR-Runtime commit: 1ff5cb33b2]
Add data out for enable trap to return poll fd to user space.
Add query debug events interface.
Change-Id: Ia4afde1cf167e6aa61d502380a8b329ee89d5f44
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
[ROCm/ROCR-Runtime commit: 836dfd0752]
The NodeId parameter is redundant and can be retrieved
from QueueId parameter.
Change-Id: I12853849b868b304bd27633fa7653ba644d69026
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>
[ROCm/ROCR-Runtime commit: f40f166e20]