Libraries normally don't print messages. We use pr_err, pr_warn,
pr_info, and pr_debug to print messages to stderr when prints are
enabled for debugging.
Change-Id: I9caf719343aa618c88e7b500f9737a46702e424a
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Existing Thunk has printf/fprintf in the code while normally libraries
don't print any message. This patch introduces a print machenism similar to
how the Linux kernel prints to console based on the log level. The default
is not to print any message, but setting HSAKMT_DEBUG_LEVEL will enable the
prints.
Change-Id: Ic071e122d35a82260218e9914cde4815e69df742
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
For experimental purpose, we need an option to change compute capability
by forcing the GfxIp version. This patch allows to use environment
variable HSA_OVERRIDE_GFX_VERSION=major.minor.stepping to replace the
default version. For example:
export HSA_OVERRIDE_GFX_VERSION=9.0.1
Change-Id: I90cfbd43619d9d3aebf53321d4e058f01bcd7088
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
ctl_stack_copy is allocated from malloc. It should be freed by free.
Change-Id: Ib924da20200d91f52f106fe173464d47862759a8
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
9.0.1 is XNACK enabled gfx900 compile target. Compiler must generate ISA
that's XNACK enabled.
Change-Id: Ic4987132ef9f8d06d9e2bcdb8f7eeb875cdd2b44
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Use build machine architecture to build debian package. Useful for
building on Power8 and ARM64 machines.
Change-Id: I97fc80a6723b139e753019a355f11ced0bba0dd4
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
The KFD implementation has been removed and will not be upstreamed.
This API has been superseded by hsaKmtRegisterGraphicsHandleToNodes.
Change-Id: I5f2d8da3260974618cdb6ea3fdcd77d37b82c9cb
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
For items in HsaQueueInfo, control stack information comes from KFD, CU
mask information is maintained in Thunk, and others (queue detail error
and queue type extended) are ignored (value = 0) at this point.
Change-Id: Ib21370b0f52b2bb4ebe6a9b4b6ec6139cccb25ca
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Use checkpatch.pl to fix the majority of errors. Some that remain and
will be excluded:
Use of typedefs/externs/volatile/sscanf
Lines over 80 characters
Remaining errors are due to misunderstanding the * symbol with typedefs
Also use this opportunity to spell manageable properly
Change-Id: I0b335e9cb3e1eea38bee27eaa1f582b2c9b09b38
Use calloc to allocate event data. Otherwise random data may be filled
in for events that haven't actually signalled. This could trigger the
VM fault handler in the Runtime when no VM fault actually happened and
lead to intermittent HSA conformance test failures.
Change-Id: Icf702970e73a485b50633703c1b164f87fbb8606
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
This change breaks the ABI, and aligns it with the upstream ABI.
It also fixes some ioctl structures that are not 64-bit safe and
consolidates ioctl numbers.
Change-Id: Ib79944721534bd55a5299c5baf7bb5b3246cccd2
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
This patch adds more non-privileged PMC blocks to GFX9/gfx900 to cover
blocks added in HSA Thunk Spec.
Change-Id: Ia3d953213a32536b2275231149f11ba060791442
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
This patch adds more non-privileged PMC blocks to GFX8 products: gfx801,
gfx803, and gfx803. Most of them have the same counter IDs on the same
block. For certain blocks when the product doesn't have the same counter
IDs, gfx8_xx_ is used to represent the product.
Change-Id: I059913c974bf2eb875fd1cf6f8b0d8c9c9bd7c14
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
HSA Thunk Spec was updated to include more non-privileged blocks for
profiling. This patch adds those newly added non-privileged blocks for
gfx70x.
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Change-Id: Id745ac236c871e8e61a128a2460784f9c9c354b6
export HSA_CHECK_USERPTR=1 to check user pointers on registration. If
the pointer doesn't point to a valid mapping, there will be a segfault.
Change-Id: I459c0902cbc90338517fbf79678871ebfbe5183b
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
KFD added all direct IO links to sysfs, so this patch removes all direct
links related code and modify the indirect links function to reflect the
change.
Change-Id: Iaec7b5f6c59f9034f8f960ca1fe1145d51dab367
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Guard pages help catch out-of-bounds memory accesses by applications
by generating VM faults (GPU) and segfaults (CPU).
Remove address space reservation from scratch aperture. That address
space is managed by the Thunk client. Guard pages would cause Thunk's
address space management to get out of sync with the client's.
Change-Id: I2e5aee2923a90186358cc7b0e131baf547996df6
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
- Typo fix: *_link_tye to *_link_type and a missing word in comments
- Replace printf with fprintf(stderr
- Shorten lines to fit in 80 characters
Change-Id: Ibeb0b98d5c59d617ae06d9854a9dde16251ded52
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Many devices have the same counter IDs for the hardware block. Devices
in the same GFX generation usually have the same block counters. No need
to list each device individually. Instead, have a table to share with all
devices that have the same counter IDs, and have separated tables for
devices that don't have the same counter IDs.
Change-Id: I857056edc6f491f61af6e9598580e5dc7d372f94
Integrate the supported device ID list distributed in topology, queue, and
pmc into one place: topology.
Change-Id: If035cf8e4a6fc6caff6c94ec627647cfb11c3d79
Though S_IWOTH flag is set in the open() call, the lock file is not
created as accessable by others if others try to open the file with O_RDWR
permission. It's because the default umask masks off S_IWOTH. This patch
changes the umask to S_IXOTH since others don't need that permission but
it'll open up S_IWOTH. Restore the umask to original after the file is
opened.
Change-Id: I8a239e1566ce0b0b18821913385f239db7c3588e
StartTrace and StopTrace send ioctl requests to enable/disable performance
counters. QueryTrace reads the counter from the perf_event fd.
Change-Id: Ibf79675bc23fcf129371bfd100f8e262121bc684
Unless HSA_USERPTR_FOR_PAGED_MEM is explicitly set, don't use userptr
for all paged memory. This will also allow us to work around some 4.9
issues, and then we can explicitly set HSA_USERPTR_FOR_PAGED_MEM for
all usage once those issues are resolved.
Change-Id: I25ce22b73ae6e93f1567f2318d9d2b47d4a44e69
The control stack memory for CWSR is allocate in kernel together with MQD
allocation.
Change-Id: Ib1c0ab9402df3431e9555649394320380d6c6dd8
Signed-off-by: shaoyun.liu <shaoyun.liu@amd.com>
On SOC15 chips, the ABI for the create_queue ioctl is changed to
allow doorbell allocation independent of the queue ID. This is
necessary to accommodate doorbell routing to specific engines in
the BIF.
Change-Id: Ie98d0a758758149dd5fc09ae088afccc29904124
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
On gfx900 we need 64-bit for all doorbells and SDMA WPTRs.
Change-Id: I9b922e16442e967599ae3c928308451d5cc470b3
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Use KFD_IOC_ALLOC_MEM_FLAGS_COHERENT when allocating fine-grained
memory and doorbell BOs so that they will be mapped with MTYPE_UC
on GFX9 hardware.
Change-Id: I51adf45b13105f479e6bcdaf54955b467920ee9a
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
Copied from kernel repository.
Change-Id: I9ed021cfb5b297d9a91dce93ed6355c95fb1127b
Signed-off-by: Felix Kuehling <Felix.Kuehling@amd.com>
Reviewed-by: Shaoyun Liu <Shaoyun.Liu@amd.com>
This is in preparation for gfx900, which uses 64-bit doorbells. We
maintain the same number of doorbells per process by making the
doorbell page size bigger.
KFD will need to implement the same rule.
Change-Id: I3c4110869b191b83943b5a390a48edfc94d941d8
Existing code uses lockf to ensure exclusive PMC access of one process and
one TraceId. However Thunk spec allows hsaKmtPmcAcquireTraceAccess to get
exclusive access to the defined set of counters, not exclusive to one
process or one TraceId. Multiple counter sets of multiple TraceIds is
allowed if they meet the concurrent access limit evaluated by the hardware
/driver.
Change-Id: I59cacb855a707fe326a4070452fcbbd3c95ac223
Existing code assumes all counters sent to hsaKmtPmcRegisterTrace belong
to one PMC block and this block is SQ. This patch considers cases when
counters are in different blocks, and removes the hard-coded SQ. As a
matter of fact, SQ is non-privileged so the user even shouldn't use SQ
counters to register/release trace. This patch also ignores
non-privileged blocks as what HSA Thunk spec describes.
This patch also records counters information in trace structure so
AcquireTrace can get counters information using that TraceId.
Change-Id: Ifa5741050553d4615baab01f7485a9e09435b019
Signed-off-by: Amber Lin <Amber.Lin@amd.com>
Implement two new API for cross memory read and write operation.
- hsaKmtProcessVMRead
- hsaKmtProcessVMWrite
Add new ioclts necessary for the above APIs.
Change-Id: I0c153e3b4e1f32b7a8b102ad5c774d9ae9bfc2fa
Signed-off-by: Harish Kasiviswanathan <Harish.Kasiviswanathan@amd.com>
events.c and queues.c were accidently changed to 755 by change
fc70f0c30976f4021f7d763bfc10d76a76029553. Change them back.
Change-Id: If51c0b91139afc23e9051cf94c83d61fc20297e6
Signed-off-by: Oak Zeng <Oak.Zeng@amd.com>