Default is OFF to conform to latest cmake standard (3.15) and
because this feature can cause some confusion for unaware developers.
Change-Id: I00f9ec2185c27d2f6a8d2c7f294512a268a4e3f5
Add namespacing to elf find module.
Stop using CMAKE_CXX_FLAGS and start using target properties for this.
Ideally we should remove the actual option strings and replace with cmake
compiler properties or compile features.
Change-Id: I57756387b3bd3c565c99a35fed4b37fe1a2d0556
Adds support for find_package(), locates dependencies with
find_package(), swaps the roles of /hsa/include/hsa and /include/hsa
as well as /lib & /hsa/lib.
Kernel code objects no longer build at every make call but only
as needed. Dependencies are tracked through to clang.
Device lib is still located with directory searches. build_devicelibs.sh
does not yet install the cmake config files on the build systems.
Corrects DAZ mode mismatch in code object compilation.
Still needs updating to compiler properties rather than direct
manipulation of CMAKE_CXX_FLAGS.
Change-Id: I02d946c8a77d5cf753681f8e3d3153fca4aae86a
Save and restore exec_lo/exec_hi around the mGetDoorbellId macro in
the signal_error case just like we do in the signal_debugger case.
Also reset the wave_id (ttmp4/ttmp5) to 0 since it isn't preserved.
0 will be detected as a new wave by the debugger api library.
Change-Id: I5123caa9431154ec1584bae85e42648c97c64c37
Initialize all the fields in the HSA queue object to known values
before calling the thunk to create the KFD queue. This ensures that
when the debugger detects that a KFD queue is created it can access
the values it requires. The values it requires include the apperture
addresses, queue scratch memory base, and the HSA queue kind.
Change-Id: Ic985755b0402c6794d5987e60aff50d223f09eb9
- Check address is in the range of the mapped file.
- Correct calculation of offset within the file.
Change-Id: I848a3ead4422698c2ef1c140bc8ae5e000a717f7
Set ttmp11[8] and send a signal for the debugger when the handler
is entered because of an address watch exception.
Change-Id: Icc83a79027bb7ca1e50e19e2f00464cb9ca862f3
Adds the following:
- New factory method to create a code object reader from
file with offset and size.
- A pair of queries on a loaded code object to get the URI name/length.
- A bump to the AMD vendor loader extension API and its associated table.
Change-Id: I17c83e9c2447d29a43c438459395365f786a3611
Zero size pools have no numa bindings. Selecting a pool with numa
bindings should prevent thrashing due to numa balancing daemon.
Change-Id: Ib0082cb9af66e24e07a2adbb83c1045145d51403
Memory from the suballocator may be exported via IPC. If this
happens then the allocating process should not reuse that memory
since it would still be connected to the remote process. IPC exported
memory must be released back to the driver.
Change-Id: I2ab0c814f63191f753fc3640cc4140ee144bf07f
All types which could be generated from a fragment need to take this branch.
Taking the branch is correct for all types, it was a performance optimization
only and was missing IPC. Branch removal simplifies updates for any future
fragment use and will allow CQE to report any performance issues that might
require bringing the branch back.
Change-Id: I8041788c422e880b764e144eb1877f5126ba76f3
Thunk may report nullptr for host base if the host does not have
access. Use agent base in this case.
Change-Id: I44883d35a3fff0941b1e3037d16b059591a6c511
The 1st level trap handler jumps to the 2nd level trap handler on
context save requests or regular traps if (mode.debug_en
&& !status.halt) is true.
If we return from the 2nd level trap handler without status.halt=1, then
we need to make sure mode.debug_en is cleared, or we will re-enter the
2nd level trap handler again and again when trapsts.savectx is set.
Change-Id: I4db6369de8c91a32842f488a4df5c9d94fa65aa9
Missing attribute type. Also remove dangling word in
HSA_AMD_AGENT_INFO_MEMORY_WIDTH.
No code change, documentation only.
Change-Id: I1d0efdb721eaa0e2fb0bdb21f8d5e034beaf8857
In the case where SQ_WAVE_TRAPSTS_XNACK_ERROR_MASK is set, we also need
to set the TTMP11_EXCP_RAISED_BIT in ttmp11. If we don't, the debugger
may think that the wave is halted at launch (halted without events).
Change-Id: I8c19605bbfc145275728de4ad1979d3ba8bb478a
Mesa address lib faults if the only acceptable swizzle modes are
forbidden. The old address lib simply ignored the forbidden list
in this case. Mesa addrlib will not select linear unless there
is no other option so allowing linear mode for tiled images will
still use tiled modes when possible.
Change-Id: I1aa44d072db902c968484dbff67b482af03b45d9
- Correct defintion of HSA_QUEUE_TYPE_COOPERATIVE to be a queue type
and not a bit mask.
- Correct implementation of hsa_queue_type_t to treat is as an
enumeration type and not a bit mask. In particular
HSA_QUEUE_TYPE_COOPERATIVE is a distinct queue type that uses the
multi producer protocol, and is not a bit set value.
Change-Id: I9415be8853671e5511e16e306caf16020e8c84af
There are a couple problems with this. First, llvm-dis is an unstable
llvm development tool and 3rd party users should generally not rely on
it. The text format is unstable, and the regex here isn't even
explicitly looking for the target triple field, so it could
accidentally find something else. Second, picking the target to
compile based on the library you are linking is a fundamentally
backwards decision. The target you're compiling for changes the
library you would want to link. The device libraries are only ever
compiled with amdgcn-amd-amdhsa. If we had a second triple, this
should be explicitly building for any it cares about.
Change-Id: I3bae8398f60f78df61ab2177aa9e83f47ec6dea4
The ROCR trap handler should check for all end program instructions
and not halt on them. Mask off the imm16 before comparing the
instruction to the s_endpgm opcode.
Change-Id: I669ffc7f5b699d7daf0c8ec5761ed7bb193f07a7
New trap handler ABI: Record in ttmp11[8:7] the event that caused the
trap handler to be entered. We currently record 2 events, trap_raised
if an s_trap instruction was executed, or excp_raised if an exception
(MEM_VIOL or ILLEGAL_INST) was raised.
Change-Id: Ie278c8277437b3b67c2737dcd1a12fe6511df428
Remove the hard-coding of "SHARED" as the lib type, and move any
SO-specific linking to only happen if the .so exists in the first place
Change-Id: I3f0bfd5c03f19b2425423b4dc8eed8fd87acc1d6
Changes in the compiler are being made to add controls for XNACK and SRAM ECC
for all targets which can support these features. By default the conservatively
correct settings of XNACK on and SRAM ECC on will be used. This change is to
facilitate these backend updates.
Change-Id: I2fd6b6bc1d32937737e7f56d8e08c70fe781c745
IPC create must only be used on whole ROCr allocations.
Fragments were allowing handle creation with offsets.
Change-Id: I1faa96d36bc7a6199bdc2e3ff1b8871d1a36a2fa
This has been the default mode for a while now since we don't
distribute or build the finalizer. Removing the attempt cleans
up debug mode messages that are causing confusion.
Change-Id: I8162c95abd5bbedaa22b90191f7a384a34c388ae
Lock API suceeds but the GPU still faults on the address.
This should be fixed in Thunk and/or KFD as well.
Change-Id: I8b2fbcae61ab181e4fe7f0b64e43a5f0772efb24
Iterate the loaded shared objects to see if the given elf image binary
is part of a loaded segment.
Change-Id: I074cacd99eb5b59f883f4ce2bd901e0e35a660b8
- Update the documentation comment in hsa_ext_amd.h, which contained
contradictory and incorrect information about an argument to the
hsa_amd_agents_allow_access function.
Change-Id: I60b0dbbdc761078cd81906bc2c63a27d7e6b53e1