Add namespacing to elf find module.
Stop using CMAKE_CXX_FLAGS and start using target properties for this.
Ideally we should remove the actual option strings and replace with cmake
compiler properties or compile features.
Change-Id: I57756387b3bd3c565c99a35fed4b37fe1a2d0556
[ROCm/ROCR-Runtime commit: 9ff0268f4c]
Adds support for find_package(), locates dependencies with
find_package(), swaps the roles of /hsa/include/hsa and /include/hsa
as well as /lib & /hsa/lib.
Kernel code objects no longer build at every make call but only
as needed. Dependencies are tracked through to clang.
Device lib is still located with directory searches. build_devicelibs.sh
does not yet install the cmake config files on the build systems.
Corrects DAZ mode mismatch in code object compilation.
Still needs updating to compiler properties rather than direct
manipulation of CMAKE_CXX_FLAGS.
Change-Id: I02d946c8a77d5cf753681f8e3d3153fca4aae86a
[ROCm/ROCR-Runtime commit: 55a4f01b16]
Save and restore exec_lo/exec_hi around the mGetDoorbellId macro in
the signal_error case just like we do in the signal_debugger case.
Also reset the wave_id (ttmp4/ttmp5) to 0 since it isn't preserved.
0 will be detected as a new wave by the debugger api library.
Change-Id: I5123caa9431154ec1584bae85e42648c97c64c37
[ROCm/ROCR-Runtime commit: db6a781f0c]
Create symlink directory before attmpting to create the symlink.
Change-Id: Ic4d07052e5bfc32280c7d71e58784cbba3536e2a
[ROCm/ROCR-Runtime commit: d163fac13d]
Initialize all the fields in the HSA queue object to known values
before calling the thunk to create the KFD queue. This ensures that
when the debugger detects that a KFD queue is created it can access
the values it requires. The values it requires include the apperture
addresses, queue scratch memory base, and the HSA queue kind.
Change-Id: Ic985755b0402c6794d5987e60aff50d223f09eb9
[ROCm/ROCR-Runtime commit: a74660c69a]
- Check address is in the range of the mapped file.
- Correct calculation of offset within the file.
Change-Id: I848a3ead4422698c2ef1c140bc8ae5e000a717f7
[ROCm/ROCR-Runtime commit: 5f614c31f5]
Set ttmp11[8] and send a signal for the debugger when the handler
is entered because of an address watch exception.
Change-Id: Icc83a79027bb7ca1e50e19e2f00464cb9ca862f3
[ROCm/ROCR-Runtime commit: da6d892058]
Adds the following:
- New factory method to create a code object reader from
file with offset and size.
- A pair of queries on a loaded code object to get the URI name/length.
- A bump to the AMD vendor loader extension API and its associated table.
Change-Id: I17c83e9c2447d29a43c438459395365f786a3611
[ROCm/ROCR-Runtime commit: 9eb735ec24]
Zero size pools have no numa bindings. Selecting a pool with numa
bindings should prevent thrashing due to numa balancing daemon.
Change-Id: Ib0082cb9af66e24e07a2adbb83c1045145d51403
[ROCm/ROCR-Runtime commit: 32bb10086d]
This has nothing to do with registering agents.
Moved to Runtime::Load.
Change-Id: I0f84c9d8f5a68d458717111113f02af56c92f4f6
[ROCm/ROCR-Runtime commit: 40d1931209]
Memory from the suballocator may be exported via IPC. If this
happens then the allocating process should not reuse that memory
since it would still be connected to the remote process. IPC exported
memory must be released back to the driver.
Change-Id: I2ab0c814f63191f753fc3640cc4140ee144bf07f
[ROCm/ROCR-Runtime commit: 29b660c91e]
All types which could be generated from a fragment need to take this branch.
Taking the branch is correct for all types, it was a performance optimization
only and was missing IPC. Branch removal simplifies updates for any future
fragment use and will allow CQE to report any performance issues that might
require bringing the branch back.
Change-Id: I8041788c422e880b764e144eb1877f5126ba76f3
[ROCm/ROCR-Runtime commit: 09ebc21d13]
Thunk may report nullptr for host base if the host does not have
access. Use agent base in this case.
Change-Id: I44883d35a3fff0941b1e3037d16b059591a6c511
[ROCm/ROCR-Runtime commit: 397608e2c0]
The 1st level trap handler jumps to the 2nd level trap handler on
context save requests or regular traps if (mode.debug_en
&& !status.halt) is true.
If we return from the 2nd level trap handler without status.halt=1, then
we need to make sure mode.debug_en is cleared, or we will re-enter the
2nd level trap handler again and again when trapsts.savectx is set.
Change-Id: I4db6369de8c91a32842f488a4df5c9d94fa65aa9
[ROCm/ROCR-Runtime commit: 584ef1e1ca]
Missing attribute type. Also remove dangling word in
HSA_AMD_AGENT_INFO_MEMORY_WIDTH.
No code change, documentation only.
Change-Id: I1d0efdb721eaa0e2fb0bdb21f8d5e034beaf8857
[ROCm/ROCR-Runtime commit: 012ffed459]
In the case where SQ_WAVE_TRAPSTS_XNACK_ERROR_MASK is set, we also need
to set the TTMP11_EXCP_RAISED_BIT in ttmp11. If we don't, the debugger
may think that the wave is halted at launch (halted without events).
Change-Id: I8c19605bbfc145275728de4ad1979d3ba8bb478a
[ROCm/ROCR-Runtime commit: 838c6bd6ad]
Mesa address lib faults if the only acceptable swizzle modes are
forbidden. The old address lib simply ignored the forbidden list
in this case. Mesa addrlib will not select linear unless there
is no other option so allowing linear mode for tiled images will
still use tiled modes when possible.
Change-Id: I1aa44d072db902c968484dbff67b482af03b45d9
[ROCm/ROCR-Runtime commit: c60364e1e0]
Peer accessibility query was not previously directed at the peer agent.
Change-Id: I259f0afac827a6e4778a56419a3acd296d00391b
[ROCm/ROCR-Runtime commit: 9c9064c2b7]
Contructor function must not be attempted twice even if the construction
attempt returns nullptr.
Change-Id: I75353e5e511769a96e4332f7f60887f6559c1cd5
[ROCm/ROCR-Runtime commit: 2fbacccaed]
- Correct defintion of HSA_QUEUE_TYPE_COOPERATIVE to be a queue type
and not a bit mask.
- Correct implementation of hsa_queue_type_t to treat is as an
enumeration type and not a bit mask. In particular
HSA_QUEUE_TYPE_COOPERATIVE is a distinct queue type that uses the
multi producer protocol, and is not a bit set value.
Change-Id: I9415be8853671e5511e16e306caf16020e8c84af
[ROCm/ROCR-Runtime commit: bccb25fc33]
There are a couple problems with this. First, llvm-dis is an unstable
llvm development tool and 3rd party users should generally not rely on
it. The text format is unstable, and the regex here isn't even
explicitly looking for the target triple field, so it could
accidentally find something else. Second, picking the target to
compile based on the library you are linking is a fundamentally
backwards decision. The target you're compiling for changes the
library you would want to link. The device libraries are only ever
compiled with amdgcn-amd-amdhsa. If we had a second triple, this
should be explicitly building for any it cares about.
Change-Id: I3bae8398f60f78df61ab2177aa9e83f47ec6dea4
[ROCm/ROCR-Runtime commit: 96d4140609]
The ROCR trap handler should check for all end program instructions
and not halt on them. Mask off the imm16 before comparing the
instruction to the s_endpgm opcode.
Change-Id: I669ffc7f5b699d7daf0c8ec5761ed7bb193f07a7
[ROCm/ROCR-Runtime commit: df03a377f5]
Image swizzle mode will be set by the preferred surface info
function.
Change-Id: I41e639be53cafbb4db6cf15c159aa2bd457ec5be
[ROCm/ROCR-Runtime commit: 1440da3e15]
New trap handler ABI: Record in ttmp11[8:7] the event that caused the
trap handler to be entered. We currently record 2 events, trap_raised
if an s_trap instruction was executed, or excp_raised if an exception
(MEM_VIOL or ILLEGAL_INST) was raised.
Change-Id: Ie278c8277437b3b67c2737dcd1a12fe6511df428
[ROCm/ROCR-Runtime commit: 00da82f951]
Remove the hard-coding of "SHARED" as the lib type, and move any
SO-specific linking to only happen if the .so exists in the first place
Change-Id: I3f0bfd5c03f19b2425423b4dc8eed8fd87acc1d6
[ROCm/ROCR-Runtime commit: 33133ebd07]
Changes in the compiler are being made to add controls for XNACK and SRAM ECC
for all targets which can support these features. By default the conservatively
correct settings of XNACK on and SRAM ECC on will be used. This change is to
facilitate these backend updates.
Change-Id: I2fd6b6bc1d32937737e7f56d8e08c70fe781c745
[ROCm/ROCR-Runtime commit: 87202d4408]
IPC create must only be used on whole ROCr allocations.
Fragments were allowing handle creation with offsets.
Change-Id: I1faa96d36bc7a6199bdc2e3ff1b8871d1a36a2fa
[ROCm/ROCR-Runtime commit: 7712c7e743]
This has been the default mode for a while now since we don't
distribute or build the finalizer. Removing the attempt cleans
up debug mode messages that are causing confusion.
Change-Id: I8162c95abd5bbedaa22b90191f7a384a34c388ae
[ROCm/ROCR-Runtime commit: 3fe891d5da]
Pool size was being used where alloc_max_size should be.
Changes are necessary on NUMA systems where not all nodes have
installed memory.
Change-Id: If8f507cae50a8dfeae8572d4e39df757abe28599
[ROCm/ROCR-Runtime commit: a9470e3563]
Lock API suceeds but the GPU still faults on the address.
This should be fixed in Thunk and/or KFD as well.
Change-Id: I8b2fbcae61ab181e4fe7f0b64e43a5f0772efb24
[ROCm/ROCR-Runtime commit: 9fe44ed675]
Iterate the loaded shared objects to see if the given elf image binary
is part of a loaded segment.
Change-Id: I074cacd99eb5b59f883f4ce2bd901e0e35a660b8
[ROCm/ROCR-Runtime commit: 5f783494f1]