This partially avoids a difference in the include paths between a
build and install tree, and simplifies the install configuration.
Change-Id: If8119507594e0d284ac08c141c6c51c88ec619ef
Bottom layers don't error check this value, so we might and up writing a bad value to a register and cause the SPI to hang.
Change-Id: I6da4ae71c66a25c63ebb804da4afe4ca7fb831b7
Device binaries that are embedded inside the host binary do not
require a copy. Their lifetime is guaranteed to exceed that of the
loaded executable.
Add a 'make_copy' parameter to amd::Program::addDeviceProgram. If
make_copy is false the original image will be used and will not
get freed when the amd::Program is destroyed.
Change-Id: I7973bb0243f5a2d1b639b8a88445cfe6af919dd7
Remove queue limitation since we loop through HW queues now.
Add a DevLogError if we fail to create the hsa_queue. A ticket showed a regression there.
Change-Id: I4f58e405f88e75600a762f6d6352838c969cdb5e
This workaround is to avoid performance penalty of SDMA engine
taking a while to clock up from a lower DPM state. Add env var
GPU_FORCE_BLIT_COPY_SIZE (1024 by default for HIP in KB). Forcing
Src and Dst agent to be amdgpu makes ROCr take blit copy path for
what otherwise should have been SDMA copy
Change-Id: I222f687155f86000d17d66d25182e490b6710463
Object libraries are weird, and producing a library by using the
target objects from them doesn't automatically import the interface
properties of the linked targets. These object libraries only have
single uses, so just directly create the final library from the
sources.
Leaves libelf as an object library, since there seems to be some cmake
oddity when trying to link an unexported target to an exported one.
Change-Id: Ic379612c89340c40085c9862cfe111fa4bbff425
SWDEV-232580 & SWDEV-232580
Allocate p2p statging buffer when full P2P access is not available between all devices.
p2p staging buffer will eventually be used when required.
Change-Id: If8490ba7b1c52c432c1e942ae95421b9d2ec7097
There's a lot of unnecessary system configuration junk here which
isn't used, and is already available through compiler predefines. This
is also blindly placed without really checking the host architecture.
-DLINUX is unused.
-D__AMD64__ is predefined by the compiler, and is also redundant with
__x86_64__ and ATI_BITS_64.
__x86_64__ should also be removed. It's used in libelf, but I'm not
sure if msvc predefines this or not.
-DqLittleEndian is unused, and also doesn't follow macro naming
conventions (plus compilers have their own predefines for checking
this).
Change-Id: I89f6fc4c88e861623be7f32df41aecbb4e9009ab
This should allow the cmake build for the opencl runtime to work
without manually adding these definitions. The PAL build also adds
these as private defines in its build, so change rocm to match. This
should probably be including these a config header to benefit other
builds, but this will at least avoid some clutter in the opencl build
for now.
Change-Id: I1044984b87ba3fc72e280e255ceea2dd9e3337ff
Use target specific forms for define/include. Don't set
CMAKE_CXX_FLAGS for the standard, which is already implied from the
parent build.
Change-Id: I4000893376d6685e9889b66ad8451fc493020272