Bottom layers don't error check this value, so we might and up writing a bad value to a register and cause the SPI to hang.
Change-Id: I6da4ae71c66a25c63ebb804da4afe4ca7fb831b7
[ROCm/clr commit: 6e985845b3]
Apply the optimization to change for OpenCL too.
Clean up some unnecessary checks.
Change-Id: I840261fe35baeeadeba7388e86779d482f509aad
[ROCm/clr commit: 6c5a42b33c]
Device binaries that are embedded inside the host binary do not
require a copy. Their lifetime is guaranteed to exceed that of the
loaded executable.
Add a 'make_copy' parameter to amd::Program::addDeviceProgram. If
make_copy is false the original image will be used and will not
get freed when the amd::Program is destroyed.
Change-Id: I7973bb0243f5a2d1b639b8a88445cfe6af919dd7
[ROCm/clr commit: 9e1964ddaa]
Remove queue limitation since we loop through HW queues now.
Add a DevLogError if we fail to create the hsa_queue. A ticket showed a regression there.
Change-Id: I4f58e405f88e75600a762f6d6352838c969cdb5e
[ROCm/clr commit: b54c3f7db9]
This workaround is to avoid performance penalty of SDMA engine
taking a while to clock up from a lower DPM state. Add env var
GPU_FORCE_BLIT_COPY_SIZE (1024 by default for HIP in KB). Forcing
Src and Dst agent to be amdgpu makes ROCr take blit copy path for
what otherwise should have been SDMA copy
Change-Id: I222f687155f86000d17d66d25182e490b6710463
[ROCm/clr commit: 5f64e6e7ad]
Object libraries are weird, and producing a library by using the
target objects from them doesn't automatically import the interface
properties of the linked targets. These object libraries only have
single uses, so just directly create the final library from the
sources.
Leaves libelf as an object library, since there seems to be some cmake
oddity when trying to link an unexported target to an exported one.
Change-Id: Ic379612c89340c40085c9862cfe111fa4bbff425
[ROCm/clr commit: cba7a4d20e]
SWDEV-232580 & SWDEV-232580
Allocate p2p statging buffer when full P2P access is not available between all devices.
p2p staging buffer will eventually be used when required.
Change-Id: If8490ba7b1c52c432c1e942ae95421b9d2ec7097
[ROCm/clr commit: f149fe0803]
There's a lot of unnecessary system configuration junk here which
isn't used, and is already available through compiler predefines. This
is also blindly placed without really checking the host architecture.
-DLINUX is unused.
-D__AMD64__ is predefined by the compiler, and is also redundant with
__x86_64__ and ATI_BITS_64.
__x86_64__ should also be removed. It's used in libelf, but I'm not
sure if msvc predefines this or not.
-DqLittleEndian is unused, and also doesn't follow macro naming
conventions (plus compilers have their own predefines for checking
this).
Change-Id: I89f6fc4c88e861623be7f32df41aecbb4e9009ab
[ROCm/clr commit: e7d6a5e5a6]
This should allow the cmake build for the opencl runtime to work
without manually adding these definitions. The PAL build also adds
these as private defines in its build, so change rocm to match. This
should probably be including these a config header to benefit other
builds, but this will at least avoid some clutter in the opencl build
for now.
Change-Id: I1044984b87ba3fc72e280e255ceea2dd9e3337ff
[ROCm/clr commit: c60d7d860d]
Use target specific forms for define/include. Don't set
CMAKE_CXX_FLAGS for the standard, which is already implied from the
parent build.
Change-Id: I4000893376d6685e9889b66ad8451fc493020272
[ROCm/clr commit: 815198bec9]
Don't use find_path on the header, it's redundant with the interface
include directories on the imported target. Use the target specific
forms for including and linking it.
Change-Id: I3923143c992888ee7d5ee1130084ac2e5eaa0f3a
[ROCm/clr commit: 83455f36c5]
This is almost never the correct thing to use since it breaks adding
this as a subproject build in a larger build. Switch to refer to
CMAKE_CURRENT_SOURCE_DIR, which is equivalent in a standalone build.
Change-Id: Ib8dbbc0668491f4227389b9a5b27da770b3bc5ce
[ROCm/clr commit: a36f19df51]
[ROCm][TCT][HIP] cooperative stream test case is failing.
Make sure lockXfer() in the blit manager returns a valid value.
Port the latest PAL backend logic into the ROCr backend.
This change doesn't fix the issue, reported in the ticket.
Change-Id: I54101a824f49a2dcfbbf5414cb5b3af41745306d
[ROCm/clr commit: 89133a7301]
- Once device assertion occurs, abort the host execution as well.
- TODO: This's the initial support. As we need to drain hostcall queue
to ensure device assertion message being flushed out, hostcall
listener needs an interface to explicitly drain its queue.
Change-Id: I8a04400aa7109bfd054ae5777c41a4abbf0db4a9
[ROCm/clr commit: 97f55b5c7f]