This addresses the rocprof hang seen with direct dispatch. The
workaround queues the handler back if any of the signal value in the batch
is not decremented. To rememmber the last position in the list, we save
the parsed command in the current timestamp struct.
Change-Id: I02959e463cfe3cee83c54808ffd6e6f48f43b4e8
[ROCm/clr commit: e5e635f9bf]
Setting AMD_CPU_AFFINITY = 1 will make runtime honor core affinity that
the process may set. This is disabled by default as it can prevent
worker thread or any thread that runtime creates from getting scheduled
thus affecting performance.
Change-Id: Ibe4cc95e7b99caee5ce750b7bf66e09e999cc9a3
[ROCm/clr commit: 1398719b0d]
HIP should be built with HSAIL support disabled.
Currently HSAILProgram::info() and VirtualGPU::buildKernelInfo() expose
ACL interfaces directly. This should not be allowed.
Change-Id: Iae15d4f19be16806826f2f6cb600752c11f97fc1
[ROCm/clr commit: bbe6246f19]
Currently LiquidFlash cannot be supported from Github Enterprise,
hence we need to be able to build with out it.
Allow this by setting -DWITH_LIQUID_FLASH=0.
Change-Id: I975e8ee16b7ba033e3eb95fe40955d8c1d4779b7
[ROCm/clr commit: 7034e749e3]
aclutGetTargetInfo() is an internal compiler lib helper functions. This
will not be imported in the HSAIL shared library build, however it is
simple enough that we can maintain our own local copy of it.
Change-Id: I91d1a336c7da027bf8a7df8fae86a25add533611
[ROCm/clr commit: 7fd1e9c10a]
hipIpcOpenMemHandle should return the device pointer which is
similar to the base ptr of the original allocation even if the offset
to the original pointer is passed to hipIpcGetMemHandle
Change-Id: I99c0553e8c67c15b5fed880b6a4c74bce39c3aee
[ROCm/clr commit: 88fca7bf9e]
Device enqueue has an option to execute scheduler on the current
queue and it's enabled by default. Make sure scratch is allocated
on the current queue for that case. Add max vgpr tracking per
program to adjust scratch size accordingly.
Change-Id: I2a6d796913a4551a1e7f343a2465d589eec60d8a
[ROCm/clr commit: e553b2763a]
MT doesn't use GPU waits, but CPU for sync between engines.
Change the threshold values for CPU waits for direct dispatch.
That will bring behavior closer to MT.
Change-Id: Ia41c3cb812614962aff2746b6cf858f1bf77dda2
[ROCm/clr commit: ca2ea70a6c]
Enabling both LC and HSAIL will cause the DYN macro to be redefined.
Rename it for each compiler to avoid name clashing.
Change-Id: I607f022f37c4d05bef4e3a8070d19bd3659d7bc2
[ROCm/clr commit: b771377665]
This change makes HSAIL usage similar to that of Comgr. By default, the
runtime will statically link against it, however if HSAIL_DYN_DLL is
defined, then the runtime will try to dynamically load HSAIL.
Currently stick to statically linking to HSAIL. In a feature patch the
dynamic loading behaviour will be enabled.
Change-Id: I6a78a4375975cf847f236b200404c8cf941d012b
[ROCm/clr commit: c7b50bb890]