hipIpcOpenMemHandle should return the device pointer which is
similar to the base ptr of the original allocation even if the offset
to the original pointer is passed to hipIpcGetMemHandle
Change-Id: I99c0553e8c67c15b5fed880b6a4c74bce39c3aee
Device enqueue has an option to execute scheduler on the current
queue and it's enabled by default. Make sure scratch is allocated
on the current queue for that case. Add max vgpr tracking per
program to adjust scratch size accordingly.
Change-Id: I2a6d796913a4551a1e7f343a2465d589eec60d8a
MT doesn't use GPU waits, but CPU for sync between engines.
Change the threshold values for CPU waits for direct dispatch.
That will bring behavior closer to MT.
Change-Id: Ia41c3cb812614962aff2746b6cf858f1bf77dda2
Enabling both LC and HSAIL will cause the DYN macro to be redefined.
Rename it for each compiler to avoid name clashing.
Change-Id: I607f022f37c4d05bef4e3a8070d19bd3659d7bc2
This change makes HSAIL usage similar to that of Comgr. By default, the
runtime will statically link against it, however if HSAIL_DYN_DLL is
defined, then the runtime will try to dynamically load HSAIL.
Currently stick to statically linking to HSAIL. In a feature patch the
dynamic loading behaviour will be enabled.
Change-Id: I6a78a4375975cf847f236b200404c8cf941d012b
In adition to removing the HSAIL logic from the ROCm backend, guard all
of the HSAIL includes in the common layer behind the WITH_COMPILER_LIB
define. This is to avoid including HSAIL headers when building with
no support for it.
In common logic replace the use of the aclType enum with the new
Program::file_type_t enum. This is essentially a local copy of the HSAIL
enum to avoid including any HSAIL headers.
Change-Id: Ica0651d1b29dfccc255cc584eb82a5cb35e1b520
- Add HSAIL ID for Hawaii as gfx702
- Add HSAIL ID for Renoir without xnack as gfx90c
Fixes: SWDEV-271289, SWDEV-272761
Change-Id: I92cf4619cdfd550462ff8ec3740443ef1e5a5f96
The check has to be performed inside the signal loop, because
active signals need to be processed to avoid a stale timestamp
class.
Change-Id: I26af8287aae18eb19c096d9358cd0b86cfd1c561
- With direct disaptch profiling state is enabled to trigger the
callback on HSA signal. However ROCr has very low peformance on
the first call to get the profiling info. That impacts some tiny
performance tests.
Change-Id: Idacd1b10a473fcfb5feef3074b7191d35743f769
This is part 2 of the change. This is for PAL backend.
The parent buffer sometimes has newer data than the sub buffer or image.
We always need to copy the data into copybuffer in pitch workaround.
Tests:
clinfo
Conformance tests: all images test, info, API, basic.
Internal runtime tests
Change-Id: I97d876ac75b240e69b48244be4c9e522db24f8ac
This is part 2 of the code change for PAL.
The copy image workaround could be recursively used by ROCclr blit kernel.
Avoid such situation by using stack variable.
Tests:
clinfo.
Conformance tests - basic, API, info, and all images tests.
Internal runtime tests - all passed.
Change-Id: I3c822e55398cdf35c2c4a46ed9fc20fbee7cc908
The copy image workaround could be recursively used by ROCclr blit kernel.
Avoid such situation by using stack variable.
Change-Id: Iadaa8cad9216220194760dd461a9533bb236aea0