Heap initialization used device queue, but it shoudl be used for
cooperative launches only. Heap initialization must use the same queue
as the current dispatch.
Change-Id: I856621bf82bbdeb1c2d0fbc4970e90d09af805cb
Move hidden heap creation to the kernel launch to make sure it's
allocated on the actual first usage.
Change-Id: I1b65a82fc06d9129ed45a69765bf14ea3d945b04
Remove assert for kernel arg size, because COv5 reports a value
bigger than the actual usage in the most of cases
Change-Id: I8e15bc45a9e21b58a5894f9977511ca84408ce61
Metadata in Codeobject version 5 is the extension of CO3 and CO4.
Add the detection of the new fields and program them in
the setup of the kernel arguments.
Change-Id: I27e58df77320ad00f4f16d35912668db803826af
Report proper target id for xnack in HSAIL path. Runtime
will use ISA table and report hsailName().
Fix offline compilation path for PAL.
Change-Id: Ic0250bf6b9c193d867aec9800a319da1bf00c3ee
In HSAIL path, kernel akc info is obtained after code object
loading, and kernel signature creation requires the akc info
when one of the kernel argument is a reference object.
Change-Id: I9cdb1dbf2c72f4620b0b6e46a88402a2473c3e97
This change makes HSAIL usage similar to that of Comgr. By default, the
runtime will statically link against it, however if HSAIL_DYN_DLL is
defined, then the runtime will try to dynamically load HSAIL.
Currently stick to statically linking to HSAIL. In a feature patch the
dynamic loading behaviour will be enabled.
Change-Id: I6a78a4375975cf847f236b200404c8cf941d012b
Since the majority of the Hostcall implementation now sits in the
commmon layer, the PAL backend simply just needs to invoke it. One thing
that is missing though is HSA signal support.
The newly added pal::Signal class is a light emulaion of what HSA
signals provide. The current implementation is just enough to get
Hostcall working, but it can be expanded in the future if needed to
fully emulate HSA signals.
The major difference for now between PAL and ROCm hostcall
implemenations is that PAL doesn't support blocking signals. This will
be enabled in the near future. For now use active wait for PAL.
Change-Id: I746557354ab9d71a7d4a31f9320fcc2fee5aee7f
For roc devices create hsa_agent_t handles using a pointer to the
device::Device. This ensures each device has a different hsa_agent_t
handle. This may be necessary to ensure the loader symbol lookup will
search only symbols for the correct device.
Change-Id: Iee6dd40d68bf22a02ce8c75cbe5ac8f5a0d9e418
- Add assertions to enforce that objects are of the correct kind and
have been allocated.
- Make destructors check if objects have been allocated before
deleting.
- Operations that require a non-NullDevice return failure if given a
NullDevice.
- Use static_cast rather than reinterpret_cast when cohersing from a
base class to a derived class.
Change-Id: I02ee0ea9d7982fd7ca29d49c9b02cfae111b7127
Rename functions that access devices to reflect the derived device
they return. This includes the base device::Device and the derived
gpu/pal/roc device classes in both NullDevice and Device forms. Change
to use the least derived versions to clarify what operations will be
available.
Change-Id: I1abb6bfed7efa24852bc8d0d49acaea357d8b5d0