Init and fini kernel needs to be launched when we load and unload code object. Avoid looping through all kernels within a code object just to run the init and fini kernels. Compiler currently only generates 1 init and fini kernel.
[ROCm/clr commit: cd46294b31]
This PR adds UberTrace-based tracing support to ROCclr's PAL device class.
Legacy RGP-based tracing is still available and is the default.
If UberTrace support is enabled tool-side, this new code path will activate.
Change-Id: I268b2dcef70e850a50e2caef8355f38bf51d4641
[ROCm/clr commit: e550032d25]
The "optimized" version of memcpy is outdated and
was used in win32 only.
Change-Id: I7f2e0e9051e37cec95438266824b5b0025c324c6
[ROCm/clr commit: 7448113cfc]
This reverts commit 1b05247a03.
Reason for revert: Waiting for staging results before finally merging it.
Change-Id: Iaabb510325f50147f368108e98531291217627c0
[ROCm/clr commit: 77be355fd9]
Add trap handler code into runtime and compile/load during
device initialization. The current interface for trap handler in
PAL is obsolete and the new interface will be provided later.
Change-Id: I1fa702c5d1f2e6731f781369c980d546cf422328
[ROCm/clr commit: e1d34cb24f]
- Program unique AQL index for debugger. The logic manages AQL array of packets per HW queue.
- Provide debug state to PAL
Change-Id: I38fa1f5435fa711fd1d44dc391f2e61eb2a25efa
[ROCm/clr commit: d97cc0abbd]
Report proper target id for xnack in HSAIL path. Runtime
will use ISA table and report hsailName().
Fix offline compilation path for PAL.
Change-Id: Ic0250bf6b9c193d867aec9800a319da1bf00c3ee
[ROCm/clr commit: a543d4a860]
- Create hash values for binaries
- Add the binaries into RGP trace
- Add corresponding hash value for every dispatch
Change-Id: I2c3ce004d69f37d0d46bc4744e12f24273517f5e
[ROCm/clr commit: 2a298f2ec3]
PCMark10 counts the time spent in clCreateKernel as part of execution
time, so as workaround for the PAL path, move code object loading
back to clBuildProgram.
Change-Id: I3b9cf1879ece08ab59f447ec165b0525bc8593a4
[ROCm/clr commit: 1d0364e590]
- Device Reset should not purge the allocations that were not by the user
- Addresses QMCPack Test abort due to the removal of all the mem objects during reset
Change-Id: I7b7a123e72bcc985d7e51d17c2382bc618d3e041
[ROCm/clr commit: 924695fb5e]
In HSAIL path, kernel akc info is obtained after code object
loading, and kernel signature creation requires the akc info
when one of the kernel argument is a reference object.
Change-Id: I9cdb1dbf2c72f4620b0b6e46a88402a2473c3e97
[ROCm/clr commit: 8cac880779]
HIP should be built with HSAIL support disabled.
Currently HSAILProgram::info() and VirtualGPU::buildKernelInfo() expose
ACL interfaces directly. This should not be allowed.
Change-Id: Iae15d4f19be16806826f2f6cb600752c11f97fc1
[ROCm/clr commit: bbe6246f19]
Device enqueue has an option to execute scheduler on the current
queue and it's enabled by default. Make sure scratch is allocated
on the current queue for that case. Add max vgpr tracking per
program to adjust scratch size accordingly.
Change-Id: I2a6d796913a4551a1e7f343a2465d589eec60d8a
[ROCm/clr commit: e553b2763a]
This change makes HSAIL usage similar to that of Comgr. By default, the
runtime will statically link against it, however if HSAIL_DYN_DLL is
defined, then the runtime will try to dynamically load HSAIL.
Currently stick to statically linking to HSAIL. In a feature patch the
dynamic loading behaviour will be enabled.
Change-Id: I6a78a4375975cf847f236b200404c8cf941d012b
[ROCm/clr commit: c7b50bb890]
- Correct GSL path to report targets using the TargetID syntax.
- Correct GSL path to check compatibility of code objects when
loading.
- Add concept of an device isa and create a registery used by ROCm,
PAL and GSL.
- Support XNACK and SRAMECC target features consistently for PAL and ROCm.
- Correct logic for NullDevices and asserts to avoid memory coruption.
- Allow all NullDevices to be created for HIP.
- Numerous other code improvements.
Change-Id: I40abf3d2b22249c1492d1af5919665f8184f4e0e
[ROCm/clr commit: c7e8d91e14]
For roc devices create hsa_agent_t handles using a pointer to the
device::Device. This ensures each device has a different hsa_agent_t
handle. This may be necessary to ensure the loader symbol lookup will
search only symbols for the correct device.
Change-Id: Iee6dd40d68bf22a02ce8c75cbe5ac8f5a0d9e418
[ROCm/clr commit: 76c371d78a]
- Use std::unique_ptr to levarage RAII for managing the allocation in
the presence of errors.
Change-Id: I55de515bbf72938e1dd09731c5e51f538cf9d34a
[ROCm/clr commit: 682774a09d]
- Add assertions to enforce that objects are of the correct kind and
have been allocated.
- Make destructors check if objects have been allocated before
deleting.
- Operations that require a non-NullDevice return failure if given a
NullDevice.
- Use static_cast rather than reinterpret_cast when cohersing from a
base class to a derived class.
Change-Id: I02ee0ea9d7982fd7ca29d49c9b02cfae111b7127
[ROCm/clr commit: e5431676d4]
Rename functions that access devices to reflect the derived device
they return. This includes the base device::Device and the derived
gpu/pal/roc device classes in both NullDevice and Device forms. Change
to use the least derived versions to clarify what operations will be
available.
Change-Id: I1abb6bfed7efa24852bc8d0d49acaea357d8b5d0
[ROCm/clr commit: 001fd66cac]
When HIP_ENABLE_DEFERRED_LOADING=0, many global variables will be
referenced but they are not initialized in that early time. The patch
will use constexpr to initialze global constant varables in compile
time.
Change-Id: I9d538b7abc6a0ce700ec3332b97fc144db5fc1ef
[ROCm/clr commit: fdef6f722f]
These are unnecessary and an obstacle to producing a relocatable
package.
Change-Id: I0059bf7a2d11fcece0cd7ab47d7545d0df4d7099
[ROCm/clr commit: 1d267c9c08]