- Added missing validation as graph node should not be created
if parameters are invalid
- Fix conversion of input params to graphNode params
Change-Id: I37ab04942b5fb2eb07386850cb7dbbf26f9ca967
[ROCm/clr commit: db8527f655]
If a module is loaded on one device, hipModuleGetFunction and other similar APIs should be able to run successfully from another device.
Change-Id: I96084cbd6c6dcf2a81019779a6ab1842ef2f35d1
[ROCm/clr commit: c46f843b99]
This is to avoid calling the HIP_INIT macro during the shutdown process.
Change-Id: I2e65f6e10491918a17445ee1e8ddd08286070358
[ROCm/clr commit: 5e3a29078d]
Adds UberTrace support for pre-dispatch markers and barrier begin/end markers.
Moves shared definitions out of palgpuopen.hpp into shared header
palcapturemgr.hpp.
Change-Id: I9f464c689e7ff12c54eca043fc1ad65e1836a64f
[ROCm/clr commit: 541c449ce2]
- When using shader copy, make sure to use release scope for the AQL
packet. This is a potential bug but is hidden as hipMemcpyAsync always
needs synchronization(which inserts a barrier with release scope). For
hipMemcpy we use a barrier packet to make sure its blocking. Eitherways
a barrier gets always used and hides in some ways a potential bug.
Change-Id: I57fb7f769c3179e76d712471c0905104c801d7ba
[ROCm/clr commit: c9dd95bf6c]
- Resolve stream once for event record. We should avoid calling
getStream again in addMarker
Change-Id: I78448c4f151ae10a5c8e8c248b2f4078b84191cb
[ROCm/clr commit: a22c45d635]
- When we use blit(compute) copies, two subsequent copies may read for
the same source buffer, the buffer may get modified by the host in
between and if the src buffer was allocated with non-coherent flag, the
device may simply use stale value from previous cacheline fetch. This is
a corner case.
Change-Id: I2ce261c6f6fa4e5bb608f116548e5cc711ae6f3c
[ROCm/clr commit: b63005d550]
This reverts commit 0830d95f6d.
Reason for revert: There needs to be memcpy size change
Change-Id: If4f51769731e54743ac705b19b4f81b2d5925d5a
[ROCm/clr commit: 446ed661a0]
We are passing this arg as an address, and memcpy complains about
overreading (8 bytes instead of 4).
Change-Id: Ica9207f6c5f6056a4bfc968280c76e779ded13ae
[ROCm/clr commit: a6f2a2c2af]
- Added DEBUG_CLR_SKIP_RELEASE_SCOPE flag to force release scope to
SCOPE_NONE in AQL packet header
Change-Id: Ife02cddb9d5cd4749103ce585d3d5fe9024c6868
[ROCm/clr commit: 8155943c5f]
Uri decoder logic currently silently ignores processing of memory uri.
This patch enables the existing logic to handle the processing of offset
and size related to loaded code-object having memory URI.
Change-Id: If03579cefb11d91f667410464dc89404df9270a3
[ROCm/clr commit: 11cd37ce0b]
Updated CHANGELOG to include the performance fix for
kernel launch latency with increasing number of idle streams.
Change-Id: I509e14cb8f8cd3abe61c6ede78808e96ef8f06e1
[ROCm/clr commit: a55118f63d]
NOPTION is meant for component options or alias runtime options so
the option group must not be OA_RUNTIME or OA_MISC_ALIAS must be set,
otherwise we incorrectly assume that it has an option variable and
attempting to write to it causes corruption of OptionVariables.
Change-Id: Iafb5a8f743e5ed0f87be36061c44578178f6cfde
[ROCm/clr commit: caa10572cb]
The vector with all kernels is preallocated on the executable init.
Thus, reduce the scope of global lock to the binary creation only.
Change-Id: I73035013a6562175069137e895bba815f466ee35
[ROCm/clr commit: 0640d36019]
- added wptr and rptr to ClPrint in dispatchBarrierPacket and dispatchBarrierValuePacket
Change-Id: I8a62289deb23c9f657a9b0ac6138bb55eafecba2
[ROCm/clr commit: 078fe7e5de]
Support gfx9-4-generic target to cover mi3XX.
Support features sramecc and xnack in generic target.
Improve some code formats.
Add more log on compiler.
Change-Id: I6b3c6af55c60cffd43ce6f17b75998f751b75713
[ROCm/clr commit: 3ad8f1b811]