Adding a general stage for agents to release their resources on
shutdown. This avoids a circular dependency during shutdown because
we have to delete allocated resources before deleting memory pools, but
we also have to delete memory pools before destroying agents.
In PcSamplingCreateFromId, convert number of bytes into number of
dwords because DmaFill expects a count of 32-bit words, not raw bytes.
This prevents OOB writes on large sampling buffers.
The over arching goal it so provide an API that pre-silicon models can latch into for software bring up.# Please enter the commit message for your changes. Lines starting
Poll the dependent signals twice on all gfx9.0 GPUs except gfx90a.
This is needed as a work-around for a rare issue where SDMA_POLL_REGMEM
may return before the memory is actually cleared.
Resets event_age when signals move. Prior to this PR, event_age
can become unaligned with hsa_event, causing hangs if the event_age
exceeds the true hsa_event age.
Remove hard assertions for signal validation on hsa_amd_signal_wait_* operations, instead ignore 0/NULL/invalid signals in the dependency condition evaluation to align with HSA specs for barrier-AND and barrier-OR packets.
Signed-off-by: zichguan-amd <zichuan.guan@amd.com>
The scratch_backing_memory_byte_size is not used by CP, but it is
currently used by rocgdb. Putting the field back, but we need to find a
solution for alt_scratch_backing_memory_byte_size.
Also, completely disabling alternate scratch as we need some changes to
support debugger.
Check for RLIMIT_CORE before collecting data for coredump. If the
current limit is 0, then we can return early without spending time
collecting coredump data.
A HSA_IMAGE_ENABLE_3D_SWIZZLE_DEBUG environment flag exists already to
enable/disable this. Default value is false (view3dAs2dArray = 1)
Enabling this flag will enable support for swizzles that do 3D
interleaving on GFX9, GF10 and GFX11. By default support for swizzles that
do 3D interleaving is disabled.
Use the core Driver object in the CPU agent to make it OS/driver
agnostic.
Implement the GetMemoryProperties() and GetCacheProperties methods
for the KFD driver.
Add support for these 2 new queries:
- HSA_AMD_AGENT_INFO_SCRATCH_LIMIT_MAX
Maximum amount of scratch memory allowed on this agent
- HSA_AMD_AGENT_INFO_SCRATCH_LIMIT_CURRENT
Current limit for scratch memory on this agent
Updating ROCr code to match new handshake protocol with CP FW for
asynchronous scratch reclaim.
Increase previous limits when scratch reclaim feature is available.
- When waiting on non-interrupt signals, do not uSleep. This causes
regressions compared to interrupt signal usage.
- Cleanup code.
Change-Id: I706bda0b13e64ffec0b607c1915d8380a2ce0dea
BUILD_SHARED_LIBS is a global flag so we don't need to set a default
option for it in both libhsakmt and hsa-runtime, only the top level
CMakeLists file. Also updated README to reflect that libhsakmt is
always built statically and gets linked to libhsa-runtime.
Change-Id: I1511f68a268032bec9758bc731d8074f33ec980f
Added HSA_IMAGE_ENABLE_3D_SWIZZLE_DEBUG environment flag to
enable/disable this. Default value is false (view3dAs2dArray = 1)
Enabling this flag will enable support for swizzles that do 3D
interleaving. Note that all features of 3D images are supported
with 2D swizzles,it's just that the access patterns are different
and therefore cache hit-rates may be better or worse, depending
on how it's used. Volumetric algorithms do better with 3D and apps
that tend to access a single slice at a time do better with 2D.
Change-Id: Id8574a6710fe4333a1ee331e5ce9195a81434198
Replaces WaitAny with WaitMultiple to more closely align with the
underlying driver API for waiting on multiple events.
WaitMultiple adds a single parameter, wait_on_all, to the WaitAny
interface providing a single function for waiting on multiple
events when we only need AND and OR semantics for the signal
checking logic.
Change-Id: I68a4a45d48151d9d69aef02fd8f7263b9e6c0e75
<squashed with patch for gfx950 generic targets>
Signed-off-by: Chris Freehill <Chris.Freehill@amd.com>
Change-Id: Ifec6d93cf46c7fbf736c6572882299e279260af6