Remove hard assertions for signal validation on hsa_amd_signal_wait_* operations, instead ignore 0/NULL/invalid signals in the dependency condition evaluation to align with HSA specs for barrier-AND and barrier-OR packets.
Signed-off-by: zichguan-amd <zichuan.guan@amd.com>
The scratch_backing_memory_byte_size is not used by CP, but it is
currently used by rocgdb. Putting the field back, but we need to find a
solution for alt_scratch_backing_memory_byte_size.
Also, completely disabling alternate scratch as we need some changes to
support debugger.
Check for RLIMIT_CORE before collecting data for coredump. If the
current limit is 0, then we can return early without spending time
collecting coredump data.
A HSA_IMAGE_ENABLE_3D_SWIZZLE_DEBUG environment flag exists already to
enable/disable this. Default value is false (view3dAs2dArray = 1)
Enabling this flag will enable support for swizzles that do 3D
interleaving on GFX9, GF10 and GFX11. By default support for swizzles that
do 3D interleaving is disabled.
Use the core Driver object in the CPU agent to make it OS/driver
agnostic.
Implement the GetMemoryProperties() and GetCacheProperties methods
for the KFD driver.
Add support for these 2 new queries:
- HSA_AMD_AGENT_INFO_SCRATCH_LIMIT_MAX
Maximum amount of scratch memory allowed on this agent
- HSA_AMD_AGENT_INFO_SCRATCH_LIMIT_CURRENT
Current limit for scratch memory on this agent
Updating ROCr code to match new handshake protocol with CP FW for
asynchronous scratch reclaim.
Increase previous limits when scratch reclaim feature is available.
- When waiting on non-interrupt signals, do not uSleep. This causes
regressions compared to interrupt signal usage.
- Cleanup code.
Change-Id: I706bda0b13e64ffec0b607c1915d8380a2ce0dea
BUILD_SHARED_LIBS is a global flag so we don't need to set a default
option for it in both libhsakmt and hsa-runtime, only the top level
CMakeLists file. Also updated README to reflect that libhsakmt is
always built statically and gets linked to libhsa-runtime.
Change-Id: I1511f68a268032bec9758bc731d8074f33ec980f
Added HSA_IMAGE_ENABLE_3D_SWIZZLE_DEBUG environment flag to
enable/disable this. Default value is false (view3dAs2dArray = 1)
Enabling this flag will enable support for swizzles that do 3D
interleaving. Note that all features of 3D images are supported
with 2D swizzles,it's just that the access patterns are different
and therefore cache hit-rates may be better or worse, depending
on how it's used. Volumetric algorithms do better with 3D and apps
that tend to access a single slice at a time do better with 2D.
Change-Id: Id8574a6710fe4333a1ee331e5ce9195a81434198
Replaces WaitAny with WaitMultiple to more closely align with the
underlying driver API for waiting on multiple events.
WaitMultiple adds a single parameter, wait_on_all, to the WaitAny
interface providing a single function for waiting on multiple
events when we only need AND and OR semantics for the signal
checking logic.
Change-Id: I68a4a45d48151d9d69aef02fd8f7263b9e6c0e75
<squashed with patch for gfx950 generic targets>
Signed-off-by: Chris Freehill <Chris.Freehill@amd.com>
Change-Id: Ifec6d93cf46c7fbf736c6572882299e279260af6
Generalize the driver discovery and move driver-specific
functionality to the concrete driver implementations.
Currently, this process is tightly coupled to the hsakmt
which is GPU and OS specific.
Change-Id: Ie1c53fef407a71b5ec4c6eaf3a3ed00871184409
The recent static initialization changes cause this clean up to
happen when it previously never did. The result of ~RuntimeCleanup()
being executed is that the static global "loaded_" is set to false,
which in turn prevents hsa_init() from executing again. Clean up
already happens when hsa_shut_down() occurs.
Change-Id: Ib5cefb80d82880c1945e04eb6ec246bc2c7d2324
Minor modifications to multiple source and header
files based on Coverity report
Change-Id: I4a73d0f56640983c4d5124e13c8c280245cca672
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>
Add return checks, initialization and clean
redundant memory operations
fix 1: check return value of 'setsockopt' for error
fix 2: check return value of 'PtrInfo' for error
fix 3: move 'tool_names' instead of copying
fix 4: call 'munmap' for 'va' only once
fix 5: use 'ssize' for possible return values of -1 (err)
fix 6: add missing initialization in constructors
fix 7: add initialization for some scalars and pointers
Change-Id: I07d90e36d4e1fe48c4de4f44e18083e5ed4c5fbc
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>
Make sure waiting_ count for queue signal is always > 0 so that we
always call hsaKmtWaitOnEvent to force hsaKmtWaitOnEvent to return.
Remove incorrect warning print when running in debug mode.
Call internal Signal::WaitAny instead of AMD::hsa_amd_signal_wait_any
to avoid extra function calls.
Change-Id: I9e41b704643e4e8ee7402b1379b1c30ff4c544ef