Remove hard assertions for signal validation on hsa_amd_signal_wait_* operations, instead ignore 0/NULL/invalid signals in the dependency condition evaluation to align with HSA specs for barrier-AND and barrier-OR packets.
Signed-off-by: zichguan-amd <zichuan.guan@amd.com>
[ROCm/ROCR-Runtime commit: e4d027191c]
The scratch_backing_memory_byte_size is not used by CP, but it is
currently used by rocgdb. Putting the field back, but we need to find a
solution for alt_scratch_backing_memory_byte_size.
Also, completely disabling alternate scratch as we need some changes to
support debugger.
[ROCm/ROCR-Runtime commit: 02b38d0614]
We cannot guarrantee system-scope coherency on systems with only PCIe
connections, so do not expose extended fine-grain memory pool on these
systems.
[ROCm/ROCR-Runtime commit: 6dac90c89a]
Check for RLIMIT_CORE before collecting data for coredump. If the
current limit is 0, then we can return early without spending time
collecting coredump data.
[ROCm/ROCR-Runtime commit: d031af9eb5]
A HSA_IMAGE_ENABLE_3D_SWIZZLE_DEBUG environment flag exists already to
enable/disable this. Default value is false (view3dAs2dArray = 1)
Enabling this flag will enable support for swizzles that do 3D
interleaving on GFX9, GF10 and GFX11. By default support for swizzles that
do 3D interleaving is disabled.
[ROCm/ROCR-Runtime commit: 0984a1f0fd]
Use the core Driver object in the CPU agent to make it OS/driver
agnostic.
Implement the GetMemoryProperties() and GetCacheProperties methods
for the KFD driver.
[ROCm/ROCR-Runtime commit: a9f6bc8d0e]
Add support for these 2 new queries:
- HSA_AMD_AGENT_INFO_SCRATCH_LIMIT_MAX
Maximum amount of scratch memory allowed on this agent
- HSA_AMD_AGENT_INFO_SCRATCH_LIMIT_CURRENT
Current limit for scratch memory on this agent
[ROCm/ROCR-Runtime commit: 107b48fb15]
Updating ROCr code to match new handshake protocol with CP FW for
asynchronous scratch reclaim.
Increase previous limits when scratch reclaim feature is available.
[ROCm/ROCR-Runtime commit: aa2f98e6f9]
Allow IPC signals to be registered with hsa_amd_signal_async_handler.
This forces AsyncEventsLoop to switch to polling instead of interrupts.
[ROCm/ROCR-Runtime commit: fa8be44df9]
- When waiting on non-interrupt signals, do not uSleep. This causes
regressions compared to interrupt signal usage.
- Cleanup code.
Change-Id: I706bda0b13e64ffec0b607c1915d8380a2ce0dea
[ROCm/ROCR-Runtime commit: 890399a7cf]
Set underlying type of hsa_region_info_t, hsa_amd_region_info_t
to int.
Change-Id: Ibf97a025eec6176d8e28af8009e9bd6795ca061f
[ROCm/ROCR-Runtime commit: 166b08346b]
BUILD_SHARED_LIBS is a global flag so we don't need to set a default
option for it in both libhsakmt and hsa-runtime, only the top level
CMakeLists file. Also updated README to reflect that libhsakmt is
always built statically and gets linked to libhsa-runtime.
Change-Id: I1511f68a268032bec9758bc731d8074f33ec980f
[ROCm/ROCR-Runtime commit: ff01f62777]
Added HSA_IMAGE_ENABLE_3D_SWIZZLE_DEBUG environment flag to
enable/disable this. Default value is false (view3dAs2dArray = 1)
Enabling this flag will enable support for swizzles that do 3D
interleaving. Note that all features of 3D images are supported
with 2D swizzles,it's just that the access patterns are different
and therefore cache hit-rates may be better or worse, depending
on how it's used. Volumetric algorithms do better with 3D and apps
that tend to access a single slice at a time do better with 2D.
Change-Id: Id8574a6710fe4333a1ee331e5ce9195a81434198
[ROCm/ROCR-Runtime commit: 6361466baa]
Replaces WaitAny with WaitMultiple to more closely align with the
underlying driver API for waiting on multiple events.
WaitMultiple adds a single parameter, wait_on_all, to the WaitAny
interface providing a single function for waiting on multiple
events when we only need AND and OR semantics for the signal
checking logic.
Change-Id: I68a4a45d48151d9d69aef02fd8f7263b9e6c0e75
[ROCm/ROCR-Runtime commit: 8a38f121ea]
Set priority to maximum for signal event handler and minimum for
exceptions event handler.
Change-Id: I1b982d3c2e4c880fafc073fe1a542d01692a6fdc
[ROCm/ROCR-Runtime commit: 7ea25ebb85]
Generalize the driver discovery and move driver-specific
functionality to the concrete driver implementations.
Currently, this process is tightly coupled to the hsakmt
which is GPU and OS specific.
Change-Id: Ie1c53fef407a71b5ec4c6eaf3a3ed00871184409
[ROCm/ROCR-Runtime commit: 15107afb11]
The recent static initialization changes cause this clean up to
happen when it previously never did. The result of ~RuntimeCleanup()
being executed is that the static global "loaded_" is set to false,
which in turn prevents hsa_init() from executing again. Clean up
already happens when hsa_shut_down() occurs.
Change-Id: Ib5cefb80d82880c1945e04eb6ec246bc2c7d2324
[ROCm/ROCR-Runtime commit: b1d6cacf79]
This add support for GC version 11.5.3
Change-Id: I1d55e33198620d3493967558c25c636d5f7ab347
Signed-off-by: Tim Huang <tim.huang@amd.com>
[ROCm/ROCR-Runtime commit: e515b0bca5]
This is to avoid use after free at the program's end, when statics
are destructed.
Change-Id: Id6bf26f25a58d13bdf1ee99c852adae8add76569
[ROCm/ROCR-Runtime commit: 67b0082443]
if .supports_exception_debugging is not enabled.
Change-Id: I944fe7aa4f3068964f47e23f5259c3802d1e9556
Signed-off-by: Flora Cui <flora.cui@amd.com>
[ROCm/ROCR-Runtime commit: ac64c54d74]
Minor modifications to multiple source and header
files based on Coverity report
Change-Id: I4a73d0f56640983c4d5124e13c8c280245cca672
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>
[ROCm/ROCR-Runtime commit: 699d0140be]
Add return checks, initialization and clean
redundant memory operations
fix 1: check return value of 'setsockopt' for error
fix 2: check return value of 'PtrInfo' for error
fix 3: move 'tool_names' instead of copying
fix 4: call 'munmap' for 'va' only once
fix 5: use 'ssize' for possible return values of -1 (err)
fix 6: add missing initialization in constructors
fix 7: add initialization for some scalars and pointers
Change-Id: I07d90e36d4e1fe48c4de4f44e18083e5ed4c5fbc
Signed-off-by: Apurv Mishra <apurv.mishra@amd.com>
[ROCm/ROCR-Runtime commit: 441bd9fe6c]
Make sure waiting_ count for queue signal is always > 0 so that we
always call hsaKmtWaitOnEvent to force hsaKmtWaitOnEvent to return.
Remove incorrect warning print when running in debug mode.
Call internal Signal::WaitAny instead of AMD::hsa_amd_signal_wait_any
to avoid extra function calls.
Change-Id: I9e41b704643e4e8ee7402b1379b1c30ff4c544ef
[ROCm/ROCR-Runtime commit: 5da1889fb7]