* clr: Adjust call to ICmdBuffer::CmdCopyMemoryToImage for PAL >= 955
PAL starting versino 955 adds a new argument to
ICmdBuffer::CmdCopyMemoryToImage. Adjust teh callsite to account
fort his.
* clr: Handle new GpuUtil::TraceSessionState cases for PAL >= 939
Starting PAL API version 939, GpuUtil::TraceSessionState changes its
possible values. Adjust for it.
* clr: require PAL version 954
Bump the PAL required vesion to 954, as this is required for proper
debugger support.
1. Create a set of mini numa interface.
In Linux, the interface is based on system call rather than libnuma.
In Windows, the interface can also work, but the policy class is dummy.
Different from Linux, Windows doesn't provide numactl tool or numa lib to setup numa policy, thus
the default policy is followed in Windows, that is, using the closest host numa node to allocate
pinned host memory in hipHostMalloc().
To get the closest host numa node of a GPU device, you need query the new attribute
hipDeviceAttributeHostNumaId. Then you can create a thread with CPU affinity on the numa node.
For example, reference the test in hip-tests/catch/perftests/memory/hipPerfHostNumaAllocWin.cc.
2. Remove pfnSetThreadGroupAffinity and pfnGetNumaNodeProcessorMaskEx as the functions have been exposed since Win7 and Win server 2008.
3. Other minor fixes.
Prior we were enabling dynamic loading mode if BUILD_SHARED_LIBS, but this is not correct. We should only be loading dynamically if the amd_comgr library itself is shared.
Background: we have a configuration where we use a static linked comgr stub in order to achieve LLVM isolation (it dynamically loads the comgr and compiler into a dedicated link namespace) in an otherwise dynamic linked clr.
The build of ROCR backend will be enabled by default in Windows.
It requires the dll loader until ROCR dll will be always available in Windows for any configuration.
* SWDEV-520352 - Remove HostThread and legacy monitor
Remove HostThread, semaphore and legacy monitor.
Make original logics of thread and command queue stricker.
Add more comments to make logics clearer.
Some other minor improvement.
Also part of SWDEV-458943.
[ROCm/clr commit: 96cadbc9e9]
Add the new cmake option AMD_COMPUTE_WIN to build HIP on Windows
from the public github. AMD_COMPUTE_WIN should point to a special
repo with the PAL static libs
[ROCm/clr commit: a3effa16f1]
Add initial implementation of virtual memory heap with
dynamic virtual memory mapping support for memory pools.
DEBUG_HIP_MEM_POOL_VMHEAP controls the new method.
Change-Id: I8dc5be2e0f34ab472f1800f43bb6243639a5e500
[ROCm/clr commit: 296dce5570]
- Header files inside rocclr/utils when included from hipamd or opencl should be included as #include "rocclr/utils/xxx.h" instead of "utils/xxx.h"
Change-Id: Ic0760c33b9d091f5620dec67e5482c9698d22093
[ROCm/clr commit: 78f62d3230]
This PR adds UberTrace-based tracing support to ROCclr's PAL device class.
Legacy RGP-based tracing is still available and is the default.
If UberTrace support is enabled tool-side, this new code path will activate.
Change-Id: I268b2dcef70e850a50e2caef8355f38bf51d4641
[ROCm/clr commit: e550032d25]
Restore PAL platform destruction.
Update CmdAllocatorCreateInfo::AllocInfo for the new interface.
Change-Id: Iea418eed7ee26166039a4a9cc1999438856e9097
[ROCm/clr commit: bd00826446]