Add HSA_ENABLE_SRAMECC environment variable that can be used to
override SRAM ECC mode reported by KFD
Change-Id: I2b95511820a2d3d146a76b03070659c0695b61fd
[ROCm/ROCR-Runtime commit: a180c9ee78]
The gfx940 does not support IMAGE instructions. Any get_info with
IMAGE attributes should return failure.
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Change-Id: I12005628f92780f551ab6f8b41526c66b54c6a59
[ROCm/ROCR-Runtime commit: 46b667e530]
The function IDs used to be 0 on previous asics but on gfx94x and newer
asics, these bits are set. These bits are used by user applications to
uniquely identify the locations of GPU nodes. These exta bits break
hwloc and are not needed for rocrtst.
Signed-off-by: Mike Li <Tianxinmike.Li@amd.com>
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: I1202f504645b0662d009b9c0926eebb7ddc08d73
[ROCm/ROCR-Runtime commit: d7fa654338]
gfx940 uses ttmp11 to hold the queue packet index so the first level
trap handler uses ttmp13 instead to save ib_sts.
Repurpose ttmp11[31] to mean that the ttmps are initialized. The issue
was that the debugger could not tell whether ttmp6 was written by the
trap handler when determining the stop reason.
If ttmp11[31]=0, then the trap handler has not been executed and ttmp6
should be assumed to be 0. If ttmp11[31]=1, then ttmp6 holds the
trap_id, if an s_trap instruction caused the exception.
Signed-off-by: Laurent Morichetti <laurent.morichetti@amd.com>
Signed-off-by: Lancelot Six <lancelot.six@amd.com>
Change-Id: I9af903abae044b9ec530306229caf3b883f3ee46
[ROCm/ROCR-Runtime commit: f31b312611]
tempnam has been marked as obsolete.
Signed-off-by: Alex Sierra <alex.sierra@amd.com>
Change-Id: Ie64d9a351bf386da00a96ceff059f685e11f2cca
[ROCm/ROCR-Runtime commit: e82025bffa]
On Linux, the os_thread abstraction is built on top of pthread. Many of
the pthread calls might fail and return error codes. The error
conditions are only checked via assertions (if ever checked) which means
that when doing a release build, no error condition is checked. The
same goes for dlsym/dlinfo and clock_gettime.
This commit improves the situation this by checking the error conditions
and acting accordingly. When the error condition is detected in a
function with a mean to indicate some error to its caller, then this
patch prints some error message and returns. If there is no way to
propagate the error up the call stack, print some error message and
abort the process.
For the os_info::os_info ctor, the only user is CreateThread, which
checks that the built thread is Valid(). If not, nullptr is returned to
the caller.
It could be possible to use exceptions when functions cannot pass
errors, but for now I only use abort as it is what abort would do with
debug build.
Change-Id: I815703c3b95777cc29bb89a7d654ac879c14a759
[ROCm/ROCR-Runtime commit: 183f5d90aa]
When building with g++-11.3.0, I have the following warning:
/home/.../core/runtime/runtime.cpp: In member function ‘hsa_status_t rocr::core::Runtime::GetSystemInfo(hsa_system_info_t, void*)’:
/home/.../core/runtime/runtime.cpp:693:56: warning: suggest parentheses around ‘&&’ within ‘||’ [-Wparentheses]
693 | kfd_version.KernelInterfaceMajorVersion == 1 &&
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
694 | kfd_version.KernelInterfaceMinorVersion >= 12)
| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This patch adds the parenthesis as suggested. This silences the
compiler warning.
No functional change expected.
Change-Id: I69c1a73a432b0f2393dbaf36d4424cf0056c535f
[ROCm/ROCR-Runtime commit: 72219b8237]
We used to report HSA_STATUS_ERROR_INVALID_ISA when receiving error code
128, but there are several other reasons why we could be exceeding
number of VGPRs, so updating the error code.
Change-Id: I6a6980d5b07b09c93d00dee5207a0d52399bc77e
[ROCm/ROCR-Runtime commit: f43a284b8e]
One some platforms, e.g Arch Linux, -D_GLIBCXX_ASSERTIONS compile flag
is enabled by default, causing a runtime assertion.
Avoid assertion by using std::vector accessor function data().
Change-Id: I118cdf102c3e353f32c618823e363ee1059f3453
[ROCm/ROCR-Runtime commit: 511855d344]
Fix for overwriting pointer info size provided by caller of
hsa_amd_pointer_info.
Change-Id: I2e5d73ab9ba1a32bc9b4d112bc29b4a99fd8b3b5
[ROCm/ROCR-Runtime commit: c5bf7eb112]
Some applications will keep trying to allocate device memory until the
allocation fails. This causes all device memory to be used up and we are
then unable to allocate scratch memory for dispatches. Reserve enough
memory for 1 small scratch allocation.
Change-Id: I968400d41540ba1aca8f28581f229693eec02225
[ROCm/ROCR-Runtime commit: 8ebf5f9c48]
Wait on completion signal for amd_aql_pm4_ib processing
on ASICs with gfx version >= 9.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: Ia704d9cc5b2535dcf8564a30f694262b113f77a2
[ROCm/ROCR-Runtime commit: aec7200cb2]
Engine offset that is the maximum number of engines is still valid
as offset enum 0 is occupied by blit copies so raise the limit by 1.
Change-Id: I6fcab106290e6647702efe297a4281861da4e0b8
[ROCm/ROCR-Runtime commit: fc8f3f9fd5]
Package ASAN libraries and license file
Suffix "asan" added to package name
Change-Id: I2af416d86a9068a41e3880836a21c9005e45271b
[ROCm/ROCR-Runtime commit: dd9b7b3b3a]
Using backward compatibility paths will provide an #error message. Compile time option added to enable/disable the #error message.
Disabling the same will provide a #warning message
Change-Id: Ib48e361b72176e2845c8f74f980f0234e7eb4a7d
[ROCm/ROCR-Runtime commit: 629ddde072]
Adds hsa_amd_portable_export_dmabuf and hsa_amd_portable_close_dmabuf
which allow obtaining dmabuf handles to rocr allocations. These handles
may be shared with other APIs to support cross vendor & cross device
memory sharing.
Adds query to return whether dmabuf export is supported
Signed-off-by: Jonathan Kim <Jonathan.Kim@amd.com>
Signed-off-by: David Yat Sin <David.YatSin@amd.com>
Change-Id: I7f98501087d9563d07fc2cb428cc886b1e518b1e
[ROCm/ROCR-Runtime commit: 42243c1e8f]
Forgot SDMA blit engine indices are offset by DevToDev 0-position in
a couple of places.
Change-Id: Ie811d8281bc812738ed0107694f3dffde5e93685
[ROCm/ROCR-Runtime commit: 7364a93b98]
Use mwaitx instructions when busy waiting for signals to reduce CPU
energy usage.
This can be disabled by setting HSA_ENABLE_MWAITX=0
Change-Id: Ic207895a491b2bf6dacba47ef0921df3faad5b5a
[ROCm/ROCR-Runtime commit: cc48dfdbff]
Copying memory from device to host with a CPU agent
would cause a poor performance due to the reading of
uncahced device memory by CPU.
Fix it by using a GPU agent.
Signed-off-by: Lang Yu <Lang.Yu@amd.com>
Change-Id: Ia3b562758fe73ef9efaa284f47e67bf569cc7b7b
[ROCm/ROCR-Runtime commit: 8501c0bcb1]
ROCr internally uses the same allocation_map_ list to track memory
allocations that are both for internal allocations and allocations by
users of ROCr library. In some edge cases, the library user would call
hsa_amd_pointer_info on an invalid pointer, but ROCR would return the
pointer as valid because this pointer belongs to a memory range that
was allocated internally within ROCr. Adding a flag to differentiate
between internal and external allocations.
Change-Id: I98c52bd85f3985d1ba1b0e3101d2254b003412cf
[ROCm/ROCR-Runtime commit: 59685f4492]
Track and report the size, in bytes, of pending unexecuted blit
commands. To be used in copy ganging.
Change-Id: Ia7453ff88571e927df771c6c819b73c17e67708e
[ROCm/ROCR-Runtime commit: 27596aef0c]
Fixes hang due to change in order of initialization of libraries
that have cyclical dependencies and they call hsa_init() during their
initialization phase.
This implementation looks for a symbol called "HSA_AMD_TOOL_PRIORITY"
across all loaded shared libraries using dynamic section entries of the
loaded lib instead of using dlopen and dlsym for the same purpose.
Change-Id: I4865f2fd18dd186ec311a432ec38fbb5583805d2
[ROCm/ROCR-Runtime commit: 8aac885318]
Reporting whether IOMMU V2 is supported.
IOMMU V1 support is not relevant to user, so not reporting it.
Change-Id: I77389484a87a352da9c2f7b2a5d9de264f90ee53
[ROCm/ROCR-Runtime commit: e30be76f37]
Currently, Wavefront::GetInfo(HSA_WAVEFRONT_INFO_SIZE.. always returns
64. Instead, return the proper wavefront size based on the ISA.
Temporarily, we only return 1 wavefront size for each ISA. As we do not
have mechanism from upper layers to determine correct wavefront when
there are multiple wavefronts supported. We are temporarily
returning 32 for all gfx1xxx cards even though they support 64 as the
kernels for gfx1xxx are compiled for wavefront-32 by default.
Change-Id: Ic6c2917b7e6d3704daf742d243f5ec7f49430de9
[ROCm/ROCR-Runtime commit: f7e3782b42]