The hack dosn't really track the commands status. It may be not
necessary for HIP, but will cause early resource release.
Change-Id: I791ad36dd8abd3b6b3d2c9b16a210a555c08ca64
This is helpfull to do when debugging issues on lowend asics. Navi14 can be emulated as Navi10. So can Navi22 be emulated as Navi21.
Change-Id: I693ffd45a5b03657822afdc872781901bc69b65c
The change reuses HSA signals for dispatches as a wait signal.
Skipping the barrier requires to disable L2 cache for sysmem
allocations and extra tracking for HDP access with the large bar.
ROC_BARRIER_SYNC=0 activates the new logic. Barrier sync is
still used by default.
ROC_ACTIVE_WAIT=1 enables unconditional active wait in ROCr.
The change also consolidated ROCr wait logic under single function.
Change-Id: I6bd1be30aa88258da1b1f9de319ef5a45852afd8
This reverts commit 5f055d227d.
This change broke the legacy-complib build in p4. It seems that we can't use any flags in debug.cpp.
Change-Id: I17bb83651b85d6f415d9074634b479658fd4c3f9
- Expose ROCclr interfaces for HIP usage
- ROCr interfaces aren't available in staging, thus control the
build with AMD_HMM_SUPPORT define
Change-Id: Iadc2bcc230e78d3b0dc22b235189c8cc80843446
This reverts commit a3b730b595.
Reason for revert: HIP_ENABLE_LAZY_KERNEL_LOADING is needed before the runtime is initialized, so this utility cannot be used
Change-Id: I49f8ddb98c9a85b9a77b8fd4b236d06b6b2b0f32
This workaround is to avoid performance penalty of SDMA engine
taking a while to clock up from a lower DPM state. Add env var
GPU_FORCE_BLIT_COPY_SIZE (1024 by default for HIP in KB). Forcing
Src and Dst agent to be amdgpu makes ROCr take blit copy path for
what otherwise should have been SDMA copy
Change-Id: I222f687155f86000d17d66d25182e490b6710463
[hipclang-vdi-rocm][perf]~45% to 50% of Performance drop on
rocBLAS_int8 test
- Enable AMD_OPT_FLUSH optimization by default to match HCC
- Disable CPU writes to GPU memory on boards with large bar,
because it requires HDP flush tracking.
- Enable L2 cache on kernel arguments, because L2 will be
invalidated on memory reuse .
Change-Id: I124cf250bdd4d19c523ce542c163813828f8fbdc