We unmap a memory with a different pointer.
ROCr runtime might be confused and silently ignore the unmap request
Change-Id: Ic5a1387a426cf02a985a4ef8ff8ff05e6a870cbf
When HIP_ENABLE_DEFERRED_LOADING=0, many global variables will be
referenced but they are not initialized in that early time. The patch
will use constexpr to initialze global constant varables in compile
time.
Change-Id: I9d538b7abc6a0ce700ec3332b97fc144db5fc1ef
- Expose ROCclr interfaces for HIP usage
- ROCr interfaces aren't available in staging, thus control the
build with AMD_HMM_SUPPORT define
Change-Id: Iadc2bcc230e78d3b0dc22b235189c8cc80843446
1. Enable pitch workaround
2. When we use copy image, we don't need to create the custom pitch image
3. wrtBackImageBuffer_ stores device memory object, not amd image object.
Tests:
conformance kernel read / write test pass with this code change.
Change-Id: I7dca3127adde6ac83e78dd270a2256ebed55c60d
When we're aligning rowPitch to imagePitchAlignment, rowPitch is in pixels,
but imagePitchAlignment_ is bytes, so we end up overaligning the pitch.
Convert imagePitchAlignment_ to pixels before doing any logic.
Change-Id: Ia5ab9d54bed150fe974e86b060dbadc196165b29
[hipclang-vdi-rocm][perf]~45% to 50% of Performance drop on
rocBLAS_int8 test
- Enable AMD_OPT_FLUSH optimization by default to match HCC
- Disable CPU writes to GPU memory on boards with large bar,
because it requires HDP flush tracking.
- Enable L2 cache on kernel arguments, because L2 will be
invalidated on memory reuse .
Change-Id: I124cf250bdd4d19c523ce542c163813828f8fbdc