~45% to 50% of Performance drop on rocBLAS_int8 test
Use the last command in the queue for a wait.
Add extra print information about processed commands.
Add an option to disable file location printing.
Change-Id: I4187883e1a90e571fde3128af98368108fda8785
When we're aligning rowPitch to imagePitchAlignment, rowPitch is in pixels,
but imagePitchAlignment_ is bytes, so we end up overaligning the pitch.
Convert imagePitchAlignment_ to pixels before doing any logic.
Change-Id: Ia5ab9d54bed150fe974e86b060dbadc196165b29
hip_threadfence_system passes locally with this change. This also fixes
hipHostMalloc() failures when hipHostMallocMapped flag is used.
Change-Id: Id412efe502accc7c6e7676b52c05ccb9d8fbbe67
Remove a workaround to CS_PARTIAL_FLUSH added in CL#1495187,
since PAL is no longer uses CS_PARTIAL_FLUSH.
Change-Id: I03edc7595459e19aad33b2b0901f0ebe4754d310
[hipclang-vdi-rocm][perf]~45% to 50% of Performance drop on
rocBLAS_int8 test
- Enable AMD_OPT_FLUSH optimization by default to match HCC
- Disable CPU writes to GPU memory on boards with large bar,
because it requires HDP flush tracking.
- Enable L2 cache on kernel arguments, because L2 will be
invalidated on memory reuse .
Change-Id: I124cf250bdd4d19c523ce542c163813828f8fbdc
Update a use of the deprecated amd_comgr_action_info_set_options to
instead use amd_comgr_action_info_set_option_list.
Completely remove all references to amd_comgr_action_info_set_options
and amd_comgr_action_info_get_options from the runtime.
Change-Id: I12a0803c87430722364ec22818e249caf3798c88
The last commit to replace the cl_* types with standard types
failed to correct issues introduced in the PAL and GPU backend.
Change-Id: I926997234dfbe346fc165a7bc4e1b8aabab7bac5
SWDEV-2 - Change OpenCL version number from 3085 to 3086.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2833 edit
SWDEV-197836 - Drop the use of llvm header files in opencl runtime
- COv2 doesn't report HostCall argument properly. Make a workaround for it.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#34 edit
SWDEV-2 - Change OpenCL version number from 3084 to 3085.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2832 edit
SWDEV-2 - Change OpenCL version number from 3083 to 3084.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2831 edit
SWDEV-2 - Change OpenCL version number from 3082 to 3083.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2830 edit
SWDEV-2 - Change OpenCL version number from 3081 to 3082.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2829 edit
SWDEV-2 - Change OpenCL version number from 3080 to 3081.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2828 edit
SWDEV-79445 - OCL generic changes and code clean-up
Make the conversion from amd::Coord3D to size_t* be explicit.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/platform/object.hpp#21 edit
SWDEV-2 - Change OpenCL version number from 3079 to 3080.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2827 edit
SWDEV-79445 - OCL generic changes and code clean-up
Alllow amd::Coord3D to decay into size_t*. This allows creating an amd::BufferRect obect without the need of explicitly passing size_t[3] arguments.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18473/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/platform/object.hpp#20 edit
SWDEV-2 - Change OpenCL version number from 3078 to 3079.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2826 edit
SWDEV-219121 - [Navi][ROCm] Add performance counter support
This is initial implementation for Navi 10 performance counter in OpenCL runtime.
Tests:
1. http://ocltc.amd.com:8111/viewModification.html?modId=130609&personal=true&tab=vcsModificationBuilds
2. ./ocltst -m oclruntime.so -t OCLPerfCounters
Before this code change, the segmenation fault happens inside OpenCL runtime. After this code change, the error happened inside HSA. Error message is generated in function hsa_ven_amd_aqlprofile_start. Inside this function, the C++ try block create exception. The exception error in HSA is "GFXIP is not supported(gfx1010)". We need HSA to add support for Navi.
ReviewBoards: http://ocltc.amd.com/reviews/r/18463/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/roccounters.cpp#5 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/roccounters.hpp#4 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#94 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/runtime/OCLPerfCounters.cpp#50 edit
SWDEV-2 - Change OpenCL version number from 3077 to 3078.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2825 edit
SWDEV-2 - Change OpenCL version number from 3076 to 3077.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2824 edit
SWDEV-2 - Change OpenCL version number from 3075 to 3076.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2823 edit