Test window will quit or show error message when run OpenCL API.
Add a workaround for the race condition with the first page
during pinning.
Change-Id: I9a27b4e173cf94c84aefcb94e255f11169453d94
The printf call in the device code is expanded by the compiler into a
series of hostcalls that together form a "message". This change
introduces the following functionality in the runtime:
1. Receive a generic message consisting of a series of hostcalls.
2. Process a printf message.
Change-Id: I9d667d6f91607a907a96e46cc5fca55734339747
~45% to 50% of Performance drop on rocBLAS_int8 test
Add support for active waits without blocking the host thread.
Change-Id: Ie7bb48dcafcb4c93d448bf74749b829b626c3578
cl_bool needed to be replaced with uint32_t instead of bool. This is due to cl_bool being a typedef of cl_uint32.
Currently clGetDeviceInfo() reports incorrect size for the return value, due to cl_bool being 4 bytes and c++ bool being 1 byte.
Change-Id: I647a4b8873627059865c84c8ca27694dbc0916de
Add MS HWS support. PAL reports just one compute engine
in that mode and runtime needs extra logic to detect RT queues.
Change-Id: I011f1f1b18dec6a7195a4f1fe939f8029bc269ae
~45% to 50% of Performance drop on rocBLAS_int8 test
Use the last command in the queue for a wait.
Add extra print information about processed commands.
Add an option to disable file location printing.
Change-Id: I4187883e1a90e571fde3128af98368108fda8785
When we're aligning rowPitch to imagePitchAlignment, rowPitch is in pixels,
but imagePitchAlignment_ is bytes, so we end up overaligning the pitch.
Convert imagePitchAlignment_ to pixels before doing any logic.
Change-Id: Ia5ab9d54bed150fe974e86b060dbadc196165b29
hip_threadfence_system passes locally with this change. This also fixes
hipHostMalloc() failures when hipHostMallocMapped flag is used.
Change-Id: Id412efe502accc7c6e7676b52c05ccb9d8fbbe67
Remove a workaround to CS_PARTIAL_FLUSH added in CL#1495187,
since PAL is no longer uses CS_PARTIAL_FLUSH.
Change-Id: I03edc7595459e19aad33b2b0901f0ebe4754d310
[hipclang-vdi-rocm][perf]~45% to 50% of Performance drop on
rocBLAS_int8 test
- Enable AMD_OPT_FLUSH optimization by default to match HCC
- Disable CPU writes to GPU memory on boards with large bar,
because it requires HDP flush tracking.
- Enable L2 cache on kernel arguments, because L2 will be
invalidated on memory reuse .
Change-Id: I124cf250bdd4d19c523ce542c163813828f8fbdc
Update a use of the deprecated amd_comgr_action_info_set_options to
instead use amd_comgr_action_info_set_option_list.
Completely remove all references to amd_comgr_action_info_set_options
and amd_comgr_action_info_get_options from the runtime.
Change-Id: I12a0803c87430722364ec22818e249caf3798c88