1dcac07a7eda127a3a24ef516370d91abd38f9fa
SWDEV-160930 - SPECworkstation 3 benchmark GPU Compute tests fail Root cause: Caffe compute benchmark fails within SPECWorkstation app because one of the Caffe's OCL kernel tries to launch a kernel with the local_work_size of 1024 causing the clEnqueueNDRangeKernel API to return CL_INVALID_WORK_GROUP_SIZE (i.e., the maximum allowable number is 256) Proposed workaround: In order to run a kernel with a local_work_size of 1024, we check the number of used VGPRs in the Kernel and if the Kernel is not using all the available VGPRs we let the Kernel to use 1024 as the local_work_size. ReviewURLBoard = http://ocltc.amd.com/reviews/r/15638/ Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#58 edit
Description
No description provided
Languages
C++
67.5%
C
20.6%
Python
6.6%
CMake
3.4%
Shell
0.6%
Other
1.1%