foreman 1dcac07a7e P4 to Git Change 1594730 by asalmanp@asalmanp-ocl-stg on 2018/08/16 17:32:14
SWDEV-160930 - SPECworkstation 3 benchmark GPU Compute tests fail
	Root cause: Caffe compute benchmark fails within SPECWorkstation app because one of the Caffe's OCL kernel tries to launch a kernel with the local_work_size of 1024 causing the clEnqueueNDRangeKernel API to return CL_INVALID_WORK_GROUP_SIZE (i.e., the maximum allowable number is 256)
	Proposed workaround: In order to run a kernel with a local_work_size of 1024, we check the number of used VGPRs in the Kernel and if the Kernel is not using all the available VGPRs we let the Kernel to use 1024 as the local_work_size.

	ReviewURLBoard = http://ocltc.amd.com/reviews/r/15638/

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#58 edit
2018-08-16 17:49:03 -04:00
S
Description
No description provided
282 MiB
Languages
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
Other 1.1%