foreman 520cfc439d P4 to Git Change 1214781 by smekhano@stas-rampitec-hsa on 2015/11/25 14:45:09
SWDEV-82596 - HSA HLC: Create AMDInline pass
	The generic llvm inlining heuristcs do not work well for GPU.
	In particular we have a common problem in several tests:
	If we have a pointer to private array passed into a function it will not be optimized out, leaving scratch usage.
	The pass increases the inline threshold to allow inliniting in this case.

	Also that we can move at least some AMD inlining customizations into this file from the common code.
	Inline hint threshold is moved in this change.

	Performance impact on ocltst sha256, 32 bit, Fiji:

				AMDIL	HSAIL	Diff		HSAIL+Inliner	Diff		Diff
					before	to AMDIL			to HSAIL	to AMDIL
	OCLPerfSHA256[  0]	43.843	40.894	0.93		69.910		1.71		1.59
	OCLPerfSHA256[  1]	53.611	51.083	0.95		80.919		1.58		1.51
	OCLPerfSHA256[  2]	52.127	51.528	0.99		80.640		1.56		1.55
	OCLPerfSHA256[  3]	60.952	57.027	0.94		68.615		1.20		1.13
	OCLPerfSHA256[  4]	76.173	70.150	0.92		80.582		1.15		1.06
	OCLPerfSHA256[  5]	75.886	70.264	0.93		81.000		1.15		1.07

	Testing: smoke, precheckin, ocltst sha256
	Reviewed by Danill Fukalov

Affected files ...

... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/opt_level.cpp#28 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/InitializePasses.h#93 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/LinkAllPasses.h#49 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/Transforms/IPO.h#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/AMDInline.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/CMakeLists.txt#24 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/IPO.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/Inliner.cpp#42 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/opt/amdopt.inc#28 edit


[ROCm/clr commit: 5e3d4f5a01]
2015-11-25 15:23:51 -05:00
S
Описание
No description provided
282 MiB
Languages
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
Разное 1.1%