ECR #333753 - Partial fix for Bug 10478 "Fix -fno-bin-llvmir/-fno-bin-hsail options"
If option -fno-bin-llvmi is set, .llvmir section is deleted from BIF on CG phase instead of FE. Both HSA & AMDIL are affected.
[Fixed] -fno-bin-llvm option causes clBuildProgram fail with error -11.
Took place only if compiled from OpenCL
[TODO] If possible -fno-bin-hsail should avoid putting HSAIL binary (BRIG) into BIF.
[Tests] pre check-in, make smoke, complib
[Reviewers] Brian Sumner, Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/frontend.cpp#31 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/frontend_clang.cpp#17 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#58 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#63 edit
... //depot/stg/opencl/drivers/opencl/tests/hsa/src/complib/options/-fbin-llvmir/HelloWorld_Kernel_cl.cl#1 add
... //depot/stg/opencl/drivers/opencl/tests/hsa/src/complib/options/-fno-bin-llvmir/HelloWorld_Kernel_cl.cl#1 add
... //depot/stg/opencl/drivers/opencl/tests/hsa/tlst/complib.tlst#3 edit
EPR #405889 - Added option to set VGPR/SGPR/LDS usage in ISA to certain value greater than actual usage for debugging purpose. If the given value is smaller than actual value, this option has no effect.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/SI/scCompileSI.cpp#52 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/scHWShaderInfo.h#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#121 edit
EPR #409950 - [IV][OCL] Multiple OCL samples crashed on multiple machines for 32-bit OS.
There are two issues:
1. the SC dll should be dynamically loaded only when it is available. This is to allow apps to run on CPU device without the SC dll. This CL fixes it. It also allows user to use env var AMD_OCL_SC_LIB to provide the name or complete path of SC dll to load.
2. The test fails because amdhsasc.dll is not included in base driver for 32 bit OS. The proper solution should be ask package team to include amdhsasc.dll in the base driver. Also amdhsasc.dll should be renamed amdoclsc.dll since it is not only used for HSAIL but also used by AMDIL. The benefit of separate SC component as a shared library is decreased build time since changes in SC does not require rebuild of amdocl.dll, and ease of debugging and regression analysis by allowing swapping SC comopnent.
However since 15.10 branch is close, there is not enough time to make changes to package. Therefore this CL implements a workaround for this issue without change to the package. We will implement the proper fix in the next relase.
The workaround implemented by this CL embeds SC statically in amdocl.dll. The runtime loads SC dll specified by env var AMD_OCL_SC_LIB only if it is available. If the SC dll is not available, it will use the embeded SC.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/build/Makefile.api#96 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/api/v0_8/acl.cpp#22 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/api/v0_8/aclLoaders.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/Makefile#44 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sclibdefs.opencl#20 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclStructs.h#13 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclTypes.h#4 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/build/Makefile.aoc2#21 edit
... //depot/stg/opencl/drivers/opencl/opencldefs#148 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#485 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#220 edit
ECR #333753 - Compiler Lib/aoc2/devloader: move devloader functionality into aoc2
[Purpose] To get rid of obsolete runtimenew dependency in compiler
1. Devloader functionality moved into aoc2;
2. Devloader is removed from the tree & make system;
3. Related changes in test_driver.pl;
4. Functions alignedMalloc & alignedFree are moved to libUtils.h;
5. Function aclHsaLoader is renamed to _aclHsaLoader to indicate that it is not a Compiler Lib API's function.
[Testing] make smoke, pre check-in
[Reviewers] Nikolay Haustov, Brian Sumner
Affected files ...
... //depot/stg/opencl/drivers/opencl/Makefile#48 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/amdocl.def.in#10 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/amdocl.map.in#11 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/amdoclcl.def.in#8 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/amdoclcl.map.in#7 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/brig_loader.cpp#15 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/scClientAPI.cpp#20 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#13 edit
... //depot/stg/opencl/drivers/opencl/compiler/loader/devloader/Makefile#8 delete
... //depot/stg/opencl/drivers/opencl/compiler/loader/devloader/build/Makefile#3 delete
... //depot/stg/opencl/drivers/opencl/compiler/loader/devloader/build/Makefile.devloader#11 delete
... //depot/stg/opencl/drivers/opencl/compiler/loader/devloader/devloader.cpp#6 delete
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#61 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/build/Makefile.aoc2#20 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#185 edit
... //depot/stg/opencl/drivers/opencl/tests/hsa/bin/test_driver.pl#5 edit
ECR #333753 - hsa_foundation RT/Compiler Lib: recompilation algorithm rework
1. Recompilation algorithm rework in order to avoid superfluous recompilations.
2. Replace aclExtractSymbol/Section with aclQueryInfo for symbol/section detection.
The replaced calls in RT previously performed actual extraction of the sections from the BIF with memory allocation and copying. But what is needed in fact is only to determine whether the section exists in BIF or not to make a further decision on needed recompilations. With aclQueryInfo and new added enums RT_CONTAINS_LLVMIR, RT_CONTAINS_OPTIONS, RT_CONTAINS_BRIG, RT_CONTAINS_HSAIL, RT_CONTAINS_ISA Runtime starts querying not the whole sections but the bool flag which indicates the existence of the corresponding section(s) without any memory allocations. Every compilation on RT starting from LLVMIR is affected by the change including compilation of blit kernels.
3. Fix in Compiler Lib for correct ACL_INVALID_ARG detection (for wrong/unsupported compilations).
[Side Effects] performance improvement, memory consumption reduction
[ToDo] Do not finalize program if ISA is already provided in BIF and options are unchanged.
[Testing] pre check-in, ocltst complib, ocl conformance 2.0 compiler & api
[Reviewers] German Andryeyev, Brian Sumner
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#56 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsaprogram.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsaprogram.hpp#3 edit
EPR #405357 - [CQE DTB][valgrind][OCL2.0]:MemLeaks are observed with MonteCarloAsian sample.
Need to delete amdrtFunctions when it is no longer used.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#116 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDLLVMContextHook.h#23 edit
ECR #333753 - Performance: Stop obligatory BRIG disassembling to HSAIL
[Important]: HSAIL is not being disassembled from BRIG and not being inserted into BIF anymore by default.
Testing: pre check-in, smoke_clang
Reviewers: Stanislav Mekhanoshin, Brian Sumner, Artem Tamazov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#55 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.cpp#33 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.hpp#11 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#60 edit
EPR #407056, #407061, #406980 - Back out changelist 1083545 since it causes a bunch of perf degradations. Will add a heurstics for -scras=2 for memory bound kernels only.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#118 edit
ECR #333753 - new tests are added to ocltst -m oclcomplib -t OCLRTGetInfo
tests on aclQueryInfo for:
RT_KERNEL_NAMES, RT_CONTAINS_LLVMIR, RT_CONTAINS_OPTIONS, RT_CONTAINS_BRIG, RT_CONTAINS_HSAIL, RT_CONTAINS_ISA
+ query for RT_CONTAINS_HSAIL is fixed in Compiler Lib: looking for symbol symHSAILText instead of section aclCODEGEN, because aclCODEGEN section may contain also symOpenclMeta, symOpenclKernel, symOpenclStub besides symHSAILText.
Testing: pre check-in, ocltst -m oclcomplib
Reviewer: Brian Sumner
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#53 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/oclrtGetInfo.cpp#16 edit
ECR #333753 - Compiler Lib/RT/Performance: Replace aclExtractSymbol/Section with aclQueryInfo for symbol/section detection.
The replaced calls in RT previously performed actual extraction of the sections from the BIF with memory allocation and copying. But what is needed in fact is only to determine whether the section exists in BIF or not to make a further decision on needed recompilations. With aclQueryInfo and new added enums RT_CONTAINS_LLVMIR, RT_CONTAINS_OPTIONS, RT_CONTAINS_BRIG, RT_CONTAINS_HSAIL, RT_CONTAINS_ISA Runtime starts querying not the whole sections but the bool flag which indicates the existance of the corresponding section(s) without any memory allocations. Every compilation on RT starting from LLVMIR is affected by the change including compilation of blit kernels.
Side Effects: performance improvement, memory consumption reduction
Testing: pre check-in, ocl conformance (api, basic, compiler), ocltst complib
Reviewers: Brian Sumner, German Andryeyev, Artem Tamazov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/api/v0_8/acl.cpp#21 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#52 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclEnums.h#14 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#180 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLEnumCheck.cpp#38 edit
ECR #333753 - HSA RT/Compiler Lib/Performance: Elimination of HSAIL text usage in RT
Extracting HSAIL from the binary and parsing it for the kernel names in RT were replaced with aclQueryInfo call for RT_KERNEL_NAMES.
Kernel names are obtained now from the corresponding metadata symbols names, which are already presented in BIF at kernel finalization stage.
Side effect: performance improvement
Next Step: Performance: elimination of BRIG disassembling to HSAIL as obligatory stage in Compiler Lib (previously was needed only by RT).
Testing: pre check-in, ocl conformnace 2.0 (basic, api, compiler, workgroups, device_execution)
Reviewers: Stanislav Mekhanoshin, German Andryeyev, Brian Sumner
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/api/v0_8/acl.cpp#20 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#51 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclEnums.h#13 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bifbase.cpp#50 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bifbase.hpp#22 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#11 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#266 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#179 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLEnumCheck.cpp#37 edit
ECR #377625 - AMDIL Function support: allow functions without names to be not inlined.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#114 edit
ECR #304775 - clp re-implementation - refactoring and generalization of clpVectorExpansion to work on both AMDIL and CPU path, HSAIL path not included yet.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/clc/clpSrc/build/Makefile.clp#5 edit
... //depot/stg/opencl/drivers/opencl/compiler/clc/clpSrc/clpVectorExpansion.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#113 edit
... //depot/stg/opencl/drivers/opencl/library/common/src/commonConversions.cl#16 edit
... //depot/stg/opencl/drivers/opencl/library/x86/gen/build/Makefile.gen#16 edit
ECR #333753 - HSA HLC: decouple hsail inlining options and threshold from amdil/cpu
This allows selective enablement of the feature and selective tuning of the threshold depending on the target.
Testing: smoke, smoke_clang, precheckin
Reviewed by Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/opt_level.cpp#21 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/Transforms/IPO/AMDOptOptions.h#6 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/AMDOptOptions.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/AMDPassManagerBuilder.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/opt/amdopt.inc#21 edit
ECR #333753 - Compiler Lib/RT: Metadata related code refactor, annotation, minor fixes & additional checks
+ refactor if_aclQueryInfo() in order to simplify code and to avoid direct usage of aclMetadata struct members types
+ annotation on why we need to use deserializeCLMetadata on "serialized" (to NULL) pointers
+ erroneously forgotten RT_KERNEL_NAME was added to aclQueryType enum
+ OCLRTGetInfo, CLEnumCheck tests from ocltst oclcomplib was updated to use RT_KERNEL_NAME
+ testing of printf is added to OCLRTGetInfo
+ minor fixes and additional checks
tests: pre check-in, ocltst -m oclcomplib
Reviewers: Artem Tamazov, Brian Sumner, German Andryeyev
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#49 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclEnums.h#12 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclStructs.h#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#265 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLEnumCheck.cpp#36 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/clSourceShaders.h#5 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/oclrtGetInfo.cpp#14 edit
EPR #402000 - [CQE OCL][Perf][QR] ~6-7% perf drop in CompuCL Benchmark (Graphics: T-Rex subtest).
Add option to disable SC merge memory loads and stores. By default it is disabled. Will decide whether to enable it by default after performance runs.
cherrypick 1076590 and CL#1077419 from sc stg for adding option in sc.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/Interface/SCCommon.h#42 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/Src/CompilerBase.cpp#51 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/Src/CompilerBase.hpp#35 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/Src/HwUtils.cpp#36 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/scState.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#114 edit
ECR #377625 - AMDIL Function support: Calculate total private memory usage by a kernel including memory used by called functions.
This cannot be done by IPA since stack size is known only after register allocation due to potential register spill, but MachineFunctionAnalysis cannot persist after CGSCC pass with current LLVM version.
This change adds private memory usage metadata for non-kernel functions. The total private memory usage by a kernel is calculated when AMDIL is split for different kernels. BIF will contain total private memory size.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/amdilUtils.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/amdilUtils.hpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/AMDIL/AMDILKernelManager.cpp#451 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/AMDIL/AMDILKernelManager.h#51 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#175 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#54 edit
ECR #333753 - HSA HLC: remove code changing the default filetype which is set by an external tool such as llc
Effectively llc will produce text hsail file by default as a standard llc behaviour. Use -filetype=obj to obtain brig.
Note, test_driver.pl is already patched to preserve old behaviour.
Testing: smoke, smoke_clang, precheckin
Reviewed by Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/codegen.cpp#58 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/HSAIL/HSAILTargetMachine.cpp#33 edit
EPR #405194 - Change unroll threshold to LLVM default to partially work around Linpack performance problem.
Prior to CL 1058428, which increased the unroll threshold to 200, this was only 100 which is lower than the LLVM default. Linpack's new ISA has increased register usage, but decreasing the unroll threshold to the previous level does not reduce the register count to its previous level. The increased register usage is probably a new SC problem, so this should probably be increased again in the future. There is no change in register usage with 100 vs. 150 on Linpack.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#113 edit
ECR #304775 - Remove _ in hsail_64 triple enum name. It isn't consistent with itself, or most other targets. The string form is already "hsail64", but the target name is sometimes "hsail-64". Does not remove the - in "hsail-64" for the target name since users could be depending on that, although that should also be fixed.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/codegen.cpp#57 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#110 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/ADT/Triple.h#36 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDLLVMContextHook.h#22 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/MC/MCObjectFileInfo.cpp#14 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Support/Triple.cpp#47 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/HSAIL/TargetInfo/HSAILTargetInfo.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/SPIR/AMDSPIRLoader.cpp#82 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/Scalar/AMDLowerAtomics.cpp#13 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/Utils/AMDUtils.cpp#2 edit
ECR #333756 - HSA Finalizer: added runtime option to force buffer instructions for global access
This can be used under ORCA RT.
Testing: smoke, smoke_clang, precheckin, clbas dgemm
Reviewed by Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/SI/scStateSI.cpp#24 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#112 edit
EPR #405458 - clinfo segfaults when ENABLE_CAL_SHUTDOWN=1.
For the global variables of:
std::map <std::string, int> OptionNameMap[2];
std::map <std::string, int> NoneSeparatorOptionMap[2];
std::map <std::string, int> FOptionMap;
std::map <std::string, int> MOptionMap;
We don't need to call the clear() method explicitly, since the std::map destructor will clean things up (valgrind mem-check doesn't report any leak related to these global variables after this change). Besides, on Linux amd::option::teardown() is called after the global variables' destructors are called, and it will cause segfault.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/options.cpp#29 edit
ECR #333753 - unify online/offline linkers
The code for "FixUpModule" from the online linker is now moved to
a common file under llvm/lib. This replaces the copy present in
llvm/tools/llvm-link, thus unifying the two linkers.
Reviewed by Stanislav Mekhanoshin, Yaxun Liu (Sam)
Passes smoke, smoke_clang and precheckin.
Also passes OpenCL 2.0 conformance tests.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#109 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDFixupKernelModule.h#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDUtils.h#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Linker/AMDFixupKernelModule.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/Utils/AMDUtils.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/llvm-link/AMDFixUpModule.cpp#12 delete
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/llvm-link/llvm-link.cpp#48 edit
ECR #333753 - Compiler Lib: improve & refactor HSAIL text routines
+ HSAIL text is always being inserted into BIF now in one place of Codegen phase
+ AMDIL & HSAIL paths are unified at Codegen phase
+ Error handling is improved
Testing: make smoke_clang, pre check-in
Reviewers: Brian Sumner, Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#47 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.hpp#10 edit
ECR #333753 - Compiler Lib & RT: Fix for Compiler's build log printing on RT.
+ RT now asks correctly Compiler's build log by aclGetCompilerLog().
+ BuildLog is added for HSAILKernel by moving it from NullKernel class to Kernel class.
+ Compiler's Lib appendLogToCL() is fixed.
+ Usage of API's aclExtractSection/aclExtractSymbol/aclInsertSection/aclInsertSymbol in Compiler Lib itself replaced by it's inner realizations extSec/extSym/insSec/insSym due to unneded build log clearing in first case.
+ Phase info is added to build log even if CallBack function is not presented for aclCompiler.
How to verify:
set AMD_OCL_BUILD_OPTIONS_APPEND="-print-compile-phases -buildlog=stdout"
test_integer_ops integer_ctz
test_integer_ops integer_ctz cpu
Testing: make smoke_clang, selective OCL conf. tests, pre check-in
Reviewer: Brian Sumner, German Andryeyev
Review board: http://ocltc.amd.com/reviews/r/5582/
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#46 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/brig_loader.cpp#13 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.cpp#31 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#228 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#262 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#100 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#174 edit
ECR #333753 - RS compilation path.
RS compilation will be in 2 stages, first generates BRIG and the next is done via HSA Finalize API (that involves a load step).
Existing code in compiler/lib has a bug in that when the final output expected is HSAIL_BINARY, compilation should stop with invoking the llvm compiler (and the built-in assembler), not go all the way to ISA.
Tests: precheckin, hsa smoke
hsa/tests/RS/ test harness will be changed in a separate changelist.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#57 edit