SWDEV-76911 - Bug 11070: ViennaCL fails in amd::hsa::loader::Executable::Destroy
Destructors of global variables run in undefined order, so loader cannot
be global. Encapsulate loader functionality in Loader class and link its lifecycle
to Runtime.
In HSA Runtime, create Loader as part of Runtime singleton.
In ORCA Runtime, create Loader as part of HSAILProgram class.
Testing: smoke, pre-checkin
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/sc/HSAIL/ext/amdhsacod/amdhsacod.cpp#10 integrate
... //depot/stg/opencl/drivers/opencl/compiler/sc/HSAIL/ext/loader/executable.cpp#12 integrate
... //depot/stg/opencl/drivers/opencl/compiler/sc/HSAIL/ext/loader/executable.hpp#8 integrate
... //depot/stg/opencl/drivers/opencl/compiler/sc/HSAIL/include/amd_hsa_loader.hpp#5 integrate
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#78 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#210 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#61 edit
ECR #333753 - ORCA RT/Compiler Lib: HSA Code Object/RT independent loader introducing/integration into OpenCL.
Changes by Evgeniy Mankov.
Purpose:
Use the same Finalizer & loader for both HSA & ORCA RT.
AMDIL path is not affected.
Changes:
1. The whole BRIG is finalized now instead of per kernel finalization (both in gpuprogram & hsail_be).
2. HSALoader is changed in order to work with CodeObject and new HSA Loader's API <96> Context. Now it is in ORCA<92>s gpuprogram instead of Compiler Lib.
3. brig_loader.cpp is removed from compiler lib, as well as __aclHSALoader function exports from the whole stack.
4. BIF .text section now contains the whole finalized HSA CodeObject instead of separate symbols for finalized kernels.
5. ORCA RT now works directly with amd_kernel_code_t and doesn't need any SC metadata anymore.
6. aoc2 is supplemented with fake offline loader correspondingly.
7. amdocl/complib make sytem changes.
8. test_driver.pl update.
ToDo:
1. Implement disassemble() & BuildLog() functions to support ISA dumping & SC error handling (Konstantin).
2. Global variables initialization by pragma reference (Konstantin). Test to verify: test_basic progvar_prog_scope_init.
3. Code Object without kernels support (Nikolay - ready). Test to verify: test_generic_address_space.exe library_function
testing: windows smoke, pre check-in, ocl conformance 2.0, ocl SDK 2.9
Reviewers: Nikolay Haustov, German Andryeyev
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/amdocl.def.in#13 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/amdocl.map.in#15 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/build/Makefile.api#116 edit
... //depot/stg/opencl/drivers/opencl/compiler/legacy-lib/amdoclcl.def.in#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/legacy-lib/amdoclcl.map.in#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/amdoclcl.def.in#12 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/amdoclcl.map.in#11 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#70 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/build/Makefile.gpu#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/build/Makefile.complib#85 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#18 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#70 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/build/Makefile.aoc2#24 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#248 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#121 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#288 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#112 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#194 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#59 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuscsi.cpp#33 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#368 edit
... //depot/stg/opencl/drivers/opencl/tests/hsa/bin/test_driver.pl#12 edit
ECR #333753 - HSA RT: avoiding superfluous recompilations on ORCA RT/HSA path (part 2)
+ support of -fno-bin-llvmir & -fno-bin-hsail options: do not check compiler options for recompilation decision.
As a result if the binary contains ISA, BRIG & HSAIL and the above options are specified when compiling from binary, then compilation options are not compared, recompilation doesn't occur. This makes possible to compile from binary with different set of options, for example: -just-kernel.
P.S. Brig & HSAIL should be in binary in order to initialize & execute kernel (even if ISA is presented).
Testing: pre check-in, compiler, api & basic ocl conformance 2.0 tests
Reviewers: German Andryeyev, Artem Tamazov
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#178 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#56 edit
ECR #333753 - HSA RT: avoiding superfluous recompilations on ORCA RT/HSA path
Next compilation stage determination based on binary sections and options (while linkImpl).
If current HSAILProgram options are equal to binarys ones:
- Do not generate BRIG if BRIG sections are already presented in binary.
- Do not finalize BRIG->ISA if ISA is already presented in binary.
- Perform only CG phase if HSAIL is absent in binary.
Always perform only brig loading (even in case of ISA presented).
Testing: pre check-in, compile & basic ocl conformance 2.0 tests
Reviewer: German Andryeyev
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpucompiler.cpp#150 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#264 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#101 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#177 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#55 edit
ECR #377625 - AMDIL Function support: Calculate total private memory usage by a kernel including memory used by called functions.
This cannot be done by IPA since stack size is known only after register allocation due to potential register spill, but MachineFunctionAnalysis cannot persist after CGSCC pass with current LLVM version.
This change adds private memory usage metadata for non-kernel functions. The total private memory usage by a kernel is calculated when AMDIL is split for different kernels. BIF will contain total private memory size.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/amdilUtils.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/amdilUtils.hpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/AMDIL/AMDILKernelManager.cpp#451 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/AMDIL/AMDILKernelManager.h#51 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#175 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#54 edit