SWDEV-76911 - Set output pipeline of multi-shader compilation in the same way driver sets. Fill output shader pointers only if they will be generated based on the input shaders, otherwise NULL.
ReviewBoardURL = http://dxreview.amd.com/r/20059/
Affected files ...
... //depot/stg/sc/Src/Dev/TestEngine.cpp#512 edit
SWDEV-90482 - [Afterswitch] Interop from OpenGL to OpenCL is broken in one driver and crashes in another
- Make sure SRD resource is reported to OS if program contains static samplers
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#314 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#67 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palprogram.hpp#4 edit
SWDEV-90618 - cl_kernel_info_amd always returns 0 when working via HSAIL path
- Don't access GPU device specific data for offline compilation
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#313 edit
SWDEV-80874 - fixed out of bound access to the printf format string
We do not really need two separate induction variables, pos and i, and we had a bug of not incrementing i as needed.
The only reason it used to work is because all strings we used for testing ended with '\n'.
The bug resulted in ignoring this '\n', but the code unconditionally adds '\n', so nobody noticed.
If you try to print anything having any other escape, '\n' not at the end, or a colon, there will be assertion.
That is fixed, and newline now is only added if last symbol in user's format was not newline, because otherwise
we would now print 2 new lines. NB, I prefer to use bool variable rather then addressing last symbol of the string
which could be empty.
A side node, why do we run flex scanner past the last colon? If we do not we would not need this double encoding at all.
Testing: smoke, precheckin, conformance printf with HSAIL forced, custom test
Reviewed by German Andreev
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#309 edit
SWDEV-80864 - HSAIL Metadata Workgroup Size Hint and Vec Type Hint added to HSAIL Runtime
Runtime changes required for the use of these two metadata:
- Runtime's gpukernel.cpp requires new aclQueries during HSAILKernel::Init
- One for quering WorkGroupSizeHint's array
- Two for size of VecTypeHint and fetching VecTypeHint's string
- initArgList needs to be moved to end of HSAILKernel::init to allow createSignature to get non empty values
- Compiler lib's workgroup hint (wsh) needs to match runtime's type (size_t)
- In Kernel constructor, instead of using memset which corrupts std::string, specifically set default workGroupInfo struct's variables
Also fixed wavesPerSimdHint to use size_t to match runtime.
Updated CLAssumptionCheck.cpp since aclMetadata structure was modified.
Note: This is the runtime counterpart to submitted CL#1204512. (Post Review#8808, SWDEV-79695)
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/legacy-lib/include/v0_8/aclStructs.h#5 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclStructs.h#22 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#260 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#308 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLAssumptionCheck.cpp#48 edit
SWDEV-79399 - OpenCL printf does not print correctly when the printf builtin function is called twice - clear the local printf info each time
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#305 edit
SWDEV-78915 - SYCL - segfault building SPIR binary where the kernel name exceeds 255 characters - changed kernel/arg name type from char[] to string to avoid the 256 character limitation.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#304 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#122 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#211 edit
SWDEV-59579 - using existing function to simplify the code.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#301 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#385 edit
SWDEV-77321 - Runtime to replace metadata LimitWave with WavePerSimdHint.
Compiler has changed LimitWave to WavePerSimdHint, so runtime need to make corresponding change to have the Wave Limiter continue working.
WavePerSimdHint=1,...,10 will be dealt with later.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#258 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#300 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#119 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuwavelimiter.cpp#8 edit
ECR #333753 - ORCA RT/Compiler Lib/aoc2: AMD HSA Code Object Import feature (part II) - arbitrary hidden (extra) kernargs support
Only HSAIL path is affected. It doesn't affect blit kernels.
To use offline by aoc2:
aoc2 -hsacodeobject=<importing_code_object_filename> -numhiddenkernargs=<num> -cl-std=CL2.0 -march=hsail(-64) -mdevice=Bonaire <source_cl_filename>
To use online by setting env:
AMD_DEBUG_HSA_NUM_HIDDEN_KERNARGS=<num>
where num >= 0. If num == 0, then no additional arguments will be added on RT for every kernel. The default value is unchanged and equal to 6 for now.
Misc:
+ get rid of PRE & POST defines in Compiler Lib, as they started to conflict with ugl\gl\gs\hwl\ headers with the same defines.
+ minor copy/paste eliminations & typo fixes
+ ocltst complib tests update
Testing: pre check-in, manually based on ocl sdk MatrixMultiplication
Reviewers: Brian Sumner, German Andryeyev, Nikolay Haustov, Artem Tamazov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#72 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.cpp#49 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/metadata.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclDefs.h#5 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclEnums.h#19 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclStructs.h#17 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/bif_section_labels.hpp#21 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.cpp#10 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#20 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#74 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#181 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#249 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#291 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#113 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#199 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#369 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsaprogram.cpp#38 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsakernel.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsakernel.hpp#5 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsaprogram.cpp#19 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsavirtual.cpp#43 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLAssumptionCheck.cpp#43 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLEnumCheck.cpp#44 edit
EPR #419362 - Forum [170348]: problem with printf for OpenCL 2.0 kernel build - added a set of missing symbols to flex including space, added a extra backslash to escape sequences that could not be handled by flex and also to colon which is defined in flex as a delimiter!
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/MDParser/AMDILMDParser.l#3 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/MDParser/AMDILMDParser.output#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/MDParser/lex.yy.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/HSAIL/HSAILProducePrintfMetadata.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/Scalar/AMDPrintfRuntimeBinding.cpp#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#290 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/compiler/OCLPrintSpecialChars.cpp#1 add
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/compiler/OCLPrintSpecialChars.h#1 add
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/compiler/TestList.cpp#41 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/compiler/build/Makefile.compiler#44 edit
ECR #304775 - First batch of build fixes for clang.
Fixes hard source errors and a handful of simple warnings, but leaves most other warnings for later. Other errors not fixed here are from adding compile flags that are not understood.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/clc/src/e2lCommon.h#11 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/opt_level.hpp#4 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/HSAIL/BRIGAsmPrinter.cpp#117 edit
... //depot/stg/opencl/drivers/opencl/opencldefs#162 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#289 edit
ECR #333753 - ORCA RT/Compiler Lib: HSA Code Object/RT independent loader introducing/integration into OpenCL.
Changes by Evgeniy Mankov.
Purpose:
Use the same Finalizer & loader for both HSA & ORCA RT.
AMDIL path is not affected.
Changes:
1. The whole BRIG is finalized now instead of per kernel finalization (both in gpuprogram & hsail_be).
2. HSALoader is changed in order to work with CodeObject and new HSA Loader's API <96> Context. Now it is in ORCA<92>s gpuprogram instead of Compiler Lib.
3. brig_loader.cpp is removed from compiler lib, as well as __aclHSALoader function exports from the whole stack.
4. BIF .text section now contains the whole finalized HSA CodeObject instead of separate symbols for finalized kernels.
5. ORCA RT now works directly with amd_kernel_code_t and doesn't need any SC metadata anymore.
6. aoc2 is supplemented with fake offline loader correspondingly.
7. amdocl/complib make sytem changes.
8. test_driver.pl update.
ToDo:
1. Implement disassemble() & BuildLog() functions to support ISA dumping & SC error handling (Konstantin).
2. Global variables initialization by pragma reference (Konstantin). Test to verify: test_basic progvar_prog_scope_init.
3. Code Object without kernels support (Nikolay - ready). Test to verify: test_generic_address_space.exe library_function
testing: windows smoke, pre check-in, ocl conformance 2.0, ocl SDK 2.9
Reviewers: Nikolay Haustov, German Andryeyev
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/amdocl.def.in#13 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/amdocl.map.in#15 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/build/Makefile.api#116 edit
... //depot/stg/opencl/drivers/opencl/compiler/legacy-lib/amdoclcl.def.in#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/legacy-lib/amdoclcl.map.in#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/amdoclcl.def.in#12 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/amdoclcl.map.in#11 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#70 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/build/Makefile.gpu#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/build/Makefile.complib#85 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#18 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#70 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/build/Makefile.aoc2#24 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#248 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#121 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#288 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#112 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#194 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#59 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuscsi.cpp#33 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#368 edit
... //depot/stg/opencl/drivers/opencl/tests/hsa/bin/test_driver.pl#12 edit
ECR #333756 - HSA Finalizer: Make sure size of kernarg segment, alignment of kernarg, private and group segments are multiple of 16. Update ORCA runtime assert. [ OpenCL integration of CL 1151953]
Change by Nikolay Haustov
Testing: http://ocltc:8111/viewModification.html?modId=51851&personal=true&init=1&tab=vcsModificationBuilds
Also fix uncovered problem in test.
Testing: pre-checkin
Reviewed by: German
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/HSAIL/hsail-fin/HSAILFinalizer.cpp#16 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/HSAIL/tests/src/finalizer/features/structural_analysis/short_circuit/short_circuit06.hsail#4 integrate
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#284 edit
ECR #304775 - Wave limiter: Fix bug in adaptation.
Dumped waves/simd value is incorrect.
Should exit adptation only after the changed waves/simd value is applied.
Added wave limiter manager to handle situation that one kernel is enqueued to more than one queues. Create wave limiter for each virtual device.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#245 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#283 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#109 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#360 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuwavelimiter.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuwavelimiter.hpp#3 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#70 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.hpp#76 edit
EPR #403782 - IOMMU2/SVM
- Enable SCOption_R1200_ENABLE_XNACK whenever IOMMUv2 is supported.
- Add "-sc-xnack-iommu" option for compile and link and pass this to SCWrapper in the options string.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/7266/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/SI/scStateSI.cpp#30 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#122 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#282 edit
EPR #413091 - fixed a bug in the gpukenerl processing, the svm memory object of a kernel argument also needs to be updated writer couting for mGPU support, if the memory object is writable for the kernel.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#276 edit
EPR #410797 - Specific OCL kernel is 5x slower on Hawaii than on Nvidia K40 GPU when tested under Linux.
- The logic for local workgroup size search was prioritizing ALU utilization, but with multidemensional launches X dimension could affect address calculation and cacheline utlization more than others. Add cacheline size into the consideration.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#270 edit
EPR #403782 - IOMMU2/SVM
- For finegrainsystem, the app can pass a malloced pointer directly to the kernel. Copy pointer directly to the aqlArgBuf without exiting.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/6378/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#269 edit