EPR #419072 - [OpenCL2.0] Enable 16MB large on device queues
- Enable device queue creation up to 12MB. That should allow to run Intel SDK sample from the EPR that requires 6MB queue only.
- Currently a queue with >12.5MB size has a significant performance degradation. Thus the current max possible is 12MB. In general it's preferable to use the queue size more suitable for the task, rather than max possible.
Affected files ...
... //depot/stg/opencl/drivers/opencl/library/hsa/hsail/src/devenq/schedule.cl#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#115 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#38 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#123 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#517 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusched.hpp#17 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#372 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#131 edit
EPR #010002 - Change OpenCL version number from 1868 to 1869.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1615 edit
EPR #010002 - Change OpenCL version number from 1867 to 1868.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1614 edit
EPR #010002 - Change OpenCL version number from 1866 to 1867.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1613 edit
ECR #304775 - Fork llvm-link into new tool called opencl-link
Most of what the patched llvm-link does has nothing to do with linking LLVM IR, and more to do with loading SPIR and specifically handling the builtin library.
Forking this into a separate tool is the fastest way to fix dependencies on large LLVM patches for the OpenCL build to work. With this in place, it should now be possible to move the various linker and SPIR conversions out of the llvm directory and into compiler library.
Ideally this would be fixed by:
1. Not always lowering the library from SPIR
2. Having a separate SPIR lowering tool for testing
3. Using function attributes and stub libraries for library FP options instead of linker flags
4. Structuring all of the SPIR conversions as passes and having a single PassManager handle all of the lowering / linking / optimization passes.
But accomplishing all of these will be more time consuming.
Branching
//depot/stg/opencl/drivers/opencl/compiler/llvm/tools/llvm-link/...
to //depot/stg/opencl/drivers/opencl/compiler/lib/linker/tools/opencl-link/...
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/build/Makefile.api#118 edit
... //depot/stg/opencl/drivers/opencl/compiler/Makefile#60 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/Makefile#34 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/build/Makefile.common#33 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#131 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/build/Makefile.complib#88 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/compliblinkerlibs#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/Makefile#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/include/AMDFixupKernelModule.h#1 move/add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/include/AMDPrelink.h#1 move/add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/include/AMDResolveLinker.h#1 move/add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/lib/AMDFixupKernelModule.cpp#1 move/add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/lib/AMDPrelink.cpp#1 move/add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/lib/AMDResolveLinker.cpp#1 move/add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/lib/Makefile#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/lib/build/Makefile#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/lib/build/Makefile.linker#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/lib/clpVectorExpansion.cpp#1 move/add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/tools/opencl-link/Android.mk#1 branch
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/tools/opencl-link/Makefile#1 branch
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/tools/opencl-link/build/Makefile#1 branch
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/tools/opencl-link/build/Makefile.opencl-link#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/linker/tools/opencl-link/opencl-link.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDFixupKernelModule.h#3 move/delete
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDPrelink.h#2 move/delete
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDResolveLinker.h#7 move/delete
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Linker/AMDFixupKernelModule.cpp#9 move/delete
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Linker/AMDPrelink.cpp#2 move/delete
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Linker/AMDResolveLinker.cpp#11 move/delete
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Linker/clpVectorExpansion.cpp#3 move/delete
... //depot/stg/opencl/drivers/opencl/compiler/llvm/llvmdefs#41 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/llvm-link/llvm-link.cpp#55 edit
... //depot/stg/opencl/drivers/opencl/openclrules#87 edit
... //depot/stg/opencl/drivers/opencl/tests/hsa/bin/hsa_dist.pl#2 edit
... //depot/stg/opencl/drivers/opencl/tests/hsa/bin/test_driver.pl#18 edit
ECR #333753 - Compiler Lib: switch Bif Version to 3.1 by default for HSAIL
It is needed due to the latest AMD HSA Code Object introduction in BIF.
TODO (in separate changes):
1. Analyze the changes in sections/symbols and remove (if needed) unused anymore (in BIF31), for example, symISAMeta, check backward compatibility.
2. Move the bif versions/conversions code from libUtils to loader\Bif.
3. Refactor the bif versions/conversions code in order to get rid of copy/paste (templates?).
4. Drop aclBIFVersionCAL.
Testing: pre check-in
Reviewer: Brian Sumner, Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/api/v0_8/acl.cpp#33 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclEnums.h#20 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bif.hpp#3 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bif20.cpp#14 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bif20.hpp#7 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bif21.cpp#7 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bif21.hpp#7 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bif30.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bif30.hpp#8 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bif31.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bif31.hpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bifbase.cpp#53 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bifbase.hpp#23 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/bif_section_labels.hpp#22 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.cpp#11 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#21 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#75 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/binary/BIFAssumptionCheck.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/binary/BIFAssumptionCheck.hpp#5 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLEnumCheck.cpp#45 edit
EPR #010002 - Change OpenCL version number from 1865 to 1866.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1612 edit
EPR #421017 - IOMMU2/SVM on CZ Win10, the bit INST_ATC of COMPUTE_PGM_HI needs to be set for device enqueue.
Affected files ...
... //depot/stg/opencl/drivers/opencl/library/hsa/hsail/src/devenq/schedule.cl#9 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusched.hpp#16 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#370 edit
EPR #424562 - Add Averaging algorithm to Wave Limiter.
1. Extract the algorithms to a sub-class of the wave limiter class.
2. Add Averaging algorithm
This averaging algorithm typically improves performance of BasemarkCL wave simulation by 8% on Tonga/Fiji than the current smooth algorithm. This change has not enable the averaging algorithm yet. Follow-up changes should be made to intelligently select which algorithm to use.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuwavelimiter.cpp#7 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuwavelimiter.hpp#4 edit
EPR #010002 - Change OpenCL version number from 1864 to 1865.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1611 edit
EPR #397491 - Back out changelist 1177450. Disable 32 bit generic address space
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#130 edit
EPR #397491 - Enable generic address space for 32 bit on Windows
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#129 edit
ECR #333753 - ORCA RT/Compiler Lib/aoc2: AMD HSA Code Object Import feature (part II) - arbitrary hidden (extra) kernargs support
Only HSAIL path is affected. It doesn't affect blit kernels.
To use offline by aoc2:
aoc2 -hsacodeobject=<importing_code_object_filename> -numhiddenkernargs=<num> -cl-std=CL2.0 -march=hsail(-64) -mdevice=Bonaire <source_cl_filename>
To use online by setting env:
AMD_DEBUG_HSA_NUM_HIDDEN_KERNARGS=<num>
where num >= 0. If num == 0, then no additional arguments will be added on RT for every kernel. The default value is unchanged and equal to 6 for now.
Misc:
+ get rid of PRE & POST defines in Compiler Lib, as they started to conflict with ugl\gl\gs\hwl\ headers with the same defines.
+ minor copy/paste eliminations & typo fixes
+ ocltst complib tests update
Testing: pre check-in, manually based on ocl sdk MatrixMultiplication
Reviewers: Brian Sumner, German Andryeyev, Nikolay Haustov, Artem Tamazov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#72 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.cpp#49 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/metadata.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclDefs.h#5 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclEnums.h#19 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclStructs.h#17 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/bif_section_labels.hpp#21 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.cpp#10 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#20 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#74 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#181 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#249 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#291 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#113 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#199 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#369 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsaprogram.cpp#38 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsakernel.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsakernel.hpp#5 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsaprogram.cpp#19 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsavirtual.cpp#43 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLAssumptionCheck.cpp#43 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLEnumCheck.cpp#44 edit
EPR #010002 - Change OpenCL version number from 1863 to 1864.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1610 edit
EPR #010002 - Change OpenCL version number from 1862 to 1863.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1609 edit
EPR #010002 - Change OpenCL version number from 1861 to 1862.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1608 edit
EPR #010002 - Change OpenCL version number from 1860 to 1861.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1607 edit
EPR #010002 - Change OpenCL version number from 1859 to 1860.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1606 edit
EPR #010002 - Change OpenCL version number from 1858 to 1859.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1605 edit
EPR #419362 - Forum [170348]: problem with printf for OpenCL 2.0 kernel build - added a set of missing symbols to flex including space, added a extra backslash to escape sequences that could not be handled by flex and also to colon which is defined in flex as a delimiter!
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/MDParser/AMDILMDParser.l#3 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/MDParser/AMDILMDParser.output#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/MDParser/lex.yy.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/HSAIL/HSAILProducePrintfMetadata.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/Scalar/AMDPrintfRuntimeBinding.cpp#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#290 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/compiler/OCLPrintSpecialChars.cpp#1 add
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/compiler/OCLPrintSpecialChars.h#1 add
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/compiler/TestList.cpp#41 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/compiler/build/Makefile.compiler#44 edit
EPR #010002 - Change OpenCL version number from 1857 to 1858.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1604 edit
EPR #010002 - Change OpenCL version number from 1856 to 1857.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1603 edit
ECR #333753 - cl-denorms-are-zero runtime change - comment out "singleFpDenorm_ = true" for now.
So AMD_GPU_FORCE_SINGLE_FP_DENORM is required to enable this, until a few issues are resolved.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#316 edit
EPR #010002 - Change OpenCL version number from 1855 to 1856.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1602 edit
ECR #333753 - Compiler Lib/aoc2: AMD HSA Code Object Import feature
Works only in case of whole compilation from the source CL to ISA. Doesn't work for -srctoir, -irtocg, -cgtoisa as meaningless.
Importing occurs instead of finalization.
To use offline by aoc2:
aoc2 -hsacodeobject=<importing_code_object_filename> -cl-std=CL2.0 -march=hsail(-64) -mdevice=Bonaire <source_cl_filename>
To use online by setting env:
AMD_DEBUG_HSA_CODE_OBJECT_INPUT=<importing_code_object_filename>
Misc:
+ fix a few aoc2 bugs & typos
+ readFile/writeFile functions duplication removal
+ std::getenv instead of :getenv (ToDo: :putenv -> std::system in separate change)
+ fix memory leak in hsail_be.cpp on CodeObject
Testing: pre checin, ursa.pl -t complib -M hsacodeobject
Reviewers: Nikolay Haustov, Brian Sumner
http://ocltc.amd.com/reviews/r/8058/
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/api/v0_8/acl.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#71 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/MDParser/Main.cxx#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/hsail_be.cpp#46 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#19 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#73 edit
... //depot/stg/opencl/drivers/opencl/tests/hsa/bin/test_driver.pl#14 edit
... //depot/stg/opencl/drivers/opencl/tests/hsa/src/complib/aoc2/hsacodeobject/hsacodeobject.1.cl#1 add
... //depot/stg/opencl/drivers/opencl/tests/hsa/src/complib/aoc2/hsacodeobject/hsacodeobject.cl#1 add
... //depot/stg/opencl/drivers/opencl/tests/hsa/tlst/complib.tlst#7 edit
EPR #010002 - Change OpenCL version number from 1854 to 1855.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1601 edit
EPR #010002 - Change OpenCL version number from 1853 to 1854.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1600 edit
EPR #010002 - Change OpenCL version number from 1852 to 1853.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1599 edit
EPR #010002 - Change OpenCL version number from 1851 to 1852.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1598 edit
EPR #394700 - SubAllocation Scheme for ConstantBuffers Implemented in GLL.
This CL introduces sub-allocation for constant buffer for all Asics and currently supported by GLL, it can easily be opted in by OpenCL, OES as well.
OpenCL changes for Sub-allocation scheme for constantbuffers. For more info please see CL#1172125
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.cpp#75 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#126 edit