ECR #333753 - HSA RT/Compiler Lib/Performance: Elimination of HSAIL text usage in RT
Extracting HSAIL from the binary and parsing it for the kernel names in RT were replaced with aclQueryInfo call for RT_KERNEL_NAMES.
Kernel names are obtained now from the corresponding metadata symbols names, which are already presented in BIF at kernel finalization stage.
Side effect: performance improvement
Next Step: Performance: elimination of BRIG disassembling to HSAIL as obligatory stage in Compiler Lib (previously was needed only by RT).
Testing: pre check-in, ocl conformnace 2.0 (basic, api, compiler, workgroups, device_execution)
Reviewers: Stanislav Mekhanoshin, German Andryeyev, Brian Sumner
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/api/v0_8/acl.cpp#20 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#51 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclEnums.h#13 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bifbase.cpp#50 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/loaders/bif/bifbase.hpp#22 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#11 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#266 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#179 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLEnumCheck.cpp#37 edit
[ROCm/clr commit: f7c2190e63]
EPR #010002 - Change OpenCL version number from 1650 to 1651.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1397 edit
[ROCm/clr commit: 1001028b9b]
EPR #010002 - Change OpenCL version number from 1649 to 1650.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1396 edit
[ROCm/clr commit: 2343bbf8c6]
ECR #377625 - AMDIL Function support: allow functions without names to be not inlined.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#114 edit
[ROCm/clr commit: 26ad0e1a8e]
EPR #010002 - Change OpenCL version number from 1648 to 1649.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1395 edit
[ROCm/clr commit: c0c60308d3]
ECR #304775 - clp re-implementation - refactoring and generalization of clpVectorExpansion to work on both AMDIL and CPU path, HSAIL path not included yet.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/clc/clpSrc/build/Makefile.clp#5 edit
... //depot/stg/opencl/drivers/opencl/compiler/clc/clpSrc/clpVectorExpansion.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#113 edit
... //depot/stg/opencl/drivers/opencl/library/common/src/commonConversions.cl#16 edit
... //depot/stg/opencl/drivers/opencl/library/x86/gen/build/Makefile.gen#16 edit
[ROCm/clr commit: bb6fa26029]
ECR #304775 - Reduce the total number of renames to 16.
- Use 128KB for CB size on SI+
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#286 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#334 edit
[ROCm/clr commit: f48b935b43]
ECR #333753 - HSA HLC: decouple hsail inlining options and threshold from amdil/cpu
This allows selective enablement of the feature and selective tuning of the threshold depending on the target.
Testing: smoke, smoke_clang, precheckin
Reviewed by Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/opt_level.cpp#21 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/Transforms/IPO/AMDOptOptions.h#6 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/AMDOptOptions.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/AMDPassManagerBuilder.cpp#44 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/opt/amdopt.inc#21 edit
[ROCm/clr commit: 3faaeb958f]
EPR #010002 - Change OpenCL version number from 1647 to 1648.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1394 edit
[ROCm/clr commit: 654c244bd1]
EPR #406110 - OCL20:Basic subtest fails when running on GPU
- Reduce max prog variable size to 90% of max single allocation
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#465 edit
[ROCm/clr commit: 617422f40f]
EPR #010002 - Change OpenCL version number from 1646 to 1647.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1393 edit
[ROCm/clr commit: ad5a7696d1]
EPR #010002 - Change OpenCL version number from 1645 to 1646.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1392 edit
[ROCm/clr commit: 511742dee7]
EPR #010002 - Change OpenCL version number from 1644 to 1645.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1391 edit
[ROCm/clr commit: ae5911684c]
EPR #405824 - Back out changelist 1079967. It causes regressions in other apus.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#464 edit
[ROCm/clr commit: 93e05cdac8]
EPR #399601 - Back out changelist 1080047 to have CZ report as 2.0 device.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#285 edit
[ROCm/clr commit: a510b4480b]
ECR #333753 - HSA RT: avoiding superfluous recompilations on ORCA RT/HSA path (part 2)
+ support of -fno-bin-llvmir & -fno-bin-hsail options: do not check compiler options for recompilation decision.
As a result if the binary contains ISA, BRIG & HSAIL and the above options are specified when compiling from binary, then compilation options are not compared, recompilation doesn't occur. This makes possible to compile from binary with different set of options, for example: -just-kernel.
P.S. Brig & HSAIL should be in binary in order to initialize & execute kernel (even if ISA is presented).
Testing: pre check-in, compiler, api & basic ocl conformance 2.0 tests
Reviewers: German Andryeyev, Artem Tamazov
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#178 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#56 edit
[ROCm/clr commit: d13ba8f18c]
EPR #010002 - Change OpenCL version number from 1643 to 1644.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1390 edit
[ROCm/clr commit: 98d18cf816]
EPR #010002 - Change OpenCL version number from 1642 to 1643.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1389 edit
[ROCm/clr commit: 92f1356f92]
EPR #399601 - Back out changelist 1076725 to have CZ NOT report as 2.0 device. To be cherry picked to mainline and then unback out afterwards.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#284 edit
[ROCm/clr commit: 7d55aee58a]
EPR #406216 - Revert CL#1076975 for Linux for now due to ASIC hang.
Keep the change for Windows.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#283 edit
[ROCm/clr commit: 81b9faadba]
EPR #405824 - On apus, if we run out of local memory to allocate cl_mem objects, ocl runtime will use remote (system) memory. Update maxMemAllocSize_ to include that.
Reviewed by: German
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#463 edit
[ROCm/clr commit: a16bef5482]
EPR #010002 - Change OpenCL version number from 1641 to 1642.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1388 edit
[ROCm/clr commit: 5ef908c5e7]
ECR #333753 - Compiler Lib/RT: Metadata related code refactor, annotation, minor fixes & additional checks
+ refactor if_aclQueryInfo() in order to simplify code and to avoid direct usage of aclMetadata struct members types
+ annotation on why we need to use deserializeCLMetadata on "serialized" (to NULL) pointers
+ erroneously forgotten RT_KERNEL_NAME was added to aclQueryType enum
+ OCLRTGetInfo, CLEnumCheck tests from ocltst oclcomplib was updated to use RT_KERNEL_NAME
+ testing of printf is added to OCLRTGetInfo
+ minor fixes and additional checks
tests: pre check-in, ocltst -m oclcomplib
Reviewers: Artem Tamazov, Brian Sumner, German Andryeyev
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#49 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclEnums.h#12 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclStructs.h#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#265 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/CLEnumCheck.cpp#36 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/clSourceShaders.h#5 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/complib/oclrtGetInfo.cpp#14 edit
[ROCm/clr commit: d50fa706e3]
EPR #010002 - Change OpenCL version number from 1640 to 1641.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1387 edit
[ROCm/clr commit: 9f99843ca0]
EPR #010002 - Change OpenCL version number from 1639 to 1640.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1386 edit
[ROCm/clr commit: 5df649cb7c]
ECR #304775 - Add extra CP write operation for the resource warm-up
- Vidmm will page in the constant buffers before the actual usage
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#192 edit
[ROCm/clr commit: 2e23538a01]
EPR #010002 - Change OpenCL version number from 1638 to 1639.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1385 edit
[ROCm/clr commit: 52fa4fec8a]
EPR #397491 - disable OpenCL 2.0 for mainline when there are multiple devices in the system, because svm test will fail even test on the first device
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#462 edit
[ROCm/clr commit: e7b10515af]
EPR #010002 - Change OpenCL version number from 1637 to 1638.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1384 edit
[ROCm/clr commit: ddf76db1d4]