ECR #333753 - hsa_foundation RT/Compiler Lib: recompilation algorithm rework
1. Recompilation algorithm rework in order to avoid superfluous recompilations.
2. Replace aclExtractSymbol/Section with aclQueryInfo for symbol/section detection.
The replaced calls in RT previously performed actual extraction of the sections from the BIF with memory allocation and copying. But what is needed in fact is only to determine whether the section exists in BIF or not to make a further decision on needed recompilations. With aclQueryInfo and new added enums RT_CONTAINS_LLVMIR, RT_CONTAINS_OPTIONS, RT_CONTAINS_BRIG, RT_CONTAINS_HSAIL, RT_CONTAINS_ISA Runtime starts querying not the whole sections but the bool flag which indicates the existence of the corresponding section(s) without any memory allocations. Every compilation on RT starting from LLVMIR is affected by the change including compilation of blit kernels.
3. Fix in Compiler Lib for correct ACL_INVALID_ARG detection (for wrong/unsupported compilations).
[Side Effects] performance improvement, memory consumption reduction
[ToDo] Do not finalize program if ISA is already provided in BIF and options are unchanged.
[Testing] pre check-in, ocltst complib, ocl conformance 2.0 compiler & api
[Reviewers] German Andryeyev, Brian Sumner
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#56 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsaprogram.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsaprogram.hpp#3 edit
EPR #010002 - Change OpenCL version number from 1679 to 1680.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1426 edit
EPR #408459 - changed the implementation of svmAlloc, so that the first device can create amd::Memory object, and the rest of devices only added gpu memory to it. This is part of changes for mgpu support for svmalloc
code review:
http://ocltc.amd.com/reviews/r/6245/
precheckin testing results:
http://ocltc.amd.com:8111/viewModification.html?modId=43136&personal=true&buildTypeId=&tab=vcsModificationTests
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpudevice.hpp#88 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#233 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#479 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#133 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsadevice.cpp#87 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsadevice.hpp#44 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsadevice.cpp#19 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsadevice.hpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/context.cpp#34 edit
EPR #408185 - Use pinned memory if directaccess is true and remoteAlloc is used.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#478 edit
EPR #405357 - [CQE DTB][valgrind][OCL2.0]:MemLeaks are observed with MonteCarloAsian sample.
Need to delete amdrtFunctions when it is no longer used.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#116 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDLLVMContextHook.h#23 edit
EPR #010002 - Change OpenCL version number from 1678 to 1679.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1425 edit
EPR #408459 - added an env variable OCL_FORCE_CPU_SVM in the runtime, so that the svm feature for CPU can be enabled manually even for non OpenCL 2.0 support for CPU device.
code review:
http://ocltc.amd.com/reviews/r/6190/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpudevice.cpp#268 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#218 edit
EPR #406328 - removed the customSVMallocator from runtime, and also changed the name of customSvmAllocDevice to svmAllocDevice, because we don't use custom svm allocator for devices.
precheckin testing:
http://ocltc.amd.com:8111/viewModification.html?modId=43040&personal=true&buildTypeId=&tab=vcsModificationBuilds&show_all_builds=true
code review:
http://ocltc.amd.com/reviews/r/6222/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpusettings.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#171 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#232 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#293 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsasettings.cpp#36 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/context.cpp#33 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/context.hpp#23 edit
EPR #010002 - Change OpenCL version number from 1677 to 1678.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1424 edit
EPR #408506 - Extended the reported global memory size(CL_DEVICE_GLOBAL_FREE_MEMORY_AMD) to include a portion of remote memory for APU
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#476 edit
EPR #010002 - Change OpenCL version number from 1676 to 1677.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1423 edit
EPR #010002 - Change OpenCL version number from 1675 to 1676.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1422 edit
EPR #010002 - Change OpenCL version number from 1674 to 1675.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1421 edit
EPR #010002 - Change OpenCL version number from 1673 to 1674.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1420 edit
EPR #010002 - Change OpenCL version number from 1672 to 1673.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1419 edit
ECR #304775 - add flag to force CL_FP_DENORM on gpu
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#475 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#216 edit
ECR #304775 - Add a check for NULL dev pointer.
- Subbuffer was created, but never used. Thus dev memory could be NULL and lastWriter_ was passed from the parent object on create
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.cpp#112 edit
ECR #304775 - Align the queue size to match the multidispatch scheduler requirements
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#337 edit
EPR #010002 - Change OpenCL version number from 1671 to 1672.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1418 edit
EPR #406328 - modified the opencl runtime so that SVM allocation is done for every SVM capable devices, not just one device. This is the part of changes for SVM multiple device support.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_svm.cpp#7 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/context.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/context.hpp#22 edit
EPR #397491 - changed the CPU SVM capability availablility only for OpenCL 2.0, not for 1.2.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpudevice.cpp#267 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpusettings.cpp#28 edit
EPR #010002 - Change OpenCL version number from 1670 to 1671.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1417 edit
EPR #010002 - Change OpenCL version number from 1669 to 1670.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1416 edit
EPR #010002 - Change OpenCL version number from 1668 to 1669.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1415 edit
EPR #010002 - Change OpenCL version number from 1667 to 1668.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1414 edit
EPR #010002 - Change OpenCL version number from 1666 to 1667.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1413 edit
EPR #010002 - Change OpenCL version number from 1665 to 1666.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1412 edit
EPR #407469 - disabled the SVM fine grain buffer support for CZ on mainline
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#472 edit
EPR #010002 - Change OpenCL version number from 1664 to 1665.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1411 edit
EPR #010002 - Change OpenCL version number from 1663 to 1664.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1410 edit
EPR #010002 - Change OpenCL version number from 1662 to 1663.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1409 edit
ECR #333753 - clc2: disallow implicit function declarations (bug 10328)
In addition, the change fixes the following side-effects:
1. Fix a typo in runtime/.../gpuschedcl.cpp, which fails due to the stricter check in Clang.
2. Unconditionally add sub_group builtins for pipes, without checking if the extension is enabled. See bug 10366.
3. Also added a test in ocl_features_clang to check for the sub_group builtins.
Passes smoke, smoke_clang, precheckin.
Additionally passes new tests added in ocl_features.
Reviewed by Brian Sumner.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/clang/lib/Sema/SemaLookup.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/compiler/clc2/wrapper/ClangWrapper.cpp#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuschedcl.cpp#33 edit
EPR #010002 - Change OpenCL version number from 1661 to 1662.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1408 edit
EPR #010002 - Change OpenCL version number from 1660 to 1661.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1407 edit
EPR #405824 - On apus, if we run out of local memory to allocate cl_mem objects, ocl runtime will use remote (system) memory. Update maxMemAllocSize_ to include that.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#469 edit
EPR #010002 - Change OpenCL version number from 1659 to 1660.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1406 edit