EPR #010002 - Change OpenCL version number from 1636 to 1637.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1383 edit
[ROCm/clr commit: 0ae51a1467]
EPR #402000 - [CQE OCL][Perf][QR] ~6-7% perf drop in CompuCL Benchmark (Graphics: T-Rex subtest).
Add option to disable SC merge memory loads and stores. By default it is disabled. Will decide whether to enable it by default after performance runs.
cherrypick 1076590 and CL#1077419 from sc stg for adding option in sc.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/Interface/SCCommon.h#42 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/Src/CompilerBase.cpp#51 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/Src/CompilerBase.hpp#35 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/Src/HwUtils.cpp#36 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/scState.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#114 edit
[ROCm/clr commit: a49ebf6f6e]
ECR #333753 - HSA RT: avoiding superfluous recompilations on ORCA RT/HSA path
Next compilation stage determination based on binary sections and options (while linkImpl).
If current HSAILProgram options are equal to binarys ones:
- Do not generate BRIG if BRIG sections are already presented in binary.
- Do not finalize BRIG->ISA if ISA is already presented in binary.
- Perform only CG phase if HSAIL is absent in binary.
Always perform only brig loading (even in case of ISA presented).
Testing: pre check-in, compile & basic ocl conformance 2.0 tests
Reviewer: German Andryeyev
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpucompiler.cpp#150 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#264 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#101 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#177 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#55 edit
[ROCm/clr commit: ff7ab4a0b2]
EPR #010002 - Change OpenCL version number from 1635 to 1636.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1382 edit
[ROCm/clr commit: 44b425a7a5]
ECR #304775 - Update resource cache behavior
Currently, the resource cache is fixed at 64MB regardless of available video memory size. Changed the logic to use max(1/8th video memory, 64MB). This is still overrideable with the env. var. GPU_RESOURCE_CACHE_SIZE.
Improvements with changes: 18% decrease in video chat face detect time on 95w Kaveri (no change in PCMark8 score as we already achieved 30fps), 14% improvement on 19w Kaveri (this does result in an improvement in PCMark8 since the APU is slower).
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#279 edit
[ROCm/clr commit: fc2687df3a]
EPR #010002 - Change OpenCL version number from 1634 to 1635.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1381 edit
[ROCm/clr commit: fe0dedc497]
EPR #010002 - Change OpenCL version number from 1633 to 1634.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1380 edit
[ROCm/clr commit: 585664f084]
ECR #377625 - AMDIL Function support: Calculate total private memory usage by a kernel including memory used by called functions.
This cannot be done by IPA since stack size is known only after register allocation due to potential register spill, but MachineFunctionAnalysis cannot persist after CGSCC pass with current LLVM version.
This change adds private memory usage metadata for non-kernel functions. The total private memory usage by a kernel is calculated when AMDIL is split for different kernels. BIF will contain total private memory size.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/amdilUtils.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/amdilUtils.hpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/AMDIL/AMDILKernelManager.cpp#451 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/AMDIL/AMDILKernelManager.h#51 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#175 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#54 edit
[ROCm/clr commit: 42f4b2af97]
ECR #333753 - HSA HLC: remove code changing the default filetype which is set by an external tool such as llc
Effectively llc will produce text hsail file by default as a standard llc behaviour. Use -filetype=obj to obtain brig.
Note, test_driver.pl is already patched to preserve old behaviour.
Testing: smoke, smoke_clang, precheckin
Reviewed by Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/codegen.cpp#58 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/HSAIL/HSAILTargetMachine.cpp#33 edit
[ROCm/clr commit: fea6100aa9]
EPR #010002 - Change OpenCL version number from 1632 to 1633.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1379 edit
[ROCm/clr commit: 6fddbe6449]
EPR #405753 - Fixed incorrect value of slicePitch returned from clEnqueueMapimage for 1Darray.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.cpp#111 edit
[ROCm/clr commit: edb288692d]
EPR #010002 - Change OpenCL version number from 1631 to 1632.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1378 edit
[ROCm/clr commit: 93ee6bb034]
EPR #405194 - Change unroll threshold to LLVM default to partially work around Linpack performance problem.
Prior to CL 1058428, which increased the unroll threshold to 200, this was only 100 which is lower than the LLVM default. Linpack's new ISA has increased register usage, but decreasing the unroll threshold to the previous level does not reduce the register count to its previous level. The increased register usage is probably a new SC problem, so this should probably be increased again in the future. There is no change in register usage with 100 vs. 150 on Linpack.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#113 edit
[ROCm/clr commit: d5f7502ca8]
ECR #304775 - Remove _ in hsail_64 triple enum name. It isn't consistent with itself, or most other targets. The string form is already "hsail64", but the target name is sometimes "hsail-64". Does not remove the - in "hsail-64" for the target name since users could be depending on that, although that should also be fixed.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/codegen.cpp#57 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#110 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/ADT/Triple.h#36 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDLLVMContextHook.h#22 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/MC/MCObjectFileInfo.cpp#14 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Support/Triple.cpp#47 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Target/HSAIL/TargetInfo/HSAILTargetInfo.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/SPIR/AMDSPIRLoader.cpp#82 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/Scalar/AMDLowerAtomics.cpp#13 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/Utils/AMDUtils.cpp#2 edit
[ROCm/clr commit: c02dacedb2]
EPR #403493 - Block index error for CI and VI, OCL code change
Problem description: The OCL implementation requires HSA to used different block index values for CI and VI. However, the same index value is used for the same counter block in both CI and VI, which in turn caused a segmentation fault.
Root cause: HSA implementation does not know this situation before hand.
Solution: Fix to use different counter block index in CI from that in VI
Functional area: HSA perf counter implementation...
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsacounters.cpp#4 edit
[ROCm/clr commit: f517eefd51]
ECR #333756 - HSA Finalizer: added runtime option to force buffer instructions for global access
This can be used under ORCA RT.
Testing: smoke, smoke_clang, precheckin, clbas dgemm
Reviewed by Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/SI/scStateSI.cpp#24 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#112 edit
[ROCm/clr commit: d35be99f01]
EPR #010002 - Change OpenCL version number from 1630 to 1631.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1377 edit
[ROCm/clr commit: 612ed6149b]
EPR #010002 - Change OpenCL version number from 1629 to 1630.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1376 edit
[ROCm/clr commit: 8751996be9]
EPR #400016 - Keep the path of temp folder if the app is WIndows app
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/os/os_win32.cpp#39 edit
[ROCm/clr commit: 8f5b43ffd1]
EPR #010002 - Change OpenCL version number from 1628 to 1629.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1375 edit
[ROCm/clr commit: c777c3e198]
EPR #405458 - clinfo segfaults when ENABLE_CAL_SHUTDOWN=1.
For the global variables of:
std::map <std::string, int> OptionNameMap[2];
std::map <std::string, int> NoneSeparatorOptionMap[2];
std::map <std::string, int> FOptionMap;
std::map <std::string, int> MOptionMap;
We don't need to call the clear() method explicitly, since the std::map destructor will clean things up (valgrind mem-check doesn't report any leak related to these global variables after this change). Besides, on Linux amd::option::teardown() is called after the global variables' destructors are called, and it will cause segfault.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/options.cpp#29 edit
[ROCm/clr commit: b9e695d254]
EPR #010002 - Change OpenCL version number from 1627 to 1628.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1374 edit
[ROCm/clr commit: 97c9f5611c]
EPR #010002 - Change OpenCL version number from 1626 to 1627.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1373 edit
[ROCm/clr commit: 126e8c33e1]
EPR #010002 - Change OpenCL version number from 1625 to 1626.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1372 edit
[ROCm/clr commit: 7bd8bf4f9c]
EPR #398128 - Windows 2015, WDDM2.0, New Residency Model
- Modify MarkUsedInCmdBuf in IOL to make sure that MakeResident is called for OpenCL (Part 2)
ReviewBoardURL = http://ocltc.amd.com/reviews/r/5684/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#99 edit
[ROCm/clr commit: ef1f9267eb]
ECR #304775 - Add batching to the device enqueue for possible asynchronous execution
- Increase the max device queue size to 512KB. That will allow to pass conformance tests that enqueue more jobs than the queue size.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#459 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusched.hpp#13 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuschedcl.cpp#28 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#333 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.cpp#65 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.h#39 edit
[ROCm/clr commit: 2738b30287]
EPR #010002 - Change OpenCL version number from 1624 to 1625.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1371 edit
[ROCm/clr commit: b2cfd32629]
EPR #010002 - Change OpenCL version number from 1623 to 1624.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1370 edit
[ROCm/clr commit: 467ec09d69]
EPR #010002 - Change OpenCL version number from 1622 to 1623.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1369 edit
[ROCm/clr commit: 6f7a3b20f8]
EPR #010002 - Change OpenCL version number from 1621 to 1622.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1368 edit
[ROCm/clr commit: 17edecc562]
ECR #333753 - unify online/offline linkers
The code for "FixUpModule" from the online linker is now moved to
a common file under llvm/lib. This replaces the copy present in
llvm/tools/llvm-link, thus unifying the two linkers.
Reviewed by Stanislav Mekhanoshin, Yaxun Liu (Sam)
Passes smoke, smoke_clang and precheckin.
Also passes OpenCL 2.0 conformance tests.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#109 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDFixupKernelModule.h#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/AMDUtils.h#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Linker/AMDFixupKernelModule.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/Utils/AMDUtils.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/llvm-link/AMDFixUpModule.cpp#12 delete
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/llvm-link/llvm-link.cpp#48 edit
[ROCm/clr commit: 7f55691ebc]
EPR #010002 - Change OpenCL version number from 1619 to 1620.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1366 edit
[ROCm/clr commit: c018bed751]
ECR #304775 - Optimization for rectangular copies(Part2). Due to HW restriction of 14bits for src and dst pitch, its advantageous to choose optimal bpp. Higher the bpp the larger the byte pitch. This indirectly helps to reduce the number of packets for buffer copy(line by line vs a single sub_win raw packet)
ReviewBoardURL = http://ocltc.amd.com/reviews/r/5605/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#109 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#191 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#76 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.cpp#64 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.h#38 edit
[ROCm/clr commit: 5efe63df44]
EPR #010002 - Change OpenCL version number from 1618 to 1619.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1365 edit
[ROCm/clr commit: 61fa04cf2b]