SWDEV-2 - Change OpenCL version number from 1961 to 1962.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1708 edit
[ROCm/clr commit: 9de775b2f3]
SWDEV-82256 - Limit the workaround for Win 7 only because KMD has fixed TDR issue on Win 8.1/10
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#336 edit
[ROCm/clr commit: 6b762d400f]
SWDEV-2 - Change OpenCL version number from 1960 to 1961.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1707 edit
[ROCm/clr commit: f3a106f125]
SWDEV-2 - Change OpenCL version number from 1959 to 1960.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1706 edit
[ROCm/clr commit: ae519cdf56]
SWDEV-2 - Change OpenCL version number from 1958 to 1959.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1705 edit
[ROCm/clr commit: 828fe4a2d8]
SWDEV-2 - Change OpenCL version number from 1957 to 1958.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1704 edit
[ROCm/clr commit: 6c19b7d71c]
SWDEV-82205 - Increased workloard to pass this test.
- This is workaround because KMD don't have solution to fix TDR issue yet in 15.30.
- This workaround including CL#1201765 should be reverted once KMD has a fix
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#335 edit
[ROCm/clr commit: 217ef518c4]
SWDEV-2 - Change OpenCL version number from 1956 to 1957.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1703 edit
[ROCm/clr commit: ca1ab4b444]
SWDEV-2 - Change OpenCL version number from 1955 to 1956.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1702 edit
[ROCm/clr commit: 26c0df5551]
SWDEV-82596 - HSA HLC: Create AMDInline pass
The generic llvm inlining heuristcs do not work well for GPU.
In particular we have a common problem in several tests:
If we have a pointer to private array passed into a function it will not be optimized out, leaving scratch usage.
The pass increases the inline threshold to allow inliniting in this case.
Also that we can move at least some AMD inlining customizations into this file from the common code.
Inline hint threshold is moved in this change.
Performance impact on ocltst sha256, 32 bit, Fiji:
AMDIL HSAIL Diff HSAIL+Inliner Diff Diff
before to AMDIL to HSAIL to AMDIL
OCLPerfSHA256[ 0] 43.843 40.894 0.93 69.910 1.71 1.59
OCLPerfSHA256[ 1] 53.611 51.083 0.95 80.919 1.58 1.51
OCLPerfSHA256[ 2] 52.127 51.528 0.99 80.640 1.56 1.55
OCLPerfSHA256[ 3] 60.952 57.027 0.94 68.615 1.20 1.13
OCLPerfSHA256[ 4] 76.173 70.150 0.92 80.582 1.15 1.06
OCLPerfSHA256[ 5] 75.886 70.264 0.93 81.000 1.15 1.07
Testing: smoke, precheckin, ocltst sha256
Reviewed by Danill Fukalov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/opt_level.cpp#28 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/InitializePasses.h#93 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/LinkAllPasses.h#49 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/include/llvm/Transforms/IPO.h#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/AMDInline.cpp#1 add
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/CMakeLists.txt#24 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/IPO.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/lib/Transforms/IPO/Inliner.cpp#42 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/opt/amdopt.inc#28 edit
[ROCm/clr commit: 5e3d4f5a01]
SWDEV-2 - Change OpenCL version number from 1954 to 1955.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1701 edit
[ROCm/clr commit: 4cefa6126f]
SWDEV-2 - Change OpenCL version number from 1953 to 1954.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1700 edit
[ROCm/clr commit: cf8e10a104]
SWDEV-82054 - [CQE OCL][QR][LNX] RQ Conformance "Integer_Ops" test is crashing on CPU; Faulty CL#1206023.
In llvm32 by default llvm::DisablePrettyStackTrace is off, which causes a trap handler installed by default and interferes with the trap handler in runtime, causing unhandled SIGFPE exceptions when executing conformance/integer_ops on certain cpu.
To fix this, put stack trace duping under an env var AMD_DUMP_STACK_TRACE and set llvm::DisablePrettyStackTrace=true by default.
Here env var is used because there is still no elf binary at this stage to pass the compiler option to if_aclCompilerInit.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/legacy-lib/backends/common/v0_8/if_acl.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#86 edit
[ROCm/clr commit: ccc2b4ce79]
SWDEV-2 - Change OpenCL version number from 1952 to 1953.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1699 edit
[ROCm/clr commit: 4035e3b21b]
SWDEV-2 - Change OpenCL version number from 1951 to 1952.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1698 edit
[ROCm/clr commit: 8da0a97e7f]
SWDEV-2 - Change OpenCL version number from 1950 to 1951.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1697 edit
[ROCm/clr commit: 9db300e6d4]
SWDEV-2 - Change OpenCL version number from 1949 to 1950.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1696 edit
[ROCm/clr commit: 719f92981b]
SWDEV-2 - Change OpenCL version number from 1948 to 1949.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1695 edit
[ROCm/clr commit: a3c5a06983]
SWDEV-2 - Change OpenCL version number from 1947 to 1948.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1694 edit
[ROCm/clr commit: 4f570b1585]
SWDEV-2 - Change OpenCL version number from 1946 to 1947.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1693 edit
[ROCm/clr commit: a8257c0b47]
SWDEV-2 - Change OpenCL version number from 1945 to 1946.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1692 edit
[ROCm/clr commit: 2b6f4f6477]
SWDEV-2 - Change OpenCL version number from 1944 to 1945.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1691 edit
[ROCm/clr commit: 08e5cc5695]
SWDEV-2 - Change OpenCL version number from 1943 to 1944.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1690 edit
[ROCm/clr commit: ced3e2df46]
SWDEV-2 - Change OpenCL version number from 1942 to 1943.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1689 edit
[ROCm/clr commit: b70c72025b]
SWDEV-81805 - Fix compiler lib bug: incorrect type name %opencl.pipe_t0 is generated when using clang.
Clang does not return llvm::Module. It saves the bitcode to a memory buffer and passed back to compiler lib, then bitcode reader is used to get llvm::Module. Clang and bitcode reader uses the same LLVMContext which is created earlier in aclCompileInternal. Since named struct types are shared between modules in LLVMContext. When bitcode reader loads the module, name collision happens for named struct types, which causes them to be postfixed with a number, e.g. %opencl.pipe_t => %opencl.pipe_t0.
This causes failure in SPIR-V drop-in conformance test.
The fix is to let clang uses a separate LLVMContext.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/frontend_clang.cpp#26 edit
[ROCm/clr commit: 56ea6c56a1]
SWDEV-2 - Change OpenCL version number from 1941 to 1942.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1688 edit
[ROCm/clr commit: 2972bacfc1]
SWDEV-2 - Change OpenCL version number from 1940 to 1941.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1687 edit
[ROCm/clr commit: 98f48d3a6d]
SWDEV-80874 - fixed out of bound access to the printf format string
We do not really need two separate induction variables, pos and i, and we had a bug of not incrementing i as needed.
The only reason it used to work is because all strings we used for testing ended with '\n'.
The bug resulted in ignoring this '\n', but the code unconditionally adds '\n', so nobody noticed.
If you try to print anything having any other escape, '\n' not at the end, or a colon, there will be assertion.
That is fixed, and newline now is only added if last symbol in user's format was not newline, because otherwise
we would now print 2 new lines. NB, I prefer to use bool variable rather then addressing last symbol of the string
which could be empty.
A side node, why do we run flex scanner past the last colon? If we do not we would not need this double encoding at all.
Testing: smoke, precheckin, conformance printf with HSAIL forced, custom test
Reviewed by German Andreev
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#309 edit
[ROCm/clr commit: eea9bc6733]
SWDEV-2 - Change OpenCL version number from 1939 to 1940.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1686 edit
[ROCm/clr commit: 05ef4a4226]
SWDEV-77584 - ORCA RT: Preparations for enabling HSAIL on OpenCL 1.2 by default. Integrate new algorithm for device program choice.
[Reasons]
1. Make the switching change as less as possible.
2. Give a chance to test HSA_foundation device work on OCL 1.2 beforehand (asked by Nikolay).
Almost already reviewed:
http://ocltc.amd.com/reviews/r/8850/
Additionally:
1. Linking logic was changed: if the target of one of the binaries is hsail-(64) linking goes through HSAIL, otherwise - through AMDIL. Previously -cl-std=CL2.0 in any of the linking binaries was a criterion for HSAIL, what will be wrong for HSAIL 1.2 after switching. -clang & -edg options are set now to distinguish the path while linking.
2. -cl-std=CL2.0 as a criterion for HSAIL was returned back in isHSAILProgram() method; -clang & -edg options were also added as a criterion.
[ToDo] After enabling HSAIL by default remove -cl-std, -clang & -edg checks from the code.
[Testing] Pre-checkin
http://ocltc.amd.com:8111/viewModification.html?modId=61929&personal=true&buildTypeId=&tab=vcsModificationBuilds&show_all_builds=true
[Reviewers] German Andryeyev, Nikolay Haustov
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_program.cpp#39 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpudevice.cpp#279 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpudevice.hpp#93 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#261 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#534 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#154 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsadevice.cpp#47 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsadevice.hpp#22 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/program.cpp#76 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/program.hpp#38 edit
[ROCm/clr commit: 539fef47eb]
SWDEV-79957 - use system memory to calculate the largest available memory size on Linux APU system.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#533 edit
[ROCm/clr commit: 6f0457c510]
SWDEV-2 - Change OpenCL version number from 1938 to 1939.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1685 edit
[ROCm/clr commit: a1146f6e4d]
SWDEV-2 - Change OpenCL version number from 1937 to 1938.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1684 edit
[ROCm/clr commit: b7c9a38645]
SWDEV-2 - Change OpenCL version number from 1936 to 1937.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1683 edit
[ROCm/clr commit: f0aea225b6]
SWDEV-77172 - IOMMUv2 changes for Windows 10
- Clear passing SVM flag from top level and fix GL interop on SVM
- Add\Remove gpuvmOffset before WDDM calls as its added manually for SUA model
ReviewBoardURL = http://ocltc.amd.com/reviews/r/8914/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#230 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDeviceGL.cpp#25 edit
[ROCm/clr commit: b0b6b55051]
SWDEV-80874 - fixed staging buffer overflow with HSA printf
Staging buffer is ~2 times smaller than allocated printf buffer, so if amount of data in printf buffer exceeds the size of the staging buffer
we hit assertion in the memory copy. To hit the assertion that is enough to print 2 integers with 64K workitems.
Added loop to read printf buffer into staging in portions.
Testing: smoke, precheckin, conformance printf with HSAIL forced, custom tests
Reviewed by German Andreev
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#41 edit
[ROCm/clr commit: e18cd1d76e]
SWDEV-80874 - Fixed ORCA RT HSA printf buffer indexing issues
The format of the buffer is: printf_id, <arg1>, <arg2>, ...
The RT did not advance index for printf_id field, so for example for a format string "%d" we have been printing printf_id instead of actual argument for every other string.
The other issue is that outputDbgBuffer is adjusting its last argument (idx) by the number of consumed DWORD values,
but PrintfDbgHSA::output() is also ajusting dbgBufferPtr, so we had adjustment done twice, printing only half of the actual data and then printing zeroes from the buffer.
The resolution for both is to always pass 1 as index to outputDbgBuffer(). 1 because 0 is printf_id.
Testing: smoke, precheckin, conformance printf with HSAIL forced, custom tests
Reviewed by Brian Sumner and German Andreev
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#40 edit
[ROCm/clr commit: 047f87bb4f]
SWDEV-2 - Change OpenCL version number from 1935 to 1936.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1682 edit
[ROCm/clr commit: 3206a987aa]