EPR #411058 - [CQE OCL][Lnx][QR][CZ]MultiDevice_Context fails in 2.0 conformance wimpyfull due to CL# 1101352
- The detection of different map types is overcomplicated with possibility of multiple maps and multithreading environment. Thus keep USWC indirect map optimization based on the allocation flags.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.cpp#114 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.hpp#46 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#342 edit
[ROCm/clr commit: 593d1e3b8d]
EPR #010002 - Change OpenCL version number from 1701 to 1702.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1448 edit
[ROCm/clr commit: c722a0a2da]
EPR #010002 - Change OpenCL version number from 1700 to 1701.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1447 edit
[ROCm/clr commit: 68108a505c]
EPR #405889 - Added option to set VGPR/SGPR/LDS usage in ISA to certain value greater than actual usage for debugging purpose. If the given value is smaller than actual value, this option has no effect.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/SI/scCompileSI.cpp#52 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/scHWShaderInfo.h#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#121 edit
[ROCm/clr commit: 9f760b7bf0]
EPR #010002 - Change OpenCL version number from 1699 to 1700.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1446 edit
[ROCm/clr commit: ed3642807b]
EPR #010002 - Change OpenCL version number from 1698 to 1699.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1445 edit
[ROCm/clr commit: f4addd58c9]
EPR #410824 - [CQE OCL][CZ][S/G][QR] Two Bolt sample failing on CPU; Faulty CL: 1101352
- The test performs double maps with different map flags. Optimization could choose different map schemes for each call and memory coherency could be broken. Add extra conditions to detect multiple maps and use the same path as the first map.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.hpp#45 edit
[ROCm/clr commit: a1202e54be]
EPR #010002 - Change OpenCL version number from 1697 to 1698.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1444 edit
[ROCm/clr commit: 3444e16d99]
EPR #010002 - Change OpenCL version number from 1696 to 1697.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1443 edit
[ROCm/clr commit: 2d58fc417b]
EPR #410736 - [CQE OCL][ISV][QR][G] FFMPEG app generating corrupted video output; Faulty CL:1101352
- Add detection for AHP allocation.
FFmpeg uses AHP allocations with CL_MAP_READ flag, but actually performs CPU write into the buffer. With indirect map runtime executes useless transfer on map and doesn't write updated memory on unmap, because a wrong flag sent by the app.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.cpp#113 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.hpp#44 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#341 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/perf/TestList.cpp#40 edit
[ROCm/clr commit: f9f5df731e]
EPR #010002 - Change OpenCL version number from 1695 to 1696.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1442 edit
[ROCm/clr commit: 750e1bf9bd]
EPR #010002 - Change OpenCL version number from 1694 to 1695.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1441 edit
[ROCm/clr commit: 6824541acd]
EPR #010002 - Change OpenCL version number from 1693 to 1694.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1440 edit
[ROCm/clr commit: ba8e6fefbe]
EPR #397491 - Replace "switch" with "if" so that new ASIC id doesn't need to be added.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#103 edit
[ROCm/clr commit: afe1835f56]
EPR #403782 - IOMMU2/SVM
- For finegrainsystem, the app can pass a malloced pointer directly to the kernel. Copy pointer directly to the aqlArgBuf without exiting.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/6378/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#269 edit
[ROCm/clr commit: 2ba0f2a112]
EPR #010002 - Change OpenCL version number from 1692 to 1693.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1439 edit
[ROCm/clr commit: 7bf07ad054]
EPR #010002 - Change OpenCL version number from 1691 to 1692.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1438 edit
[ROCm/clr commit: 9583bf4f36]
EPR #010002 - Change OpenCL version number from 1690 to 1691.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1437 edit
[ROCm/clr commit: e0f3106f52]
EPR #409950 - [IV][OCL] Multiple OCL samples crashed on multiple machines for 32-bit OS.
There are two issues:
1. the SC dll should be dynamically loaded only when it is available. This is to allow apps to run on CPU device without the SC dll. This CL fixes it. It also allows user to use env var AMD_OCL_SC_LIB to provide the name or complete path of SC dll to load.
2. The test fails because amdhsasc.dll is not included in base driver for 32 bit OS. The proper solution should be ask package team to include amdhsasc.dll in the base driver. Also amdhsasc.dll should be renamed amdoclsc.dll since it is not only used for HSAIL but also used by AMDIL. The benefit of separate SC component as a shared library is decreased build time since changes in SC does not require rebuild of amdocl.dll, and ease of debugging and regression analysis by allowing swapping SC comopnent.
However since 15.10 branch is close, there is not enough time to make changes to package. Therefore this CL implements a workaround for this issue without change to the package. We will implement the proper fix in the next relase.
The workaround implemented by this CL embeds SC statically in amdocl.dll. The runtime loads SC dll specified by env var AMD_OCL_SC_LIB only if it is available. If the SC dll is not available, it will use the embeded SC.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/build/Makefile.api#96 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/api/v0_8/acl.cpp#22 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/api/v0_8/aclLoaders.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/Makefile#44 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sclibdefs.opencl#20 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclStructs.h#13 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/include/v0_8/aclTypes.h#4 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/build/Makefile.aoc2#21 edit
... //depot/stg/opencl/drivers/opencl/opencldefs#148 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#485 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#220 edit
[ROCm/clr commit: 16ebf68e43]
EPR #010002 - Change OpenCL version number from 1689 to 1690.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1436 edit
[ROCm/clr commit: 05afab8ccf]
EPR #010002 - Change OpenCL version number from 1688 to 1689.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1435 edit
[ROCm/clr commit: 48dbc6d01e]
EPR #010002 - Change OpenCL version number from 1687 to 1688.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1434 edit
[ROCm/clr commit: 6d7aaf21a6]
ECR #333753 - Compiler Lib/aoc2/devloader: move devloader functionality into aoc2
[Purpose] To get rid of obsolete runtimenew dependency in compiler
1. Devloader functionality moved into aoc2;
2. Devloader is removed from the tree & make system;
3. Related changes in test_driver.pl;
4. Functions alignedMalloc & alignedFree are moved to libUtils.h;
5. Function aclHsaLoader is renamed to _aclHsaLoader to indicate that it is not a Compiler Lib API's function.
[Testing] make smoke, pre check-in
[Reviewers] Nikolay Haustov, Brian Sumner
Affected files ...
... //depot/stg/opencl/drivers/opencl/Makefile#48 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/amdocl.def.in#10 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/amdocl.map.in#11 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/amdoclcl.def.in#8 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/amdoclcl.map.in#7 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/brig_loader.cpp#15 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/scwrapper/scClientAPI.cpp#20 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#13 edit
... //depot/stg/opencl/drivers/opencl/compiler/loader/devloader/Makefile#8 delete
... //depot/stg/opencl/drivers/opencl/compiler/loader/devloader/build/Makefile#3 delete
... //depot/stg/opencl/drivers/opencl/compiler/loader/devloader/build/Makefile.devloader#11 delete
... //depot/stg/opencl/drivers/opencl/compiler/loader/devloader/devloader.cpp#6 delete
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/aoc2.cpp#61 edit
... //depot/stg/opencl/drivers/opencl/compiler/tools/aoc2/build/Makefile.aoc2#20 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#185 edit
... //depot/stg/opencl/drivers/opencl/tests/hsa/bin/test_driver.pl#5 edit
[ROCm/clr commit: 6244599f99]
EPR #409798 - clCompileProgram and clLinkProgram regression for SPIR - set the correct IR type while extracting from binary (aclSPIR, aclLLVMIR) for single SPIR module for CPU.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpuprogram.cpp#62 edit
[ROCm/clr commit: 57a45f9066]
EPR #409840 - [CQE OCL][LNX][QR] OpenCL SPIR Conf test "Compile_and_link" failed in all Asics due to CL#1098110 - Set the IR type to SPIR only for single SPIR modules.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#184 edit
[ROCm/clr commit: f324bf5f80]
EPR #010002 - Change OpenCL version number from 1686 to 1687.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1433 edit
[ROCm/clr commit: 5573f4f56b]
EPR #409798 - clCompileProgram and clLinkProgram regression for SPIR - set the correct IR type while extracting from binary (aclSPIR, aclLLVMIR)
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#183 edit
[ROCm/clr commit: d604c03916]
EPR #010002 - Change OpenCL version number from 1685 to 1686.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1432 edit
[ROCm/clr commit: 1b0c2439f2]
EPR #010002 - Change OpenCL version number from 1684 to 1685.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1431 edit
[ROCm/clr commit: fca0dee30a]
ECR #304775 - Optimize oclBandwidthTest from nVidia SDK
- Cache pinned memory, since the benchmark sends the same transfer in a single batch. Thus we could avoid pin/unpin
- Swap SDMA engine allocation order. Blit manager allocates a queue on device, thus the first app queue was getting the paging second SDMA.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#112 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#37 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#339 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#121 edit
[ROCm/clr commit: dc8a3205ce]
EPR #010002 - Change OpenCL version number from 1683 to 1684.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1430 edit
[ROCm/clr commit: ab2a9ee5fc]
EPR #010002 - Change OpenCL version number from 1682 to 1683.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1429 edit
[ROCm/clr commit: 3f1af9d6c4]
EPR #010002 - Change OpenCL version number from 1681 to 1682.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1428 edit
[ROCm/clr commit: 5658b6a1b4]
ECR #333755 - Part 2- Update to foundation spec 1.0 20141019:
- hsa_dispatch_packet_t now becomes hsa_kernel_dispatch_packet_t
- all bit mask in a struct are removed and replaced by enums that indicates the bit position and width.
Test: TC precheckin
Review: Hari, Fan, Shucai, German, Yunjun.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#268 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#103 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusched.hpp#15 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#338 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsavirtual.cpp#25 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsavirtual.hpp#12 edit
[ROCm/clr commit: c7988f7209]
EPR #010002 - Change OpenCL version number from 1680 to 1681.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1427 edit
[ROCm/clr commit: 381e955dbf]
ECR #333753 - hsa_foundation RT/Compiler Lib: recompilation algorithm rework
1. Recompilation algorithm rework in order to avoid superfluous recompilations.
2. Replace aclExtractSymbol/Section with aclQueryInfo for symbol/section detection.
The replaced calls in RT previously performed actual extraction of the sections from the BIF with memory allocation and copying. But what is needed in fact is only to determine whether the section exists in BIF or not to make a further decision on needed recompilations. With aclQueryInfo and new added enums RT_CONTAINS_LLVMIR, RT_CONTAINS_OPTIONS, RT_CONTAINS_BRIG, RT_CONTAINS_HSAIL, RT_CONTAINS_ISA Runtime starts querying not the whole sections but the bool flag which indicates the existence of the corresponding section(s) without any memory allocations. Every compilation on RT starting from LLVMIR is affected by the change including compilation of blit kernels.
3. Fix in Compiler Lib for correct ACL_INVALID_ARG detection (for wrong/unsupported compilations).
[Side Effects] performance improvement, memory consumption reduction
[ToDo] Do not finalize program if ISA is already provided in BIF and options are unchanged.
[Testing] pre check-in, ocltst complib, ocl conformance 2.0 compiler & api
[Reviewers] German Andryeyev, Brian Sumner
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/v0_8/if_acl.cpp#56 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsaprogram.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsaprogram.hpp#3 edit
[ROCm/clr commit: 9c4a22118e]
EPR #010002 - Change OpenCL version number from 1679 to 1680.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1426 edit
[ROCm/clr commit: f858ed6336]
EPR #408459 - changed the implementation of svmAlloc, so that the first device can create amd::Memory object, and the rest of devices only added gpu memory to it. This is part of changes for mgpu support for svmalloc
code review:
http://ocltc.amd.com/reviews/r/6245/
precheckin testing results:
http://ocltc.amd.com:8111/viewModification.html?modId=43136&personal=true&buildTypeId=&tab=vcsModificationTests
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpudevice.hpp#88 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#233 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#479 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#133 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsadevice.cpp#87 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsadevice.hpp#44 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsadevice.cpp#19 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsadevice.hpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/context.cpp#34 edit
[ROCm/clr commit: efbedb25be]
EPR #408185 - Use pinned memory if directaccess is true and remoteAlloc is used.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#478 edit
[ROCm/clr commit: c24b46e708]