SWDEV-133851 - [CQE OCL][1.2][LNX-PRO] A subtest from OCLcompiler is failing due to faulty cl#1458879
- If pinning failed and allocation was forced to system memory, then copy the original data
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#577 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#64 edit
[ROCm/clr commit: af98be0351]
SWDEV-2 - Change OpenCL version number from 2507 to 2508.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2255 edit
[ROCm/clr commit: 2276473e9d]
SWDEV-2 - Change OpenCL version number from 2506 to 2507.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2254 edit
[ROCm/clr commit: 75653253bb]
SWDEV-2 - Change OpenCL version number from 2505 to 2506.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2253 edit
[ROCm/clr commit: c7ea67c7c6]
SWDEV-95919 - Expose coutners by instances and not number of counters. Also expose EA and RMI instances.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/13502/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palcounters.cpp#15 edit
... //depot/stg/opencl/drivers/opencl/tests/ocltst/module/runtime/OCLPerfCounters.cpp#41 edit
[ROCm/clr commit: db63366bcb]
SWDEV-2 - Change OpenCL version number from 2504 to 2505.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2252 edit
[ROCm/clr commit: 4460abcd9c]
SWDEV-132899 - [OCL][GFX10] Add support for GFX10
Adjusting WaveFrontSize for Null Devices based on the gfxip (the WaveFrontSize is 32 for gfxip10)
ReviewBoardURL = http://ocltc.amd.com/reviews/r/13486/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#63 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#37 edit
[ROCm/clr commit: cb4585939d]
SWDEV-2 - Change OpenCL version number from 2503 to 2504.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2251 edit
[ROCm/clr commit: b236524590]
SWDEV-2 - Change OpenCL version number from 2502 to 2503.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2250 edit
[ROCm/clr commit: c7dbd490b0]
SWDEV-86035 - Switch back to 8 CBs due to HW hangs with HWSC on VI.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#59 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.hpp#32 edit
[ROCm/clr commit: e53571d05b]
SWDEV-118564 - [OCL-LC-ROCm] Remove options, which have already been moved to AMDGPUToolChain by https://reviews.llvm.org/rL312524
In order to have similar set of optipons for online and offline compilation a mechanism of setting default options in AMDGPUToolChain was implemented by https://reviews.llvm.org/rL312524. That commit also sets two default options in AMDGPUToolChain: -m64 and -O3 (the latter only set if there is no -O{N} option in the args). The commit has already reached amd-common.
The current change relates to LC only and removes setting of -m64 from compileImpl_LC() as it is set later in TranslateArgs(); for online -O{N} is set as before by RT and stays unchanged in AMDGPUToolChain; for offline it is set to -O3 by TranslateArgs() if no -O{N} is passed through args.
Also remove comments regarding "-x cl" as it is now correctly set in OpenCL driver.
Review: http://ocltc.amd.com/reviews/r/13454/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palcompiler.cpp#18 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/roccompiler.cpp#36 edit
[ROCm/clr commit: a348a08391]
SWDEV-2 - Change OpenCL version number from 2501 to 2502.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2249 edit
[ROCm/clr commit: 5b85bc981c]
SWDEV-2 - Change OpenCL version number from 2500 to 2501.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2248 edit
[ROCm/clr commit: 61de12fb47]
SWDEV-120036 - Supporting the cl_amd_device_attribute_query on the ROC device - Back out changelist 1459984
- not all device attributes are supported, will re-submit the changes when every attribute is supported.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#64 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocsettings.cpp#26 edit
[ROCm/clr commit: fb18f128c9]
SWDEV-111439 - Add query for preferred constant size
- fixed a mistake of using 64KiB for the size, which should be 16KiB.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#576 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#62 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#63 edit
[ROCm/clr commit: a851cc152c]
SWDEV-2 - Change OpenCL version number from 2499 to 2500.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2247 edit
[ROCm/clr commit: f8bc731619]
SWDEV-2 - Change OpenCL version number from 2498 to 2499.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2246 edit
[ROCm/clr commit: 040f46d584]
SWDEV-129129 - [[CQE OCL][Vega vs Fiji] Upto 12% Performance drop observed on VEGA10 compared to FIJI while running BlackMagic Davinci Resolve
More benchmark tuning:
- Keep system memory locked in the resource cache. That removes huge amount of lock/unlock calls to OS due to the resource creation and destruciton
- Reduce the command buffer size to 256 commands and incrrease the amount of CBs to 16
- Increase the amount of resident resources to 2048
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#574 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palmemory.cpp#17 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#37 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#58 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.hpp#31 edit
[ROCm/clr commit: 4066449a8b]
SWDEV-2 - Change OpenCL version number from 2497 to 2498.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2245 edit
[ROCm/clr commit: 44b7cfefaf]
SWDEV-132238 - [CQE OCL][Vega10][DTB-Blocker][QR] 'Allocation (Single)' test of WF Conformance is failing; Faulty CL# 1451444
- Disable reporting extra HBCC memory by default. Reporting extra memory can be reenabled with GPU_ADD_HBCC_SIZE=1
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#60 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#280 edit
[ROCm/clr commit: 068bf554fb]
SWDEV-2 - Change OpenCL version number from 2496 to 2497.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2244 edit
[ROCm/clr commit: d0cd65755a]
SWDEV-130808 - set the local sizes to preferredWorkGroupSize_ when clEnqueueNDRange is not given and the kernel does not have required workgroup sizes.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#320 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#411 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#36 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#56 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#43 edit
[ROCm/clr commit: 8aef16e13c]
SWDEV-130722 - Channel order in an interop buffer from OpenCL to OpenGL is flipped on Vega
Follow up for CL#1456230. Adding a new table that maps the OGL surface formats (hData.format) returned by wglResourceAttachAMD function into the OCL image format. The hData.format is the internal image surface format created for an interop by OGL and should be used by OCL for cl_gl interop.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/13421/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.hpp#20 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevicegl.cpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#36 edit
[ROCm/clr commit: 23f12d5ea4]
SWDEV-122517 - DVR toolbar and timer are corrupted when recording in fullscreen with portrait oriented monitors using Eyefinity.
Fixed by obtaining the rotation information from OGL driver and set tha displayable attribute accordingly. (For OCL RT changes)
- fix the type casting issue that causes build error
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDeviceGL.cpp#31 edit
[ROCm/clr commit: a88ad35556]
SWDEV-122517 - DVR toolbar and timer are corrupted when recording in fullscreen with portrait oriented monitors using Eyefinity.
Fixed by obtaining the rotation information from OGL driver and set tha displayable attribute accordingly. (For OCL RT changes)
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDeviceGL.cpp#30 edit
[ROCm/clr commit: 3373a1ef2f]
SWDEV-2 - Change OpenCL version number from 2495 to 2496.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2243 edit
[ROCm/clr commit: 1ff1a9a9c7]
SWDEV-2 - Change OpenCL version number from 2494 to 2495.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2242 edit
[ROCm/clr commit: 5f7750e9b1]
SWDEV-129129 - [[CQE OCL][Vega vs Fiji] Upto 12% Performance drop observed on VEGA10 compared to FIJI while running BlackMagic Davinci Resolve
- Force tiny read_only buffers into USWC memory. That will avoid expensive tiny data uploads, which occur every frame.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#59 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#278 edit
[ROCm/clr commit: caa25fc792]
SWDEV-130808 - Add support of two new queries: CL_DEVICE_PREFERRED_WORK_GROUP_SIZE_AMD, CL_DEVICE_MAX_WORK_GROUP_SIZE_AMD.
- Initialize the "preferredWorkGroupSize_" for CPU device so that CL_MAX_WORK_GROUP_SIZE correctly reports CPU_MAX_WORKGROUP_SIZE.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpudevice.cpp#281 edit
[ROCm/clr commit: 00e913da6d]
SWDEV-2 - Change OpenCL version number from 2493 to 2494.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2241 edit
[ROCm/clr commit: cd4c8a168b]
SWDEV-130305 - For Vega CF configuration on specific chipset (AMD Ryzen 7 1800X) slave ASIC comes out of BACO when ReLive is enabled
- Finalize() in PAL shouldnt be called during enumeration. This creates a paging queue in WDDM which causes the second GPU to come out of BACO. Move Finalize to initializeHeapResources.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/13410/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#56 edit
[ROCm/clr commit: 881613438a]
SWDEV-130722 - Channel order in an interop buffer from OpenCL to OpenGL is flipped on Vega
OCL calls glGetTexLevelParameteriv_ function to get the internal GL format but this format is the one chosen by app in OGL API such as glTexImage2D.
The issue is that OGL sometimes selects a different format than defined in the glTexImage2D and this causes some issues in cl_gl interop. One example is shown below
glTexImage2D(GL_TEXTURE_2D, 0, GL_RGBA/**internal format**/, width, height, 0, GL_BGRA/**external format**/, GL_UNSIGNED_BYTES, NULL);
in this case GL_RGBA is selected by app as the internal format but OGL switches to BGRA8 internally and causes an issue later in cl_gl interop (i.e., R and B channels are swapped) because OCL gets GL_RGBA as the internal format in the glGetTexLevelParameteriv_ call.
To avoid this issue, OCL needs to query the real internal gl format in wglResourceAttachAMD and adjusts the CL format accordingly.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/13408/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.hpp#19 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevicegl.cpp#5 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#35 edit
[ROCm/clr commit: e8395888c5]
SWDEV-2 - Change OpenCL version number from 2492 to 2493.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2240 edit
[ROCm/clr commit: a90c0025a4]
SWDEV-131497 - [CQE OCL][Vega10][OclTst][QR][DTB-Blocker] 'Spir' test of OCLTST is crashing randomly 3/10 times; Faulty CL# 1451293
- The test doesn't release command queues, which may cause a crash on the device destruction. Force the app's queue destruction if the app didn't release them.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#54 edit
[ROCm/clr commit: d3d97c5010]
SWDEV-2 - Change OpenCL version number from 2491 to 2492.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#2239 edit
[ROCm/clr commit: b71ee777ff]
SWDEV-79278 - [OCL][PAL] refactoring PAL Null device create function to account for creating all the gfx9+ subtarget devices such as gfx901/gfx902/etc
ReviewboardURL = http://ocltc.amd.com/reviews/r/13378/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#21 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#53 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.hpp#18 edit
[ROCm/clr commit: 9772217dcb]
SWDEV-131493 - [CQE OCL][Vega10][QR][DTB-Blocker] Soft Hang is observed while running 'Mipmaps-clCopyImage' tests of WF Conformance due to Faulty CL# 1451293
Multiple runtime locks could conflict each other:
- Remove PAL lock from the resource creation/destruction. PAL should be thread safe for those operations.
- Avoid queue execution lock for a mipmap view destruction in submitUnmapMemory
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#34 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#55 edit
[ROCm/clr commit: 6b103f1bf6]
SWDEV-79278 - [OCL] Dont add gfx9+ devices into offline devices list in orca path as they will be added in pal.
ReviewboardURL = http://ocltc.amd.com/reviews/r/13396/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#569 edit
[ROCm/clr commit: 121ffcc6ec]