SWDEV-183848 - Update OpenCL perfcounter blocks to match the block lists used by GPUPerfAPI (which is in turn used by RCP and CodeXL)
- The PAL block index for the PA_SC block was incorrect in all hw generations (in PAL, GpuBlock::SC has ordinal 4, not 3)
- In GFX9, the MC and SRBM blocks are not supported
- In GFX10, the following changes are made:
-- There are 8 CB instances per SE
-- There are 8 DB instances per SE
-- There are 2 PA_SU instances per SE
-- There are 4 PS_SC instances per SE
-- There are 4 RMI instances per SE
-- There are 2 GL1A instances per SE
-- There are 2 GL1C instances per SE
-- There are 8 GL1CG instances per SE
-- There are 4 global GL2A instances
-- There are 16 global GL2A instances
-- There are 4 global CHC instances
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palcounters.cpp#19 edit
SWDEV-132899 - [OCL][GFX10] add support for Navi10_A0 (gfx1010) (new entry from PAL for Navi10 A0 boards)
ReviewBoardURL = http://ocltc.amd.com/reviews/r/17022/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#50 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#74 edit
SWDEV-168145 - Add ECC target feature to OpenCL runtime
- hard coded SRAM ECC target feature for now since ROCr disable sram-ecc reporting via ISA until HCC is fixed
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#123 edit
SWDEV-183848 - Updating gfx10BlockIdPal based on the request from CodeXL
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16992/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palcounters.cpp#18 edit
SWDEV-132899 - [OCL][GFX10] add "wavefrontsize64" to the linkOptions if they had previously been added to the compile options
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16966/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#35 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#71 edit
SWDEV-132899 - [OCL][GFX10] passing "force-wgp-mode" option to Finalizer to enable WGP mode by default on gfx10+
and allow GPU_ENABLE_WGP_MODE to control the WGP/CU mode for HSAIL/SC path as well.
- also for Ariel (Navi10Lite) the wave32 should be disabled in LC but allow GPU_ENABLE_WAVE32_MODE control it for testing if needed.
ReviewrequestURL = http://ocltc.amd.com/reviews/r/16926/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#34 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#70 edit
SWDEV-86035 - Fix asserts in PAL after latest integration
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#71 edit
SWDEV-180834 - [Forum] - Washed-Out Colors in Premiere Pro CC 2018 When 10bit Enabled
- Correct OGL->OCL mapping for CM_SURF_FMT_RGB10_X2 format
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDeviceGL.cpp#33 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevicegl.cpp#10 edit
SWDEV-180407 - Observed failure while running OCL 2.0 conformance API : min_max_device_version
- revert CL1739455 to use OCL version 1.2 as default to avoid this issue for ROCm 2.2 release
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#117 edit
SWDEV-181012 - [CQE OCL][DTB-BLOCKER][QR][Windows][19.10] clinfo results in "clBuildProgram" error with OCL binaries. Faulty CL#1737731
- Remove some dead code as suggested by Konstantin Zhuravlyov
- These strings are no longer needed, loader will never ask about them
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#158 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#248 edit
SWDEV-180834 - [Forum] - Washed-Out Colors in Premiere Pro CC 2018 When 10bit Enabled
- Add CL_RGBA 101010 support for GL interop in GSL path.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#157 edit
SWDEV-179047 - [CQE OCL][DTB-BLOCKER][QR][Windows][19.10] clinfo results in "clBuildProgram" error with OCL binaries. Faulty CL#1737731
-Tested on Picasso, PAL stack passed, ORCA stack has issue.
-PAL gas xnack feature supported, but ORCA diesn't
-Added xnack feature and tested, work fine on both Bristol and Picasso
- this is a simple temporary fix, will implement a proper fix later in another ticket.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#155 edit
SWDEV-169154 - Implement OpenCL extension function to set stable pstate on ORCA stack
-Enable StablePstate feature in linux brahma stack
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#189 edit
SWDEV-176282 - FP16_MatrixTranspose is failing on NAVI10/VEGA10 PAL/LC path:wq
- add COMGR logging support to show the build log
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#28 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.hpp#16 edit
SWDEV-169154 - Implement OpenCL extension function to set stable pstate on ORCA stack
-Disable StablePstate feature in linux brahma stack
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#187 edit
SWDEV-134107 - Add support for respecting target's xnack setting
- Enable the XNACK feature for all the APU system and remove the xnackEnabled_ field in AMDDeviceInfo struct
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#332 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdefs.hpp#23 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#116 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#98 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocsettings.cpp#41 edit
SWDEV-178313 - Properly enable OpenCL 2.0 on ROCm/LC path for Vega10+.
OPENCL_VERSION_STR is 2.1, but we only enable 2.0 since we don't have compiler's support for 2.1.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#115 edit
SWDEV-178313 - Enable OpenCL 2.0 on ROCm/LC path for Vega10+
Doorbell self-ring doesn't work for Fiji, so we enable 2.0 only for Vega10+ for now.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#114 edit
SWDEV-178459 - Navi10 Regression in driver builds with OpenCL v2811.3 causing issues with several OpenCL apps and workloads
Switch to Wave64 for HSAIL/SC path for now as the Wave32 in HSAIL/SC path causes multiple regressions and some OCL apps cannot be run
ReviewBoardURL = http://ocltc.amd.com/reviews/r/16662/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#66 edit
SWDEV-172504 - [PAL/LC] OpenCL PAL Runtime does not support new isa naming convention
- Using new isa naming convention in ORCA path
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#154 edit
SWDEV-127767 - Don't guess at the suffix for the device libraries
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/CMakeLists.txt#18 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Add 101010 GL interop formats mapping into CL_RGBA. The change will make sure the channel order consistency between OGL and OCL
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_gl.cpp#62 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#47 edit
SWDEV-132899 - [OCL][GFX10] 70 subtests of Conformance Mipmaps (clCopyImage) test failed for image type 1Darray
This is the follow up for CL#1517501
copyImage1DA blit kernel uses image2d_array_t type for src/dst images. On gx10, num of arrays/layers is expected in Z component for a 2Darray image so a swap is required for 1Darray images when we use 2Darray image for the image copy. The copyImage1DA has code for swapping z and y components as follows:
if (srcOrigin.w != 0) {
coordsSrc.z = coordsSrc.y;
coordsSrc.y = 0;
}
if (dstOrigin.w != 0) {
coordsDst.z = coordsDst.y;
coordsDst.y = 0;
}
So to use this path force the w component to 1 for src and dst images on gfx10 if image type is 1Darray.
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16538/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#28 edit
SWDEV-174282 - [AMF] WIN10 Converter fails when scale YUY2 image with certain output width
- When OCL creates an image view use the pitch value from the original surface
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#185 edit
SWDEV-172202 - Workaround the scheduler for systems don't support PCIe 3 atomics properly.
The idea is the scheduler uses a device side global as write_index, and only write the write_index back to the hsa queue when the last thread of the scheduler leaves.
This change along with the library side change have been tested on systems with or without proper PCIe 3 atomics support.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocblit.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocsched.hpp#2 edit
SWDEV-162389 - OpenCL Support for COMgr
- added the machineTargetLC_ values, which was introduced in CL1702548, for Carrizo and Hawaii
- requested by Joseph Greathouse for public users (https://github.com/RadeonOpenCompute/ROCm/issues/668)
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdefs.hpp#22 edit