SWDEV-134107 - Add support for respecting target's xnack setting
- Enable the XNACK feature for all the APU system and remove the xnackEnabled_ field in AMDDeviceInfo struct
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#332 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdefs.hpp#23 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#116 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#98 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocsettings.cpp#41 edit
SWDEV-178313 - Properly enable OpenCL 2.0 on ROCm/LC path for Vega10+.
OPENCL_VERSION_STR is 2.1, but we only enable 2.0 since we don't have compiler's support for 2.1.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#115 edit
SWDEV-178313 - Enable OpenCL 2.0 on ROCm/LC path for Vega10+
Doorbell self-ring doesn't work for Fiji, so we enable 2.0 only for Vega10+ for now.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#114 edit
SWDEV-178459 - Navi10 Regression in driver builds with OpenCL v2811.3 causing issues with several OpenCL apps and workloads
Switch to Wave64 for HSAIL/SC path for now as the Wave32 in HSAIL/SC path causes multiple regressions and some OCL apps cannot be run
ReviewBoardURL = http://ocltc.amd.com/reviews/r/16662/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#66 edit
SWDEV-172504 - [PAL/LC] OpenCL PAL Runtime does not support new isa naming convention
- Using new isa naming convention in ORCA path
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#154 edit
SWDEV-127767 - Don't guess at the suffix for the device libraries
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/CMakeLists.txt#18 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Add 101010 GL interop formats mapping into CL_RGBA. The change will make sure the channel order consistency between OGL and OCL
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_gl.cpp#62 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#47 edit
SWDEV-132899 - [OCL][GFX10] 70 subtests of Conformance Mipmaps (clCopyImage) test failed for image type 1Darray
This is the follow up for CL#1517501
copyImage1DA blit kernel uses image2d_array_t type for src/dst images. On gx10, num of arrays/layers is expected in Z component for a 2Darray image so a swap is required for 1Darray images when we use 2Darray image for the image copy. The copyImage1DA has code for swapping z and y components as follows:
if (srcOrigin.w != 0) {
coordsSrc.z = coordsSrc.y;
coordsSrc.y = 0;
}
if (dstOrigin.w != 0) {
coordsDst.z = coordsDst.y;
coordsDst.y = 0;
}
So to use this path force the w component to 1 for src and dst images on gfx10 if image type is 1Darray.
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16538/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#28 edit
SWDEV-174282 - [AMF] WIN10 Converter fails when scale YUY2 image with certain output width
- When OCL creates an image view use the pitch value from the original surface
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#185 edit
SWDEV-172202 - Workaround the scheduler for systems don't support PCIe 3 atomics properly.
The idea is the scheduler uses a device side global as write_index, and only write the write_index back to the hsa queue when the last thread of the scheduler leaves.
This change along with the library side change have been tested on systems with or without proper PCIe 3 atomics support.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocblit.cpp#29 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocsched.hpp#2 edit
SWDEV-162389 - OpenCL Support for COMgr
- added the machineTargetLC_ values, which was introduced in CL1702548, for Carrizo and Hawaii
- requested by Joseph Greathouse for public users (https://github.com/RadeonOpenCompute/ROCm/issues/668)
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdefs.hpp#22 edit
SWDEV-172202 - Back out changelist 1730757.
Failure in OCLDynamic tests in various TC Sanity tests.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#111 edit
SWDEV-132899 - [OCL][GFX10] report number of WGP by default on gfx10 ASICs
Both HSAIL/SC and LC compilers use WGP mode by default on gfx10 ASICs (i.e., COMPUTE_PGM_RSRC1.WGP_MODE is set to 1 by both compilers) therefore runtime should report number of WGP (i.e., CU/2) on gfx10 ASICs by default.
The new environment variable (GPU_ENABLE_WGP_MODE = 0) can be used to force CU mode on LC (i.e., -mcumode option) if its needed (HSAIL/SC doesn't have any compiler option for forcing the CU mode)
Also, using the new environment variable (GPU_ENABLE_WAVE32_MODE) to control the wave32 mode on gfx10+.
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16435/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#329 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#121 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#65 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#301 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Add IS_LIGHTNING check for the rocr initialization, because currently for LC builds GPU_ENABLE_PAL is forced to 1.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#241 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Allow ROCr and PAL support from a single runtime binary. Runtime will use ROCr path by default with GPU_ENABLE_PAL=1 forcing PAL.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/build/Makefile.api#183 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#240 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Keep the body of all methods in the Program interface
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#97 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.hpp#44 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Move the constructor body of LightningProgram to the header
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#96 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.hpp#43 edit
SWDEV-174551 - [CQE OCL][QR][DTB-Blocker] 7 tests are failing in Conformance | Faulty CL#1720236
- Back out changelist 1720236. Conformance swaps RGB to BGR components and fails if real RGB is used. OCL can't switch to RGB until a fix into the conformance tests will be applied.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#46 edit
SWDEV-79445 - Back out changelist 1722556
- More changes are necessary on ROCm backend to support a dynamic switch between HSAIL and LC
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#107 edit
SWDEV-145570 - Use Subwindow copy SDMA for D->H and H->D copies if possible or fall back to linebyline copies if unalinged pitch.
- Set correct flags for SVM finegrain buffer for ROC backend
ReviewBoardURL = http://ocltc.amd.com/reviews/r/16353/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocblit.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmemory.cpp#41 edit
SWDEV-162389 - OpenCL Support for COMgr
- fixing bug of using incorrect included header file name
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#23 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Allow ROCM build within the same workspace as PAL. Please note that ROCM defualt path in this case will be HSAIL.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#105 edit
SWDEV-132899 - [OCL][GFX10] correctly set the wavefrontWidth_ for gfx10.
PAL reprots 64 for wavefrontSize so set it to 32 if the below conditions are met:
1- if we are in HSAIL path and GPU_FORCE_WAVE_SIZE_32 is set
2- or if we are in LC path andAsic is Navi10Plus
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16329/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#120 edit
SWDEV-172784 - No Video playback while 10 bit pixel format is enable
- Correct the channel's order for RGB10
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#45 edit
SWDEV-79445 - COMGR update
- Use full names for dll/lib load, since not all paths in loadLibrary() can recognize short names.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/comgrctx.cpp#2 edit
SWDEV-169078 - Also copy private_segment_size/group_segment_size to runtime handle for COMgr support
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#75 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.cpp#46 edit
SWDEV-162389 - Prepare the runtime code for enabling COMGR by default in the non-LC workspace
- Make sure OCL runtime can dynamically switch between HSAIL and LC paths
- For now use the both WITH_LIGHTNING_COMPILER and USE_COMGR_LIBRARY defines to identify LC specific code. The clean-up will come later
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/build/Makefile.api#179 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#238 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#327 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#14 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.hpp#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#19 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.hpp#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#249 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#118 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#74 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palprogram.cpp#85 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#63 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/program.cpp#101 edit