SWDEV-79445 - OCL generic changes and code clean-up
- More changes for VanGoghLite support in OCL
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#168 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#103 edit
SWDEV-192384 - [HIP CQE][HIPonPAL][19.40] hipBindTexRef1DFetch, hipTextureRef2D are failed on all ASICs for both Win/Lnx
1. Don't ignore the PAL_ALWAYS_RESIDENT flag for HIP.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18061/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#101 edit
SWDEV-192384 - [HIP CQE][HIPonPAL][19.40] hipBindTexRef1DFetch, hipTextureRef2D are failed on all ASICs for both Win/Lnx
The runtime cannot trivially determine all the resources that will be used by a kernel, thus it can fail to make all of them resident.
1. Add new runtime flag PAL_ALWAYS_RESIDENT. Enabling this setting will cause resources to become resident at allocation time.
2. Set the default value of the above flag to true for HIP and false for OCL.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/18054/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#79 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.hpp#30 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#100 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.hpp#27 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#153 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#319 edit
SWDEV-200614 - [Schneider] Crash in Agisoft when run in mGPU environment
- Add a workaround for memory pinning path. It will perform 2-step copy to make sure memory pinning doesn't occur on the first unaligned page, because in Windows memory manager can have CPU access to the allocation header in another thread and a race condition is possible
- change some default setting for staging and pinned paths, because PCIE gen3 performance.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#96 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#150 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#317 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Remove double wave limit logic for wave32, since HW spec claims the same 32 waves per CU
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#94 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Make sure PAL_DISABLE_SDMA is fully functional. CP DMA is used for buffer transfers currently and kernels for images and buffer rect copies.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#32 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#150 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#92 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.hpp#25 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#142 edit
SWDEV-189787 - [NWNIT] WX9100 crashing under heavy load in Agisoft
- Don't report SPIR extension, since HSAIL doesn't support it
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#90 edit
SWDEV-195023 - [CQE OCL][Navi10][RESOLVE] corruption seen in thumbnail for mxf clip after enabling temporal denoiser in Davinci resolve app
- Add a workaround for missing custom pitch in gfx10 HW. It can be disabled with GPU_IMAGE_BUFFER_WAR=0. Workaround implements double copy with an image without pitch.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palmemory.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palmemory.hpp#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#89 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.hpp#24 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#138 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.hpp#62 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#313 edit
SWDEV-180872 - Runtime support changes for Cooperative Group Features
- Keep this feature for Linux only. Windows doesn't enable GWS by default
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#85 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Force LC for HIP, since it doesn't support HSAIL path
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#83 edit
SWDEV-79445 - OCL generic changes and code clean-up
Optimize scratch buffer calculation in the preparation for coop group launch, since the current limit affects max waves calculation:
- Switch to 32 waves per CU as the max possible limit
- Use vgprs count for the waves limit calculation to avoid unconditional possible max
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#141 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.hpp#38 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#82 edit
SWDEV-132899 - [OCL][GFX10] increase the numScratchWavesPerCu in Wave32 mode and use the actual num of CUs not the total num of WGPs when calculating the scratch buffer size
ReviewBoardURL = http://ocltc.amd.com/reviews/r/17474/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#139 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#81 edit
SWDEV-189140 - Add P2P support in PAL path
- PAL requires P2P resource open on the usage device. Add the new interface to open the resource
- Add a hidden P2P device object creation into amd::Memory. It can be activated with OCL context that has a single device.
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_p2p_amd.cpp#2 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#337 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#134 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palmemory.cpp#25 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#74 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.hpp#28 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#80 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.hpp#23 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#133 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#126 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#93 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.cpp#136 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.hpp#109 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#306 edit
SWDEV-132899 - [OCL][GFX10] add support for Navi10_A0 (gfx1010) (new entry from PAL for Navi10 A0 boards)
ReviewBoardURL = http://ocltc.amd.com/reviews/r/17022/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#50 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#74 edit
SWDEV-132899 - [OCL][GFX10] add "wavefrontsize64" to the linkOptions if they had previously been added to the compile options
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16966/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#35 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#71 edit
SWDEV-132899 - [OCL][GFX10] passing "force-wgp-mode" option to Finalizer to enable WGP mode by default on gfx10+
and allow GPU_ENABLE_WGP_MODE to control the WGP/CU mode for HSAIL/SC path as well.
- also for Ariel (Navi10Lite) the wave32 should be disabled in LC but allow GPU_ENABLE_WAVE32_MODE control it for testing if needed.
ReviewrequestURL = http://ocltc.amd.com/reviews/r/16926/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#34 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#70 edit
SWDEV-178459 - Navi10 Regression in driver builds with OpenCL v2811.3 causing issues with several OpenCL apps and workloads
Switch to Wave64 for HSAIL/SC path for now as the Wave32 in HSAIL/SC path causes multiple regressions and some OCL apps cannot be run
ReviewBoardURL = http://ocltc.amd.com/reviews/r/16662/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#66 edit
SWDEV-132899 - [OCL][GFX10] report number of WGP by default on gfx10 ASICs
Both HSAIL/SC and LC compilers use WGP mode by default on gfx10 ASICs (i.e., COMPUTE_PGM_RSRC1.WGP_MODE is set to 1 by both compilers) therefore runtime should report number of WGP (i.e., CU/2) on gfx10 ASICs by default.
The new environment variable (GPU_ENABLE_WGP_MODE = 0) can be used to force CU mode on LC (i.e., -mcumode option) if its needed (HSAIL/SC doesn't have any compiler option for forcing the CU mode)
Also, using the new environment variable (GPU_ENABLE_WAVE32_MODE) to control the wave32 mode on gfx10+.
ReviewRequestURL = http://ocltc.amd.com/reviews/r/16435/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#329 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#121 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#65 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#301 edit
SWDEV-162389 - Prepare the runtime code for enabling COMGR by default in the non-LC workspace
- Make sure OCL runtime can dynamically switch between HSAIL and LC paths
- For now use the both WITH_LIGHTNING_COMPILER and USE_COMGR_LIBRARY defines to identify LC specific code. The clean-up will come later
Affected files ...
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/build/Makefile.api#179 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#238 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#327 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#14 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.hpp#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#19 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.hpp#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#249 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#118 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#74 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palprogram.cpp#85 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#63 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/program.cpp#101 edit
SWDEV-162389 - Prepare the runtime code for enabling COMGR by default in the non-LC workspace
- Make sure OCL runtime can dynamically switch between HSAIL and LC paths
- For now use the both WITH_LIGHTNING_COMPILER and USE_COMGR_LIBRARY defines to identify LC specific code. The clean-up will come later
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#236 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#325 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.cpp#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devkernel.hpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.cpp#17 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/devprogram.hpp#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#247 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#116 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palkernel.cpp#72 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palprogram.cpp#83 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#61 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/program.cpp#99 edit
SWDEV-145570 - [HIP] - Enable largest possible allocation on HIP
ReviewBoardURL = http://ocltc.amd.com/reviews/r/15803/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#56 edit
SWDEV-133815 - PAL support for Linux Pro w/OpenCL 2.0 support
- Reenable OCL2.0 support on Linux.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palsettings.cpp#53 edit