SWDEV-94611 - [OCL-LC-ROCm] Use GFX IP for device name. Set the name to "gfx[M][m][s]" (M:major,m:minor,stepping). Removed the device name strings from the DeviceInfo table. Keep the machineTarget_ field until the compiler is changed to accept gfxip strings.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdefs.hpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#15 edit
SWDEV-94644 - Run prepare-builtins from the modules build directory, instead of right before generating the include files. Renamed the files to match the opensource build names (except for the .amdgcn suffix). Automatically generate a single include file for all libraries.
Affected files ...
... //depot/stg/opencl/drivers/opencl/library/build/Makefile.library#54 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/irif/build/Makefile.irif#7 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/ockl/build/Makefile.ockl#8 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/oclc/build/Makefile.oclc#10 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/ocml/build/Makefile.ocml#8 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/opencl/build/Makefile.opencl#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#30 edit
SWDEV-86035 - Add PAL backend to OpenCL
- Add (PAL) suffix to the driver version
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#556 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#21 edit
SWDEV-96354 - Wrong usage of hsaImageData_ and deviceMemory_.
Use hsaImageData_ as the original pointer before alignment and only for that purpose. The deviceMemory_ is where the data is located. No one ever needs to use hsaImageData_ really. This is only an issue with tiled images
ReviewBoardURL = http://ocltc.amd.com/reviews/r/11331/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#14 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.hpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmemory.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmemory.hpp#4 edit
SWDEV-94610 - Make sure each kernarg segment sits on a different cache line (align the kernargs on cache lines at minimum). Minor misc cleanups.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#13 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.cpp#14 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.hpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#13 edit
SWDEV-101383 - [RS_DVR][MGPU] Slave GPU is blocked from going into BACO when DVR process is active (no recording or instant replay)
- Fix a memory leak
- Also make sure to use VALIDATE_ONLY flag properly as bindExternalDevice can be called even during context creation for which we cant close the adaper
ReviewBoardURL = http://ocltc.amd.com/reviews/r/11330/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#555 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#174 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.h#62 edit
SWDEV-79278 - [OpenCL][PAL] fixing a regression in gfx9 after CL#1309875 which caused all the OCLTST tests to fail on gfx9 emulator. Dont add any extra entry to the GfxIpDeviceInfo table as this table must match with GfxIpLevel enum (located in //depot/stg/pal/inc/core/palDevice.h).
ReviewBoardURL = http://ocltc.amd.com/reviews/r/11313/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldefs.hpp#11 edit
SWDEV-101448 - [CQE OCL][Brahma][PERF][QR] ~21% perf drop is observed with lulesh-cl subtest of ComputeApps tests : Faulty CL # 1306133
- Use the logic for transfer size before CL#1306133
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#124 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#10 edit
SWDEV-101315 - Fix PerfCounter not working under CodeXL.
1. Need to map ORCA PerfCounter block to PAL PerfCounter block/instance.
2. CodeXL could try to create PerfCouters that don't exist in HW, so need to handle that and return 0 as result.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palcounters.cpp#5 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palcounters.hpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palvirtual.cpp#21 edit
SWDEV-101853 - Fix the build, add a "return NULL" after the assert.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.hpp#7 edit
SWDEV-94610 - Fill the compileSize_ and compileSizeHint_ info from the LC metadata.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.cpp#13 edit
SWDEV-101853 - roc::Kernel cleanups:
- Remove unused classes & member functions/variables.
- Flatten vector arguments for the HSAIL path to remove the need for numElem_.
- Consolidate initArguments in a single loop for the HSAIL path.
- Use the Kernel::Argument to fill the OCL descriptor as much as possible.
- Set the access qualifier for both buffers and images.
- Fix the indentation and coding conventions.
- Add new ROC_ARG_TYPE type for hidden arguments
- Add an index_ field the roc::Kernel::Argument to record the OCL signature index for this argument, or -1 for hidden arguments
- Handle the hidden arguments as any other argument at dispatch (now included in the hsailArgList_)
- roc::Kernel::hsailArgAt(int) now returns the kernel argument for the given position in the OCL signature, not the position the the hsailArgList_.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.cpp#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.hpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#12 edit
SWDEV-101383 - Back out CL1310033 as it is causing Carrizo Win 10 Sanity test to crash at ocltst module ocldx.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#553 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#172 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.h#61 edit
SWDEV-2 - Change OpenCL version number from 2216 to 2217.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1963 edit
SWDEV-2 - Change OpenCL version number from 2215 to 2216.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1962 edit
SWDEV-101169 - Compile the PCH file from <stdin> instead of a file reference. This removes the requirement to have the original file present when using the PCH file.
Affected files ...
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/headers/build/Makefile.headers#9 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/irif/build/Makefile.irif#6 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/ockl/build/Makefile.ockl#7 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/oclc/build/Makefile.oclc#9 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/ocml/build/Makefile.ocml#7 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/opencl/build/Makefile.opencl#9 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/roccompiler.cpp#14 edit
SWDEV-2 - Change OpenCL version number from 2214 to 2215.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1961 edit
SWDEV-94610 - The spec says that the value returned for HSA_EXECUTABLE_SYMBOL_INFO_NAME_LENGTH does not include the NUL terminator. We should add one before using the string.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#25 edit
SWDEV-101383 - [RS_DVR][MGPU] Slave GPU is blocked from going into BACO when DVR process is active (no recording or instant replay)
- if the OS is Win10, no need to do extensive adapter init.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/11241/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#552 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#171 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.h#60 edit
SWDEV-101621 - [CQE OCL][OpenCL on PAL] 6 WF Conformance tests are failing
- Make sure the rowPitch is aligned to pixels for images created from buffer
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#20 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palresource.cpp#10 edit
SWDEV-79278 - [OpenCL][PAL] force Vega10(gfx9)(aka: Greenland) to use PAL backend
ReviewBoardURL = http://ocltc.amd.com/reviews/r/11279/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#551 edit
SWDEV-79445 - OCL generic changes and code clean-up
- Improve image fill performance with multiple writes in a single thread. The current split has 3 regions
Affected files ...
... //depot/stg/opencl/drivers/opencl/library/common.hsa/src/blitKernels.cl#4 edit
... //depot/stg/opencl/drivers/opencl/library/common/src/blitKernels.cl#4 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#123 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#40 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.hpp#4 edit
SWDEV-94610 - Restore the amdgpu_metadata.[ch]pp files. We need to share these files between different projects, and should avoid branching them. Ideally, they would be part of a metadata utility library.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/amdgpu_metadata.cpp#1 add
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/amdgpu_metadata.hpp#1 branch
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmetadata.cpp#3 delete
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmetadata.hpp#4 delete
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.hpp#11 edit
SWDEV-2 - Change OpenCL version number from 2213 to 2214.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1960 edit
SWDEV-94610 - Use the metadata to set the correct size for pointer arguments. Pointers to different address spaces may be of different sizes.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocvirtual.cpp#11 edit
SWDEV-94610 - Fix the argName length issue. The string returned by the ROCR is already NUL-terminated.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#22 edit
SWDEV-94610 - Fix the API::get_kernel_arg_info conformance test failure. The runtime metadata needs to return references from Name() and TypeName() instead of temporary strings. Name().c_str() should be valid until the program is destroyed.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rockernel.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmetadata.cpp#2 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocmetadata.hpp#3 edit
SWDEV-2 - Change OpenCL version number from 2212 to 2213.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1959 edit
SWDEV-94644 - Run prepare-builtins on the control functions.
Affected files ...
... //depot/stg/opencl/drivers/opencl/library/build/Makefile.library#53 edit
... //depot/stg/opencl/drivers/opencl/make/amdgcn.git/oclc/build/Makefile.oclc#7 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#19 edit
SWDEV-86035 - Enable PAL for GFX9 by default
- GPU_ENABLE_PAL=0 will force GSL backend for GFX9
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#550 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/paldevice.cpp#19 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#256 edit
SWDEV-101678 - Create a new instance of the ROCm-OpenCL-Driver for each call to compileImpl and linkImpl.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#202 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/roccompiler.cpp#11 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.cpp#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocdevice.hpp#5 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.cpp#18 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/rocm/rocprogram.hpp#9 edit
SWDEV-2 - Change OpenCL version number from 2211 to 2212.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1958 edit
SWDEV-101206 - [CQE OCL][Perf][G][QR] Upto ~9% Performance drop observed while running Video Composition subtest of Compubench; Faulty CL#1306133
- Use the original logic without DMA flush. Flush on staging write helps with a blocking op only, but currently VDI doesn't have that information.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#122 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#7 edit
SWDEV-2 - Change OpenCL version number from 2210 to 2211.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1957 edit
SWDEV-2 - Change OpenCL version number from 2209 to 2210.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1956 edit