EPR #010002 - Change OpenCL version number from 1818 to 1819.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1565 edit
EPR #010002 - Change OpenCL version number from 1817 to 1818.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1564 edit
ECR #333753 - HSA HLC: fix for cannot select __hsail_memfence intrinsic under -O0
Fix for bug 10808.
We have missed always inliner pass on runtime only for -O0 build. aoc2 and opt.exe run it, but conformance binary itself does not.
As a result a set of always_inline functions was not inlined and we cannot lower __hsail_memfence call which should accept only immediate arguments,
where we are passing call arguments if it is not inlined.
The fix ensures we are running at either of two inliner always if building for GPU. Under -O0 that is always inliner pass.
In addition three helper functions in the library were marked as always_inline to ensure these are also inlined into atomic_work_item_fence() and thus
supply immediate arguments into __hsail_memfence call. The dialect we are using in clang to build the library does not provide inlining of a "static inline"
functions, so an additional attribute was needed.
Also the fix unifies -O0 inliner invocation code for offline opt.exe and complib.
Additionally fixed the complib bug, under -O0 OptLevel::setup() extis early, so does not change HLC_Disable_Amd_Inline_All variable.
This variable contains the value left after blit kernels compilation. Added corresponding setup code to GPUO0OptLevel::optimize().
In a long run we should call AMDPassManagerBuilder under -O0 as well and handle all that logic there.
Testing: test_c11_atomics atomic_fence -O0, pipes -O0, smoke, precheckin, ocl_features
Reviewed by Brian Sumner, Matthew Arsenalut and Evgeny Mankov
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/opt_level.cpp#27 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm/tools/opt/opt.cpp#70 edit
... //depot/stg/opencl/drivers/opencl/library/hsa/hsail/src/misc/atomicWorkItemFence.cl#9 edit
EPR #010002 - Change OpenCL version number from 1816 to 1817.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1563 edit
ECR #304775 - Mipmaps support in OpenCL
- Keep mipmaps in staging only
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#233 edit
ECR #304775 - Fix for a memory "leak" in one GEHC sample
- The app builds a dependency graph when the new command waits for the previous one. Our runtime couldn't release wait commands in this situation, because release is done in the command destructor. Release the events from the wait list when the command is done.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.cpp#71 edit
EPR #396242 - Update to HCtoDCmapping: Adding guards for HCtoDCmapping in mapping parameters from LLVM to MVSC. New struct packing rule for doubles in Windows added. Use dc_alignment and hc_alignment to track parameter alignment on device and host compilers respectively.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpumapping.cpp#2 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpumapping.hpp#2 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpuprogram.cpp#66 edit
EPR #420344 - Forum [180211]: enqueueNDRangeKernel crashes to execute device binary if it contains printf statements
This is a temporary workaround to avoid app crash when a kernel has pritntf but the program object is built from a binary (i.e., the printf info is not propagated if the program object is built from a binary).
ReviewBoardURL = http://ocltc.amd.com/reviews/r/7676/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#36 edit
EPR #010002 - Change OpenCL version number from 1815 to 1816.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1562 edit
EPR #010002 - Change OpenCL version number from 1814 to 1815.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1561 edit
EPR #010002 - Change OpenCL version number from 1813 to 1814.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1560 edit
EPR #420584 - [CQE OCL][ISV][QR][SI] FAHBenchmark application is crashing on all SI cards.
Wave limiter causes FAH crash on SI. Disable wave limiter for SI as a workaround.
Opened bug #10817 to track this issue.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuwavelimiter.cpp#6 edit
EPR #010002 - Change OpenCL version number from 1812 to 1813.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1559 edit
EPR #010002 - Change OpenCL version number from 1811 to 1812.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1558 edit
EPR #010002 - Change OpenCL version number from 1810 to 1811.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1557 edit
EPR #419347 - Fix a d3d9 memory leak.
According to https://msdn.microsoft.com/en-us/library/windows/desktop/bb174386(v=vs.85).aspx: Calling IDirect3DDevice9::GetDirect3D will increase the internal reference count on the IDirect3D9 interface. Failure to call IUnknown::Release when finished using this IDirect3D9 interface results in a memory leak..
Although p3d9devEx->Release() has the same effect as p3d9dev->Release(), for clarification we better use p3d9dev->Release() instead.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDeviceD3D9.cpp#13 edit
EPR #010002 - Change OpenCL version number from 1809 to 1810.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1556 edit
ECR #304775 - Mipmaps support in OpenCL
- Enable PAD2 bit for miplevel views
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#218 edit
EPR #010002 - Change OpenCL version number from 1808 to 1809.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1555 edit
EPR #010002 - Change OpenCL version number from 1807 to 1808.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1554 edit
EPR #010002 - Change OpenCL version number from 1806 to 1807.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1553 edit
EPR #010002 - Change OpenCL version number from 1805 to 1806.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1552 edit
EPR #403782 - IOMMU2/SVM
- Update the caching and hit logic for resource cache to reflect allocation attributes for SVM. Else it can give wrong hits leading to hangs if a regular surface is used for shader upload etc. IOMMUv2 strictly needs shader and command buffers to have EXECUTE attribute.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/7572/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#217 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#80 edit
EPR #397491 - Disabling generic address space on 32 bit windows too for now.
Back out revision 116 from //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#117 edit
EPR #397491 - According to HSA-Finalizer-ADD, for GPUVM32 private_segment_aperture_base_hi and group_segment_aperture_base_hi should be equal to the 32 bits of the 32 bit private and group segment flat address aperture.
Reviewed by: German
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#362 edit
EPR #010002 - Change OpenCL version number from 1804 to 1805.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1551 edit
ECR #333756 - HSA Finalizer: Make sure size of kernarg segment, alignment of kernarg, private and group segments are multiple of 16. Update ORCA runtime assert. [ OpenCL integration of CL 1151953]
Change by Nikolay Haustov
Testing: http://ocltc:8111/viewModification.html?modId=51851&personal=true&init=1&tab=vcsModificationBuilds
Also fix uncovered problem in test.
Testing: pre-checkin
Reviewed by: German
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/HSAIL/hsail-fin/HSAILFinalizer.cpp#16 integrate
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/gpu/sc/HSAIL/tests/src/finalizer/features/structural_analysis/short_circuit/short_circuit06.hsail#4 integrate
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#284 edit
EPR #010002 - Change OpenCL version number from 1803 to 1804.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1550 edit
EPR #403782 - IOMMU2/SVM
- Disable DX interop on SVM. This is a feature for SVM and may need more work.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/7555/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#310 edit
EPR #010002 - Change OpenCL version number from 1802 to 1803.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1549 edit
ECR #304775 - Mipmaps support
- Following CL#1151650. Change the comparison condition to 1.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#216 edit
ECR #304775 - Mipmaps support
- Enable miplevel flag even for the first mip level when runtime creates a view. Otherwise GSL may change the pitch alignment for the created view.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#215 edit
EPR #418590 - Fix improper check for temp path.
- Existing code will change temp path to "." if tempPath is "C:\Windows\Temp\"
- Need to make sure temp path will be changed to "." if tempPath is "C:\Windows\" or "C:\Windows"
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/os/os_win32.cpp#44 edit
EPR #397491 - Disable generic address space only on 32 bit Linux
Reviewed by: German
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#116 edit
EPR #412821 - Fix a crash when the ThreadTrace Object is freed before a ThreadTrace command is processed.
Also change 'None' to 'Undefined', since 'None' is a macro defined in X.h.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/platform/command.hpp#77 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/threadtrace.hpp#5 edit
EPR #010002 - Change OpenCL version number from 1801 to 1802.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1548 edit
ECR #333753 - Compiler Lib/RT: libutils.h usage removal due to non-API interface
Utils are to be used only by Compiler Lib itself.
Testing: pre checkin
Reviewers: German Andryeyev, Brian Sumner, Yaxun Liu
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/v0_8/libUtils.h#17 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#178 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#193 edit
EPR #010002 - Change OpenCL version number from 1800 to 1801.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1547 edit
EPR #419351 - clEnqueueNDRange crash if one doesn't create a device queue and use device enqueue in the kernel
- add a check for defQueue is NULL in case the app didn't create one.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#510 edit