SWDEV-108024 - [ROCm CQE] Printf broken on ROCm , LC and HSAIL path
- Update GSL and PAL backends to reflect a change in ROCM (CL#1344871)
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#43 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palprintf.cpp#4 edit
SWDEV-102417 - Forum [205433] : Memory leak with printf statement inside kernel code
A memory leak can occur if a printf statement is inside the .cl source code but it is not used inside the __kernel code (e.g., a function inside .cl code that uses printf but never called by the __kernel). In this case compiler generates the printf metadata but printf is not used by the __kernel (i.e., the printf buffer is empty).
To fix this issue, release the transfer buffer object before returning false in PrintfDbgHSA::output function.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/11394/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#42 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palprintf.cpp#3 edit
SWDEV-80874 - fixed staging buffer overflow with HSA printf
Staging buffer is ~2 times smaller than allocated printf buffer, so if amount of data in printf buffer exceeds the size of the staging buffer
we hit assertion in the memory copy. To hit the assertion that is enough to print 2 integers with 64K workitems.
Added loop to read printf buffer into staging in portions.
Testing: smoke, precheckin, conformance printf with HSAIL forced, custom tests
Reviewed by German Andreev
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#41 edit
SWDEV-80874 - Fixed ORCA RT HSA printf buffer indexing issues
The format of the buffer is: printf_id, <arg1>, <arg2>, ...
The RT did not advance index for printf_id field, so for example for a format string "%d" we have been printing printf_id instead of actual argument for every other string.
The other issue is that outputDbgBuffer is adjusting its last argument (idx) by the number of consumed DWORD values,
but PrintfDbgHSA::output() is also ajusting dbgBufferPtr, so we had adjustment done twice, printing only half of the actual data and then printing zeroes from the buffer.
The resolution for both is to always pass 1 as index to outputDbgBuffer(). 1 because 0 is printf_id.
Testing: smoke, precheckin, conformance printf with HSAIL forced, custom tests
Reviewed by Brian Sumner and German Andreev
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#40 edit
SWDEV-78024 - SYCL - Issue with printf when printing a string without format specifier - removed the condition to expand printf only if it has more than one arguemnt.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/llvm32/lib/Target/AMDIL/AMDILPrintfConvert.cpp#2 edit
... //depot/stg/opencl/drivers/opencl/compiler/llvm32/test/CodeGen/AMDIL/printf_without_format_specifier.ll#1 add
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#39 edit
EPR #420344 - Forum [180211]: enqueueNDRangeKernel crashes to execute device binary if it contains printf statements
This is a temporary workaround to avoid app crash when a kernel has pritntf but the program object is built from a binary (i.e., the printf info is not propagated if the program object is built from a binary).
ReviewBoardURL = http://ocltc.amd.com/reviews/r/7676/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#36 edit