EPR #010002 - Change OpenCL version number from 1600 to 1601.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1347 edit
ECR #333753 - linker: rely on builtins library triple instead of elf binary
This is the first of multiple changes aimed at unifying the
offline linker (llvm-link) with the online linker in the compiler
library. The online linker is considered state-of-the-art and the
code there needs to be made available to the offline linker.
This change teaches the online linker to determine the target by
examining the target triple on the builtins library modules,
instead of checking the elf binary target. The assumption is that
the builtins library always matches the actual target as
confirmed in CL 1041226. This removes one dependence of compiler
library functions so that the affected code can eventually be
moved to llvm/lib and shared with the offline linker.
The change passes smoke_clang (Orca build), smoke (HSA build) and Teamcity pre-checkin.
Reviewed by Brian Sumner, Yaxun Liu, Stanislav Mekhanoshin
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#107 edit
[ROCm/clr commit: 4396288d55]
ECR #333753 - linker: rely on builtins library triple instead of elf binary
This is the first of multiple changes aimed at unifying the
offline linker (llvm-link) with the online linker in the compiler
library. The online linker is considered state-of-the-art and the
code there needs to be made available to the offline linker.
This change teaches the online linker to determine the target by
examining the target triple on the builtins library modules,
instead of checking the elf binary target. The assumption is that
the builtins library always matches the actual target as
confirmed in CL 1041226. This removes one dependence of compiler
library functions so that the affected code can eventually be
moved to llvm/lib and shared with the offline linker.
The change passes smoke_clang (Orca build), smoke (HSA build) and Teamcity pre-checkin.
Reviewed by Brian Sumner, Yaxun Liu, Stanislav Mekhanoshin
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/backends/common/linker.cpp#107 edit
ECR #392041 - Implement high performance state on Linux
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuappprofile.cpp#7 edit
[ROCm/clr commit: 7be05e924e]
ECR #392041 - Implement high performance state on Linux
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuappprofile.cpp#7 edit
EPR #010002 - Change OpenCL version number from 1599 to 1600.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1346 edit
[ROCm/clr commit: e5523be947]
EPR #010002 - Change OpenCL version number from 1599 to 1600.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1346 edit
ECR #304775 - Use accelerated copy path for read/writeRect if the host memory has offsets. This avoids re-pinning the memory giving nearly a 100% perf boost for such copies.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/5371/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#328 edit
[ROCm/clr commit: 0758f1e95b]
ECR #304775 - Use accelerated copy path for read/writeRect if the host memory has offsets. This avoids re-pinning the memory giving nearly a 100% perf boost for such copies.
ReviewBoardURL = http://ocltc.amd.com/reviews/r/5371/diff/
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#328 edit
EPR #010002 - Change OpenCL version number from 1598 to 1599.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1345 edit
[ROCm/clr commit: d42ad806ad]
EPR #010002 - Change OpenCL version number from 1598 to 1599.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1345 edit
EPR #010002 - Change OpenCL version number from 1597 to 1598.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1344 edit
[ROCm/clr commit: 0ac5d305af]
EPR #010002 - Change OpenCL version number from 1597 to 1598.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1344 edit
EPR #010002 - Change OpenCL version number from 1596 to 1597.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1343 edit
[ROCm/clr commit: d12b4d2364]
EPR #010002 - Change OpenCL version number from 1596 to 1597.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1343 edit
EPR #399808 - Fix the value of HSA image channel order for CL_RGB
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#188 edit
[ROCm/clr commit: 5f93384dbc]
EPR #399808 - Fix the value of HSA image channel order for CL_RGB
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#188 edit
EPR #010002 - Change OpenCL version number from 1595 to 1596.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1342 edit
[ROCm/clr commit: 8431455a87]
EPR #010002 - Change OpenCL version number from 1595 to 1596.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1342 edit
ECR #304775 - Bug 10112 - Raise default unroll threshold. The current default is 100, which is even lower than the LLVM default of 150. Increasing to 200 is a modest increase, and this should probably be even higher.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#111 edit
[ROCm/clr commit: 2c5424663c]
ECR #304775 - Bug 10112 - Raise default unroll threshold. The current default is 100, which is even lower than the LLVM default of 150. Increasing to 200 is a modest increase, and this should probably be even higher.
Affected files ...
... //depot/stg/opencl/drivers/opencl/compiler/lib/utils/OPTIONS.def#111 edit
EPR #402935 - Reset Resource::pinOffset_ if gslResource couldn't be created for pinned memory.
When the pinned memory to be created is too large, gslResource couldn't be created, and a local memory will be created instead. If pinOffset_ is NOT reset in this case, it will mess up future copying of the local memory.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#187 edit
[ROCm/clr commit: 1e0a5f64f5]
EPR #402935 - Reset Resource::pinOffset_ if gslResource couldn't be created for pinned memory.
When the pinned memory to be created is too large, gslResource couldn't be created, and a local memory will be created instead. If pinOffset_ is NOT reset in this case, it will mess up future copying of the local memory.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#187 edit
EPR #010002 - Change OpenCL version number from 1594 to 1595.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1341 edit
[ROCm/clr commit: a7c60aeaed]
EPR #010002 - Change OpenCL version number from 1594 to 1595.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1341 edit
ECR #304775 - Device enqueuing
- Use atomic fetch for enqueue flags
- Switch to a multithreaded scheduler
- Add a workaround for Linux host_multi_queue failures. Linux has only 2 queues, but the test allocates multiple host queues and the same HW ring can be used
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#106 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#449 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#127 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuschedcl.cpp#22 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#325 edit
[ROCm/clr commit: d2b905f18e]
ECR #304775 - Device enqueuing
- Use atomic fetch for enqueue flags
- Switch to a multithreaded scheduler
- Add a workaround for Linux host_multi_queue failures. Linux has only 2 queues, but the test allocates multiple host queues and the same HW ring can be used
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#106 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#449 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#127 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuschedcl.cpp#22 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#325 edit
EPR #010002 - Change OpenCL version number from 1593 to 1594.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1340 edit
[ROCm/clr commit: 1e8c506c75]
EPR #010002 - Change OpenCL version number from 1593 to 1594.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/utils/versions.hpp#1340 edit
ECR #304775 - Device enqueuing
- Add L2 cache flush after the scheduler execution. Although CP has to work with L2 cache, it seems some functionality relies on direct memory access and without explicit L2 flush CP can pick old values in the template.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#324 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.cpp#61 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.h#35 edit
[ROCm/clr commit: 4599bd0d4a]
ECR #304775 - Device enqueuing
- Add L2 cache flush after the scheduler execution. Although CP has to work with L2 cache, it seems some functionality relies on direct memory access and without explicit L2 flush CP can pick old values in the template.
Affected files ...
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#324 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.cpp#61 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.h#35 edit