1410f002f3c8c194cc5f2b950e5ea96ffbc80b15
Use barrier packets for every profile marker that gets submitted
and use the completion signal to get GPU ts. This gives most accurate
dispatch time. Club cache flushes with profile marker if there is a
pending dispatch that needs cache flush. This optimization saves on
extra barrier and helps wall time
Change-Id: Ib62d6d7aabf4743827b561be6c9c5afa813203da
[ROCm/clr commit: 59c6cb0268]
Popis
Nebyl uveden žádný popis
Jazyky
C++
67.5%
C
20.6%
Python
6.6%
CMake
3.4%
Shell
0.6%
Jiný
1.1%