When CL-GL interop is used, a GL context are used by two or more threads at the same time, which causes race condition.
Solution:
Add lock when accessing GL functions during CL-GL interop.
Change-Id: I3a34da3cbdf74c401111cc4e3a04ad84cc52709e
- Create a vector to allow multiple TS to be stored in Command.
- This would mean we dont wait for entire batch in Accumulate command
to finish when we exhaust signals.
- Reduce the number of signals created at init to 64. This min value
may still need to be tuned but the KFD allows max of 4094 interrupt
signals per device.
- Store kernel names whenever they are available and not just when
profiling. If we dynamically enable profiling like for Torch, a crash
can happen if hipGraphInstantiate wasnt included in Torch profile scope
beacuse we previously entered kernel names only when profiler is
attached.
Change-Id: I34e7881a25bbc763f82fdeb3408a8ea58e1ec006