9db52f9a46
- Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution - Optimize the launch latency, where commands creation and execution is done at the same time - Optimize the scheduling to use less barriers and waiting signals if the same queue can be detected - The new path is controlled by DEBUG_HIP_FORCE_GRAPH_QUEUES environment variable, where 0 will use the original path and any other value will force the number of asynchronous queues for execution - DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async execution in graphs(applicable for Navi families only) Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e