Saleel Kudchadker f3aedfbec0 SWDEV-301667 - Create TS for each node recorded in graph
- Create a vector to allow multiple TS to be stored in Command.
- This would mean we dont wait for entire batch in Accumulate command
to finish when we exhaust signals.
- Reduce the number of signals created at init to 64. This min value
may still need to be tuned but the KFD allows max of 4094 interrupt
signals per device.
- Store kernel names whenever they are available and not just when
profiling. If we dynamically enable profiling like for Torch, a crash
can happen if hipGraphInstantiate wasnt included in Torch profile scope
beacuse we previously entered kernel names only when profiler is
attached.

Change-Id: I34e7881a25bbc763f82fdeb3408a8ea58e1ec006


[ROCm/clr commit: c157bfb202]
2024-03-26 14:47:24 -04:00
S
Açıklama
Hiçbir açıklama sağlanmadı
282 MiB
Dil
C++ 67.5%
C 20.6%
Python 6.6%
CMake 3.4%
Shell 0.6%
Diğer 1.1%