[Docs][rocprofv3]Add Consecutive Kernels Parameter Description to Docs (#1111)
* Add consecutive kernels parameter description * remove space * Updated docs and CHANGELOG
Этот коммит содержится в:
@@ -195,6 +195,7 @@ Full documentation for ROCprofiler-SDK is available at [rocm.docs.amd.com/projec
|
||||
- Perfetto support for scratch memory.
|
||||
- Support in the `rocprofv3` avail tool for command-line arguments.
|
||||
- Documentation for `rocprofv3` advanced options.
|
||||
- Support for multi dispatch ATT file added
|
||||
|
||||
### Changed
|
||||
|
||||
|
||||
@@ -105,6 +105,14 @@ The following table lists the parameters relevant to thread tracing:
|
||||
| att-gpu-index | Integer | | | Comma-separated list of integers. If enabled, only the GPU |
|
||||
| | (List) | | | indexes in the list will be profiled by thread trace. |
|
||||
+--------------------------+---------+---------+-----------+--------------------------------------------------------------+
|
||||
| att-consecutive-kernels | Integer | >=0 | | Starting at the targeted kernel, enables thread trace for the|
|
||||
| | | | | next N kernel dispatches, sharing a single ATT file, |
|
||||
| | | | | stats.csv and UI dir. See --kernel-include-regex and |
|
||||
| | | | | --kernel-iteration-range. If multiple targeted kernels |
|
||||
| | | | | overlap, the count for N next dispatches starts again from 0.|
|
||||
| | | | | Recommended use with --att-gpu-index due to thread trace |
|
||||
| | | | | being enabled for all GPUs. |
|
||||
+--------------------------+---------+---------+-----------+--------------------------------------------------------------+
|
||||
|
||||
For AMD Instinct accelerators, enable perfmon streaming using:
|
||||
|
||||
@@ -145,6 +153,18 @@ By default, ``rocprofv3`` enables thread trace only once per kernel instance. Th
|
||||
To enable thread trace for multiple kernel instances, use the ``kernel-iteration-range`` parameter.
|
||||
It's recommended to use ``kernel-include-regex`` parameter to filter the desired kernel names instead of tracing everything.
|
||||
|
||||
Typically, each kernel profile has its own ATT file output.
|
||||
To compile multiple kernel profiles into a single output file, use the ``att-consecutive-kernels`` parameter.
|
||||
When using this parameter, the ``rocprofv3`` tool begins profiling kernels after encountering a targeted kernel.
|
||||
The tool then continues profiling subsequent kernels until a total of ``n`` kernels are profiled including the initial targeted kernel
|
||||
where ``n`` is the non-negative integer passed to ``att-consecutive-kernels``.
|
||||
Note that the subsequent kernels encountered after the initial targeted kernel do not themselves have to be targeted.
|
||||
If the subsequent kernels are targeted kernels, the profiler will then profile another ``n - 1`` kernels after encountering this
|
||||
new targeted kernel, so it is possible for a generated ATT file to have more than ``n`` kernels profiled.
|
||||
All the profiled kernels are then compiled into a single ATT file.
|
||||
If a new targeted kernel is encountered after the ``rocprofv3`` tool has finished profiling a batch of kernels,
|
||||
the profiler will restart profiling when encountering this new targeted kernel and create another ATT file with multiple kernels.
|
||||
|
||||
.. _output-files:
|
||||
|
||||
rocprofv3 output files
|
||||
|
||||
Ссылка в новой задаче
Block a user