diff --git a/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_host_trap_single_kernel.png b/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_host_trap_single_kernel.png new file mode 100644 index 0000000000..d39158938c Binary files /dev/null and b/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_host_trap_single_kernel.png differ diff --git a/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_no_kernel_filtering.png b/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_no_kernel_filtering.png new file mode 100644 index 0000000000..50b5b5f253 Binary files /dev/null and b/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_no_kernel_filtering.png differ diff --git a/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_sort_by_count.png b/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_sort_by_count.png new file mode 100644 index 0000000000..b7e6cdf3c6 Binary files /dev/null and b/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_sort_by_count.png differ diff --git a/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_stochastic_single_kernel.png b/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_stochastic_single_kernel.png new file mode 100644 index 0000000000..09b9d631b5 Binary files /dev/null and b/projects/rocprofiler-compute/docs/data/pc_sampling/pc_sampling_stochastic_single_kernel.png differ diff --git a/projects/rocprofiler-compute/docs/how-to/pc_sampling.rst b/projects/rocprofiler-compute/docs/how-to/pc_sampling.rst index 1c6e4acd19..64aff4ae2a 100644 --- a/projects/rocprofiler-compute/docs/how-to/pc_sampling.rst +++ b/projects/rocprofiler-compute/docs/how-to/pc_sampling.rst @@ -20,10 +20,10 @@ Profiling options --------------------- For using profiling options for PC sampling the configuration needed are: -* ``--pc-sampling-method``: Should be either ``stochastic`` or ``host_trap`` -* ``--pc-sampling-interval``: For stochastic sampling, the interval is in cycles. The finest granularity is 1 cycle. For host_trap sampling, the interval is in microsecond (DEFAULT: 1048576). The interval should be the power of 2. You are recommended to try to starting from 1048576, and lowering until reaching 65536. +* ``--pc-sampling-method``: Should be either ``stochastic`` or ``host_trap``, (DEFAULT: stochastic) +* ``--pc-sampling-interval``: For stochastic sampling, the interval is in cycles. The finest granularity is 1 cycle. For ``host_trap`` sampling, the interval is in microsecond (DEFAULT: 1048576). The interval should be the power of 2. You are recommended try starting from 1048576, and lowering until reaching 65536. -**Sample command:** +**Sample command:** .. code-block:: shell @@ -42,8 +42,33 @@ For using analysis options for PC sampling the configuration needed are: $ rocprof-compute analyze -p workloads/pc_test/MI300A_A1/ -b 21 -k 0 --pc-sampling-sorting-type offset +**Sample output:** + +Selecting single kernel host trap PC sampling: + +.. image:: ../data/pc_sampling/pc_sampling_host_trap_single_kernel.png + :align: left + :alt: Host trap PC sampling snapshot + +Selecting single kernel stochastic PC sampling: + +.. image:: ../data/pc_sampling/pc_sampling_stochastic_single_kernel.png + :align: left + :alt: Stochastic PC sampling snapshot + +If you don't filter by kernel, the output will fall back to the original data from ``rocprofv3`` csv output for all the kernels: + +.. image:: ../data/pc_sampling/pc_sampling_no_kernel_filtering.png + :align: left + :alt: Host trap PC sampling snapshot no_kernel_filtering + +Selecting single kernel sorting by PC count: + +.. image:: ../data/pc_sampling/pc_sampling_sort_by_count.png + :align: left + :alt: Host trap PC sampling sorting snapshot + .. note:: * PC sampling feature is currently in BETA version. To enable PC sampling, you have to explicitly enable it with block index 21. - * To associate PC sampling info back to HIP source code, you need to build the profiling target app with ``-g`` to keep the symbols. Otherwise, PC sampling info would be only associated with assembly lines. - + * To associate PC sampling info back to HIP source code, you need to build the profiling target app with ``-g`` to keep the symbols. Otherwise, PC sampling info will be only associated with assembly lines.