Files
vedithal-amd 354fe5f52c Unified configuration for metrics (#726)
* Show description of metrics during analysis
    * Use --include-cols Description show the Description column in analyze mode (this is hidden by default)
    * Remove tips field from analysis config

* Align metric names in analysis config and documentation

* Add unified config utils/unified_config.yaml

* Add python script utils/split_config.py to auto generate analysis configuration and documentation metrics description
   * Add test case to ensure unified config is older than auto-generated config
   * Auto generate analysis config and documentation metrics description

* Update CONTRIBUTING.md to add instructions to build documentation assets
    * Add docker image and compose file to build documentation

* Update CHANGELOG and Documentation

* Use jinja template instead of hardcoding metric tables in documentation

[ROCm/rocprofiler-compute commit: bb44e90b2d]
2025-07-25 14:01:34 -04:00

50 строки
1.7 KiB
ReStructuredText

.. meta::
:description: ROCm Compute Profiler performance model: Local data share (LDS)
:keywords: Omniperf, ROCm Compute Profiler, ROCm, profiler, tool, Instinct, accelerator, local, data, share, LDS
**********************
Local data share (LDS)
**********************
.. _lds-sol:
LDS Speed-of-Light
==================
.. warning::
The theoretical maximum throughput for some metrics in this section are
currently computed with the maximum achievable clock frequency, as reported
by ``rocminfo``, for an accelerator. This may not be realistic for all
workloads.
The :ref:`LDS <desc-lds>` speed-of-light chart shows a number of key metrics for
the LDS as a comparison with the peak achievable values of those metrics.
.. jinja:: lds-sol
:file: _templates/metrics_table.j2
.. rubric:: Footnotes
.. [#lds-workload] Here we assume the typical case where the workload evenly distributes
LDS operations over all SIMDs in a CU (that is, waves on different SIMDs are
executing similar code). For highly unbalanced workloads, where e.g., one
SIMD pair in the CU does not issue LDS instructions at all, this metric is
better interpreted as the percentage of SIMDs issuing LDS instructions on
:ref:`SIMD pairs <desc-lds>` that are actively using the LDS, averaged over
the lifetime of the kernel.
.. [#lds-bank-conflict] The maximum value of the bank conflict rate is less than 100%
(specifically: 96.875%), as the first cycle in the
:ref:`LDS scheduler <desc-lds>` is never considered contended.
.. _lds-stats:
Statistics
==========
The LDS statistics panel gives a more detailed view of the hardware:
.. jinja:: lds-stats
:file: _templates/metrics_table.j2