* pip-compile docs/requirements.txt Signed-off-by: Peter Jun Park <peter.park@amd.com> Add Sphinx docs config Signed-off-by: Peter Jun Park <peter.park@amd.com> Add Sphinx config Signed-off-by: Peter Jun Park <peter.park@amd.com> Update docs build config Signed-off-by: Peter Jun Park <peter.park@amd.com> * style(conf.py): Apply black formatting to docs/conf.py Signed-off-by: Sam Wu <22262939+samjwu@users.noreply.github.com> * Update docs requirements Signed-off-by: Peter Jun Park <peter.park@amd.com> Update to rocm-docs-core 1.3.0 Signed-off-by: Peter Jun Park <peter.park@amd.com> Update docs requirements Signed-off-by: Peter Jun Park <peter.park@amd.com> pip-compile requirements Signed-off-by: Peter Jun Park <peter.park@amd.com> bump rocm-docs-core to 1.5.0 bump rocm-docs-core to 1.4.1 Signed-off-by: Peter Jun Park <peter.park@amd.com> * Add dependabot.yml and update CODEOWNERS Signed-off-by: Peter Jun Park <peter.park@amd.com> Update toc and conf Signed-off-by: Peter Jun Park <peter.park@amd.com> update dependabot * Port docs to rocm-docs standard Signed-off-by: Peter Jun Park <peter.park@amd.com> Add toc and Diataxis cards Signed-off-by: Peter Jun Park <peter.park@amd.com> Add basic file structure Signed-off-by: Peter Jun Park <peter.park@amd.com> add glossary Signed-off-by: Peter Jun Park <peter.park@amd.com> add includes Signed-off-by: Peter Jun Park <peter.park@amd.com> Add license.rst Signed-off-by: Peter Jun Park <peter.park@amd.com> add compatible hw Signed-off-by: Peter Jun Park <peter.park@amd.com> fix spelling and license Signed-off-by: Peter Jun Park <peter.park@amd.com> clean up index Signed-off-by: Peter Jun Park <peter.park@amd.com> clean up installation guides Signed-off-by: Peter Jun Park <peter.park@amd.com> add basic usage (quickstart) Signed-off-by: Peter Jun Park <peter.park@amd.com> add ref to global options update toc Signed-off-by: Peter Jun Park <peter.park@amd.com> modularize modes and global options Signed-off-by: Peter Jun Park <peter.park@amd.com> add profile mode Signed-off-by: Peter Jun Park <peter.park@amd.com> fixes Signed-off-by: Peter Jun Park <peter.park@amd.com> reorg and clean up Signed-off-by: Peter Jun Park <peter.park@amd.com> add dynamic omniperf version number in installation guide Signed-off-by: Peter Jun Park <peter.park@amd.com> add datatemplate more reorg Signed-off-by: Peter Jun Park <peter.park@amd.com> clean up Signed-off-by: Peter Jun Park <peter.park@amd.com> reorg images move profile mode reorg reorg reorg more fix formatting fix headings ref anchor mi2xx note add extlinks add extlinks Signed-off-by: Peter Jun Park <peter.park@amd.com> black format fix formatting, anchors Signed-off-by: Peter Jun Park <peter.park@amd.com> reorg fix words and formatting Signed-off-by: Peter Jun Park <peter.park@amd.com> formatting Signed-off-by: Peter Jun Park <peter.park@amd.com> same reorg format fix formatting fix toc Signed-off-by: Peter Jun Park <peter.park@amd.com> format * impr internal linking and fix sphinx warnings Signed-off-by: Peter Jun Park <peter.park@amd.com> * add spellcheck/linting from rocm-docs-core Signed-off-by: Peter Jun Park <peter.park@amd.com> fix rst directives satisfy spellcheck fix more spelling rm unused files fix spelling and update wordlist * bump rocm-docs-core to 1.6.0 Signed-off-by: Peter Jun Park <peter.park@amd.com> * add fixes from @skyreflectedinmirrors and @lpaoletti Signed-off-by: Peter Jun Park <peter.park@amd.com> add references to toc Signed-off-by: Peter Jun Park <peter.park@amd.com> add more fixes Signed-off-by: Peter Jun Park <peter.park@amd.com> * add package manager install section Signed-off-by: Peter Jun Park <peter.park@amd.com> * add fixes Signed-off-by: Peter Jun Park <peter.park@amd.com> add metadata and fixes Signed-off-by: Peter Jun Park <peter.park@amd.com> add fixes bump to 1.6.1 more fixes fix fmt in profiling examples Signed-off-by: Peter Jun Park <peter.park@amd.com> add missing mem type table Signed-off-by: Peter Jun Park <peter.park@amd.com> fix formatting fmt * add custom css Signed-off-by: Peter Jun Park <peter.park@amd.com> fix css fs * make images/figs click-to-expand Signed-off-by: Peter Jun Park <peter.park@amd.com> add missed image update fix link * update documentation link in README Signed-off-by: Peter Jun Park <peter.park@amd.com> * formatting fixes Signed-off-by: Peter Jun Park <peter.park@amd.com> more formatting * fix heading Signed-off-by: Peter Jun Park <peter.park@amd.com> * move archived docs Signed-off-by: Peter Jun Park <peter.park@amd.com> * exclude archived docs from docs build Signed-off-by: Peter Jun Park <peter.park@amd.com> * update archived docs workflow Signed-off-by: Peter Jun Park <peter.park@amd.com> move files update archived docs workflow Signed-off-by: Peter Jun Park <peter.park@amd.com> fix version number clean up workflow workflow test workflow test another workflow test * rm docs linting Signed-off-by: Peter Jun Park <peter.park@amd.com> * Apply cmake-format suggested changes Signed-off-by: Sam Wu <22262939+samjwu@users.noreply.github.com> * Apply cmake-format Signed-off-by: Sam Wu <22262939+samjwu@users.noreply.github.com> --------- Signed-off-by: Peter Jun Park <peter.park@amd.com> Signed-off-by: Sam Wu <22262939+samjwu@users.noreply.github.com> Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
6.3 KiB
Getting Started
.. toctree::
:glob:
:maxdepth: 4
Quickstart
-
Launch & Profile the target application with the command line profiler
The command line profiler launches the target application, calls the rocProfiler API via the rocProf binary, and collects profile results for the specified kernels, dispatches, and/or hardware components. If not specified, Omniperf will default to collecting all available counters for all kernels/dispatches launched by the user's executable.
To collect the default set of data for all kernels in the target application, launch, e.g.:
$ omniperf profile -n vcopy_data -- ./vcopy -n 1048576 -b 256The app runs, each kernel is launched, and profiling results are generated. By default, results are written to a subdirectory with your accelerator's name e.g., ./workloads/vcopy_data/MI200/ (where name is configurable via the
-nargument).To collect all requested profile information, it may be required to replay kernels multiple times. -
Customize data collection
Options are available to specify for which kernels/metrics data should be collected. Note that filtering can be applied either in the profiling or analysis stage, however filtering at during profiling collection will often speed up your overall profiling run time.
Some common filters include:
-k/--kernelenables filtering kernels by name.-d/--dispatchenables filtering based on dispatch ID.-b/--blockenables collects metrics for only the specified (one or more) hardware component blocks.
To view available metrics by hardware Block you can use the
--list-metricsargument:$ omniperf analyze --list-metrics <sys_arch> -
Analyze at the command line
After generating a local output folder (e.g. ./workloads/vcopy_data/MI200), the command line tool can also be used to quickly interface with profiling results. View different metrics derived from your profiled results and get immediate access all metrics organized by hardware blocks.
If no kernel, dispatch, or hardware block filters are applied at this stage, analysis will be reflective of the entirety of the profiling data.
To interact with profiling results from a different session, users just provide the workload path.
-p/--pathenables users to analyze existing profiling data in the Omniperf CLI. -
Analyze in the Grafana GUI
To conduct a more in-depth analysis of profiling results we recommend users utilize the Omniperf Grafana GUI. To interact with profiling results, users must import their data to the MongoDB instance included in the Omniperf dockerfile.
To interact with Grafana GUI data, stored in the Omniperf DB, users can enter database mode. For example:
$ omniperf database --import [CONNECTION OPTIONS]
Usage
Modes
Modes change the fundamental behavior of the Omniperf command line tool. Depending on which mode is chosen, different command line options become available.
-
Profile: Target application is launched on the local system using AMD’s ROC Profiler. Depending on the profiling options chosen, selected kernels, dispatches, and/or hardware components in the application are profiled and results are stored locally in an output folder (./workloads/<name>).
$ omniperf profile --help -
Analyze: Profiling data from
-p/--pathdirectory is loaded into the Omniperf CLI analyzer where users have immediate access to profiling results and generated metrics. Metrics are quickly generated from the entirety of your profiled application or a subset you’ve identified through the Omniperf CLI analysis filters.To generate a lightweight GUI interface users can add the
--guiflag to their analysis command.This mode is designed to be a middle ground to the highly detailed Omniperf Grafana GUI and is great for users who want immediate access to a hardware component they’re already familiar with.
$ omniperf analyze --help -
Database: Our detailed Grafana GUI is built on a MongoDB database.
--importprofiling results to the DB to interact with the workload in Grafana or--removethe workload from the DB.Connection options will need to be specified. See the Grafana Analysis import section for more details on this.
$ omniperf database --help
Global Options
The Omniperf command line tool has a set of 'global' options that are available across all modes.
| Argument | Description |
|---|---|
-v / --version |
Print Omniperf version and exit. |
-V / --verbose |
Increase output verbosity (use multiple times for higher levels). |
-q / --quiet |
Reduce output and run quietly. |
-s / --specs |
Print system specs and exit. |
Omniperf also recognizes the project variable, `OMNIPERF_COLOR`, should the user choose to disable colorful output. To disable default colorful behavior, set this variable to `0`.
Basic Operations
| Operation | Mode | Required Arguments |
|---|---|---|
| Profile a workload | profile | --name, -- <profile_cmd> |
| Standalone roofline analysis | profile | --name, --roof-only, -- <profile_cmd> |
| Import a workload to database | database | --import, --host, --username, --workload, --team |
| Remove a workload from database | database | --remove, --host, --username, --workload, --team |
| Launch standalone GUI from CLI | analyze | --path, --gui |
| Interact with profiling results from CLI | analyze | --path |