* pip-compile docs/requirements.txt
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Add Sphinx docs config
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Add Sphinx config
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Update docs build config
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* style(conf.py): Apply black formatting to docs/conf.py
Signed-off-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
* Update docs requirements
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Update to rocm-docs-core 1.3.0
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Update docs requirements
Signed-off-by: Peter Jun Park <peter.park@amd.com>
pip-compile requirements
Signed-off-by: Peter Jun Park <peter.park@amd.com>
bump rocm-docs-core to 1.5.0
bump rocm-docs-core to 1.4.1
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* Add dependabot.yml and update CODEOWNERS
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Update toc and conf
Signed-off-by: Peter Jun Park <peter.park@amd.com>
update dependabot
* Port docs to rocm-docs standard
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Add toc and Diataxis cards
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Add basic file structure
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add glossary
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add includes
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Add license.rst
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add compatible hw
Signed-off-by: Peter Jun Park <peter.park@amd.com>
fix spelling and license
Signed-off-by: Peter Jun Park <peter.park@amd.com>
clean up index
Signed-off-by: Peter Jun Park <peter.park@amd.com>
clean up installation guides
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add basic usage (quickstart)
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add ref to global options
update toc
Signed-off-by: Peter Jun Park <peter.park@amd.com>
modularize modes and global options
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add profile mode
Signed-off-by: Peter Jun Park <peter.park@amd.com>
fixes
Signed-off-by: Peter Jun Park <peter.park@amd.com>
reorg and clean up
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add dynamic omniperf version number in installation guide
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add datatemplate
more reorg
Signed-off-by: Peter Jun Park <peter.park@amd.com>
clean up
Signed-off-by: Peter Jun Park <peter.park@amd.com>
reorg images
move profile mode
reorg
reorg
reorg more
fix formatting
fix headings
ref anchor mi2xx note
add extlinks
add extlinks
Signed-off-by: Peter Jun Park <peter.park@amd.com>
black format
fix formatting, anchors
Signed-off-by: Peter Jun Park <peter.park@amd.com>
reorg
fix words and formatting
Signed-off-by: Peter Jun Park <peter.park@amd.com>
formatting
Signed-off-by: Peter Jun Park <peter.park@amd.com>
same
reorg
format
fix formatting
fix toc
Signed-off-by: Peter Jun Park <peter.park@amd.com>
format
* impr internal linking and fix sphinx warnings
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* add spellcheck/linting from rocm-docs-core
Signed-off-by: Peter Jun Park <peter.park@amd.com>
fix rst directives
satisfy spellcheck
fix more spelling
rm unused files
fix spelling and update wordlist
* bump rocm-docs-core to 1.6.0
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* add fixes from @skyreflectedinmirrors and @lpaoletti
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add references to toc
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add more fixes
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* add package manager install section
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* add fixes
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add metadata and fixes
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add fixes
bump to 1.6.1
more fixes
fix fmt in profiling examples
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add missing mem type table
Signed-off-by: Peter Jun Park <peter.park@amd.com>
fix formatting
fmt
* add custom css
Signed-off-by: Peter Jun Park <peter.park@amd.com>
fix css fs
* make images/figs click-to-expand
Signed-off-by: Peter Jun Park <peter.park@amd.com>
add missed image
update
fix link
* update documentation link in README
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* formatting fixes
Signed-off-by: Peter Jun Park <peter.park@amd.com>
more formatting
* fix heading
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* move archived docs
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* exclude archived docs from docs build
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* update archived docs workflow
Signed-off-by: Peter Jun Park <peter.park@amd.com>
move files
update archived docs workflow
Signed-off-by: Peter Jun Park <peter.park@amd.com>
fix version number
clean up workflow
workflow test
workflow test
another workflow test
* rm docs linting
Signed-off-by: Peter Jun Park <peter.park@amd.com>
* Apply cmake-format suggested changes
Signed-off-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
* Apply cmake-format
Signed-off-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
---------
Signed-off-by: Peter Jun Park <peter.park@amd.com>
Signed-off-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
[ROCm/rocprofiler-compute commit: a0dc485ceb]
5.0 KiB
Getting Started
.. toctree::
:glob:
:maxdepth: 4
Quickstart
-
Launch & Profile the target application with the command line profiler
The command line profiler launches the target application, calls the rocProfiler API, and collects profile results for the specified kernels, dispatches, and/or IP blocks. If not specified, Omniperf will default to collecting all available counters for all kernels/dispatches launched by the user's executable.
To collect the default set of data for all kernels in the target application, launch, e.g.:
$ omniperf profile -n vcopy_data -- ./vcopy 1048576 256The app runs, each kernel is launched, and profiling results are generated. By default, results are written to (e.g.,) ./workloads/vcopy_data (configurable via the
-nargument). To collect all requested profile information, it may be required to replay kernels multiple times. -
Customize data collection
Options are available to specify for which kernels/metrics data should be collected. Note that filtering can be applied either in the profiling or analysis stage, however filtering at during profiling collection will often speed up your overall profiling run time.
Some common filters include:
-k/--kernelenables filtering kernels by name.-d/--dispatchenables filtering based on dispatch ID-b/--ipblocksenables collects metrics for only the specified (one or more) IP Blocks.
To view available metrics by IP Block you can use the
--list-metricsargument to view a list of all available metrics organized by IP Block.$ omniperf analyze --list-metrics <sys_arch> -
Analyze at the command line
After generating a local output folder (./workloads/<name>), the command line tool can also be used to quickly interface with profiling results. View different metrics derived from your profiled results and get immediate access all metrics organized by IP block.
If no kernel, dispatch, or ipblock filters are applied at this stage, analysis will be reflective of the entirety of the profiling data.
To interact with profiling results from a different session, users just provide the workload path.
-p/--pathenables users to analyze existing profiling data in the Omniperf CLI. -
Analyze in the Grafana GUI
To conduct a more in-depth analysis of profiling results we recommend users utilize the Omniperf Grafana GUI. To interact with profiling results, users must import their data to the MongoDB instance included in the Omniperf dockerfile.
To interact with Grafana GUI data, stored in the Omniperf DB, users can enter database mode. For example:
$ omniperf database --import [CONNECTION OPTIONS]
Usage
Modes
Modes change the fundamental behavior of the Omniperf command line tool. Depending on which mode is chosen, different command line options become available.
-
Profile: Target application is launched on the local system utilizing AMD’s ROC Profiler. Depending on the profiling options chosen, selected kernels, dispatches, and/or IP Blocks in the application are profiled and results are stored locally in an output folder (./workloads/<name>).
$ omniperf profile --help -
Analyze: Profiling data from
-p/--pathdirectory is loaded into the Omniperf CLI analyzer where users have immediate access to profiling results and generated metrics. Metrics are quickly generated from the entirety of your profiled application or a subset you’ve identified through the Omniperf CLI analysis filters.To gererate a lightweight GUI interface users can add the
--guiflag to their analysis command.This mode is designed to be a middle ground to the highly detailed Omniperf Grafana GUI and is great for users who want immediate access to an IP Block they’re already familiar with.
$ omniperf analyze --help -
Database: Our detailed Grafana GUI is built on a MongoDB database.
--importprofiling results to the DB to interact with the workload in Grafana or--removethe workload from the DB.Connection options will need to be specified. See the Grafana Analysis import section for more details on this.
$ omniperf database --help
Basic Operations
| Operation | Mode | Required Arguments |
|---|---|---|
| Profile a workload | profile | --name, -- <profile_cmd> |
| Standalone roofline analysis | profile | --name, --roof-only, -- <profile_cmd> |
| Import a workload to database | database | --import, --host, --username, --workload, --team |
| Remove a workload from database | database | --remove, --host, --username, --workload, --team |
| Launch standalone GUI from CLI | analyze | --path, --gui |
| Interact with profiling results from CLI | analyze | --path |