* Add `rocpd` choice for `--format-rocprof-output` option
* Add rocpd_data.py which defines SQL queries to extract data from rocpd database
* Use sqlite3 package to read the database
* Add `--retain-rocpd-output` option in profile mode to retain raw
rocpd database
* Add warning notice to say `--format-rocprof-output rocpd` will be
default in future release
For rocpd output:
* Use only `pmc_perf.csv` instead of reading individual coll_level results csv files
* Post process csv files using pandas in analysis mode instead of profile mode
* Use ACCUM counters instead of SQ_ACCUM_PREV_HIRES
* Add test cases for rocpd output format
* Fix code formatting issues
* Update CHANGELOG
* Show description of metrics during analysis
* Use --include-cols Description show the Description column in analyze mode (this is hidden by default)
* Remove tips field from analysis config
* Align metric names in analysis config and documentation
* Add unified config utils/unified_config.yaml
* Add python script utils/split_config.py to auto generate analysis configuration and documentation metrics description
* Add test case to ensure unified config is older than auto-generated config
* Auto generate analysis config and documentation metrics description
* Update CONTRIBUTING.md to add instructions to build documentation assets
* Add docker image and compose file to build documentation
* Update CHANGELOG and Documentation
* Use jinja template instead of hardcoding metric tables in documentation
* Do not force unsupported metrics to be specified in older gpu
architectures as None
* Remove metrics which are explicitly set to None
* Update CHANGELOG
* Fix analysis configuration to fix baseline comparisons across all gpu
architectures
* Add missing 1812 section for gfx908
* Add missing 1812 section for gfx90a
* Baseline comparision will only show common metrics
* First workload will be used to set Metric ID index column
* Analysis report block based filtering is the default now
* Update documentation
* Update CHANGELOG
* Fix tests
* Replace hardware block based filtering tests with report block
based filtering tests
* Fix roofline rocm version bug
* Fix utils bug
* Remove unnecessary tests
* Do not check textual-fspicker package in cmake build
* Use rocprofv3 to test MI 100 and fix tests
* Update current bins to have rocm6 suffix. Add new rocm7 bins, built on rocm7.0 latest due to hip updates.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Re-add rocm version check for roof bins.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Rebuild roofline binaries on top of latest rocm7 (#16379) after recent mainline promotions.
Adjusting version and distro combinations of bins following rocm6 vs rocm 7 supported OS.
*rhel8 not supported on rocm7, also not built anymore
*sles15 not supported on rocm7 but is still being built
*ubuntu stays as 22.04 and above for rocm7
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Minor fixes after testing.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Fixed bad copy after finding it in testing ctest.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Remove runpath from new bin
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Rework target_binary map return in detect_roofline- we should not be returning maps of different sizes or with different keys for the same method. Expected output should be consistent in case we run into bad position, or for testing purposes. Manually tested all possible roofline bin expected cases to comfirm functionality and expected user output.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Update changelog with new roofline distro minimums
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Support for python3.8 and above means union defined in one of the methods was throwing errors for anything less than python3.10. Swapping out | operand for Optional[] resolves errors on systems using <3.10. No functional changes.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add custom rocprofiler-sdk counter definitions file for MI 100
* Update CHANGELOG to mention that accumulation counters will not be
collected when profiling on MI 100 using rocprofiler-sdk/rocprofv3
* Migrate accum_counters.yaml to code
Change when Roofline PDFs are generated- during general profiling and --roof-only profiling (skip only when --no-roof option is present)
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* added progress printout for profiler
* added comments and fixed readability
* removed redundant newlines
* moved format_time helper function to utils
* removed tqdm and redundant time calc
* enable roofline plot in cli.
* Add roofline to analysis config.
* Unify global variables.
* Disable roofline for baseline comparison and gfx908.
* Add check for roofline.csv
* Revert of https://github.com/ROCm/rocprofiler-compute/pull/738
* Change default rocprof backend interface to rocprofv3
* Add MI 350 support in documentation
* Added known issue that MI 100 profiling will not work unless rocprofv1
is explicitly opted in
* Remove MI 50 soc gfx python class since MI 50 is not supported