* Revert of https://github.com/ROCm/rocprofiler-compute/pull/738
* Change default rocprof backend interface to rocprofv3
* Add MI 350 support in documentation
* Added known issue that MI 100 profiling will not work unless rocprofv1
is explicitly opted in
* Remove MI 50 soc gfx python class since MI 50 is not supported
Add option to print out roofline plot in terminal using plotext.
Takes in one datatype and returns the str from plot.build() which contains the visual plot of roofline analysis for said datatype.
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Update roofline binaries from rocm-amdgpu-bench
- uses hip to find number of CUs dynamically instead of hardcoded values in table
Remove duplicate AI plot points printing
- only print ai points once on plot since we are measuring using total flops and value is same
- remove datatype from legend labels
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
--gui option for analyze mode failing due to missing arg in load_kernel_top call in pre_processing
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* rocprof-compute TUI (Text User Interface) - providing users interactive analyze experience with visuals.
* Analyze results with tables, charts, plots.
* Add menu bar, terminal, directory dialog. Improve logging and ui.
* Add display config file to manipulate result categorization.
* Add support for recently opened dirs.
* Update licensing and version.
* Setting ROCPROF=rocprofiler-sdk environment variable will use rocprofiler-sdk C++ library instead of rocprofv3 python script
* Add runtime option --rocprofiler-sdk-library-path to use custom version of rocprofiler sdk library
* Add --rocprofiler-sdk-library-path conftest option for tests
* Setup appropriate environment variables to inject rocprofiler sdk code to user command
* Add env. vars. for counter collection and filtering
* Add env. vars. for pc sampling
* Use python bindings to list counters supported by rocprofiler sdk
* Avoid crash when profiling data not generated
-Handle case where program has no kernel launches
-Improve error messages
-Avoid roofline when profiling data is missing
Signed-off-by: benrichard-amd <ben.richard@amd.com>
* Update other soc_gfx files to catch missing pmc_perf.csv
* Fix formatting
* Fix incorrectly ordered imports
---------
Signed-off-by: benrichard-amd <ben.richard@amd.com>
Rework of roofline binaries generated from rocm-amdgpu-bench
- removed arch identifier in bin name
- removed rocm5 bins altogether
Updated required distros for roofline
- updated distro checks and bin naming
- moved up ubuntu20.04->22.04 and sles15.3->15.6 per rocm support
Enabled ctests for mi350 for test_roof_*
- removed mi350 series check to skip these specific tests
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add test for 9fx942 number of xcds.
* Improve the structure of mi gpu specs, add num_xcds_spec_class test.
* Add to ctest.
---------
Signed-off-by: xuchen-amd <xuchen@amd.com>
-add check for rhel10 (platform:el10), force use rhel roof binary
-update changelog in 'unreleased- added' section
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Invalid rpath on roofline binaries reported during build testing for new RHEL10 addition, removed rpaths to prevent rpath check failures.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Profile mode:
Fix roofline plots for datatypes that have peakVALU only. Check for highest roofline to plot the bandwidth lines to proper height, don't rely on existence of peakMFMA for every datatype.
Analyze mode:
Add roofline-data-type option for viewing pdfs in standalone gui. Default is same as profile mode, FP32.
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add MI 350 hardware information
* Refactor MI GPU YAML file and corresponding interface
* Add SoC file for gfx950 architecture
* Add analysis report configs for MI 350 containing existing metrics
* Add placeholder None valued metrics for previous architectures to make
baseline comparison work
* Enable testing on MI 350
* Analysis config metric changes
- SPI changes
- Update metric formula for default SPI pipe counter
- Use efficiently collected pipe wise SPI counters
- Add SPI Wave Occupancy
- Add Scheduler-Pipe Wave Utilization
- Update formula for VGPR Writes
- Add Scheduler-Pipe FIFO Full Rate
- CPC changes
- Add CPC SYNC FIFO Full Rate
- Add CPC CANE Stall Rate
- Add CPC ADC Utilization
- SQ changes
- Add VALU co-issue efficiency
- Add F6F4 datatype metrics
- Update formula for total FLOPs by adding F6F4 counters
- Add LDS STORE / LOAD / ATOMIC metrics
- Add LDS STORE / LOAD / ATOMIC bandwidth
- Add LDS FIFO and TA ADDR / CMD / DATA FIFO full rates
* Collect TCP_TCP_LATENCY_sum only for gfx950 (MI 350)
* Do not inject SQ_ACCUM_PREV_HIRES unnecesarily
* Do not hardcode memory and shader clock speeds
* Write num_hbm_channels to sysinfo.csv instead of hbm_bw while profiling
* Move generate sysinfo.csv to pre processing step of profiling
* Add warnings to use --specs-correction for missing sysinfo.csv values during analysis phase
* Update CHANGELOG
* Analysis phase warning to use --specs-correction when needed
In wheel environment, rocprof-compute in bin folder is not a soft link. For executing rocprof-compute from bin folder, the system path should also have the dependency script paths. Added the same