* Setting ROCPROF=rocprofiler-sdk environment variable will use rocprofiler-sdk C++ library instead of rocprofv3 python script
* Add runtime option --rocprofiler-sdk-library-path to use custom version of rocprofiler sdk library
* Add --rocprofiler-sdk-library-path conftest option for tests
* Setup appropriate environment variables to inject rocprofiler sdk code to user command
* Add env. vars. for counter collection and filtering
* Add env. vars. for pc sampling
* Use python bindings to list counters supported by rocprofiler sdk
* Avoid crash when profiling data not generated
-Handle case where program has no kernel launches
-Improve error messages
-Avoid roofline when profiling data is missing
Signed-off-by: benrichard-amd <ben.richard@amd.com>
* Update other soc_gfx files to catch missing pmc_perf.csv
* Fix formatting
* Fix incorrectly ordered imports
---------
Signed-off-by: benrichard-amd <ben.richard@amd.com>
Rework of roofline binaries generated from rocm-amdgpu-bench
- removed arch identifier in bin name
- removed rocm5 bins altogether
Updated required distros for roofline
- updated distro checks and bin naming
- moved up ubuntu20.04->22.04 and sles15.3->15.6 per rocm support
Enabled ctests for mi350 for test_roof_*
- removed mi350 series check to skip these specific tests
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Add rp-compute technical writer directly for any documentation review.
Remove existing packaging review requests for single user; every repo owner should be notified.
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add test for 9fx942 number of xcds.
* Improve the structure of mi gpu specs, add num_xcds_spec_class test.
* Add to ctest.
---------
Signed-off-by: xuchen-amd <xuchen@amd.com>
-add check for rhel10 (platform:el10), force use rhel roof binary
-update changelog in 'unreleased- added' section
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Invalid rpath on roofline binaries reported during build testing for new RHEL10 addition, removed rpaths to prevent rpath check failures.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Profile mode:
Fix roofline plots for datatypes that have peakVALU only. Check for highest roofline to plot the bandwidth lines to proper height, don't rely on existence of peakMFMA for every datatype.
Analyze mode:
Add roofline-data-type option for viewing pdfs in standalone gui. Default is same as profile mode, FP32.
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add MI 350 hardware information
* Refactor MI GPU YAML file and corresponding interface
* Add SoC file for gfx950 architecture
* Add analysis report configs for MI 350 containing existing metrics
* Add placeholder None valued metrics for previous architectures to make
baseline comparison work
* Enable testing on MI 350
* Analysis config metric changes
- SPI changes
- Update metric formula for default SPI pipe counter
- Use efficiently collected pipe wise SPI counters
- Add SPI Wave Occupancy
- Add Scheduler-Pipe Wave Utilization
- Update formula for VGPR Writes
- Add Scheduler-Pipe FIFO Full Rate
- CPC changes
- Add CPC SYNC FIFO Full Rate
- Add CPC CANE Stall Rate
- Add CPC ADC Utilization
- SQ changes
- Add VALU co-issue efficiency
- Add F6F4 datatype metrics
- Update formula for total FLOPs by adding F6F4 counters
- Add LDS STORE / LOAD / ATOMIC metrics
- Add LDS STORE / LOAD / ATOMIC bandwidth
- Add LDS FIFO and TA ADDR / CMD / DATA FIFO full rates
* Collect TCP_TCP_LATENCY_sum only for gfx950 (MI 350)
* Do not inject SQ_ACCUM_PREV_HIRES unnecesarily
* Do not hardcode memory and shader clock speeds
* Write num_hbm_channels to sysinfo.csv instead of hbm_bw while profiling
* Move generate sysinfo.csv to pre processing step of profiling
* Add warnings to use --specs-correction for missing sysinfo.csv values during analysis phase
* Update CHANGELOG
* Analysis phase warning to use --specs-correction when needed
* Add new sample applications.
* Generalize py test launcher for additional apps.
* Add TCP pytest, and add to ctest.
* Update licensing.
* Disable for non-mi300 machines.
In wheel environment, rocprof-compute in bin folder is not a soft link. For executing rocprof-compute from bin folder, the system path should also have the dependency script paths. Added the same
Rebuild of rocm-amdgpu-bench roofline binaries for MI200/MI300 systems with rocm6.
Added datatype options to roofline feature.
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Move console logging to logger function to avoid circular dependency in utils module
Signed-off-by: coleramos425 <colramos@amd.com>
* Apply python formatting
Signed-off-by: coleramos425 <colramos@amd.com>
* Remove the default StreamHandler before adding the custom
If you are not explicitly removing this default handler, it could be causing duplicate outputs.
Signed-off-by: coleramos425 <colramos@amd.com>
* Fix lingering bugs from merge conflict resolution
Signed-off-by: coleramos425 <colramos@amd.com>
* Comply to python formatting and update pre-commit hook helper
Signed-off-by: coleramos425 <colramos@amd.com>
* Removing redundant console_log call as the get_mi300_num_xcds() call, otherwise ALL Mi200 profiling runs will print this message
Signed-off-by: coleramos425 <colramos@amd.com>
---------
Signed-off-by: coleramos425 <colramos@amd.com>