When using rocprof v3:
* Use --kernel-include-regex for kernel name filtering
* Use --kernel-iteration-range for kernel dispatch filtering
Update changelog
Added debug log for when no flops are recorded (total_flops is 0), so AI points will not be plotted.
Removed commented out print statement that is not functional- contains nonexistent method call.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Higher versions (eg. 0.4.1) have external dependencies that are causing errors and forcing early exits without creating roof plots
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Analysis report block based filtering for profiling
* Profiling mode changes
- `-b` option now additionally accepts metric id(s), similar to `-b` option in analyze mode (e.g. 6, 6.2, 6.23)
- Only counters mentioned in the selected analysis report blocks will be collected
- Add parsing logic to identify hardware counters from analysis report blocks
- Add filtering logic to only write filtered counters in perfmon files
- Log not collected counters in one line
- `--list-metrics` option added in profile mode to list possible metric id(s) similar to analyze mode
- Write arguments provided during profiling in profiling_configuration.yaml file
* Analysis mode changes
- During analysis mode, only show report blocks selected during profiling
- If `-b` option is provided in analysis mode, then follow provided filters
- Do not show empty tables in analysis report
* Miscellaneous changes
- Update CHANGELOG
- Add test cases
- Instruction mix report block filter
- Instruction mix and Memory chart report block filter
- Instruction mix report block filter and CPC hardware block filter
- TA hardware block filter
- --list-metrics in profile mode should work
- Move binary handler fixtures to conftest.py to avoid importing
fixtures
- cmake file in tests directory has been updated to compile sample/vmem.hip for testing
* Public documentation changes
- Use the term "Hardware report block" instead of "Hardware block"
- Add documentation for "--list-metrics" option in profile mode
- Add example of filtering by hardware report block such as instruction
mix and wavefront launch statistics
- Add deprecation warning for hardware component (sq, tcc) based filtering
* Fix post analysis gui in standalone binary (#591)
* Fix post analysis gui in standalone binary
* Add post analysis gui assets and required server libraries for GUI
server and web page
* Add port forwarding to docker test compose
* Update README me to use `docker compose up` instead of `docker compose run`
to run containers with port forwarding and to leverage other
functionalities of docker compose
* Fix rocprofv1 output processing. (#588)
* fix rocprof-compute binary name in package manager install docs
---------
Co-authored-by: vedithal-amd <Vignesh.Edithal@amd.com>
Co-authored-by: xuchen-amd <xuchen@amd.com>
Adding FP8 datatype to roofline feature in rocprof-compute on MI300-based systems.
FP8 now shows in terminal output and roofline csv, and outputs a standalone PDF.
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Fix post analysis gui in standalone binary
* Add post analysis gui assets and required server libraries for GUI
server and web page
* Add port forwarding to docker test compose
* Update README me to use `docker compose up` instead of `docker compose run`
to run containers with port forwarding and to leverage other
functionalities of docker compose
* Add cmake function to create standalone binary
* Mention licenses used by dependencies in the LICENSE file
* Add test cases for standalone binary by adding --call-binary option for pytest
* Docker compose file to create standalone binary in standardized RHEL 8 environment
* Add README instructions on how to create and test standalone binary
* Move docker files from utils to docker folder; Add standalone binary testing instructions
* Add CHANGELOG statement
* Use different service names in docker compose files
* Use volume mounting in docker files
* initial hack to fix for v3 stucking on mi100 becasue of -m parameter and missing counter csv file
* proper formating
* refactored profiler option function to take soc arch
* resolve missing step that casued error for profiler option
* fix typo of arch name
* change method of putting soc info into profiler option
* isort and black format
* add comment for the part that handles missing counter csv file
* remove unncecessary import
---------
Co-authored-by: YANG WANG <ywang@ywang-ubuntu.amd.com>