2024-11-01 12:20:21 -04:00
# Changelog for ROCm Compute Profiler
2024-06-04 00:03:43 +00:00
2024-11-01 12:20:21 -04:00
Full documentation for ROCm Compute Profiler is available at [https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/ ](https://rocm.docs.amd.com/projects/rocprofiler-compute/en/latest/ ).
2024-08-13 12:29:32 -04:00
2025-02-20 17:51:57 -05:00
## Unreleased
2025-02-27 12:46:47 -07:00
### Added
* Add Docker files to package the application and dependencies into a single portable and executable standalone binary file
2025-03-10 14:42:56 -04:00
* Analysis report based filtering
* -b option in profile mode now additionally accepts metric id(s) for analysis report based filtering
* -b option in profile mode also accept hardware IP block for filtering, however, this support will be deprecated soon
* --list-metrics option added in profile mode to list possible metric id(s), similar to analyze mode
2025-02-27 12:46:47 -07:00
### Changed
* Change normal_unit default to per_kernel
2025-03-17 16:20:41 -04:00
* change dependency from rocm-smi to amd-smi
2025-02-27 12:46:47 -07:00
2025-03-06 12:52:54 -07:00
### Resolved issues
* Fixed option specs-correction
2025-02-20 17:51:57 -05:00
2025-02-11 12:46:33 -05:00
## (Unreleased) ROCm Compute Profiler 3.1.0 for ROCm 6.4.0
### Added
2025-02-13 18:05:49 -07:00
* Roofline support for Ubuntu 24.04
* Experimental support rocprofv3 (not enabled as default)
* Experimental feature: Spatial multiplexing
### Resolved issues
* Fixed PoP of VALU Active Threads
* Workaround broken mclk for old version of rocm-smi
2025-02-11 12:46:33 -05:00
## ROCm Compute Profiler 3.0.0 for ROCm 6.3.0
### Changed
* Renamed Omniperf to ROCm Compute Profiler (#475 )
2024-11-04 16:45:16 -05:00
## Omniperf 2.0.1 for ROCm 6.2.1
2024-09-27 17:10:31 -04:00
2024-11-04 16:45:16 -05:00
### Changed
2024-09-27 17:10:31 -04:00
2024-11-04 16:45:16 -05:00
* enable rocprofv1 for MI300 hardware (#391 )
* refactoring and updating documemtation (#362 , #394 , #398 , #414 , #420 )
* branch renaming and workflow updates (#389 , #404 , #409 )
* bug fix for analysis output
* add dependency checks on application launch (#393 )
* patch for profiling multi-process/multi-GPU applications (#376 , #396 )
* packaging updates (#386 )
* rename CHANGES to CHANGELOG.md (#410 )
* rollback Grafana version in Dockerfile for Angular plugin compatibility (#416 )
* enable CI triggers for Azure CI (#426 )
* add GPU model distinction for MI300 systems (#423 )
* new MAINTAINERS.md guide for omniperf publishing procedures (#402 )
2024-09-27 17:10:31 -04:00
2024-11-04 16:45:16 -05:00
### Optimized
2024-09-27 17:10:31 -04:00
2025-01-02 13:29:47 -08:00
* reduced running time of Omniperf when profiling (#384 )
2024-11-04 16:45:16 -05:00
* console logging improvements
2024-09-27 17:10:31 -04:00
## Omniperf 2.0.1 for ROCm 6.2.0
2024-08-13 12:29:32 -04:00
2024-11-04 16:45:16 -05:00
### Added
2024-08-13 12:29:32 -04:00
* new option to force hardware target via `OMNIPERF_ARCH_OVERRIDE` global (#370 )
2024-06-04 00:03:43 +00:00
* CI/CD support for MI300 hardware (#373 )
* support for MI308X hardware (#375 )
2024-11-04 16:45:16 -05:00
### Optimized
2024-08-13 12:29:32 -04:00
* cmake build improvements (#374 )
## Omniperf 2.0.0 (17 May 2024)
2024-05-17 18:36:11 +00:00
* improved logging than spans all modes (#177 ) (#317 ) (#335 ) (#341 )
* overhauled CI/CD that spans all modes (#179 )
* extensible SoC classes to better support adding new hardware configs (#180 )
* --kernel-verbose no longer overwrites kernel names (#193 )
2025-01-02 13:29:47 -08:00
* general cleanup and improved organization of source code (#200 ) (#210 )
2024-05-17 18:36:11 +00:00
* separate requirement files for docs and testing dependencies (#205 ) (#262 ) (#358 )
* add support for MI300 hardware (#231 )
* upgrade Grafana assets and build script to latest release (#235 )
* update minimum ROCm and Python requirements (#277 )
* sort rocprofiler input files prior to profiling (#304 )
* new --quiet option will suppress verbose output and show a progress bar (#308 )
* roofline support for Ubuntu 22.04 (#319 )
2024-08-13 12:29:32 -04:00
## Omniperf 1.1.0-PR1 (13 Oct 2023)
2024-03-05 12:16:23 -05:00
* standardize headers to use 'avg' instead of 'mean'
* add color code thresholds to standalone gui to match grafana
* modify kernel name shortener to use cpp_filt (#168 )
* enable stochastic kernel dispatch selection (#183 )
* patch grafana plugin module to address a known issue in the latest version (#186 )
* enhanced communication between analyze mode kernel flags (#187 )
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.10 (22 Aug 2023)
2023-08-22 12:45:36 -05:00
* critical patch for detection of llvm in rocm installs on SLURM systems
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.9 (17 Aug 2023)
2023-08-17 11:04:43 -05:00
* add units to L2 per-channel panel (#133 )
* new quickstart guide for Grafana setup in docs (#135 )
* more detail on kernel and dispatch filtering in docs (#136 , #137 )
* patch manual join utility for ROCm >5.2.x (#139 )
* add % of peak values to low level speed-of-light panels (#140 )
* patch critical bug in Grafana by removing a deprecated plugin (#141 )
* enhancements to KernelName demangeler (#142 )
* general metric updates and enhancements (#144 , #155 , #159 )
* add min/max/avg breakdown to instruction mix panel (#154 )
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.8 (30 May 2023)
2023-05-30 11:28:44 -05:00
* add `--kernel-names` option to toggle kernelName overlay in standalone roofline plot (#93 )
* remove unused python modules (#96 )
* fix empirical roofline calculation for single dispatch workloads (#97 )
* match color of arithmetic intensity points to corresponding bw lines
* ux improvements in standalone GUI (#101 )
* enhanced readability for filtering dropdowns in standalone GUI (#102 )
* new logfile to capture rocprofiler output (#106 )
* roofline support for sles15 sp4 and future service packs (#109 )
2023-05-30 11:34:33 -05:00
* adding dockerfiles for all supported Linux distros
2023-05-30 11:28:44 -05:00
* new examples for `--roof-only` and `--kernel` options added to documentation
2025-01-02 13:29:47 -08:00
2023-05-30 11:28:44 -05:00
* enable cli analysis in Windows (#110 )
* optional random port number in standalone GUI (#111 )
2023-05-30 11:34:33 -05:00
* limit length of visible kernelName in `--kernel-names` option (#115 )
2023-05-30 11:28:44 -05:00
* adjust metric definitions (#117 , #130 )
* manually merge rocprof runs, overriding default rocprofiler implementation (#125 )
* fixed compatibility issues with Python 3.11 (#131 )
2025-01-02 13:29:47 -08:00
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.8-PR2 (17 Apr 2023)
2023-05-30 11:28:44 -05:00
2023-04-17 14:14:51 -05:00
* ux improvements in standalone GUI (#101 )
* enhanced readability for filtering dropdowns in standalone GUI (#102 )
* new logfile to capture rocprofiler output (#106 )
* roofline support for sles15 sp4 and future service packs (#109 )
2023-05-30 11:34:33 -05:00
* adding dockerfiles for all supported Linux distros
2023-04-17 14:14:51 -05:00
* new examples for `--roof-only` and `--kernel` options added to documentation
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.8-PR1 (13 Mar 2023)
2023-03-13 15:52:11 -05:00
* add `--kernel-names` option to toggle kernelName overlay in standalone roofline plot (#93 )
* remove unused python modules (#96 )
* fix empirical roofline calculation for single dispatch workloads (#97 )
* match color of arithmetic intensity points to corresponding bw lines
2025-01-02 13:29:47 -08:00
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.7 (21 Feb 2023)
2023-02-21 14:44:03 -06:00
2023-02-21 15:46:35 -06:00
* update documentation (#52 , #64 )
* improved detection of invalid command line arguments (#58 , #76 )
* enhancements to standalone roofline (#61 )
* enable Omniperf on systems with X-server (#62 )
* raise minimum version requirement for rocm (#64 )
* enable baseline comparison in CLI analysis (#65 )
* add multi-normalization to new metrics (#68 , #81 )
* support alternative profilers (#70 )
* add MI100 configs to override rocprofiler's incomplete default (#75 )
* improve error message when no GPU(s) detected (#85 )
2023-02-21 15:53:35 -06:00
* separate CI tests by Linux distro and add status badges
2025-01-02 13:29:47 -08:00
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.6 (21 Dec 2022)
2022-12-21 12:39:05 -06:00
* CI update: documentation now published via github action (#22 )
* better error detection for incomplete ROCm installs (#56 )
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.5 (13 Dec 2022)
2022-12-12 15:01:35 -05:00
* store application command-line parameters in profiling output (#27 )
* enable additional normalizations in CLI mode (#30 )
2022-12-13 09:28:26 -05:00
* add missing ubuntu 20.04 roofline binary to packaging (#34 )
2022-12-12 15:01:35 -05:00
* update L1 bandwidth metric calculations (#36 )
* add L1 <-> L2 bandwidth calculation (#37 )
* documentation updates (#38 , #41 )
2022-12-12 17:53:55 -06:00
* enhanced subprocess logging to identify critical errors in rocprofiler (#50 )
2022-12-13 09:28:26 -05:00
* maintain git sha in production installs from tarball (#53 )
2022-12-12 15:01:35 -05:00
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.4 (11 Nov 2022)
2022-11-11 15:15:07 -06:00
* update python requirements.txt with minimum versions for numpy and pandas
* addition of progress bar indicator in web-based GUI (#8 )
* reduced default content for web-based GUI to reduce load times (#9 )
2025-01-02 13:29:47 -08:00
* minor packaging and CI updates
* variety of documentation updates
2022-11-11 15:52:00 -06:00
* added an optional argument to vcopy.cpp workload example to specify device id
2022-11-11 15:15:07 -06:00
2024-08-13 12:29:32 -04:00
## Omniperf 1.0.3 (07 Nov 2022)
2022-11-11 15:15:07 -06:00
2022-12-13 09:29:27 -05:00
* initial Omniperf release