* Updated stream code to handle special cases when stream value is 0x01 or 0x02
* Removed extra definitions and updated tests to account for special case
* Modified stream.cpp so that each thread assigned a unique stream ID when hipStreamPerThread is used as stream value. Modified tests to check that threads are assigned unique, repeated values when hipStreamPerThread is called
* Updated idx_offset, stream_map, and thread counter to be in one struct.
* Update stream.cpp to only use add_stream() and update tests for seperate unit test for hipStreamPerThread
* Remove unecessary comment
* Removed unecessary line
* Updated tests and stream.cpp to update stream ID correctly
* Updated test structure
* Add single kernel filtering for roofline
* Add --kernel to documentation
* Add kernel labels to roofline pdfs
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add test cases
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add autodetect for mode (profile or analyze) during roof validate and filter
Prevent --kernel from affecting roofline in gui mode- although this may be broken in develop branch anyways
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Add note about roof-only usage checking for existing profiling files in the dir. If roof-only is not provided, rocprof-compute currently assumes it has to profile in full regardless. Will look into this another day.
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Update CHANGELOG.md
Add line in resolved issues section to highlight that kernel filtering is now working for roofline plots
* Apply changes suggested by docs team
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
* Update projects/rocprofiler-compute/CHANGELOG.md
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
---------
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
* Removing regex from the tool
* Adding alternative for regex regarding handling
* Adding ROCpd
* Removing regex include
* Apply suggestion from @jomadsen_amdeng
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
* Apply suggestion from @jomadsen_amdeng
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
* Apply suggestion from @jomadsen_amdeng
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
* Adding Standalone Regex Header File
* Fixing Regex to handle grouping and
* Fixing Regex to handle grouping and
* Fixing Regex to handle grouping and
* Formatting Fix
* Update rocprofiler-sdk-restrictions.yml
* Separating regex.hpp to source and header & Adding Tests for parity with std::regex
* Update regex.cpp
* Using snake_case for naming and addressing some comments
* Adding more tests & README for regex implementation
* Updating rocprofiler sdk restrictions workflow
* Updating more tests & README for regex implementation
* Update README_regex.md
* Rename README_regex.md to README.md
---------
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
* added f4f6 description and VALU FLOPS split
* changed peak ammolite vars to local
* reverted to dict peak initialization
* ruff check format
* updated VALU descriptions
* updated VALU descriptions
* Update parser.py
* Update parser.py
Added gracefull NameError handling
Moved globals() update to init_metric_evaluation with ammolite__ vars and raw pmc_df
* update formatting
* add working-directory to ver_check step in rocprofiler-compute-packaging.yml
* Remove compute mi-rhel9 workflow badge since workflow is no longer in develop
* Update actions to v5 in rocprofiler-compute-docs
* Add working directory to steps in rocprofiler-compute-docs.yml
* Revert back to v4 pages
* Remove rocprofiler-compute-docs.yml workflow
* Remove docs workflow badge from rocprofiler-compute in README.md
* Remove rocprofiler-compute-packaging.yml, update README.md badges
Analysis data dump
* Add `--output-format` and `--output-name` option to analyze mode
* Remove `--output` and `-save-dfs` option to analyze mode
* Add documentation on `rocpd` output format and analysis database file
* Create sqlite3 database using object relation mapping (ORM) provided
by sqlalchemy library
* Fix metrics config to remove metrics marked as `null`, fix `Unit` header, add
missing `title`
* Add test cases to ensure analysis data dump work
* Move amd-smi to use caching mechanism
* Add VCN and JPEG activity to rocpd
* Switch cpu_freq to use caching mechanism
* Different approach with xcp activity & applied suggestions from code review
* Applied suggestions from code review
* Fix shadowing
* Applied suggestions from code review
* Revert "SWDEV-547589 - Add hipDeviceMallocUncached to hipMemCreate (#815)"
This reverts commit 5ce7103555.
* Revert "SWDEV-547589 - comment for flag hipDeviceMallocUncached in hipMemcreate (#339)"
This reverts commit 04dac5eae3.
* SWDEV-551942 - implement hipMemAllocationTypeUncached in hipMemCreate
When stochastic sampling is not active, the trap handler is incorrectly
branching to .check_exceptions, bypassing the software trap ID checks
and inturn not advancing the PC. Fixed the issue to always check software
traps regardless of PC sampling state.
Co-authored-by: Shweta Khatri <shweta.khatri@amd.com>
* Using semaphore to sync with all peer processes in finalization stage
[rocprofv3] Implement synchronization using POSIX semaphore in finalization
* clang format code
* clang 11 format code
* Add process sync option for rocprofv3
* Default value of process sync is false
* Update source/lib/rocprofiler-sdk-tool/tool.cpp
Apply suggestion by Copilot
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* update according to comments
* add new line to helper.hpp
---------
Co-authored-by: Huanran Wang <huanrwan@amd.com>
Co-authored-by: Huanran Wang <huanran.wang@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
This commit reverts the following related commits which cause
test failures:
6d15779b3e rocr/driver: add PC sampling support to driver interface
56cb9390ff rocr/driver: add PC sampling support to driver interface
76bf829f09 rocr/driver: add ASAN header page management to Driver class
a47c060d6a rocr/driver: add ASAN header page management to Driver class
02d7eaf3b7 rocr: add memory sharing call to Driver interface
9312468655 rocr: add memory sharing call to Driver interface
Assigning a null terminator at
the end of the string wrote
past the end of the allocated
buffer. This patch corrects that.
Signed-off-by: Ashutosh Mishra <ashutosh.mishra@amd.com>