Graf commitů

630 Commity

Autor SHA1 Zpráva Datum
Elwazir, Ammar 95e0341266 CI Update: Removing OLD ROCProfiler-SDK files (#232)
Removing OLD ROCProfiler-SDK files
2025-02-21 11:59:25 -06:00
Kandula, Venkateshwar reddy 5cb4ad449f SWDEV-515574: Cache Number_Node static value. (#217)
* Cache Number_Node static value. To avoid value overwriting in consecutive dispatch callbacks.

* Format.

* tests for number_node evaluate.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
2025-02-19 19:10:16 -06:00
Elwazir, Ammar b4b81f9095 Temp: Disable CI for Mi300x (#221)
Update continuous_integration.yml
2025-02-18 11:41:49 -06:00
Madsen, Jonathan 470f347e50 SDK: remove majority of exceptions (#176)
* SDK: remove majority of exceptions

- replace with ROCP_FATAL, ROCP_CI_LOG(WARNING), etc.
- improve logging of symbolic link
- add --readlink and --realpath (hidden options) to rocprofv3 to follow symlinks for preloaded libraries

* Add rocprofv3 --rocm-root argument

* Fix registration resolved_exists

* Fix rocprofv3_avail.py

* Update logging for rocprofiler_configure search

- relax failure conditions

* Misc clang-tidy fixes

* Fix merge

* Fix merge

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-02-18 10:44:37 -06:00
Welton, Benjamin fd99654433 Fix install for conversion-script (#211) 2025-02-13 19:00:20 -06:00
Elwazir, Ammar 376c2a96ad Lowering log level for COMGR logs (#210)
* Lowering log level for COMGR logs

* Format Fix
2025-02-13 12:59:29 -06:00
Bhardwaj, Gopesh 215875de32 Adding CodeQL Analysis Workflow (#172)
* adding codeql.yml

* update codeql

* update codeql

* excluding external repos

* filter external

* filter external  and build

* Apply suggestions from code review

* Removed experimental test line

* Adding config

* moving codeql config out of workflows

* Disable Cdash

* update codeql

* replacing run-ci with simple cmake build

* cmake fix

* removing codeql_config

* Adding rule for python and actions

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-02-13 12:55:44 -06:00
Madsen, Jonathan 0d7ca72d84 SWDEV-514449: Fix missing thread pre/post callbacks (#204)
* SWDEV-514449: Fix missing thread pre/post callbacks

- invoke pre/post-callback around internal thread creation

* Update changelog

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-13 12:39:10 -06:00
Bhardwaj, Gopesh 848242eb5c SWDEV-514046 documentation build fix (#208) 2025-02-13 09:25:30 -06:00
Welton, Benjamin 27c4277222 Update VERSION (#207)
* Update VERSION

Update version to 0.7.0

* Fixing test install build step issue

* Updates from editor

---------

Co-authored-by: Ammar ELWazir <Ammar.ELWazir@amd.com>
2025-02-13 08:51:10 -06:00
Jakaraddi, Manjunath c77596b703 SWDEV-499989: Conversion Script to change counter collection output format from v3 to v1 (#107)
* SWDEV-499989: Add script to convert rocprofv3 counter collection output format to that of v1

* Add logging and argparsing

* Dropping duplicated counters in pmc multiple lines

* Adding test for conversion

* moving conversion script to test files

* copy conversion script from scripts folder
2025-02-12 11:31:17 -08:00
Trowbridge, Ian cc0c401615 Memory Allocation Counter Track Shows Total Allocation (#71)
* Counter track for memory allocation is now a running sum showing total allocation

* Address review comments

* Update source/lib/output/generatePerfetto.cpp

Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>

* Updated to reflect review comments

* Fix compilation errors on CI

* remove braces on scalar

* Fix struct compilation issues

* Removed name_to_id for sanitizer

---------

Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>
2025-02-12 12:59:53 -06:00
Nagaraj, Sriraksha 5d0b220c37 target cu to string input (#198)
* target cu to string input

* review comments

* review comments
2025-02-12 12:51:39 -06:00
Kandula, Venkateshwar reddy 6427fbafc2 Accum_vgpr support in Rocprofv3 (#70)
* output accumulate vgpr count

* fix logic for computing accum_vgpr

* add accum_vgpr to csv.

* accumulation vgpr's docs and support for rocprofv3

* CHANGELOG.md

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
2025-02-12 10:47:46 -08:00
Bhardwaj, Gopesh 075d36eb82 output format envs doc update (#173) 2025-02-11 21:37:12 -06:00
Baraldi, Giovanni 831e469320 SWDEV-513725: Update readme for gfx11+ power states (#193)
* Update readme

* Update README.md

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Address review comments

* Update README.md

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-02-11 18:12:41 -06:00
Madsen, Jonathan 59b41ab5aa rocprofv3: Update rocprofv3 command line for ATT (#201)
* rocprofv3: suppress agent info when no data collected

* Update output config serialization

- full serialization of output configuration

* Update rocprofiler-sdk-att/tests

- add version and soversion
- change output directory
- generate libatt_decoder_summary
- disable tests instead of removing them

* Update rocprofv3 command-line

- make --att-library-path hidden by default
- simplify check_att_capability
- reorder pc sampling options
- add hidden --echo option
- remove ROCPROF_LIST_AVAIL_TOOL_LIBRARY from preload

* Add new rocprofv3 tests for specify the ATT library path

* Tweak to rocprofv3-test-hsa-multiqueue-att tests

* Update rocprofv3 tool to enable output with att

* Fix standalone test installation

* Revert to fetchcontent_makeavailable to fetchcontent_populate

* Revert tests/common/CMakeLists.txt

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 18:10:48 -06:00
Madsen, Jonathan 6246ec4040 SDK: Agent UUIDs, agent runtime visibility, kernel symbol address (#154)
* [DO NOT MERGE] Misc UUID updates

- this is WIP

* Agent visibility

- Support for ROCR_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, CUDA_VISIBLE_DEVICES, GPU_DEVICE_ORDINAL

* Update CHANGELOG

* tweak to rocprofiler_agent_runtime_visiblity_t

* Code object kernel address

- new fields in code_object_kernel_symbol_register_data_t
  - kernel_code_entry_byte_offset
  - kernel_address

* Support ROCR_VISIBLE_DEVICES reordering devices for HIP

* Addressed code review changes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 14:36:23 -06:00
Madsen, Jonathan 3071199386 rocprofv3: do not abort if counter does not have dimensions (#150)
* rocprofv3: do not abort if counter does not have dimensions

* Relax error handling further in rocprofv3 metadata

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 14:31:25 -06:00
Kandula, Venkateshwar reddy 143f84fe6b [BUG FIX] store dimensions in counter id when used reduce operator (#181)
* save other dimension in counter id.

* Formating

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
2025-02-11 13:05:57 -06:00
Madsen, Jonathan 070b659a9a Re-enable clang-tidy for core workflows + clang-tidy fixes (#197)
* Ensure the clang-tidy is updated + clang-tidy fixes

* update-ci workflow

* Enable clang-tidy checks

* Add extra logging to device counter collection samples

* Misc clang-tidy fixes

* Disable device counter collection samples for ThreadSanitizer

* Formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 10:58:47 -06:00
Bhardwaj, Gopesh cdf22eba7d Adding pc sampling how to guide (#160)
* Adding pc sampling how to guide

* doc update

* Fixing indentation

* updating index

* udpating doc

* updating doc

* Added field information

* Fixing Formatting

* fix formatting error

* Added json format for pc sampling

* feedback resolved

* formatting for text

* PC Sampling API doc

* Reformatted

* Note for shared systems

* update docs

* correcting relative path for cross-referencing

---------

Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>
2025-02-10 20:33:05 -06:00
Elwazir, Ammar c478c24616 Disabling Mi325 temp. (#199) 2025-02-10 20:07:20 -06:00
Welton, Benjamin 080b2ba451 [SWDEV-513658] Force HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG value to be used with HSA calls (#192)
* Force HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG  value to be used with HSA calls

Fix for CI

* More tweaks

* Increase reproducible-runtime kernel sleep granularity

* Fix data race in synchronous device counter collection sample

* Update device counting service

- add get_active_context function

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-10 11:34:26 -06:00
Indic, Vladimir e67a4451d8 Show host-trap configurations only (#194) 2025-02-10 11:32:53 -06:00
Elwazir, Ammar 5410fabd3d Fixing Clang tidy errors (#195)
* Fixing Clang tidy errors

* format-fix

* Update code_object.hpp

* Clang Tidy Fixes on the whole Source folder

* Update source/CMakeLists.txt

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Addressing reviews

* Correcting the logic for parsing att counters

* Format Fix

* Update source/lib/rocprofiler-sdk-att/tests/dummy_decoder.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk-att/tests/standalone_tool_main.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk-tool/config.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Formatting

* Deactivate clang-tidy in source/lib/rocprofiler-sdk-att/tests

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-07 22:33:32 -06:00
Elwazir, Ammar 835e466b65 Temp: Disable Navi3/4 from CI (#196)
Update continuous_integration.yml
2025-02-07 15:21:14 -06:00
Bhardwaj, Gopesh 7821657d65 SWDEV-510794 Adding MPI usage with rocprofv3 (#183)
* swdev-510794 Adding MPI usage with rocprofv3

* update doc

* Fixed build issues

* updating doc

* doc update

* Fixed Typos

* csv format

* change format to shell
2025-02-07 12:01:31 +05:30
Madsen, Jonathan e743bf5a93 Undefined behavior warnings caught by ROCPROFILER_DEFAULT_FAIL_REGEX (#23)
* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX

- add UBSAN_OPTIONS to setup-sanitizer-env.sh

* Improve ROCPROFILER_DEFAULT_FAIL_REGEX

* Use -fno-sanitize-recover=undefined flag

- this compiler flag causes all undefined behavior errors to exit

* Revert ROCPROFILER_DEFAULT_FAIL_REGEX

* fix for shift overflow

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>
2025-02-06 08:55:57 -06:00
Welton, Benjamin 6c396adf83 Add example for synchronous reading of device counters (#64)
* Add example for synchronous reading of device counters

We already have test cases for this use case but this a sample
such that our collaborators can have a place to quickly pull
code from for use on their end (and to serve as a working example).

* Formatting fix

* Formatting fix

* Minor change for testing

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
2025-02-06 08:35:55 -06:00
Madsen, Jonathan 0fbe6cc7b6 SDK: No bg thread if no clients use SDK (#123)
* SDK: No bg thread if no clients use SDK

* Update CHANGELOG

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-06 08:34:56 -06:00
U, Srihari 90ae424c57 Initialize extremes to max and min values (#184)
* Initialize extremes to max and min values

* Address review comment

* Adding clang format
2025-02-06 08:32:37 -06:00
Nagaraj, Sriraksha 03e5a1d9cc remove duplication (#190) 2025-02-06 08:31:53 -06:00
Elwazir, Ammar 02a519e84e 6.4 fixes for HSA and HIP (#191)
* Adding support for hsa_amd_signal_wait_all

* Fixes for HIP

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2025-02-06 07:56:08 -06:00
Bhardwaj, Gopesh 12508b9521 Adding ROCTx usage doc (#159)
* Adding Roctx usage doc

* updated CHANGELOG

* dpc update

* Fixing Related Pages issue

* updating doc

* updating docs

* Adding Resource naming section

* Fixed Formatting

* format fix

* format fix

* Fixing build due to incorrect indentation
2025-02-05 11:04:24 -06:00
Jakaraddi, Manjunath 9c89b475b0 SWDEV-506317: Kernel trace failing due to Code object errors (#170)
SWDEV-506317: Kernel trace failing
2025-02-04 18:01:42 -06:00
Elwazir, Ammar dd5c0ea257 Support new HIP APIs (#179)
* Adding New HIP APIs

* Format Fix

* Format Fix

* Removing changes from ostream and moving it to format

* Addressing Code Review Comments

* Versioning the new hip calls formatting

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2025-02-04 15:50:18 -06:00
Elwazir, Ammar b9ad800194 Tests: Scratch memory validate bug, summary validate bug (#187)
Scratch memory validate bug, summary validate bug

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2025-02-04 12:48:28 -06:00
Welton, Benjamin 0c4a56c6bb [SWDEV-509876] Remove buffer requirement from device counting service (#132)
* [SWDEV-509876] Remove buffer requirement from device counting service

No longer require a buffer to be given when setting up device counting
service. This is to reduce performance overhead in cases where immediate
return of counting samples is being used (synchronous mode).

* Missed file

* Update source/include/rocprofiler-sdk/device_counting_service.h

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk/counters/controller.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk/counters/device_counting.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Fixes for build

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-02-04 06:06:03 -06:00
Nagaraj, Sriraksha d4a51e4102 Adding att v3 support (#84)
* Adding att v3 support

* misc fix

* bug fix

* Python linting workflow and rules

* fix regex

* Adding temporary args

* fix temporary args

* fix format

* remove att_perfcounters from test input

* Review comments (#163)

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

* Revert "Review comments (#163)"

This reverts commit 9ef0f8e5a4489d5581255e1b70ced2aef5c1c1d0.

* Address review comments 2

* review changes

* review comments

* review

* cmake alias

* review

* review

* review

* review

* Enabling percounter in v3 script

* review

* formatting

* formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-02-04 04:05:38 -06:00
Madsen, Jonathan 72a27feb04 Fix HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG for ROCm < 6.3 (#178)
Fix HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG for ROCm < 6.4

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-04 04:05:19 -06:00
Madsen, Jonathan 7fcd80f744 Fix async memory copy validation tests (#182)
* Fix async copy validation test

- make the async copy tracing test work regardless of however many HSA memory copies the HIP memory copy decomposes into

* Fix rocprofv3 memory copy tests

* Fix compilation support for hipGraphBatchMemOpNodeGetParams

* Fix rocprofv3-test-summary-*-validate

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-04 02:41:19 -06:00
Madsen, Jonathan f3752faa0a Update HIP string formatting for ROCm 6.4.0 (#144)
Fix HIP data type stringify

- when ROCPROFILER_CI is not defined, provide default for case statements
- Add support for hipGraphNodeTypeBatchMemOp when HIP version is >= 6.4.0

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-30 20:51:52 -06:00
Kandula, Venkateshwar reddy 121901c321 add gfx12 for counter collection tests (#108)
* add gfx12 for counter def.

* Update continuous_integration.yml

* Update counter_defs.yaml

* commenting logging.

* Update ioctl.cpp

* add gfx12 to tests

* Update ioctl.cpp

* Add description to GFX12 GL2C_EA_RDREQ counter

* Updates from editor

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
2025-01-30 15:16:48 -06:00
Madsen, Jonathan 1f49d6c57b Partial fix of legacy rocprofiler project name (#110)
* Partial fix of legacy rocprofiler project name

* Formatting fix

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-30 13:43:35 -06:00
Kuricheti, Mythreya d43070bf08 Fix navi48 counter event IDs (#158)
* Initial fix for navi48 counters

* Add GL2C navi4x gfx12 counters
2025-01-30 13:40:25 -06:00
Elwazir, Ammar acab62706b SLES Git Safe Directory (#177)
* Update continuous_integration.yml

* Updates from editor
2025-01-30 12:32:58 -06:00
Kuricheti, Mythreya 58ecbd83a9 Generate code coverage comment as collapsible summary (#169)
* Generate codecoverage comment as collapsible summary

* Tweak markdown formatting
2025-01-30 12:04:07 -06:00
Baraldi, Giovanni 39db6d842f Fix for ATT context stop while packets are being processed (#171)
Fix for context stop while packets are being processed

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-29 11:06:32 -08:00
Mallya, Ameya Keshava 35f8374e35 Added !verify trigger 2025-01-28 20:07:15 -08:00