Граф коммитов

370 Коммитов

Автор SHA1 Сообщение Дата
Kandula, Venkateshwar reddy 5cb4ad449f SWDEV-515574: Cache Number_Node static value. (#217)
* Cache Number_Node static value. To avoid value overwriting in consecutive dispatch callbacks.

* Format.

* tests for number_node evaluate.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
2025-02-19 19:10:16 -06:00
Madsen, Jonathan 470f347e50 SDK: remove majority of exceptions (#176)
* SDK: remove majority of exceptions

- replace with ROCP_FATAL, ROCP_CI_LOG(WARNING), etc.
- improve logging of symbolic link
- add --readlink and --realpath (hidden options) to rocprofv3 to follow symlinks for preloaded libraries

* Add rocprofv3 --rocm-root argument

* Fix registration resolved_exists

* Fix rocprofv3_avail.py

* Update logging for rocprofiler_configure search

- relax failure conditions

* Misc clang-tidy fixes

* Fix merge

* Fix merge

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-02-18 10:44:37 -06:00
Elwazir, Ammar 376c2a96ad Lowering log level for COMGR logs (#210)
* Lowering log level for COMGR logs

* Format Fix
2025-02-13 12:59:29 -06:00
Madsen, Jonathan 0d7ca72d84 SWDEV-514449: Fix missing thread pre/post callbacks (#204)
* SWDEV-514449: Fix missing thread pre/post callbacks

- invoke pre/post-callback around internal thread creation

* Update changelog

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-13 12:39:10 -06:00
Trowbridge, Ian cc0c401615 Memory Allocation Counter Track Shows Total Allocation (#71)
* Counter track for memory allocation is now a running sum showing total allocation

* Address review comments

* Update source/lib/output/generatePerfetto.cpp

Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>

* Updated to reflect review comments

* Fix compilation errors on CI

* remove braces on scalar

* Fix struct compilation issues

* Removed name_to_id for sanitizer

---------

Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>
2025-02-12 12:59:53 -06:00
Kandula, Venkateshwar reddy 6427fbafc2 Accum_vgpr support in Rocprofv3 (#70)
* output accumulate vgpr count

* fix logic for computing accum_vgpr

* add accum_vgpr to csv.

* accumulation vgpr's docs and support for rocprofv3

* CHANGELOG.md

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
2025-02-12 10:47:46 -08:00
Madsen, Jonathan 59b41ab5aa rocprofv3: Update rocprofv3 command line for ATT (#201)
* rocprofv3: suppress agent info when no data collected

* Update output config serialization

- full serialization of output configuration

* Update rocprofiler-sdk-att/tests

- add version and soversion
- change output directory
- generate libatt_decoder_summary
- disable tests instead of removing them

* Update rocprofv3 command-line

- make --att-library-path hidden by default
- simplify check_att_capability
- reorder pc sampling options
- add hidden --echo option
- remove ROCPROF_LIST_AVAIL_TOOL_LIBRARY from preload

* Add new rocprofv3 tests for specify the ATT library path

* Tweak to rocprofv3-test-hsa-multiqueue-att tests

* Update rocprofv3 tool to enable output with att

* Fix standalone test installation

* Revert to fetchcontent_makeavailable to fetchcontent_populate

* Revert tests/common/CMakeLists.txt

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 18:10:48 -06:00
Madsen, Jonathan 6246ec4040 SDK: Agent UUIDs, agent runtime visibility, kernel symbol address (#154)
* [DO NOT MERGE] Misc UUID updates

- this is WIP

* Agent visibility

- Support for ROCR_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, CUDA_VISIBLE_DEVICES, GPU_DEVICE_ORDINAL

* Update CHANGELOG

* tweak to rocprofiler_agent_runtime_visiblity_t

* Code object kernel address

- new fields in code_object_kernel_symbol_register_data_t
  - kernel_code_entry_byte_offset
  - kernel_address

* Support ROCR_VISIBLE_DEVICES reordering devices for HIP

* Addressed code review changes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 14:36:23 -06:00
Madsen, Jonathan 3071199386 rocprofv3: do not abort if counter does not have dimensions (#150)
* rocprofv3: do not abort if counter does not have dimensions

* Relax error handling further in rocprofv3 metadata

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 14:31:25 -06:00
Kandula, Venkateshwar reddy 143f84fe6b [BUG FIX] store dimensions in counter id when used reduce operator (#181)
* save other dimension in counter id.

* Formating

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
2025-02-11 13:05:57 -06:00
Madsen, Jonathan 070b659a9a Re-enable clang-tidy for core workflows + clang-tidy fixes (#197)
* Ensure the clang-tidy is updated + clang-tidy fixes

* update-ci workflow

* Enable clang-tidy checks

* Add extra logging to device counter collection samples

* Misc clang-tidy fixes

* Disable device counter collection samples for ThreadSanitizer

* Formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 10:58:47 -06:00
Welton, Benjamin 080b2ba451 [SWDEV-513658] Force HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG value to be used with HSA calls (#192)
* Force HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG  value to be used with HSA calls

Fix for CI

* More tweaks

* Increase reproducible-runtime kernel sleep granularity

* Fix data race in synchronous device counter collection sample

* Update device counting service

- add get_active_context function

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-10 11:34:26 -06:00
Indic, Vladimir e67a4451d8 Show host-trap configurations only (#194) 2025-02-10 11:32:53 -06:00
Elwazir, Ammar 5410fabd3d Fixing Clang tidy errors (#195)
* Fixing Clang tidy errors

* format-fix

* Update code_object.hpp

* Clang Tidy Fixes on the whole Source folder

* Update source/CMakeLists.txt

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Addressing reviews

* Correcting the logic for parsing att counters

* Format Fix

* Update source/lib/rocprofiler-sdk-att/tests/dummy_decoder.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk-att/tests/standalone_tool_main.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk-tool/config.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Formatting

* Deactivate clang-tidy in source/lib/rocprofiler-sdk-att/tests

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-07 22:33:32 -06:00
Madsen, Jonathan e743bf5a93 Undefined behavior warnings caught by ROCPROFILER_DEFAULT_FAIL_REGEX (#23)
* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX

- add UBSAN_OPTIONS to setup-sanitizer-env.sh

* Improve ROCPROFILER_DEFAULT_FAIL_REGEX

* Use -fno-sanitize-recover=undefined flag

- this compiler flag causes all undefined behavior errors to exit

* Revert ROCPROFILER_DEFAULT_FAIL_REGEX

* fix for shift overflow

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>
2025-02-06 08:55:57 -06:00
Madsen, Jonathan 0fbe6cc7b6 SDK: No bg thread if no clients use SDK (#123)
* SDK: No bg thread if no clients use SDK

* Update CHANGELOG

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-06 08:34:56 -06:00
U, Srihari 90ae424c57 Initialize extremes to max and min values (#184)
* Initialize extremes to max and min values

* Address review comment

* Adding clang format
2025-02-06 08:32:37 -06:00
Nagaraj, Sriraksha 03e5a1d9cc remove duplication (#190) 2025-02-06 08:31:53 -06:00
Elwazir, Ammar 02a519e84e 6.4 fixes for HSA and HIP (#191)
* Adding support for hsa_amd_signal_wait_all

* Fixes for HIP

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2025-02-06 07:56:08 -06:00
Jakaraddi, Manjunath 9c89b475b0 SWDEV-506317: Kernel trace failing due to Code object errors (#170)
SWDEV-506317: Kernel trace failing
2025-02-04 18:01:42 -06:00
Elwazir, Ammar dd5c0ea257 Support new HIP APIs (#179)
* Adding New HIP APIs

* Format Fix

* Format Fix

* Removing changes from ostream and moving it to format

* Addressing Code Review Comments

* Versioning the new hip calls formatting

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2025-02-04 15:50:18 -06:00
Welton, Benjamin 0c4a56c6bb [SWDEV-509876] Remove buffer requirement from device counting service (#132)
* [SWDEV-509876] Remove buffer requirement from device counting service

No longer require a buffer to be given when setting up device counting
service. This is to reduce performance overhead in cases where immediate
return of counting samples is being used (synchronous mode).

* Missed file

* Update source/include/rocprofiler-sdk/device_counting_service.h

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk/counters/controller.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk/counters/device_counting.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Fixes for build

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-02-04 06:06:03 -06:00
Nagaraj, Sriraksha d4a51e4102 Adding att v3 support (#84)
* Adding att v3 support

* misc fix

* bug fix

* Python linting workflow and rules

* fix regex

* Adding temporary args

* fix temporary args

* fix format

* remove att_perfcounters from test input

* Review comments (#163)

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

* Revert "Review comments (#163)"

This reverts commit 9ef0f8e5a4489d5581255e1b70ced2aef5c1c1d0.

* Address review comments 2

* review changes

* review comments

* review

* cmake alias

* review

* review

* review

* review

* Enabling percounter in v3 script

* review

* formatting

* formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-02-04 04:05:38 -06:00
Madsen, Jonathan 72a27feb04 Fix HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG for ROCm < 6.3 (#178)
Fix HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG for ROCm < 6.4

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-04 04:05:19 -06:00
Madsen, Jonathan 7fcd80f744 Fix async memory copy validation tests (#182)
* Fix async copy validation test

- make the async copy tracing test work regardless of however many HSA memory copies the HIP memory copy decomposes into

* Fix rocprofv3 memory copy tests

* Fix compilation support for hipGraphBatchMemOpNodeGetParams

* Fix rocprofv3-test-summary-*-validate

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-04 02:41:19 -06:00
Madsen, Jonathan f3752faa0a Update HIP string formatting for ROCm 6.4.0 (#144)
Fix HIP data type stringify

- when ROCPROFILER_CI is not defined, provide default for case statements
- Add support for hipGraphNodeTypeBatchMemOp when HIP version is >= 6.4.0

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-30 20:51:52 -06:00
Kandula, Venkateshwar reddy 121901c321 add gfx12 for counter collection tests (#108)
* add gfx12 for counter def.

* Update continuous_integration.yml

* Update counter_defs.yaml

* commenting logging.

* Update ioctl.cpp

* add gfx12 to tests

* Update ioctl.cpp

* Add description to GFX12 GL2C_EA_RDREQ counter

* Updates from editor

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
2025-01-30 15:16:48 -06:00
Kuricheti, Mythreya d43070bf08 Fix navi48 counter event IDs (#158)
* Initial fix for navi48 counters

* Add GL2C navi4x gfx12 counters
2025-01-30 13:40:25 -06:00
Baraldi, Giovanni 39db6d842f Fix for ATT context stop while packets are being processed (#171)
Fix for context stop while packets are being processed

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-29 11:06:32 -08:00
Welton, Benjamin 0d701cdaac [SWDEV-482060] Set execute permission for HSA allocated memory (#151)
We need execute permission for HSA memory (req for IB buffers).
Enforcement is upcoming which will break counter collection (see
ticket).

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-01-28 16:46:22 -08:00
Indic, Vladimir e4d736839d Temporarily allow only host-trap sampling (#156) 2025-01-27 13:26:11 -06:00
Rawat, Swati 97b7a6315d update copyright date to 2025 (#102)
* Update LICENSE

* Update conf.py

* Update copyright year

* [fix] Update copyright year

* Update copyright year "ROCm Developer Tools"

* Add license headers to c++ files

* Add license to *.py

* Update licenses in rocdecode sources

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-22 19:11:20 -06:00
Trowbridge, Ian 042c761b64 Changed memory_allocation.csv name to memory_allocation_trace.csv (#111) 2025-01-22 11:14:42 -06:00
Welton, Benjamin 3076660e60 Add gfx941/gfx942 to ValuPipeIssueUtil (#139) 2025-01-21 14:39:03 -08:00
Welton, Benjamin c6b52701c7 Add gfx940/gfx9 to ValuPipeIssueUtil (#138)
Was dropped, likely by mistake, in the transition to yaml
2025-01-21 12:12:39 -08:00
Baraldi, Giovanni 081419b745 Fix throw on repeated filename (#124)
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-20 17:24:28 -08:00
Trowbridge, Ian e307b89ca4 rocDecode API Tracing Support (#49)
* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Merge conflict error

* CMake files changed in response to review comments. Attempting to implement callbacks.

* Turned off test building for rocdecode

* Minor fixes for review comments

* Review comments

* Updated formatting

* Document changes and format.hpp reversion. Need to remove iterate args support for now for later update.

* Remove iterate args support

* Remove iterate-args

* enforce abi versioning in macro if

* Fix doc error

* removed spaces to fix indentation error

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-01-17 14:42:25 -08:00
Baraldi, Giovanni 1f01526eed SWDEV-478762+FEAT-62196: Fix crash on AQL replay (#104)
* SWDEV-478762: Fix crash on replay

* Fix iteration range

* Format

* Refactor

* Addressing review comments

* Address review comments

* Formatting

* Format

* Refactor

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-17 12:23:48 -06:00
Jakaraddi, Manjunath dfee6489b1 SWDEV-500520: Updated documentation for hang issue (#79)
* SWDEV-500520: Updated documentation for hang issue

* Avoid fatal error when invalid metric is found

* removing invalid metrics

* clang formatting
2025-01-16 02:14:22 -08:00
Welton, Benjamin 536fbba627 [SWDEV-509659] Skip rocprof device counting tests if lacking permissions (#125)
* [SWDEV-509659] Skip rocprof device counting tests if lacking permissions

Skips non-intercept test if proper permissions are not obtained
(SYS_PERFMON). This should be the only test that fails due to permission
issues (others do not require the IOCTL to pass).

Regex match sample: https://regexr.com/8b29s

* Update source/lib/rocprofiler-sdk/counters/tests/CMakeLists.txt

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* fix

* Update source/lib/rocprofiler-sdk/counters/tests/CMakeLists.txt

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-01-16 01:51:06 -06:00
Welton, Benjamin 71dc203b0c Small debug print fix in ioctl.cpp (#120)
* Small debug print fix in ioctl.cpp

Fix debug print statement to print agent id.

* formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-15 18:00:04 -08:00
Madsen, Jonathan fae4ad614c Fix host function logging (#63)
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-15 19:54:35 -06:00
Madsen, Jonathan 1b7ab08ded Split ABI checks for rocprofiler-sdk-roctx into separate file (#21)
* Split ABI checks for rocprofiler-sdk-roctx into separate file

* Update source/lib/rocprofiler-sdk-roctx/abi.cpp

Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

* Update source/lib/rocprofiler-sdk-roctx/abi.cpp

Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

* New line

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
2025-01-15 19:32:50 -06:00
Baraldi, Giovanni a2fa188e14 Adding source snapshot and partial serialization (#99)
* Adding source snapshot

* Adding option to serialize only on target kernel

* Fix for tidy

* Formatting

* Testing the new flag

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-10 15:43:06 -08:00
Baraldi, Giovanni 27266eb242 SWDEV-508485: Adding MFMA F8 metric (#112)
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-09 15:27:16 -06:00
Baraldi, Giovanni fddd8ac4aa SWDEV-490031: Adding new rdc ops metrics (#96)
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-06 11:02:23 +00:00
Baraldi, Giovanni 0a8c31842d SWDEV-492607: Fix for bvh (#87)
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2024-12-24 15:47:13 +01:00
Nagaraj, Sriraksha 202853d579 fix abort-app CI fail (#39)
* fix abort-app CI fail

* Update source/lib/rocprofiler-sdk-tool/tool.cpp

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2024-12-22 14:17:48 -06:00
Indic, Vladimir 0ce75c1043 ROCProfV3: fatal message if PC sampling unsupported, but requested (#60)
If a user requests PC sampling on a system that does not support this feature,
report a fatal error message and stop executing the program.
2024-12-20 08:04:16 -08:00
Baraldi, Giovanni 200a8624bc SWDEV-495749: Adding SIMD_UTILIZATION metric (#74)
* SWDEV-495749: Adding SIMD_UTILIZATION metric

* Fix mfmautil

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2024-12-20 14:35:18 +00:00