Граф коммитов

467 Коммитов

Автор SHA1 Сообщение Дата
Kandula, Venkateshwar reddy 5cb4ad449f SWDEV-515574: Cache Number_Node static value. (#217)
* Cache Number_Node static value. To avoid value overwriting in consecutive dispatch callbacks.

* Format.

* tests for number_node evaluate.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
2025-02-19 19:10:16 -06:00
Madsen, Jonathan 470f347e50 SDK: remove majority of exceptions (#176)
* SDK: remove majority of exceptions

- replace with ROCP_FATAL, ROCP_CI_LOG(WARNING), etc.
- improve logging of symbolic link
- add --readlink and --realpath (hidden options) to rocprofv3 to follow symlinks for preloaded libraries

* Add rocprofv3 --rocm-root argument

* Fix registration resolved_exists

* Fix rocprofv3_avail.py

* Update logging for rocprofiler_configure search

- relax failure conditions

* Misc clang-tidy fixes

* Fix merge

* Fix merge

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-02-18 10:44:37 -06:00
Welton, Benjamin fd99654433 Fix install for conversion-script (#211) 2025-02-13 19:00:20 -06:00
Elwazir, Ammar 376c2a96ad Lowering log level for COMGR logs (#210)
* Lowering log level for COMGR logs

* Format Fix
2025-02-13 12:59:29 -06:00
Madsen, Jonathan 0d7ca72d84 SWDEV-514449: Fix missing thread pre/post callbacks (#204)
* SWDEV-514449: Fix missing thread pre/post callbacks

- invoke pre/post-callback around internal thread creation

* Update changelog

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-13 12:39:10 -06:00
Bhardwaj, Gopesh 848242eb5c SWDEV-514046 documentation build fix (#208) 2025-02-13 09:25:30 -06:00
Jakaraddi, Manjunath c77596b703 SWDEV-499989: Conversion Script to change counter collection output format from v3 to v1 (#107)
* SWDEV-499989: Add script to convert rocprofv3 counter collection output format to that of v1

* Add logging and argparsing

* Dropping duplicated counters in pmc multiple lines

* Adding test for conversion

* moving conversion script to test files

* copy conversion script from scripts folder
2025-02-12 11:31:17 -08:00
Trowbridge, Ian cc0c401615 Memory Allocation Counter Track Shows Total Allocation (#71)
* Counter track for memory allocation is now a running sum showing total allocation

* Address review comments

* Update source/lib/output/generatePerfetto.cpp

Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>

* Updated to reflect review comments

* Fix compilation errors on CI

* remove braces on scalar

* Fix struct compilation issues

* Removed name_to_id for sanitizer

---------

Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>
2025-02-12 12:59:53 -06:00
Nagaraj, Sriraksha 5d0b220c37 target cu to string input (#198)
* target cu to string input

* review comments

* review comments
2025-02-12 12:51:39 -06:00
Kandula, Venkateshwar reddy 6427fbafc2 Accum_vgpr support in Rocprofv3 (#70)
* output accumulate vgpr count

* fix logic for computing accum_vgpr

* add accum_vgpr to csv.

* accumulation vgpr's docs and support for rocprofv3

* CHANGELOG.md

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
2025-02-12 10:47:46 -08:00
Bhardwaj, Gopesh 075d36eb82 output format envs doc update (#173) 2025-02-11 21:37:12 -06:00
Madsen, Jonathan 59b41ab5aa rocprofv3: Update rocprofv3 command line for ATT (#201)
* rocprofv3: suppress agent info when no data collected

* Update output config serialization

- full serialization of output configuration

* Update rocprofiler-sdk-att/tests

- add version and soversion
- change output directory
- generate libatt_decoder_summary
- disable tests instead of removing them

* Update rocprofv3 command-line

- make --att-library-path hidden by default
- simplify check_att_capability
- reorder pc sampling options
- add hidden --echo option
- remove ROCPROF_LIST_AVAIL_TOOL_LIBRARY from preload

* Add new rocprofv3 tests for specify the ATT library path

* Tweak to rocprofv3-test-hsa-multiqueue-att tests

* Update rocprofv3 tool to enable output with att

* Fix standalone test installation

* Revert to fetchcontent_makeavailable to fetchcontent_populate

* Revert tests/common/CMakeLists.txt

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 18:10:48 -06:00
Madsen, Jonathan 6246ec4040 SDK: Agent UUIDs, agent runtime visibility, kernel symbol address (#154)
* [DO NOT MERGE] Misc UUID updates

- this is WIP

* Agent visibility

- Support for ROCR_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, CUDA_VISIBLE_DEVICES, GPU_DEVICE_ORDINAL

* Update CHANGELOG

* tweak to rocprofiler_agent_runtime_visiblity_t

* Code object kernel address

- new fields in code_object_kernel_symbol_register_data_t
  - kernel_code_entry_byte_offset
  - kernel_address

* Support ROCR_VISIBLE_DEVICES reordering devices for HIP

* Addressed code review changes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 14:36:23 -06:00
Madsen, Jonathan 3071199386 rocprofv3: do not abort if counter does not have dimensions (#150)
* rocprofv3: do not abort if counter does not have dimensions

* Relax error handling further in rocprofv3 metadata

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 14:31:25 -06:00
Kandula, Venkateshwar reddy 143f84fe6b [BUG FIX] store dimensions in counter id when used reduce operator (#181)
* save other dimension in counter id.

* Formating

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
2025-02-11 13:05:57 -06:00
Madsen, Jonathan 070b659a9a Re-enable clang-tidy for core workflows + clang-tidy fixes (#197)
* Ensure the clang-tidy is updated + clang-tidy fixes

* update-ci workflow

* Enable clang-tidy checks

* Add extra logging to device counter collection samples

* Misc clang-tidy fixes

* Disable device counter collection samples for ThreadSanitizer

* Formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 10:58:47 -06:00
Bhardwaj, Gopesh cdf22eba7d Adding pc sampling how to guide (#160)
* Adding pc sampling how to guide

* doc update

* Fixing indentation

* updating index

* udpating doc

* updating doc

* Added field information

* Fixing Formatting

* fix formatting error

* Added json format for pc sampling

* feedback resolved

* formatting for text

* PC Sampling API doc

* Reformatted

* Note for shared systems

* update docs

* correcting relative path for cross-referencing

---------

Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>
2025-02-10 20:33:05 -06:00
Welton, Benjamin 080b2ba451 [SWDEV-513658] Force HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG value to be used with HSA calls (#192)
* Force HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG  value to be used with HSA calls

Fix for CI

* More tweaks

* Increase reproducible-runtime kernel sleep granularity

* Fix data race in synchronous device counter collection sample

* Update device counting service

- add get_active_context function

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-10 11:34:26 -06:00
Indic, Vladimir e67a4451d8 Show host-trap configurations only (#194) 2025-02-10 11:32:53 -06:00
Elwazir, Ammar 5410fabd3d Fixing Clang tidy errors (#195)
* Fixing Clang tidy errors

* format-fix

* Update code_object.hpp

* Clang Tidy Fixes on the whole Source folder

* Update source/CMakeLists.txt

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Addressing reviews

* Correcting the logic for parsing att counters

* Format Fix

* Update source/lib/rocprofiler-sdk-att/tests/dummy_decoder.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk-att/tests/standalone_tool_main.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk-tool/config.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Formatting

* Deactivate clang-tidy in source/lib/rocprofiler-sdk-att/tests

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-07 22:33:32 -06:00
Bhardwaj, Gopesh 7821657d65 SWDEV-510794 Adding MPI usage with rocprofv3 (#183)
* swdev-510794 Adding MPI usage with rocprofv3

* update doc

* Fixed build issues

* updating doc

* doc update

* Fixed Typos

* csv format

* change format to shell
2025-02-07 12:01:31 +05:30
Madsen, Jonathan e743bf5a93 Undefined behavior warnings caught by ROCPROFILER_DEFAULT_FAIL_REGEX (#23)
* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX

- add UBSAN_OPTIONS to setup-sanitizer-env.sh

* Improve ROCPROFILER_DEFAULT_FAIL_REGEX

* Use -fno-sanitize-recover=undefined flag

- this compiler flag causes all undefined behavior errors to exit

* Revert ROCPROFILER_DEFAULT_FAIL_REGEX

* fix for shift overflow

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>
2025-02-06 08:55:57 -06:00
Madsen, Jonathan 0fbe6cc7b6 SDK: No bg thread if no clients use SDK (#123)
* SDK: No bg thread if no clients use SDK

* Update CHANGELOG

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-06 08:34:56 -06:00
U, Srihari 90ae424c57 Initialize extremes to max and min values (#184)
* Initialize extremes to max and min values

* Address review comment

* Adding clang format
2025-02-06 08:32:37 -06:00
Nagaraj, Sriraksha 03e5a1d9cc remove duplication (#190) 2025-02-06 08:31:53 -06:00
Elwazir, Ammar 02a519e84e 6.4 fixes for HSA and HIP (#191)
* Adding support for hsa_amd_signal_wait_all

* Fixes for HIP

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2025-02-06 07:56:08 -06:00
Bhardwaj, Gopesh 12508b9521 Adding ROCTx usage doc (#159)
* Adding Roctx usage doc

* updated CHANGELOG

* dpc update

* Fixing Related Pages issue

* updating doc

* updating docs

* Adding Resource naming section

* Fixed Formatting

* format fix

* format fix

* Fixing build due to incorrect indentation
2025-02-05 11:04:24 -06:00
Jakaraddi, Manjunath 9c89b475b0 SWDEV-506317: Kernel trace failing due to Code object errors (#170)
SWDEV-506317: Kernel trace failing
2025-02-04 18:01:42 -06:00
Elwazir, Ammar dd5c0ea257 Support new HIP APIs (#179)
* Adding New HIP APIs

* Format Fix

* Format Fix

* Removing changes from ostream and moving it to format

* Addressing Code Review Comments

* Versioning the new hip calls formatting

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2025-02-04 15:50:18 -06:00
Welton, Benjamin 0c4a56c6bb [SWDEV-509876] Remove buffer requirement from device counting service (#132)
* [SWDEV-509876] Remove buffer requirement from device counting service

No longer require a buffer to be given when setting up device counting
service. This is to reduce performance overhead in cases where immediate
return of counting samples is being used (synchronous mode).

* Missed file

* Update source/include/rocprofiler-sdk/device_counting_service.h

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk/counters/controller.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk/counters/device_counting.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Fixes for build

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-02-04 06:06:03 -06:00
Nagaraj, Sriraksha d4a51e4102 Adding att v3 support (#84)
* Adding att v3 support

* misc fix

* bug fix

* Python linting workflow and rules

* fix regex

* Adding temporary args

* fix temporary args

* fix format

* remove att_perfcounters from test input

* Review comments (#163)

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

* Revert "Review comments (#163)"

This reverts commit 9ef0f8e5a4489d5581255e1b70ced2aef5c1c1d0.

* Address review comments 2

* review changes

* review comments

* review

* cmake alias

* review

* review

* review

* review

* Enabling percounter in v3 script

* review

* formatting

* formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-02-04 04:05:38 -06:00
Madsen, Jonathan 72a27feb04 Fix HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG for ROCm < 6.3 (#178)
Fix HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG for ROCm < 6.4

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-04 04:05:19 -06:00
Madsen, Jonathan 7fcd80f744 Fix async memory copy validation tests (#182)
* Fix async copy validation test

- make the async copy tracing test work regardless of however many HSA memory copies the HIP memory copy decomposes into

* Fix rocprofv3 memory copy tests

* Fix compilation support for hipGraphBatchMemOpNodeGetParams

* Fix rocprofv3-test-summary-*-validate

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-04 02:41:19 -06:00
Madsen, Jonathan f3752faa0a Update HIP string formatting for ROCm 6.4.0 (#144)
Fix HIP data type stringify

- when ROCPROFILER_CI is not defined, provide default for case statements
- Add support for hipGraphNodeTypeBatchMemOp when HIP version is >= 6.4.0

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-30 20:51:52 -06:00
Kandula, Venkateshwar reddy 121901c321 add gfx12 for counter collection tests (#108)
* add gfx12 for counter def.

* Update continuous_integration.yml

* Update counter_defs.yaml

* commenting logging.

* Update ioctl.cpp

* add gfx12 to tests

* Update ioctl.cpp

* Add description to GFX12 GL2C_EA_RDREQ counter

* Updates from editor

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
2025-01-30 15:16:48 -06:00
Madsen, Jonathan 1f49d6c57b Partial fix of legacy rocprofiler project name (#110)
* Partial fix of legacy rocprofiler project name

* Formatting fix

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-30 13:43:35 -06:00
Kuricheti, Mythreya d43070bf08 Fix navi48 counter event IDs (#158)
* Initial fix for navi48 counters

* Add GL2C navi4x gfx12 counters
2025-01-30 13:40:25 -06:00
Baraldi, Giovanni 39db6d842f Fix for ATT context stop while packets are being processed (#171)
Fix for context stop while packets are being processed

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-29 11:06:32 -08:00
Welton, Benjamin 0d701cdaac [SWDEV-482060] Set execute permission for HSA allocated memory (#151)
We need execute permission for HSA memory (req for IB buffers).
Enforcement is upcoming which will break counter collection (see
ticket).

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-01-28 16:46:22 -08:00
Indic, Vladimir e4d736839d Temporarily allow only host-trap sampling (#156) 2025-01-27 13:26:11 -06:00
Elwazir, Ammar 19a912d476 Fixing collection period rocprofv3 help message (#148)
Update rocprofv3.py
2025-01-24 08:39:40 -06:00
Rawat, Swati 97b7a6315d update copyright date to 2025 (#102)
* Update LICENSE

* Update conf.py

* Update copyright year

* [fix] Update copyright year

* Update copyright year "ROCm Developer Tools"

* Add license headers to c++ files

* Add license to *.py

* Update licenses in rocdecode sources

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-22 19:11:20 -06:00
Meadows, Lawrence 93f19cf5ca Add ELFABIVERSION_AMDGPU_HSA_V6 emitted by recent llvm compilers (#141)
Co-authored-by: Larry Meadows <lmeadows@amd.com>
2025-01-22 13:42:48 -08:00
Bhardwaj, Gopesh 73e7f8cfb1 ROCTx Documentation (#29)
* Add roctx doc

* Add roctx doxyfile input

* Update links and toc

* Build doxysphinx for both doxygen files

* Update scripts

* Generate roctx doxygen files

* Change doxygen path

to allow for 2 doxyfiles

* Make doxygen dir for script

* Call make _doxygen dir with p flag

* Create _doxygen dir in workfllow

* Create doc dirs for doxygen

* Run update docs as sudo

* Fix typo in mkdir command

* Include graphviz for dot

* Install dot for docs CI

* Install dot as sudo due to permission denied

* Install doxygen via sudo

* Install doxysphinx

* Add postcheckout step to RTD to config and gen doxygen docs

* On RTD, update doxygen after creating env

* update docs.yml

* update docs.yml

* fixing build-docs-from-source

* Fixing build docs from source

* update docs.yml

* trying to fix readthedocs

* trying to fix readthedocs

* update docs.yml

* improve mainpage documentation

* update docs

* clang-format fix

---------

Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-01-22 11:34:21 -06:00
Trowbridge, Ian 73e72bb088 Documentation Update to reflect that memory allocation trace records null pointers for free operations (#127)
Update documentation to reflect that nullpointers can be recorded in free memory operations
2025-01-22 11:20:50 -06:00
Cheruvally, Aravindan 6bb60bf930 Enhance CMAKE install instructions with std install location/destination (#85)
* Enhancement - usage of package name flags commonly across for getting unique folder name

* Enhancements - updating libexec/pkg usage, avoid sbin

* CMAKE Format Update

* Python Format Update

* Revert "Enhancement - usage of package name flags commonly across for getting unique folder name"

This reverts commit 2dcd1ac5f22ab90112d90648e4b5dab5c54bc639.

* REview Comments - Revert PACKAGE_NAME usage

* Review Comments - Update source folders accordingly to new cmake install locations
2025-01-22 11:19:47 -06:00
Madsen, Jonathan 89cfb5317d Update docs jinja requirements (#118)
- Jinja < 3.1.5 has a sandbox breakout through malicious filenames
- Jinja < 3.1.5 has a sandbox breakout through indirect reference to format method

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-22 11:18:19 -06:00
Trowbridge, Ian 042c761b64 Changed memory_allocation.csv name to memory_allocation_trace.csv (#111) 2025-01-22 11:14:42 -06:00
Welton, Benjamin 3076660e60 Add gfx941/gfx942 to ValuPipeIssueUtil (#139) 2025-01-21 14:39:03 -08:00
Welton, Benjamin c6b52701c7 Add gfx940/gfx9 to ValuPipeIssueUtil (#138)
Was dropped, likely by mistake, in the transition to yaml
2025-01-21 12:12:39 -08:00