Commit Graph

655 Commits

Author SHA1 Message Date
Bhardwaj, Gopesh c66e315ad0 fix building OpenMP test target (#292)
* fix building openmp test target

* cmake format correction

* cmake format correction

[ROCm/rocprofiler-sdk commit: e0859f7d33]
2025-03-18 14:16:18 +05:30
Bhardwaj, Gopesh 9764f96427 removing gfx940 and gfx941 targets (#286)
* removing gfx940 and gfx941 targets

* updated changelog

[ROCm/rocprofiler-sdk commit: f5c9663c51]
2025-03-17 15:21:12 -05:00
Vaddireddy, Sushma aef4f2f4c5 MI355X Support - PC Sampling and updating counter_defs.yaml (#206)
* Update mi350/gfx950 counter_defs.yaml (#131)

* Update gfx950 counter_defs.yaml

* Update F8 MFMA for gfx950

* Update counter_defs.yaml

* Update counter_defs.yaml

* add simd_util counter

* add new rdc ops gfx950

* Update counter_defs.yaml

* New mi350 CPC counters

* Update counter_defs.yaml

* New mi350 spi counters

* Update new mi350 sq counter_defs.yaml

* Update TA counter_defs.yaml

* Update TD GFX950counter_defs.yaml

* Update TCP gfx950 counter_defs.yaml

* Update new gfx950 tcc counter_defs.yaml

* Update TCP_PENDING_STALL_CYCLES counter_defs.yaml

* MI355X Host-Trap PC sampling Support (#130)

* Adding gfx12 to CU_NUM

* Add ELFABIVERSION_AMDGPU_HSA_V6

* add gfx950 to TEST_YAML_LOAD metric

* add gfx950 to append counters tests

* Updated CHANGELOG.md

---------

Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

[ROCm/rocprofiler-sdk commit: 09c7d44cc4]
2025-03-17 15:20:40 -05:00
Baraldi, Giovanni ac6e512e25 SWDEV-516846: Fix serialization services conflicts and ATT counter streaming (#230)
* Update TT API

* Rework serialization

* update att_core

* Fix tests

* Fix tool

* Formatting

* Fix perfcounter

* Formatting

* Rename agent TT

* Format

* Workaround for codeQL alert

* Tidy fix

* Fix compiler error

* Tidy

* Fix some tests

* Fixing some tests

* formatting

* Fixing ATT serialization

* Format

* Fix test commandline

* Fixing init order

* Format

* Tidy fixes

* Removing unused sample

* Fix tests and schema

* Added ATT + PMC test

* Fix mode

* Fix file mode

* Review comments

* Fix typo

* Review comments

* Review comments

* Fix missing id inc after review comment

* Review comments

* Suggested Fixes

* Testing changes

* Test fix

* Build fixes

* Minor build fix

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>

[ROCm/rocprofiler-sdk commit: 821918a512]
2025-03-14 18:11:10 -07:00
Mallya, Ameya Keshava f0834cbd12 Added release trigger for further releases
[ROCm/rocprofiler-sdk commit: 914923f688]
2025-03-14 13:48:05 -07:00
Kuricheti, Mythreya d1aeee3599 Add an option to disable perfetto debug annotations in json tool (#258)
* Add opt-in to disable perfetto annotations

Add an env option `ROCPROFILER_DISABLE_PERFETTO_ANNOTATIONS`
to disable perfetto function-arg annotations.

If there are a large number of records, the tests that use this tool timeout
on some machines

* Update iteration kind

* Remove test_retired_correlation_ids for page-migration

[ROCm/rocprofiler-sdk commit: bbacf70ec7]
2025-03-14 13:06:18 -07:00
Trowbridge, Ian 7aeaffd871 HIP Streams to Queues Translation (#235)
* rocprofiler_stream_id_t: opaque handle for a stream

- e.g. HIP stream
- the same HIP stream may map to different HSA queues at different points in the application
- added to:
  - rocprofiler_buffer_tracing_hip_api_record_t
  - rocprofiler_buffer_tracing_memory_copy_record_t
  - rocprofiler_callback_tracing_hip_api_data_t
  - rocprofiler_callback_tracing_memory_copy_data_t
---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Mark Meserve <mark.meserve@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Jakaraddi, Manjunath <Manjunath.Jakaraddi@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
Co-authored-by: Nagaraj, Sriraksha <Sriraksha.Nagaraj@amd.com>
Co-authored-by: U, Srihari <Srihari.U@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>

[ROCm/rocprofiler-sdk commit: ccd1e54293]
2025-03-14 02:45:13 -07:00
Welton, Benjamin c08db2daa1 [SWDEV-512693] Iteration based counter multiplexing (#272)
Adds iteration based multiplexing to counter collection. Counter groups can now be specified. These counter groups are collected on a device individually until a specified interval period is reached. When the interval is reached, the next counter group is set to be collected on subsequent kernel executions.

Supplies two new argument types that can be included in YAML/JSON inputs:

pmc_groups: an array of arrays containing the counter groups to run (i.e. [ ["SQ_WAVES", "GRBM_COUNT"], ["GRBM_GUI_ACTIVE"])
pmc_group_interval: the number of kernel invocations on a GPU of a group before rotating to the next group

Note: originally there was a random_seed_generator proposed in the linked ticket, that was not implemented since there are very few instances where you would want the selection of the groups to be randomly generated (and if you do, you can randomly generate the pattern and place it as a large list of groups in pmc_group).

All existing counter functionality should be preserved (selection of counters on specific devices only, profiling of only specific kernels, etc).

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>

[ROCm/rocprofiler-sdk commit: aa88dd44c7]
2025-03-14 02:05:36 -07:00
Welton, Benjamin 509298ba75 [SWDEV-518071] Return HSA not loaded status (device counter collection) (#242)
* [SWDEV-518071] Return HSA not loaded status (device counter collection)

This is a state that a caller would want to know about to understand if
they got no counters because of a failure or if they were trying to
collect counters too early (as is the case in the sample, which can
attempt to collect counters before HSA is inited).

* Minor edit

* format

* [SWDEV-518081] Simplify Metric Loading (#243)

* [SWDEV-518071] Return HSA not loaded status (device counter collection)

This is a state that a caller would want to know about to understand if
they got no counters because of a failure or if they were trying to
collect counters too early (as is the case in the sample, which can
attempt to collect counters before HSA is inited).
* [SWDEV-518324] Add AST update support

Allows the ability for ASTs to be updated (instead of an unchangable
static value). Adds a shared pointer return type to protect against
static destructors/modifications from invalidating potentially in use
AST definitions. No functionality/use changes in this PR.
* [SWDEV-518593] Add updatable dimension cache + fix string issues (#252)

* [SWDEV-518593] Add updatable dimension cache + fix string issues

Updates dimension cache to use the same design pattern as AST/Metrics.

Fixes the string scoping issue seen in ASTs, which appears here as well.

* Add rocprofiler_create_counter

Creates derived counters based on input from the API. This PR does three
things:

1. Adds the API + test case
2. Validates that an AST can be constructed from the counter supplied.
3. Updates metrics, ast, and dimension caches to include the new metric.

Metric should be available for use immediately after the call completes.

Due to the regeneration of ASTs, this call should not be performed in
performance sensitive code.

* Suggestion fixes

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>

* Minor tweak

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

* Fixes for comments

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

[ROCm/rocprofiler-sdk commit: 007285272b]
2025-03-14 01:07:16 -07:00
Nagaraj, Sriraksha 864a9c328d Adding agent-index (#189)
* Adding agent-index

* review changes

* review comments addressed

* minor fix

* fix CI failure

* review comments

* Fix agent index test and address review comments

* Build Fixes

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>

[ROCm/rocprofiler-sdk commit: c30bb7cbda]
2025-03-14 00:51:32 -07:00
Madsen, Jonathan 17272d5df1 Re-enable OpenMP target and testing (#126)
* Re-enable OpenMP target and testing

* Enable openmp target tests on mi200 jobs

* Fix direct self-inclusion of header file

* Enable openmp-target testing on vega20

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>

[ROCm/rocprofiler-sdk commit: 2fe63d873e]
2025-03-13 22:29:07 -07:00
Trowbridge, Ian be053cb10b Temporarily Fix Incorrect Kernel Perfetto Trace Duration due to Firmware Timestamp Bug (#134)
* Perfetto duration temp fix setup

* Add timestamp change amounts to ROCP Info

* Groups kernel dispatch info by agent and queue id before sorting. Midpoint interpolation is then performed on the sorted kernels

* Moved dispatch bins into the for-loop

* Fix compilation error by using const ref

* Modified for review comments

* Changed variable names

[ROCm/rocprofiler-sdk commit: 6518c5463d]
2025-03-13 20:40:03 -07:00
Verma, Saurabh e75ab64492 Fixes for runtime errors reported in id_decode.hpp:set_dim_in_rec() by Mi300 UndefinedBehaviorSanitizer job (#114)
* Initial fix for runtime error in id_decode.hpp:set_dim_in_rec()

* actual fix: corrected the handling of case where dim==1 (ROCPROFILER_DIMENSION_NONE)

* removing magic numbers

* minor fix

* fix for invalid bool value at runtime

* clang format

* build fix

---------

Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>

[ROCm/rocprofiler-sdk commit: cffda33d3c]
2025-03-13 20:17:32 -07:00
Baraldi, Giovanni 2e3191bd73 Update codeobj disassembly to use comgr va2fo API (#250)
* Update codeobj disassembly to use comgr va2fo API

* Format

* Tidy fix

* Tidy fix

* Review comments

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: 970bebafeb]
2025-03-13 12:35:25 -07:00
Baraldi, Giovanni 985d0eda01 SWDEV-518826: Adding nullptr check after gpu name query (#257)
* Fix segfault on fail to query GPU name

* Format

* Review comments

* Format

* Review comment

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: 346c7149dd]
2025-03-13 16:25:16 +00:00
Kandula, Venkateshwar reddy 2a33544c0e SWDEV-518356: added check to avoid out of range hip host to device. (#267)
added check to avoid out of range.

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

[ROCm/rocprofiler-sdk commit: 8735ae4eb0]
2025-03-11 15:37:59 -05:00
Welton, Benjamin f621d8a32a Add debug printing statement to packet submission (#212)
* Add debug printing statement to packet submission

Adds debug printing to packets being submitted to HSA Queue in device
counting mode.

* Minor change

* Small fix

* formatting

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>

[ROCm/rocprofiler-sdk commit: f7e94c1ee8]
2025-03-10 14:02:30 -07:00
Kandula, Venkateshwar reddy 6db4554b89 rocprofv3-test-trace-hip-in-libraries-validate failed in PSDB (#248)
* capture streams by reference

* Fix sync_stream in tests/bin/vector-operations

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: cc18b95c7f]
2025-03-07 14:43:29 -06:00
Rawat, Swati 27d0bc087c Update CHANGELOG.md: editorial review (#254)
Update CHANGELOG.md

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: d74ea0876f]
2025-03-06 12:32:02 +05:30
Madsen, Jonathan 15c8c05f0c [rocprofv3] Fix calculation of services which collected data (#265)
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: e7c64e12f9]
2025-03-05 10:59:06 -06:00
Bhardwaj, Gopesh af84efb389 SWDEV-518428 Fixing experimental filesystem compilation issue (#262)
* SWDEV-518428 Fixing experimental filesystem compilation issue

* addressing feedback

[ROCm/rocprofiler-sdk commit: 73aa1bdeab]
2025-03-04 08:48:23 +05:30
Rawat, Swati f7d1f14c60 Documentation updates (#236)
* Documentation updates

* formatting

* Update using-rocprofv3.rst

* Update counter_collection_services.md

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>

[ROCm/rocprofiler-sdk commit: 31b8f61c8e]
2025-02-28 10:10:26 +05:30
Kuricheti, Mythreya c4ccaa4f5c [fix] typo in name rocprofiler_rocjpef_api_id_t -> rocprofiler_rocjpeg_api_id_t (#237)
[fix] rocjpeg fix

[ROCm/rocprofiler-sdk commit: b9eaf88fa3]
2025-02-25 11:23:31 -06:00
Indic, Vladimir 08e81f5972 PCS vs CC test: initializing buffer_id to zero (#229)
[ROCm/rocprofiler-sdk commit: f7b5759a20]
2025-02-24 04:41:07 -08:00
Trowbridge, Ian 74efdad1aa rocJPEG API Tracing (#73)
* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Support for rocJPEG API Trace

* Added newline to rocjpeg_version.h

* json-tool code added, initial test/bin commit

* Formatting

* Resolved rocjpeg bin test compilation errors

* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed

* Formatting and compilation errors

* Minor fixes

* Copyright year update and minor fixes

* Doc update fix

* Added rocjpeg csv file in data

* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes

* Added rocdecode and rocjpeg to CI

* Removed rocdecode and rocjpeg from CI and added back build tests option

* Updated Cmake Files

* Added rocDecode and rocJPEG to CI

* Remove cmake line added in error

* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes

* Added find_package for test

* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path

* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests

* Resolve merge conflicts and formatting

* Added regex find and replace instead of include for CI

* VAAPI package causing errors on Vega20

* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved

* Removed workflows regex

* Formatting and minor test modification

* Modified test for vega20

* Update rocDecode and rocJPEG cmake and tests

* Changelog

* Fix merge conflict

* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing

* Removed if found statements, replaced with TARGET:EXISTS

* Skip json file for rocjpeg and rocdecode tests if not supported

* Add os import

---------

Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 31fe8858d1]
2025-02-21 13:43:49 -08:00
Elwazir, Ammar afa1951db3 CI Update: Removing OLD ROCProfiler-SDK files (#232)
Removing OLD ROCProfiler-SDK files

[ROCm/rocprofiler-sdk commit: 95e0341266]
2025-02-21 11:59:25 -06:00
Kandula, Venkateshwar reddy 2181882d24 SWDEV-515574: Cache Number_Node static value. (#217)
* Cache Number_Node static value. To avoid value overwriting in consecutive dispatch callbacks.

* Format.

* tests for number_node evaluate.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

[ROCm/rocprofiler-sdk commit: 5cb4ad449f]
2025-02-19 19:10:16 -06:00
Elwazir, Ammar db152085be Temp: Disable CI for Mi300x (#221)
Update continuous_integration.yml

[ROCm/rocprofiler-sdk commit: b4b81f9095]
2025-02-18 11:41:49 -06:00
Madsen, Jonathan e503b1f4cc SDK: remove majority of exceptions (#176)
* SDK: remove majority of exceptions

- replace with ROCP_FATAL, ROCP_CI_LOG(WARNING), etc.
- improve logging of symbolic link
- add --readlink and --realpath (hidden options) to rocprofv3 to follow symlinks for preloaded libraries

* Add rocprofv3 --rocm-root argument

* Fix registration resolved_exists

* Fix rocprofv3_avail.py

* Update logging for rocprofiler_configure search

- relax failure conditions

* Misc clang-tidy fixes

* Fix merge

* Fix merge

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: 470f347e50]
2025-02-18 10:44:37 -06:00
Welton, Benjamin 95ac740f25 Fix install for conversion-script (#211)
[ROCm/rocprofiler-sdk commit: fd99654433]
2025-02-13 19:00:20 -06:00
Elwazir, Ammar a28b3b6d19 Lowering log level for COMGR logs (#210)
* Lowering log level for COMGR logs

* Format Fix

[ROCm/rocprofiler-sdk commit: 376c2a96ad]
2025-02-13 12:59:29 -06:00
Bhardwaj, Gopesh e3b2d94005 Adding CodeQL Analysis Workflow (#172)
* adding codeql.yml

* update codeql

* update codeql

* excluding external repos

* filter external

* filter external  and build

* Apply suggestions from code review

* Removed experimental test line

* Adding config

* moving codeql config out of workflows

* Disable Cdash

* update codeql

* replacing run-ci with simple cmake build

* cmake fix

* removing codeql_config

* Adding rule for python and actions

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

[ROCm/rocprofiler-sdk commit: 215875de32]
2025-02-13 12:55:44 -06:00
Madsen, Jonathan fba0d6ba76 SWDEV-514449: Fix missing thread pre/post callbacks (#204)
* SWDEV-514449: Fix missing thread pre/post callbacks

- invoke pre/post-callback around internal thread creation

* Update changelog

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 0d7ca72d84]
2025-02-13 12:39:10 -06:00
Bhardwaj, Gopesh b8033b785f SWDEV-514046 documentation build fix (#208)
[ROCm/rocprofiler-sdk commit: 848242eb5c]
2025-02-13 09:25:30 -06:00
Welton, Benjamin 4d43482e3b Update VERSION (#207)
* Update VERSION

Update version to 0.7.0

* Fixing test install build step issue

* Updates from editor

---------

Co-authored-by: Ammar ELWazir <Ammar.ELWazir@amd.com>

[ROCm/rocprofiler-sdk commit: 27c4277222]
2025-02-13 08:51:10 -06:00
Jakaraddi, Manjunath 0608bbb4db SWDEV-499989: Conversion Script to change counter collection output format from v3 to v1 (#107)
* SWDEV-499989: Add script to convert rocprofv3 counter collection output format to that of v1

* Add logging and argparsing

* Dropping duplicated counters in pmc multiple lines

* Adding test for conversion

* moving conversion script to test files

* copy conversion script from scripts folder

[ROCm/rocprofiler-sdk commit: c77596b703]
2025-02-12 11:31:17 -08:00
Trowbridge, Ian 3a26de9e53 Memory Allocation Counter Track Shows Total Allocation (#71)
* Counter track for memory allocation is now a running sum showing total allocation

* Address review comments

* Update source/lib/output/generatePerfetto.cpp

Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>

* Updated to reflect review comments

* Fix compilation errors on CI

* remove braces on scalar

* Fix struct compilation issues

* Removed name_to_id for sanitizer

---------

Co-authored-by: Meserve, Mark <Mark.Meserve@amd.com>

[ROCm/rocprofiler-sdk commit: cc0c401615]
2025-02-12 12:59:53 -06:00
Nagaraj, Sriraksha 547ce227c3 target cu to string input (#198)
* target cu to string input

* review comments

* review comments

[ROCm/rocprofiler-sdk commit: 5d0b220c37]
2025-02-12 12:51:39 -06:00
Kandula, Venkateshwar reddy 7fde16067f Accum_vgpr support in Rocprofv3 (#70)
* output accumulate vgpr count

* fix logic for computing accum_vgpr

* add accum_vgpr to csv.

* accumulation vgpr's docs and support for rocprofv3

* CHANGELOG.md

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>

[ROCm/rocprofiler-sdk commit: 6427fbafc2]
2025-02-12 10:47:46 -08:00
Bhardwaj, Gopesh 9874a65bea output format envs doc update (#173)
[ROCm/rocprofiler-sdk commit: 075d36eb82]
2025-02-11 21:37:12 -06:00
Baraldi, Giovanni 8d709bc12f SWDEV-513725: Update readme for gfx11+ power states (#193)
* Update readme

* Update README.md

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Address review comments

* Update README.md

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

[ROCm/rocprofiler-sdk commit: 831e469320]
2025-02-11 18:12:41 -06:00
Madsen, Jonathan 81250fa3d4 rocprofv3: Update rocprofv3 command line for ATT (#201)
* rocprofv3: suppress agent info when no data collected

* Update output config serialization

- full serialization of output configuration

* Update rocprofiler-sdk-att/tests

- add version and soversion
- change output directory
- generate libatt_decoder_summary
- disable tests instead of removing them

* Update rocprofv3 command-line

- make --att-library-path hidden by default
- simplify check_att_capability
- reorder pc sampling options
- add hidden --echo option
- remove ROCPROF_LIST_AVAIL_TOOL_LIBRARY from preload

* Add new rocprofv3 tests for specify the ATT library path

* Tweak to rocprofv3-test-hsa-multiqueue-att tests

* Update rocprofv3 tool to enable output with att

* Fix standalone test installation

* Revert to fetchcontent_makeavailable to fetchcontent_populate

* Revert tests/common/CMakeLists.txt

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 59b41ab5aa]
2025-02-11 18:10:48 -06:00
Madsen, Jonathan 5cc6244389 SDK: Agent UUIDs, agent runtime visibility, kernel symbol address (#154)
* [DO NOT MERGE] Misc UUID updates

- this is WIP

* Agent visibility

- Support for ROCR_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, CUDA_VISIBLE_DEVICES, GPU_DEVICE_ORDINAL

* Update CHANGELOG

* tweak to rocprofiler_agent_runtime_visiblity_t

* Code object kernel address

- new fields in code_object_kernel_symbol_register_data_t
  - kernel_code_entry_byte_offset
  - kernel_address

* Support ROCR_VISIBLE_DEVICES reordering devices for HIP

* Addressed code review changes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 6246ec4040]
2025-02-11 14:36:23 -06:00
Madsen, Jonathan 96ec52f2da rocprofv3: do not abort if counter does not have dimensions (#150)
* rocprofv3: do not abort if counter does not have dimensions

* Relax error handling further in rocprofv3 metadata

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 3071199386]
2025-02-11 14:31:25 -06:00
Kandula, Venkateshwar reddy a585468121 [BUG FIX] store dimensions in counter id when used reduce operator (#181)
* save other dimension in counter id.

* Formating

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

[ROCm/rocprofiler-sdk commit: 143f84fe6b]
2025-02-11 13:05:57 -06:00
Madsen, Jonathan d7495f9f1a Re-enable clang-tidy for core workflows + clang-tidy fixes (#197)
* Ensure the clang-tidy is updated + clang-tidy fixes

* update-ci workflow

* Enable clang-tidy checks

* Add extra logging to device counter collection samples

* Misc clang-tidy fixes

* Disable device counter collection samples for ThreadSanitizer

* Formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 070b659a9a]
2025-02-11 10:58:47 -06:00
Bhardwaj, Gopesh 6eb343aa4a Adding pc sampling how to guide (#160)
* Adding pc sampling how to guide

* doc update

* Fixing indentation

* updating index

* udpating doc

* updating doc

* Added field information

* Fixing Formatting

* fix formatting error

* Added json format for pc sampling

* feedback resolved

* formatting for text

* PC Sampling API doc

* Reformatted

* Note for shared systems

* update docs

* correcting relative path for cross-referencing

---------

Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>

[ROCm/rocprofiler-sdk commit: cdf22eba7d]
2025-02-10 20:33:05 -06:00
Elwazir, Ammar f8bff7b835 Disabling Mi325 temp. (#199)
[ROCm/rocprofiler-sdk commit: c478c24616]
2025-02-10 20:07:20 -06:00
Welton, Benjamin b90f127957 [SWDEV-513658] Force HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG value to be used with HSA calls (#192)
* Force HSA_AMD_MEMORY_POOL_EXECUTABLE_FLAG  value to be used with HSA calls

Fix for CI

* More tweaks

* Increase reproducible-runtime kernel sleep granularity

* Fix data race in synchronous device counter collection sample

* Update device counting service

- add get_active_context function

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 080b2ba451]
2025-02-10 11:34:26 -06:00
Indic, Vladimir cd8578cf53 Show host-trap configurations only (#194)
[ROCm/rocprofiler-sdk commit: e67a4451d8]
2025-02-10 11:32:53 -06:00