Graf commitů

103 Commity

Autor SHA1 Zpráva Datum
Welton, Benjamin 007285272b [SWDEV-518071] Return HSA not loaded status (device counter collection) (#242)
* [SWDEV-518071] Return HSA not loaded status (device counter collection)

This is a state that a caller would want to know about to understand if
they got no counters because of a failure or if they were trying to
collect counters too early (as is the case in the sample, which can
attempt to collect counters before HSA is inited).

* Minor edit

* format

* [SWDEV-518081] Simplify Metric Loading (#243)

* [SWDEV-518071] Return HSA not loaded status (device counter collection)

This is a state that a caller would want to know about to understand if
they got no counters because of a failure or if they were trying to
collect counters too early (as is the case in the sample, which can
attempt to collect counters before HSA is inited).
* [SWDEV-518324] Add AST update support

Allows the ability for ASTs to be updated (instead of an unchangable
static value). Adds a shared pointer return type to protect against
static destructors/modifications from invalidating potentially in use
AST definitions. No functionality/use changes in this PR.
* [SWDEV-518593] Add updatable dimension cache + fix string issues (#252)

* [SWDEV-518593] Add updatable dimension cache + fix string issues

Updates dimension cache to use the same design pattern as AST/Metrics.

Fixes the string scoping issue seen in ASTs, which appears here as well.

* Add rocprofiler_create_counter

Creates derived counters based on input from the API. This PR does three
things:

1. Adds the API + test case
2. Validates that an AST can be constructed from the counter supplied.
3. Updates metrics, ast, and dimension caches to include the new metric.

Metric should be available for use immediately after the call completes.

Due to the regeneration of ASTs, this call should not be performed in
performance sensitive code.

* Suggestion fixes

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>

* Minor tweak

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

* Fixes for comments

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
2025-03-14 01:07:16 -07:00
Verma, Saurabh cffda33d3c Fixes for runtime errors reported in id_decode.hpp:set_dim_in_rec() by Mi300 UndefinedBehaviorSanitizer job (#114)
* Initial fix for runtime error in id_decode.hpp:set_dim_in_rec()

* actual fix: corrected the handling of case where dim==1 (ROCPROFILER_DIMENSION_NONE)

* removing magic numbers

* minor fix

* fix for invalid bool value at runtime

* clang format

* build fix

---------

Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-03-13 20:17:32 -07:00
Welton, Benjamin f7e94c1ee8 Add debug printing statement to packet submission (#212)
* Add debug printing statement to packet submission

Adds debug printing to packets being submitted to HSA Queue in device
counting mode.

* Minor change

* Small fix

* formatting

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
2025-03-10 14:02:30 -07:00
Kandula, Venkateshwar reddy 5cb4ad449f SWDEV-515574: Cache Number_Node static value. (#217)
* Cache Number_Node static value. To avoid value overwriting in consecutive dispatch callbacks.

* Format.

* tests for number_node evaluate.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
2025-02-19 19:10:16 -06:00
Madsen, Jonathan 0d7ca72d84 SWDEV-514449: Fix missing thread pre/post callbacks (#204)
* SWDEV-514449: Fix missing thread pre/post callbacks

- invoke pre/post-callback around internal thread creation

* Update changelog

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-13 12:39:10 -06:00
Kandula, Venkateshwar reddy 143f84fe6b [BUG FIX] store dimensions in counter id when used reduce operator (#181)
* save other dimension in counter id.

* Formating

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
2025-02-11 13:05:57 -06:00
Madsen, Jonathan e743bf5a93 Undefined behavior warnings caught by ROCPROFILER_DEFAULT_FAIL_REGEX (#23)
* Add regex for undefined behavior to ROCPROFILER_DEFAULT_FAIL_REGEX

- add UBSAN_OPTIONS to setup-sanitizer-env.sh

* Improve ROCPROFILER_DEFAULT_FAIL_REGEX

* Use -fno-sanitize-recover=undefined flag

- this compiler flag causes all undefined behavior errors to exit

* Revert ROCPROFILER_DEFAULT_FAIL_REGEX

* fix for shift overflow

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>
2025-02-06 08:55:57 -06:00
Madsen, Jonathan 0fbe6cc7b6 SDK: No bg thread if no clients use SDK (#123)
* SDK: No bg thread if no clients use SDK

* Update CHANGELOG

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-06 08:34:56 -06:00
Welton, Benjamin 0c4a56c6bb [SWDEV-509876] Remove buffer requirement from device counting service (#132)
* [SWDEV-509876] Remove buffer requirement from device counting service

No longer require a buffer to be given when setting up device counting
service. This is to reduce performance overhead in cases where immediate
return of counting samples is being used (synchronous mode).

* Missed file

* Update source/include/rocprofiler-sdk/device_counting_service.h

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk/counters/controller.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Update source/lib/rocprofiler-sdk/counters/device_counting.cpp

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Fixes for build

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-02-04 06:06:03 -06:00
Kandula, Venkateshwar reddy 121901c321 add gfx12 for counter collection tests (#108)
* add gfx12 for counter def.

* Update continuous_integration.yml

* Update counter_defs.yaml

* commenting logging.

* Update ioctl.cpp

* add gfx12 to tests

* Update ioctl.cpp

* Add description to GFX12 GL2C_EA_RDREQ counter

* Updates from editor

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
2025-01-30 15:16:48 -06:00
Kuricheti, Mythreya d43070bf08 Fix navi48 counter event IDs (#158)
* Initial fix for navi48 counters

* Add GL2C navi4x gfx12 counters
2025-01-30 13:40:25 -06:00
Rawat, Swati 97b7a6315d update copyright date to 2025 (#102)
* Update LICENSE

* Update conf.py

* Update copyright year

* [fix] Update copyright year

* Update copyright year "ROCm Developer Tools"

* Add license headers to c++ files

* Add license to *.py

* Update licenses in rocdecode sources

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-22 19:11:20 -06:00
Welton, Benjamin 3076660e60 Add gfx941/gfx942 to ValuPipeIssueUtil (#139) 2025-01-21 14:39:03 -08:00
Welton, Benjamin c6b52701c7 Add gfx940/gfx9 to ValuPipeIssueUtil (#138)
Was dropped, likely by mistake, in the transition to yaml
2025-01-21 12:12:39 -08:00
Welton, Benjamin 536fbba627 [SWDEV-509659] Skip rocprof device counting tests if lacking permissions (#125)
* [SWDEV-509659] Skip rocprof device counting tests if lacking permissions

Skips non-intercept test if proper permissions are not obtained
(SYS_PERFMON). This should be the only test that fails due to permission
issues (others do not require the IOCTL to pass).

Regex match sample: https://regexr.com/8b29s

* Update source/lib/rocprofiler-sdk/counters/tests/CMakeLists.txt

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* fix

* Update source/lib/rocprofiler-sdk/counters/tests/CMakeLists.txt

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-01-16 01:51:06 -06:00
Welton, Benjamin 71dc203b0c Small debug print fix in ioctl.cpp (#120)
* Small debug print fix in ioctl.cpp

Fix debug print statement to print agent id.

* formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-15 18:00:04 -08:00
Baraldi, Giovanni 27266eb242 SWDEV-508485: Adding MFMA F8 metric (#112)
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-09 15:27:16 -06:00
Baraldi, Giovanni fddd8ac4aa SWDEV-490031: Adding new rdc ops metrics (#96)
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-01-06 11:02:23 +00:00
Baraldi, Giovanni 200a8624bc SWDEV-495749: Adding SIMD_UTILIZATION metric (#74)
* SWDEV-495749: Adding SIMD_UTILIZATION metric

* Fix mfmautil

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2024-12-20 14:35:18 +00:00
Welton, Benjamin c574881cdb Add support for device counter collection ioctl (#46)
Add support for device counter colleciton ioctl

Adds support for the device counter collection IOCTL. This IOCTL
allows for device wide counters to be collected even if the queue
is not intercepted by rocprofiler-sdk (required for system profilers).

A test is also included which checks this behavior by creating a queue
that does not have profiling enabled on it and checks to see if SQ
counters can be read from it. Note: this test will be skipped if the KFD
version does not contain this IOCTL.

Right now the check is "soft" in that if the IOCTL is present and there
is an error with permissions, rocprofiler will continue but will print
an error stating that system wide device profiling and collected counter
values may be degraded. This is primarily to avoid breaking existing
users (like PAPI) who may not need the IOCTL's capability and to give
them time to update.

Co-authored-by: Benjamin Welton <ben@amd.com>
2024-12-19 13:27:35 -08:00
Baraldi, Giovanni f4984f9dcc SWDEV-489158: Fix for exit thread safety (#61)
* SWDEV-489158: Fix for exit thread safety

* Fixed exit thread logic

* Force CI to rerun

* Remove .vscode

* Fix thread safety bug

* Addressed some comments

* Formatting

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2024-12-11 12:41:19 -06:00
Welton, Benjamin 253c9adfc1 [AFAR VII] rocprofiler_sample_device_counting_service return data as part of API call (#57)
---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Benjamin Welton <ben@amd.com>
2024-12-06 22:37:45 -08:00
Baraldi, Giovanni b7661bccfd SWDEV-489158: Adding consumer+producer model to AST evaluation (#13)
* Rebased optizations for rocprofv3 tool

* Fixing merge conflicts

* Formatting

* Open from within mutex

* Small name changes

* Added operator

* removed some parameters

* Optimizing counter collection

* Re-arrange code

* Adding back dimension query

* Formatting

* Update source/lib/rocprofiler-sdk/thread_trace/att_core.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Formatting 2

* Fix for test compilation

* Fix for yield

* Adding back check for zero

* Improved thread handling

* Formatting

* Remove automatic start

* Adding test

* Small fixes

* Adding lock for buffer callbacks

* Fix for race condition in AST

* Adding check for ptr

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-12-04 22:33:53 -06:00
Benjamin Welton 7ddc72ad45 Add rocprofiler_load_counter_definition (#1193)
Adds rocprofiler_load_counter_definition. This function allows a counter definition file to be supplied to rocprofiler-sdk directly. Takes in a string containing the counter definition YAML, its size (in bytes), and a flag value to state whether this is an append operation or not.

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: usrihari123 <srihari.u@amd.com>
2024-11-22 01:55:47 -08:00
venkat1361 472907a576 Dimension support for reduce operator (#1147)
* cache reference nodes

* evaluation based on dim args

* format

* add dimensions for reduce operator

* add dimensions for reduce operator

* add dimensions for reduce operator docs

* add dimensions for reduce operator.

* refactor switch cases

* Update CHANGELOG.md

* updated doc with data example

* updated doc with data example for reduce operation.

* added fallthrough in switch case sum.

* changelog.md

* format

* fix bug in constuct_test_data()
2024-11-11 18:37:28 +05:30
venkat1361 cc4811d27d SWDEV-477244: Select() Expression Dimension Support (#1091)
* add support for select function in derived counters

* formatting

* renaming select dims variable name from set to map

* format

* Update doc with select() for dimensions

* use : for defining range of values in select dims

* - update dimension for metric after select.
- make sure to raise runtime error if user provides range for a dimension.

* use map instead of unordered_map for select dim info

* new line EOF

* fix bug: select() operator.

* Update evaluate_ast.cpp

format

* added a check for dim value exceeds max.

* Update source/lib/rocprofiler-sdk/counters/evaluate_ast.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/counters/evaluate_ast.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* updated doc with data example for select operation.

* changelog.md

* Update CHANGELOG.md

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-11-11 15:14:22 +05:30
Benjamin Welton 89167f03a3 Counter definitions for GFX12 (#1038)
Co-authored-by: Benjamin Welton <ben@amd.com>
2024-11-08 08:27:15 -08:00
Giovanni Lenzi Baraldi 4acca76edb Update fetch_size metric (#1165)
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-29 21:44:27 -03:00
Benjamin Welton 4a5b1d98c2 SDK: counter collection serialization per device (#1157)
Migrates profiler_serializer class in QueueController to have an instance per-agent instead of one globally. Other changes in this commit are to allow for maps of the queues associated with each agent to be passed to profiler_serializer when it is turned on/off. Existing test cases cover whether or not the kernels are serialized (multistream app). New test case added to show that this serialization only occurs on a per device level with a kernel launched on one device waiting for a value to be set on the other.
2024-10-25 13:13:36 -07:00
Jonathan R. Madsen 74facf87a6 CMake: Consistently name CMake Targets (#1082)
* Change all rocprofiler-X target names to rocprofiler-sdk-X

* Update rocprofiler-sdk-config.cmake

- fix install tree target names
- simplify logic for using find w/ components and find w/o components

* Update rocprofiler-sdk-roctx-config.cmake

- simplify logic for using find w/ components and find w/o components

* Update samples/intercept_table/CMakeLists.txt

- demonstrate/test use of `find_package(rocprofiler-sdk ... COMPONENTS ...)`
2024-10-25 11:17:34 -05:00
venkat1361 9d3d5cc07c Max reduction operation bug fix & support reduction for derived counters (#1085)
* Max reduction operation bug fix & support reduction for derived counters

* CHANGELOG

* Added missing new line

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-22 22:49:51 +05:30
venkat1361 3f91d90bbc Check to force tools to initialize the ctx id to zero. (#1135)
* Check to force tool to initialize the ctx id to zero.

* initialize rocprofiler_context_id_t with 0 in units tests

* changelog

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-22 18:09:25 +05:30
venkat1361 22881148bd Parsing Select() expression dimension args to support range of values. (#1074)
* dim args for select token now supports range of values instead of single integer

* dim args for select token now supports range of values instead of single integer

* modified raw_ast fmt::format

* CHANGELOG.md

* varible rename for select dim from set to map in suffix

* use : for giving range of values

* Update source/lib/rocprofiler-sdk/counters/parser/scanner.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* format

* Update CHANGELOG.md

* using map instead of unordermap for select_dimension_map

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-22 11:44:08 +05:30
Benjamin Welton 210762c69d Added agent_id to rocprofiler_record_counter_t (#1078)
Co-authored-by: Benjamin Welton <ben@amd.com>
2024-10-21 16:29:53 -07:00
Benjamin Welton c6ae3961f8 Test for AQL packet decoding for SQ Waves (#765)
Checks the decoding of AQL packets for SQ Waves by
launching a kernel, injecting the AQL packets, and
decoding the result. This does not use write interceptor
but does this check on a raw HSA stream with direct
injection.
2024-10-21 10:41:29 -07:00
Benjamin Welton 788e687167 Agent Profiling Fixes for Broken/Improper API Usage (#1122)
Prevent's multiple setups of agent profiling on the same agent.

Fixes agent read context to only read agents that were setup.

Prevent copy of agent profiling internal data struct and reset
hsa_signal on move to prevent inadvertant delete.
2024-10-18 15:48:22 -07:00
Benjamin Welton bb69467765 Renamed agent profiling service to device counting service (#1132)
* Renamed agent profiling service to device counting service

Name more aptly represents what agent profiling did (device wide
counter collection). Conversion of existing user code can be
performed by the following find/sed command:

find . -type f -exec sed -i 's/rocprofiler_agent_profile_callback_t/rocprofiler_device_counting_service_callback_t/g; s/rocprofiler_configure_agent_profile_counting_service/rocprofiler_configure_device_counting_service/g; s/agent_profile.h/device_counting_service.h/g; s/rocprofiler_sample_agent_profile_counting_service/rocprofiler_sample_device_counting_service/g' {} +

* Converted dispatch profile to dispatch counting service

* Debug for functioal counters test

* Minor changes for CI

* Minor fix

* More fixes for CI

* Update evaluate_ast.cpp

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
2024-10-18 14:14:11 +05:30
Giovanni Lenzi Baraldi 9652a6e738 SWDEV-487621: Fixes for metric definitions (#1118)
* Fixes for metric definitions

* Removing gfx8

* Update changelog

* Fixing unit tests

* Small fixes

* Fix for write size
2024-10-08 02:44:31 +05:30
Giovanni Lenzi Baraldi fa7fc7ec5d Fixing some accumulate metrics (#1089)
* Fixing some accumulate metrics

* Fixing some more accumulate metrics

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
2024-10-03 09:44:31 -07:00
Gopesh Bhardwaj fa91169479 Fixing missing counters for gfx900 (#1028) 2024-08-21 11:54:30 +05:30
Gopesh Bhardwaj 439025d421 Adding HW Block Information (#1021)
* Adding HW Block Information

* Addressed Review comments
2024-08-21 10:00:41 +05:30
Jonathan R. Madsen bb25376480 Misc API cleanup and consistency fixes (#1023)
- ROCPROFILER_API after function
- use rocprofiler_tracing_operation_t in lieu of uint32_t where appropriate
- rocprofiler_tracing_operation_t is not int32_t typedef (formerly uint32_t)
- use const T* instead of T* where appropriate
2024-08-20 01:06:12 -05:00
Jonathan R. Madsen b15e498945 Add kernel profiling time info to counter collection records (#1000)
* Add kernel profiling time info to counter collection records

- lib/rocprofiler-sdk/kernel_dispatch
  - added profiling_time.{hpp,cpp}
  - restructured tracing.cpp
- updated queue.cpp AsyncSignalHandler
  - gets kernel dispatch profiling time and passes to dispatch_complete and signal callbacks
- structured some header includes to reduce cyclic include probability
  - originally, including kernel_dispatch/tracing.hpp in hsa/queue.hpp created a lot of cyclic includes

* Fix kernel_dispatch.cpp includes

* Fix kernel_dispatch.cpp

- include <cstring>
- replace use of ROCPROFILER_HSA_AMD_EXT_API_ID_NONE with ROCPROFILER_KERNEL_DISPATCH_LAST
2024-08-19 20:05:04 -05:00
Benjamin Welton 27a408f5cc SQ Counter Documentation (#978)
* SQ Counter Documentation

Improve documentation of SQ counters. Attempts to
make what the counters are outputting (and where
applicable what the counter means in terms o
performance) more clear.

* pre-format

* Address comments + YAML formatting

* More definition fixes

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-07-30 07:46:10 -05:00
Benjamin Welton 34897d318f Convert counter def format to YAML (#976)
* Convert counter def format to YAML

Converts counter definition format to YAML with the
following structure:

```yaml
COUNTER_NAME:
 architectures:
  gfxXX: // Can be more than one, / deliminated if they share idential data
    block: <Optional>
    event: <Optional>
    expression: <optional>
    description: <Optional> // In case per arch notes are needed
  gfxYY:
    ...
 description: General counter desctiption
```

All counters (derived and hardware) are now defined
in the same file for ease of future additions/subtractions.

Removes existing XML parser. Keeps the existing XML
definitions for now (since other tools still rely on
its presence).
2024-07-12 16:20:33 -07:00
Jonathan R. Madsen 1e49b43738 Miscellaneous updates (#959)
- missing-new-line CI job: ensures all source files end with new line
- logging updates
- add new line to the end of many files
- fix header include ordering is misc places
- transition to use hsa::get_core_table() and hsa::get_amd_ext_table() in various places instead of making copies
2024-07-08 16:50:32 -05:00
Giovanni Lenzi Baraldi 4e2144dbfa General fixes to ATT, packets and event ID retrieval (#960)
* General fixes to ATT, packets and event ID retrieval

* Update source/lib/rocprofiler-sdk/hsa/aql_packet.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-07-04 03:58:45 -03:00
Jonathan R. Madsen 64b8f8370e Fix kernel trace gaps (#961)
- source/lib/rocprofiler-sdk/hsa/queue.cpp
  - Optimize WriteInterceptor to eliminate extra barrier packets causing gaps between kernels in kernel tracing
  - increase timeout_hint in hsa_signal_wait in set_profiler_active_on_queue
  - misc logging improvements
- source/lib/rocprofiler-sdk/counters/agent_profiling.cpp
  - increase timeout_hint in hsa_signal_wait in set_profiler_active_on_queue
- tests/rocprofv3/hsa-queue-dependency/CMakeLists.txt
  - add TIMEOUT for rocprofv3-test-hsa-multiqueue-execute
2024-07-02 18:49:04 -05:00
Giovanni Lenzi Baraldi a78753d392 Accumulation metrics support and update counter collection API to aqlprofile_v2 (#915)
* Updating to v3 API

* General fixes

* Extending dimension bits to 54

* Disabling agent profiling tests

* Fixed unit test

* Adding accumulate metric support for parsing counters (#609)

* Adding accumulate metric support for parsing counters

* Adding metric flag

* Updating tests

* source formatting (clang-format v11) (#610)

Co-authored-by: Manjunath-Jakaraddi <21177428+Manjunath-Jakaraddi@users.noreply.github.com>

* source formatting (clang-format v11) (#614)

Co-authored-by: jrmadsen <6001865+jrmadsen@users.noreply.github.com>

* Adding evaluate ast test

* source formatting (clang-format v11) (#633)

Co-authored-by: Manjunath-Jakaraddi <21177428+Manjunath-Jakaraddi@users.noreply.github.com>

* Update scanner generated file

* Adding flags to events for aqlprofile

* Fix Mi200 failing test

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Manjunath-Jakaraddi <21177428+Manjunath-Jakaraddi@users.noreply.github.com>
Co-authored-by: jrmadsen <6001865+jrmadsen@users.noreply.github.com>

* Revert "Extending dimension bits to 54"

This reverts commit 3cd6628452484044a93e129f27974f996a0e4c08.

* Removing CU dimension

* Fixing merge conflicts

* Revert "Disabling agent profiling tests"

This reverts commit 7e01518ed8c51fbb0c3b2575e1e0b8f9ddfa8237.

* Fixing merge conflicts

* Fix parser tests

* Adding accumulate metric documentation

* Update counter_collection_services.md

* Update index.md

* fix nested expression use

* Update source/lib/rocprofiler-sdk/counters/evaluate_ast.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Doc update

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Manjunath P Jakaraddi <manjunath180397@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Manjunath-Jakaraddi <21177428+Manjunath-Jakaraddi@users.noreply.github.com>
Co-authored-by: jrmadsen <6001865+jrmadsen@users.noreply.github.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>
2024-07-01 21:56:41 -03:00
Benjamin Welton 81d1407565 Incremental Counter Profile Creation (#933)
* Incremental Counter Profile Creation

Adds support for incremental counter creation. How this functions is the
behavior of rocprofiler_create_profile_config has been changed.

rocprofiler_create_profile_config(rocprofiler_agent_id_t           agent_id,
                                  rocprofiler_counter_id_t*        counters_list,
                                  size_t                           counters_count,
                                  rocprofiler_profile_config_id_t* config_id)

The behavior of this function now allows an existing config_id to be
supplied via config_id. The counters contained in this config will be
copied over and used as a base for a new config along with any counters
supplied in counters_list. The new config id is returned via config_id
and can be used in future dispatch/agent counting sessions.

A new config is created over modifying an existing config since there
is no gaurentee that the existing config isn't already in use. While we
could add locks (or other mutual exclusion properties) to check if its
in use and reject an update, the benefit from doing so is minor in
comparison to just creating a new config. This also side steps a common
pattern a tool may use to add additional counters at some point later on
during execution. Now they can do that without destroying the existing
config.

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-06-19 00:11:03 -07:00