Γράφημα Υποβολών

257 Υποβολές

Συγγραφέας SHA1 Μήνυμα Ημερομηνία
Jonathan R. Madsen 7e2a6916df Fix rocprofiler-sdk-tool-kokkosp (#1094)
* Fix rocprofiler-sdk-tool-kokkosp

- missing symbol `rocprofiler::common::impl::get_env(std::basic_string_view<char, std::char_traits<char> >, bool)`

* cmake formatting

[ROCm/rocprofiler-sdk commit: 2f0ec02950]
2024-09-24 17:04:58 -05:00
Jonathan R. Madsen 120a358069 Update HIP tracing ABI (#1025)
* Update HIP ABI tracing

* Minor HIP abi.cpp updates

* Misc roctx updates (version.h + more)

* Common static thread-local template struct

- static_tl_object
- similar to static_object but with thread-local semantics

* rocprofiler-sdk/version.h updates

* Update for HIP_RUNTIME_API_TABLE_STEP_VERSION == {4,5,6}

* Fix roctx.cpp tweaks

[ROCm/rocprofiler-sdk commit: 7861dcc6c6]
2024-09-13 17:10:35 -05:00
venkat1361 b9d2b3f495 SWDEV-476852 - Check added for agent architecture counters support. (#1022)
* check added for agent arch support

* formatting issue

[ROCm/rocprofiler-sdk commit: bc82eccf4f]
2024-09-13 11:28:00 -07:00
Mythreya d008463d50 Enable queue interception with scratch reporting (#1069)
* Enable queue interception with scratch reporting

Scratch reporting reports agent ID in buffer and callback records, but
HSA runtime provides only queue ID in the scratch callback.

This change enables queue interception when scratch reporting is requested

* Validation test for rocprofv3 + scratch-memory-trace

* Simplify checks for whether context is tracing a domain

* Update changelog

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: efbe4ea0a2]
2024-09-12 18:26:34 -05:00
Jonathan R. Madsen db0f26f562 Prevent misaligned read from common::container::ring_buffer (#1076)
- causes undefined behavior

[ROCm/rocprofiler-sdk commit: 6098d52335]
2024-09-12 18:25:05 -05:00
Jonathan R. Madsen fcd6cc45bd Package RCCL headers to support adding RCCL support w/o installed headers (#1075)
- in ROCm CI, rocprofiler-sdk gets built before RCCL is installed, this is a workaround for this issue

[ROCm/rocprofiler-sdk commit: 8c1382fceb]
2024-09-12 18:24:50 -05:00
Jonathan R. Madsen 3a5154c5ff rocprofv3 Kokkos-Tools Support (#1058)
[ROCm/rocprofiler-sdk commit: d5bcb63263]
2024-09-12 00:46:07 -05:00
Mythreya c47a128941 Add support for RCCL tracing (#1047)
* [Draft]: Add support for RCCL tracing

Address comments

* [Draft]: Add support for RCCL tracing

Address PR comments, changes from RCCL upstream

* Add RCCL library table registration

Working on adding support to rocprofiler-register

* Support compilation w/o <rccl/amd_detail/api_trace.h>

- dummy api_trace.h header
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED when RCCL does not have api_trace.h header

* RCCL API tracing tool support

- add to rocprofv3
- add to json-tool

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 2a146259c7]
2024-09-12 00:42:58 -05:00
Jonathan R. Madsen 4d01508d0b LD_PRELOAD librocprofiler-sdk-roctx.so when marker-trace enabled (#1057)
* LD_PRELOAD librocprofiler-sdk-roctx.so when marker-trace enabled

- this enables apps to link against old ROCTx (libroctx64.so) but get marker tracing in rocprofv3

* Update CHANGELOG

* Validation test for app linked to old (roctracer) ROCTx library

* Tweak scope of tool_counter_info

- causing "signal-unsafe call inside of a signal" error for ThreadSanitizer on mi200

* Fix handling of missing transpose-roctracer-roctx

* Disable rocprofv3 aborted-app test (ThreadSanitizer)

- ThreadSanitizer + mi200/mi300 + aborted-app results in a signal-unsafe call inside a signal that cannot be specifically suppressed as usual via rocprofv3_error_signal_handler for some unknown reason

* Add UndefinedBehaviorSanitizer job

[ROCm/rocprofiler-sdk commit: 72cbcedc9e]
2024-09-11 15:27:35 -05:00
Giovanni Lenzi Baraldi a91eab65f8 Removing unecessary barrier packet (#1017)
[ROCm/rocprofiler-sdk commit: 474b72b4fc]
2024-09-11 16:29:57 +05:30
Gopesh Bhardwaj 2a4591dcae Fix rocprofv3 output filename containing sub-directory (#1062)
* Fix -d option broken by hostname

* Fix rocprofv3 output filename containing directory

* Fix TID handling in Perfetto and OTF2 output

* Revert changes which removed hostname

* Revise tests/rocprofv3/tracing output filenames

- specify an output filename for tests which include a subdirectory

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: a4b3a57ecc]
2024-09-10 17:49:36 -05:00
Jonathan R. Madsen 34c35c26ba Fix misaligned stores in buffer (#1063)
* Fix misaligned read/write to buffer

- causes undefined behavior

* Update run-ci.py

- fix spurious CDash submission failure warning

* Improve run-ci.py support for UBSan

* Relax rocprofv3 summary stats count expectation

* Update CHANGELOG

[ROCm/rocprofiler-sdk commit: 37e0d7efce]
2024-09-10 17:08:57 -05:00
itrowbri fe6acf4a01 SWDEV-466452: incomplete gromacs pftrace (#1040)
* SWDEV-466452: Inserted `tracing_session->FlushBlocking()` after
TRACE_EVENT_END when recording trace events to resolve incomplete
pftrace bug

* Update source/lib/rocprofiler-sdk-tool/generatePerfetto.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Formatting

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>

[ROCm/rocprofiler-sdk commit: ceb7237652]
2024-09-09 11:45:57 -05:00
Jonathan R. Madsen 6743608258 rocprofv3: summary reports + more JSON metadata (#1029)
* Move include/rocprofiler-sdk/cxx/details/delimit.hpp to tokenize.hpp

* Update docs/how-to/using-rocprofv3.rst

- fix code block indents
- reorder rocprofv3 options, limit them to important options
- add docs for `--runtime-trace`

* Update rocprofv3.py

- parser argument groups
- new `--runtime-trace` option
- new `--summary` option
- new `--summary-per-domain` option
- new `--summary-groups` option
- new `--summary-output-file` option
- new `--summary-units` option

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- fix async copy operation names: add "MEMORY_COPY_" prefix

* lib/rocprofiler-sdk-tool: update statistics.{hpp,cpp}

- statistics<>::get_percent function
- stats_entry_t struct
- stats_formatter struct
- percentage struct
- std::to_string(::rocprofiler::tool::percentage)

* lib/rocprofiler-sdk-tool: update domain_type.{hpp,cpp}

- reorder domain_type enum values

* lib/rocprofiler-sdk-tool: update generateCSV.{hpp,cpp}

- separate writing CSV from accumulating statistics
- a lot of functionality was moved to statistics.{hpp,cpp}

* lib/rocprofiler-sdk-tool: update output_file.{hpp,cpp}

- output_stream_t struct
- get_output_stream(...) returns output_stream_t instance

* lib/rocprofiler-sdk-tool: update generateJSON.cpp

- update get_output_stream usage to output_stream_t

* lib/rocprofiler-sdk-tool: update generateOTF2.cpp

- header include order tweak

* lib/rocprofiler-sdk-tool: update buffered_output.hpp

- stats_data_t was renamed to stats_entry_t

* lib/rocprofiler-sdk-tool: update generatePerfetto.cpp

- header include tweak

* lib/rocprofiler-sdk-tool: update tmp_file_buffer.hpp

- emit warning message if write_ring_buffer fails after offloading instead of aborting
- prefer placement new instead of assignment in write_ring_buffer

* lib/rocprofiler-sdk-tool: add generateStats.{hpp,cpp}

- functions for accumulating statistics

* Update tests/rocprofv3/tracing-hip-in-libraries/CMakeLists.txt

- accommodate tweak to CSV output file name for HIP and HSA traces

* lib/rocprofiler-sdk-tool: update config.{hpp,cpp}

- new config variables
  - stats_summary
  - stats_summary_per_domain
  - summary_output
  - stats_summary_unit_value
  - stats_summary_unit
  - stats_summary_file
  - stats_summary_groups
- support output keys for hostname: %hostname% / %h

* lib/rocprofiler-sdk-tool: update tool.cpp

- support summary output

* Documentation fixes

* Test for summary output

* Update tests/bin/transpose to use more ROCTx

- also support building with the roctracer ROCTx

* Remove roctxMark from OTF2 + fix kernel-rename tests

- following more ROCTx calls in transpose, kernel-rename validation had to be updated

* JSON metadata + JSON summary

- add serialization support for config
- add serialization support for statistics
- additions to json spec
  - rocprofiler-sdk-tool/metadata/config
  - rocprofiler-sdk-tool/metadata/command
  - rocprofiler-sdk-tool/summary
- config output_keys support for NVIDIA %q{<ENV-VAR>} syntax
- config output_keys support keys within keys

* rocprofv3 --summary-groups warning if no domain matches

- emit warning if a regex in for summary groups did not match any domain names

* Compile fix for lib/rocprofiler-sdk-tool/tool.cpp

- get_config().scratch_memory_trace
- pass contributions to write_json

* Update rocprofv3.py to preload rocprofiler-sdk-roctx

- appended to LD_PRELOAD when args.marker_trace is enabled

* Fix ReST link errors about subtitle underline being too short

* Patch tokenization of config::stats_summary_groups

- guard against array values of empty strings

* Tweak rocprofv3 summary test

- input-summary.yaml (used by rocprofv3-test-summary-inp-yaml-execute) only provides one summary group regex

* Disable LD_PRELOAD of librocprofiler-sdk-roctx.so

- this causes problems in the sanitizers, will be addressed in another PR

[ROCm/rocprofiler-sdk commit: 395f01b689]
2024-09-09 11:20:55 -05:00
Vladimir Indic 56e33c7543 More comprehensive IOCTL PC sampling checks (mi200 and mi300) (#1045)
* More comprehensive IOCTL PC sampling checks (mi200 and mi300)

* PC sampling tests: formatting

[ROCm/rocprofiler-sdk commit: 8bf2ce622c]
2024-09-06 09:45:05 -05:00
Vladimir Indic 3ed3489670 PC sampling: online partial PC sampling decoding (#1004)
* PC sampling: online partial PC sampling decoding

PC sampling service decodes a PC sample partially
by replacing the PC with an id of the loaded code object instance
containing PC and the offset of the PC within that code object instance.

* PC sampling: marker records removed

* PC sampling parser: minor doc update in mock

* PC sampling: introducing rocprofiler_pc_t

* NULL value of the code object id introduced.

* Clarifying documenation related to PC offset.

* PC offset documentation improvement

* PC sampling parser benchmark: Reducing the number of samples to recreate half of performance.

[ROCm/rocprofiler-sdk commit: 93e82663d9]
2024-09-05 11:35:46 -05:00
Gopesh Bhardwaj 1a5dd2e7a6 Fixing missing counters for gfx900 (#1028)
[ROCm/rocprofiler-sdk commit: fa91169479]
2024-08-21 11:54:30 +05:30
Gopesh Bhardwaj a4136fde3d Adding HW Block Information (#1021)
* Adding HW Block Information

* Addressed Review comments

[ROCm/rocprofiler-sdk commit: 439025d421]
2024-08-21 10:00:41 +05:30
Jonathan R. Madsen 8ed4980b3f Update HSA ABI checks for tracing (#1027)
* Update HSA ABI checks for tracing

* Update lib/common/abi.hpp

- perform ABI versioning checks even when `ROCPROFILER_CI` is not defined (or ROCPROFILER_CI=0)

* Enforce versioning size for various HSA AmdExt step versions + hsa_amd_enable_logging support

* Minor HIP abi.cpp updates

[ROCm/rocprofiler-sdk commit: 7a639f3439]
2024-08-20 01:08:34 -05:00
Jonathan R. Madsen 4d3708a6fc Misc cleanup and stale code removal (#1026)
* Remove custom allocators

- remove unused lib/rocprofiler-sdk/allocator.*
- remove unused lib/rocprofiler-sdk/context/allocator.hpp

* Fix rocprofiler_strip_target (rocprofiler_utilities.cmake)

* Remove old HSA_TOOLS_LIB support

- remove OnLoad/OnUnload functions used by HSA_TOOLS_LIB env variable

* Fix linter warnings + specific NOLINT exceptions

- replace bare NOLINT with NOLINT(<warning-name>)

[ROCm/rocprofiler-sdk commit: 5d54682468]
2024-08-20 01:07:32 -05:00
Jonathan R. Madsen 264c48fa69 Misc API cleanup and consistency fixes (#1023)
- ROCPROFILER_API after function
- use rocprofiler_tracing_operation_t in lieu of uint32_t where appropriate
- rocprofiler_tracing_operation_t is not int32_t typedef (formerly uint32_t)
- use const T* instead of T* where appropriate

[ROCm/rocprofiler-sdk commit: bb25376480]
2024-08-20 01:06:12 -05:00
Jonathan R. Madsen 82a089ac0a Add kernel profiling time info to counter collection records (#1000)
* Add kernel profiling time info to counter collection records

- lib/rocprofiler-sdk/kernel_dispatch
  - added profiling_time.{hpp,cpp}
  - restructured tracing.cpp
- updated queue.cpp AsyncSignalHandler
  - gets kernel dispatch profiling time and passes to dispatch_complete and signal callbacks
- structured some header includes to reduce cyclic include probability
  - originally, including kernel_dispatch/tracing.hpp in hsa/queue.hpp created a lot of cyclic includes

* Fix kernel_dispatch.cpp includes

* Fix kernel_dispatch.cpp

- include <cstring>
- replace use of ROCPROFILER_HSA_AMD_EXT_API_ID_NONE with ROCPROFILER_KERNEL_DISPATCH_LAST

[ROCm/rocprofiler-sdk commit: b15e498945]
2024-08-19 20:05:04 -05:00
Giovanni Lenzi Baraldi 11526c0f7c ATT Agent fixes and improvements (#1011)
* Tidying ATT dispatch API. ATT Agent to be initialized with rest of profiler. Removing read_index-based wait.

* Formatting

* Adding some input validation

* Add perf test for agent

* Removing async

[ROCm/rocprofiler-sdk commit: fa1b9e67ab]
2024-08-15 13:57:13 -03:00
SrirakshaNag 02ba6d0cdc fix range iteration test (#999)
* fix range iteration test

* misc fix

* fixing test fail

* fixing test

* fix yaml test

* add newline

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: 0e0b37501c]
2024-08-12 14:30:08 -05:00
Jonathan R. Madsen 6c6adddb5c Reorganize thread trace codeobj headers (#1001)
* include/rocprofiler-sdk/cxx/codeobj

- Relocated from include/rocprofiler-sdk/amd_detail/rocprofiler-sdk-codeobj

* Update include/rocprofiler-sdk/cxx

- cmake updates
- correct namespace rocprofiler::codeobj rocprofiler::sdk::codeobj

* Update codeobj tests and samples

[ROCm/rocprofiler-sdk commit: 20e07caad4]
2024-08-01 00:10:09 -05:00
SrirakshaNag 03ff04bbe3 Adding changes for handling abort signals (#979)
* Adding changes for handling abort signals

* Fix the test failure

* Fixing CmakeLists error

* Addressing review comments

* fixing warnings

* fixing execute test

* Fixing abort app test

* Address review comments

* Apply suggestions from code review

* Apply suggestions from code review

* Fixes for testing issues

* Adding kernel filtering test

* Removing text input file

* fix formatting issues

* misc fix

* Suppress signal-unsafe error in ThreadSanitizer

- rename signal handler to rocprofv3_error_signal_handler to ensure specific filtering

* Fix rocprofv3 aborted-app validation

---------

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 94b5d9be3f]
2024-08-01 09:16:01 +05:30
SrirakshaNag 937b9344bb fix iteration range and add tests (#993)
* fix iteration range and add tests

* addressing review comments on tests

[ROCm/rocprofiler-sdk commit: 4d7b8ece80]
2024-07-31 09:10:44 +05:30
Jonathan R. Madsen ef22b7a484 rocprofv3 OTF2 Output Support (#995)
* CMake support for OTF2 library

* Preliminary OTF2 generation implementation

* Completed OTF2 Support

- HSA API
- HIP API
- Marker API
- Async Memory Copies
- Kernel Dispatch

* Update lib/rocprofiler-sdk-tool/generateOTF2.cpp

- fix location type for dispatches

* Testing for OTF2 output

* Add OTF2 to requirements.txt

* Update lib/rocprofiler-sdk-tool/generateOTF2.cpp

- fix getting kernel name

* OTF2 testing with rocprofv3/tracing-hip-in-libraries

* Format external/otf2/CMakeLists.txt

* Update external/otf2/CMakeLists.txt

- guard CMP0135 for cmake < 3.24

* Update lib/rocprofiler-sdk-tool/generateOTF2.cpp

- fix duplicate string ref issue

* Update lib/rocprofiler-sdk-tool/generateOTF2.cpp

- fix header includes

* Update CI workflow

- sudo install pypi requirements for core-rpm for $HOME/.local installs

* Update pytest_utils/otf2_reader.py

- modifications for reading trace

* Update pytest_utils/otf2_reader.py

- misc cleanup

* Update CI workflow

- fix installer artifact naming

* Update pytest_utils/otf2_reader.py

- handle slightly overlapping kernel timestamps for MI300

* OTF2 attributes for category

* Testing with OTF2Reader category attributes

* Fix memory leak in OTF2 generation

- leaking OTF2_AttributeList

[ROCm/rocprofiler-sdk commit: 16d535ef48]
2024-07-30 19:57:19 -05:00
Benjamin Welton 41ee6cc741 SQ Counter Documentation (#978)
* SQ Counter Documentation

Improve documentation of SQ counters. Attempts to
make what the counters are outputting (and where
applicable what the counter means in terms o
performance) more clear.

* pre-format

* Address comments + YAML formatting

* More definition fixes

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: 27a408f5cc]
2024-07-30 07:46:10 -05:00
Jonathan R. Madsen a728a4b4cd rocprofv3 kernel renaming support + misc rocprofv3 updates (#992)
* Increase rocprofv3 tool buffer size

- 32 pages instead of 1 page

* Improve rocprofv3 perfetto track labels

* Preliminary kernel renaming support + misc rocprofv3 updates

- add rocprofv3 option --kernel-rename
- add rocprofv3 options for perfetto settings (buffer size, etc.)
- add CSV columns for kernel trace
  - Thread_Id
  - Dispatch_Id
- add CSV column for counter_collection
 - Kernel_Id

[ROCm/rocprofiler-sdk commit: ebb021c59f]
2024-07-29 14:33:50 -05:00
SrirakshaNag 39c31e14fc kernel iteration filtering for counter collection (#911)
* kernel filtering for counter collection

* fixing trace tests

* removing print statements

* fix CI fail

* handling preload and updating docs

* minor fix

* misc fix

* misc fix

* Typo fix

* Update rocprofv3 + input schema

- "application_passes" -> "jobs"
- removed nesting in YAML/JSON inputs
- improved customAction (now booleanArgAction)
  - supports --<name> (defaults to true)
  - supports --<name>=<truth-value>
  - supports --<name> <truth-value>
- added --kernel-iteration-range to command-line
- automatically support new command-line options in YAML/JSON input
- standardized PMC return from text input to match PMC from YAML/JSON input
- added support for --log-level env
- updated various input*.(yml|json) to modified schema

* Update config.cpp

- added recommended code to get_kernel_filter_range

* Fixing iteration

* misc fix

* support only [-] for iteration

* bug fix

* Fix using-rocprofv3.rst

* Update config.cpp

- patch get_kernel_filter_range

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: ace34abd11]
2024-07-26 21:46:53 -05:00
Gopesh Bhardwaj 2393b2ee45 look for symbols in dynsym table (#990)
* look for symbols in dynsym table

* checking both symtab and dynsym

* Avoid symbol duplication in non stripped binaries

* clang-format

* Minor elf_utils.cpp updates

- use 'else if' instead of 'if'
- logging tweaks

* Update registration

- tweak logging

* Update testing

- strip the rocprofiler-sdk-c-tool library
- add test-c-tool-rocp-tool-lib-execute test which does NOT LD_PRELOAD the library (uses only ROCP_TOOL_LIBRARIES instead)

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: dc671497da]
2024-07-26 10:21:04 +05:30
Gopesh Bhardwaj 185094eee1 fixing core dump wehen no hipcc optmization (#989)
[ROCm/rocprofiler-sdk commit: ba35562729]
2024-07-24 10:22:07 +05:30
Giovanni Lenzi Baraldi 6512883b3f Adding barrier bit on packets after dispatch (#981)
Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: 04a38ce034]
2024-07-22 22:39:56 +05:30
Jonathan R. Madsen 65cf62cb40 Common string entry (#971)
* Common string entry

* Add lib/common/string_entry.cpp + return const string*

[ROCm/rocprofiler-sdk commit: 90accda152]
2024-07-16 23:07:49 -07:00
Benjamin Welton cc30e4830d Convert counter def format to YAML (#976)
* Convert counter def format to YAML

Converts counter definition format to YAML with the
following structure:

```yaml
COUNTER_NAME:
 architectures:
  gfxXX: // Can be more than one, / deliminated if they share idential data
    block: <Optional>
    event: <Optional>
    expression: <optional>
    description: <Optional> // In case per arch notes are needed
  gfxYY:
    ...
 description: General counter desctiption
```

All counters (derived and hardware) are now defined
in the same file for ease of future additions/subtractions.

Removes existing XML parser. Keeps the existing XML
definitions for now (since other tools still rely on
its presence).


[ROCm/rocprofiler-sdk commit: 34897d318f]
2024-07-12 16:20:33 -07:00
Jonathan R. Madsen 40e85cadd8 Parse ELF format for rocprofiler_configure symbol (#970)
* Parse ELF format to search for rocprofiler_configure

* Use ELF parsing in registration

[ROCm/rocprofiler-sdk commit: 2be3543c7b]
2024-07-11 20:22:26 -05:00
Jonathan R. Madsen df787c8a5a Fix agent shutdown destructor errors (#969)
* Update lib/rocprofiler-sdk/agent.cpp

- use static_object wrapper for vector of agent_pair (rocp agent <-> hsa agent)

* Fix get_aql_handles() shutdown error

- use `static_object` wrapper for vector of `aqlprofile_agent_handle_t`

[ROCm/rocprofiler-sdk commit: 8b1b074b2a]
2024-07-08 17:53:02 -05:00
Jonathan R. Madsen 497668060a Update HIP API tracing (#958)
- support HipDispatchTable additions for HIP_RUNTIME_API_TABLE_STEP_VERSION 1 thru 4

[ROCm/rocprofiler-sdk commit: 60b1dbfb6f]
2024-07-08 17:12:53 -05:00
Jonathan R. Madsen 73c8841a54 Miscellaneous updates (#959)
- missing-new-line CI job: ensures all source files end with new line
- logging updates
- add new line to the end of many files
- fix header include ordering is misc places
- transition to use hsa::get_core_table() and hsa::get_amd_ext_table() in various places instead of making copies

[ROCm/rocprofiler-sdk commit: 1e49b43738]
2024-07-08 16:50:32 -05:00
Giovanni Lenzi Baraldi 0bb2f9a1bd Returning code object id information in code_printing.cpp:Instruction (#965)
* Returning code object id information in code_printing.cpp:Instruction

* Adding assertions

* Simplifying decoder library

[ROCm/rocprofiler-sdk commit: 78fd8cb379]
2024-07-08 16:59:40 -03:00
Giovanni Lenzi Baraldi 19329c3a97 General fixes to ATT, packets and event ID retrieval (#960)
* General fixes to ATT, packets and event ID retrieval

* Update source/lib/rocprofiler-sdk/hsa/aql_packet.hpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

[ROCm/rocprofiler-sdk commit: 4e2144dbfa]
2024-07-04 03:58:45 -03:00
Jonathan R. Madsen cd6a94dcb5 Fix kernel trace gaps (#961)
- source/lib/rocprofiler-sdk/hsa/queue.cpp
  - Optimize WriteInterceptor to eliminate extra barrier packets causing gaps between kernels in kernel tracing
  - increase timeout_hint in hsa_signal_wait in set_profiler_active_on_queue
  - misc logging improvements
- source/lib/rocprofiler-sdk/counters/agent_profiling.cpp
  - increase timeout_hint in hsa_signal_wait in set_profiler_active_on_queue
- tests/rocprofv3/hsa-queue-dependency/CMakeLists.txt
  - add TIMEOUT for rocprofv3-test-hsa-multiqueue-execute

[ROCm/rocprofiler-sdk commit: 64b8f8370e]
2024-07-02 18:49:04 -05:00
Giovanni Lenzi Baraldi ebad2abe3c Accumulation metrics support and update counter collection API to aqlprofile_v2 (#915)
* Updating to v3 API

* General fixes

* Extending dimension bits to 54

* Disabling agent profiling tests

* Fixed unit test

* Adding accumulate metric support for parsing counters (#609)

* Adding accumulate metric support for parsing counters

* Adding metric flag

* Updating tests

* source formatting (clang-format v11) (#610)

Co-authored-by: Manjunath-Jakaraddi <21177428+Manjunath-Jakaraddi@users.noreply.github.com>

* source formatting (clang-format v11) (#614)

Co-authored-by: jrmadsen <6001865+jrmadsen@users.noreply.github.com>

* Adding evaluate ast test

* source formatting (clang-format v11) (#633)

Co-authored-by: Manjunath-Jakaraddi <21177428+Manjunath-Jakaraddi@users.noreply.github.com>

* Update scanner generated file

* Adding flags to events for aqlprofile

* Fix Mi200 failing test

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Manjunath-Jakaraddi <21177428+Manjunath-Jakaraddi@users.noreply.github.com>
Co-authored-by: jrmadsen <6001865+jrmadsen@users.noreply.github.com>

* Revert "Extending dimension bits to 54"

This reverts commit 3cd6628452484044a93e129f27974f996a0e4c08.

* Removing CU dimension

* Fixing merge conflicts

* Revert "Disabling agent profiling tests"

This reverts commit 7e01518ed8c51fbb0c3b2575e1e0b8f9ddfa8237.

* Fixing merge conflicts

* Fix parser tests

* Adding accumulate metric documentation

* Update counter_collection_services.md

* Update index.md

* fix nested expression use

* Update source/lib/rocprofiler-sdk/counters/evaluate_ast.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Doc update

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Manjunath P Jakaraddi <manjunath180397@gmail.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Manjunath-Jakaraddi <21177428+Manjunath-Jakaraddi@users.noreply.github.com>
Co-authored-by: jrmadsen <6001865+jrmadsen@users.noreply.github.com>
Co-authored-by: Manjunath-Jakaraddi <manjunath.jakaraddi@amd.com>

[ROCm/rocprofiler-sdk commit: a78753d392]
2024-07-01 21:56:41 -03:00
Giovanni Lenzi Baraldi 3327f10d81 Adding wrappers on HSA for executable load/unload and allowing multiple agents per context on ATT (#951)
* Codeobj wrappers around HSA calls for ATT

* Formatting

* Bookeeping

* Tidy

* Tidy

* Update source/lib/rocprofiler-sdk/thread_trace/code_object.hpp

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/thread_trace/att_core.hpp

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

* Variable naming

---------

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

[ROCm/rocprofiler-sdk commit: 8da0c35079]
2024-06-25 13:46:13 -03:00
Jonathan R. Madsen 869a5693d1 Remove fatal error when callback and buffer tracing API in one context (#952)
- one context for callback and buffer tracing of same API produces erroneous fatal error -- this is a valid use case

[ROCm/rocprofiler-sdk commit: b62ba5f096]
2024-06-25 02:51:41 -05:00
Jonathan R. Madsen 0ce0771c23 Add logical_node_type_id field to rocprofiler_agent_t (#948)
* Add logical_node_type_id field to rocprofiler_agent_t

* Patch queue_controller

[ROCm/rocprofiler-sdk commit: af2f85ca93]
2024-06-24 23:18:58 -05:00
Jonathan R. Madsen d794084a45 Sync queue and async copy on client finalizer (#950)
[ROCm/rocprofiler-sdk commit: 62ec95eae6]
2024-06-24 20:38:34 -05:00
Gopesh Bhardwaj 3faed47bc2 Fixing OpenSuse build (#947)
[ROCm/rocprofiler-sdk commit: eeec089d6b]
2024-06-23 12:55:36 +05:30
Jonathan R. Madsen 0114bcad4b Add HSA tracing support for hsa_amd_vmem_address_reserve_align (#946)
* Add support for hsa_amd_vmem_address_reserve_align

* Update lib/rocprofiler-sdk/hsa/types.hpp

- support HSA_AMD_EXT_API_TABLE_STEP_VERSION == 0x2 for HSA v1.14.0

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: 12785ad365]
2024-06-21 22:28:39 +05:30