Граф коммитов

132 Коммитов

Автор SHA1 Сообщение Дата
Baraldi, Giovanni e898079a13 Thread trace and Trace Decoder API tests and samples (#416)
* Adding test and samples to decoder

* Fix sample

* Formatting

* Fix multi test

* Disable sample

* Fix tests

* Format

* Version fix

* Locking the decoder

* Add atomic

* Review comments

* Format

* Adding readme

* merge conflict and adding PCS+ATT test

* Review comments

* Properly disable PCS test

* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt

* Adding back env var test

* Name fix

* Preload sample

* Addressing review comments

* Update docs

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-07-22 20:08:12 -05:00
Gill, Harkirat e948034c83 Update output file fields docs to correctly define Grid_Size (#526)
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-07-22 23:16:01 +05:30
Nagaraj, Sriraksha 2447a85215 [rocprofv3-avail] - Add sample data (#514)
* Add sample data for avail and remove color code for non terminal output

* review comments

* review comments

* add documentation

* test fix
2025-07-22 08:39:59 -07:00
Indic, Vladimir 650d35bdaa [Host-Trap PC Sampling] Host-Trap PC sampling an introduce an arbitrary sampling skid of [0, 2] instructions (#515)
* Arbitrary host-trap sampling skid (doc)

The host-trap PC sampling might introduce a skid of [0, 2]
instructions. We documented this information and provides
some advice to application developers how to find
hot-spots in the profiles generated by host-trap sampling.
2025-07-17 17:59:46 +02:00
Nagaraj, Sriraksha 3aaffc42da [rocprofv3-avail] Documentation update and column formatting (#447)
* addressing issues

* doc fix

* test fix

* fix

* fix formatting issue and doc update

* fix column size

* fix

* fix formatting in output

* tests fix

* test fix

* add new line

* add new line

* fix new line

* fixing typo in using-rocprofv3-avail.rst
2025-07-10 11:41:12 -05:00
U, Srihari 6f2a5a9646 Add perfetto support for scratch memory (#303)
* Add perfetto support for scratch memory

* Updated tests and docs.

* Update docs data

* Added underflow check

* Record all free events to 0 bytes

* Add format

* Address review comment

* updated tests for scratch memory

* update scratch-memory tests.
2025-07-09 21:05:45 +05:30
Bhardwaj, Gopesh e7616c3aad Adding OpenMP usage with rocprofv3 (#472)
* Adding openmp usage with rocprofv3

* minor changes

* Fixing missing line
2025-07-02 12:25:24 +05:30
Baraldi, Giovanni c0c08b2f08 [rocprofv3] Fix ATT library path (#476)
* Fix library path

* Update docs

* Review comments

* Update source/bin/rocprofv3.py

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-07-01 22:08:29 +02:00
Verma, Saurabh f70f369d46 PC-Sampling doc updates - FW version (#455)
* Initial doc update

* addressed review comments

* addressed review comments - 2

* accept reviewer suggestions

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept reviewer suggestions-2

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept reviewer suggestions-3

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept reviewer suggestions-4

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update README.md

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update CHANGELOG.md as per viewer suggestions

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept review suggestion

Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>

* accept reviewer suggestion

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* accept reviewer suggestions

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

---------

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-06-25 13:11:18 +05:30
Baraldi, Giovanni 9dadbbace5 Adding doc links for trace decoder, aqlprofile and viewer (#464)
Adding interlinks for trace decoder, aqlprofile and viewer

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-06-18 14:10:18 +02:00
Bhardwaj, Gopesh 3e43b1f019 Adding rocpd documenation (#449)
* Adding rocpd docuemenation

* rocpd format

* CHANGELOG update and indexing

* Fixing links

* format fixes

* fixing table

* major edits

* fixed logical error

* fixing rocprofv3 avail
2025-06-17 15:41:53 +05:30
Kandula, Venkateshwar reddy 1c91774c6a [DOCS] SWDEV-534589 Update docs with new info in kernel_trace csv output (#438)
* Update docs with new info in kernel_trace csv output and add flag for csv in docs.

* Misc.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Vaddireddy, Sushma <Sushma.Vaddireddy@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-06-10 08:20:07 +05:30
Nagaraj, Sriraksha 80d60d8535 [rocprofv3-avail] Rework rocprofv3-avail tool (#312)
---------

Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>
2025-06-06 11:51:37 -07:00
Kumar, Amit 7411640761 add binary link (#427)
* add binary link

* Update source/docs/how-to/using-thread-trace.rst

---------

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
2025-05-30 11:52:31 -05:00
Baraldi, Giovanni eedfecd905 Adding Thread Trace API reference (#417)
* Adding Thread Trace API reference

* Doc fixes

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/_toc.yml.in

Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/index.rst

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/api-reference/thread_trace.rst

* Apply suggestions from code review

Co-authored-by: Paoletti, Leo <Leo.Paoletti@amd.com>

* Update source/docs/_toc.yml.in

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

* Update source/docs/api-reference/thread_trace.rst

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
Co-authored-by: Paoletti, Leo <Leo.Paoletti@amd.com>
Co-authored-by: Xu, Alex <Alex.Xu@amd.com>
2025-05-30 11:51:46 -05:00
Baraldi, Giovanni b590612966 Adding using-thread-trace.rst (#408)
* Adding using-thread-trace.rst

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

* Add to index/toc

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Paoletti, Leo <Leo.Paoletti@amd.com>

* Update source/docs/how-to/using-thread-trace.rst

* Update source/docs/how-to/using-thread-trace.rst

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
Co-authored-by: Paoletti, Leo <Leo.Paoletti@amd.com>
2025-05-29 15:41:42 -05:00
Bhardwaj, Gopesh 7f7827fb30 SWDEV-533894 Documentation for python bindings (#404)
* SWDEV-533894 Documenation for python bindings

* Fixing missing-new line check

* Addressed Feedback
2025-05-27 22:39:21 -05:00
Rawat, Swati c255ec5b5c Doc review (#386)
* doc review

* more updates

* install title

* Update rocprofiler.h

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-05-27 11:28:38 -05:00
Bhardwaj, Gopesh b48fa532bc Making ROCTx API doxygen generated document more readable (#385)
* Making ROCTx API doxygen generated document more readable

* fixing build

* Fix linking errors

* Fixing header

* Fixing Topics and Types

* doxygen configuration fixes

* Fixing build

* Fix unnecessory doc parsing warnings

* formatting and linting fixes

* rebasing SDK modular PR

* Fixing missing line

* Fixing ROCtx documentation after merge

* Removing flake changes

* changed back WARN_IF_DOC_ERROR to Yes
2025-05-22 18:08:55 -05:00
Welton, Benjamin 33e43e66d3 [SDK] Standardize rocprofiler-sdk counter definition YAML schema (#370)
* Convert YAML Format

Convert YAML format and reader to properly read the YAML.

Comparison between output's from the YAML show only changes in ordering
of architectures (and ids).

* Test fixes

* Add script for converting the YAML schema to source/scripts

* Update documentation

* Change the extra counter code block to YAML

* Add missing new line at EOF

* remove name issues

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-05-14 13:31:51 -05:00
Kandula, Venkateshwar reddy 6ec9526475 [docs] Improve readability of ROCprofiler-SDK API library documentation (#359)
* Use custom .rst to make api doc more readable.

* Update index.rst

* Misc docs updates

- doxygen source code fixes
- updated doxygen files
- fixed conf.py (does not generate code in source tree)

* Update source/docs/api-reference/rocprofiler-sdk_api_reference.rst

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/api-reference/rocprofiler-sdk_api_reference.rst

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/api-reference/rocprofiler-sdk_api/modules.rst

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Update source/docs/api-reference/rocprofiler-sdk_api/global_data_structures_topics_files.rst

Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>

* Duplicate

* test warnings

* Update CMakeLists.txt

* Update rocprofiler-sdk.dox.in

* Update update-docs.sh

* fix docs build failures by -q -T flags.

* set warn_as_error to NO.

* test -W to suppress warnings.

* remove -q flag from make.

* reduce dot graph depth to 100

* Update custom docs target

- docs target is now no longer part of the dependency list for the all target
- installation of docs requires explicitly building the docs target (i.e. OPTIONAL install of _build/html/ folder)

* add quit and trace mode back.

* increase DOT_GRAPH_MAX_NODES to 500 back.

* Format.

---------

Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
Co-authored-by: Rawat, Swati <Swati.Rawat@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-05-14 11:17:51 +05:30
Elwazir, Ammar 6f17da7ade [rocprofv3-benchmark] SDK and rocprofv3 Benchmarking Suite (#157)
* Adding Benchmarking Stg1

* config fix

* reset

* add jpeg and decode traces in iteration

* address comments benchmark config files.

* address comments.

* address comments.

* address comments: revert cntrl ctx.

* address comments: revert csv output.

* resolve merge conflits.

* format.

* build fix.

* fix hip runtime api traces.

* loop cb services.

* format.

* bug fix.

* Fix operator>

- public C++ comparison operator

* Update configuration options

- support selected regions (--selected-regions)
- support writing output config json (--output-config)
- update serialization data

* rocprofv3 tool library misc updates

- lambda for starting context
- support for writing config json

* Tool library updates

- Finished support for all benchmarking modes
- Added build spec support to config json

* Fix ROCPROFILER_SOVERSION

- this value should not be multiplied by 10,000

* Minor tweak to rocprofv3

* Benchmarking scripts

* formatting

* Fix duplicate include

* Add reproducible-dispatch-count test app

- used in benchmarking

* registration logging

- report number of registered contexts and active contexts after client initialization

* Serialize environment in rocprofv3 output config

* ROCPROFILER_BUILD_BENCHMARK CMake option

* Update benchmark SQL schema

- hash_id is text
- add md5sum to benchmarked_app
- remove app_id from benchmarked_sdk
- add sdk_id to benchmark_config
- separate hip_trace into hip_runtime_trace and hip_compiler_trace
- use INT instead of INTEGER for MySQL compatibility
- add count column in benchmark_statistics
- allow std_dev to be NULL in benchmark_statistics

* Update rocprofv3-benchmark.py

- use md5 instead of python hash (which includes random seed)
- use args.mysql_database
- compute md5sum of executable
- fix insert_benchmark_config
  - marker trace fixes
  - memory allocation fixes
  - split hip_trace into hip_{runtime,compiler}_trace
- remove app_id from benchmarked_sdk
- support warmup runs
- count field in benchmark_statistics

* Support launcher and environment in YAML

* Update reproducible-dispatch-count.cpp

- support mode which doesn't use hip event timing

* Misc rocprofv3-benchmark.py updates

- fix some MySQL support
- remove some unnecessary logging

* support mysql db.

* Format.

* Updated SQL input files

- moved benchmark_schema.sql to benchmark_table.sql
- added benchmark_views.sql
  - uses {{metric}} syntax for variable substitution

* cmake formatting

* update rocprofv3-benchmark.py

- benchmark config labels
- overhead views

* Encode rocprofv3-benchmark PID in rocprofv3 and timem output files

* Minor tweak to benchmark_views.sql

- include count
- reorder fields for readability

* split statements and use IS if values is NONE.

* use backtick instead of double quotes and add IS before NOT NULL.:

* Adding Mandelbrot Benchmark App

* Adding Dockerfile example

* Update dockerfile

* Update dockerfile

* [SDK] rocprofiler_query_external_correlation_id_request_kind_name

* Execution-profile benchmark mode

* Execution profile SQL support

* Rename mandlebrot folder + misc clang-tidy

* [rocprofv3-benchmark] Execution profile support

* Update installation

* add work dir when setting git revision, useful when building outside src.

* Set FULL_VERSION_STRING and ROCPROFILER_SDK_GIT_REVISION

- when benchmark folder is top-level

* Remove unused python packages from requirements.txt

* Use ldd/pyelftools to include linked libs for md5sum

- also add --filter-benchmark and --filter-rocprofv3 options
- support labeling the rocprofv3 options
- use more argparse groups
- more generic application of filters
- support variable substitution in environment, e.g. PATH=/some/path:$PATH

* Environment improvements

- improve reproducibility when env set via input file vs. shell
- support "environment-ignore" to remove environment variables

* Misc formatting

* Misc. fix

* use backticks for defining new columns name

* Support shuffling the order of benchmark modes/rocprofv3 args

* Address review comments

* Update Dockerfile

- rename to Dockerfile
- reduce to one layer

* Support docker build arg BRANCH

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-05-13 16:18:23 -05:00
Bhardwaj, Gopesh fbab96e552 Replacing ROCm 6.5 mentions with ROCm 7.0 (#391) 2025-05-12 17:05:45 +05:30
Trowbridge, Ian e626df43eb Fix HIP Streams Duplication Error (#313)
* Fix stream duplication and fixed tests

* Added comments to explain stream.cpp code, change stream nullptr check to occur in update table to prevent readding null stream, simplified hip-streams bin file code, add destroyStreams to hip-streams bin file code

* Removed roctx from CMakeLists.txt

* Updated documentation

* Fix documentation

* Removed update_table for HIP compiler table and updated stream.cpp to remove support for HIP compiler table

* Added runtime initialization check for HIP

* Changed tool name, working on fixing memory management

* Added context for counter collection kernel rename combination

* Changed name from map to set and changed description

* Fix documentation description for group-by-queue

* Merged memory copy and kernel operations onto a single track when on the same stream

* Updated perfetto output to remove hardware information from track name to merge all memory copy and kernel operations on the same stream to the same track:

* Most pr comments addressed

* Added filter for counter collection and removed kernel buffer tracing hack

* Added PR comment fixes

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-05-01 00:56:15 -05:00
Madsen, Jonathan d2bde3ce27 [rocprofv3] Use -P for collection period shorthand option (#356)
* [rocprofv3] Use -P for collection period option

- Reserve -p for profiler attachment

* Update changelog

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-04-27 20:18:26 -05:00
Bhardwaj, Gopesh bbe9eab53a doc improvements for 1.0.0 (#367)
* correcting rocprofiler_configure

* Fix indexing order

* doc feedback
2025-04-24 17:05:22 +05:30
Bhardwaj, Gopesh 024cf0e5e3 Using miniconda docker (#366)
* Using miniconda docker

* remove sudo

* Remove double install of rocprofiler-docs conda environment

* Fix building docs

* Fix build docs

- Additional system packages

* Using miniforge

* Fixing warning as errors build issue

* cmake formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-04-23 23:52:03 -05:00
Bhardwaj, Gopesh 1f1c192a5e Copilot suggestions (#360)
* Copilot suggestions

* Fixing perfetto links

* correcting default value of agent-index
2025-04-22 20:52:37 +05:30
Bhardwaj, Gopesh 780b96ad3a Remove SDK as beta from docs (#351) 2025-04-21 21:31:14 +05:30
Nagaraj, Sriraksha 87badfbd15 [rocprofv3] signal handler fix (#332)
* rocprofv3: LD_PRELOAD for signal and sigaction

- wrappers around `signal` and `sigaction` to prevent applications which install signal handlers to replace the rocprofv3 signal handlers
- minor tweaks to buffer sizes (use page_size instead of
KiB)

* [DO NOT COMMIT] extra logging

* Switch git submodule url for perfetto

- use GitHub URL as this is more accessible

* Update ring_buffer<Tp>

- account for alignment padding

* Update buffered_output

- track number of bytes stored
- add nullptr checks

* Update tmp_file_buffer

- track number of bytes
- read_tmp_file does not create tmp file if it does not already exist

* Update tmp_file

- add exists member function for checking whether temporary file already exists
- tweak remove() implementation

* Update config.hpp

- add option to enable/disable signal handlers
- add option for minimum_output_bytes

* Make signal, sigaction functions visible

* rocprofv3 tool updates

- chained signals
- override the signal handler(s) installed by the application
- improve cleanup of temporary files
- support minimum output bytes

* Add commandline support

* fixing test

* minor fix

* minor fix

* fix clang issue

* fix

* Adding docs

* review comments

* review changes

* review

* YUV pulldown additions to rocdecode

* More rocdecode changes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-04-17 21:10:52 -07:00
Indic, Vladimir 96a0ef244f MI300 Stochastic PC Sampling Documentation and Changelog (#336)
* MI300 Stochastic PC Sampling Documentation

* Stochastic PC sampling title renaming

---------

Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
2025-04-15 14:04:19 -07:00
Bhardwaj, Gopesh ca7cce9e81 doc improvements for 1.0.0 part 2 (#330)
* update installation steps

* Github Issue #50 Adding README's for samples

* Making name change to ROCprofiler-SDK for consistency

* Fix HIP trace documentation

* Fix HSA trace in docs

* Fix kernel trace in docs

* Fixing memory copy and memory allocation traces

* runtime trace and sys trace doc update

* Fix scratch memory doc

* kernel naming and filtering options

* Adding collection period in docs

* Perfetto configs update

* summary output file

* kernel trace format fix

* update CHANGELOG

* Agent index doc update

* rocm-smi output

* group by queue option

* Updated --group-by-queue description

* perfetto visualization

---------

Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
2025-04-15 13:30:07 -07:00
Trowbridge, Ian 077723337a rocDecode Buffer Tracing Support (#315)
* Added buffer tracing support for rocdecode and updated tests to work with buffer tracing

* Updated perfetto to output args individually rather than as a string list

* Updated docstrings and operation type, changed OTF2 code to remove warning due to change in operation type

* Updated tests for review comments

* Test args exist and return value

* Updated to use string entry

* Change function name

* Updated PR to reflect review comments

* Updated for PR review comments

* Change function name
2025-04-11 21:56:36 +00:00
Rawat, Swati 379d760fc1 Fixing broken link (#326)
* fixing broken link

* added metadata information

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-04-03 18:12:34 +05:30
Meserve, Mark a1fcdf7f83 Additional 1.0.0 changes (#317)
* Additional 1.0.0 changes

- Update VERSION
- Add beta compatibility for rocprofiler_agent_set_profile_callback_t

* Fix location of deprecated typedef rocprofiler_agent_set_profile_callback_t

* rocprofiler_record_counter_t -> rocprofiler_counter_record_t

* Experimental + deprecated annotations

* rocprofiler_record_dimension_info_t -> rocprofiler_counter_record_dimension_info_t

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-03-26 02:12:03 -05:00
Bhardwaj, Gopesh 6d6eec230c doc improvements and fixes SWDEV-523395,SWDEV-516979 (#314)
* doc improvements and fixes SWDEV-523395,SWDEV-516979

* Adding changes from PR 231
2025-03-26 10:09:08 +05:30
Madsen, Jonathan 2061c52817 Updated source/docs/sphinx/requirements.txt (#310)
- Re-ran pip-compile on source/docs/sphinx/requirements.in

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-03-24 14:00:49 +05:30
Welton, Benjamin 4cd121e27b [SDK] Release 1.0 Public API Modifications (#277)
* Make sure all structs/enums can be forward declared

* Updates to counter collection

- consistency updates and cleanup

* Conversion of dimension information to info struct

* Added deprecated folder

* Testing changes

* merge changes

* Fix shadowed variable

* Source code formatting

* Fix shadowed variable

* Update rocprofiler_counter_info_v1_t member names

* Split version.h into version.h and ext_version.h

- ext_version.h contains external version info, e.g. ROCPROFILER_HSA_API_TABLE_MAJOR_VERSION, ROCPROFILER_HSA_RUNTIME_VERSION
- this reduces amount of recompilation after a commit since version.h gets updated with the git revision

* profile_config -> counter_config

* EOF new line

* [Samples] Reduce header includes + reorg counter collection samples

* Misc compilation fixes

- shadowed variables
- use of [[deprecated("...")]] in C code
- unused variables

* Minor misc modifications

- use common:: instead of rocprofiler::common:: when inside rocprofiler namespace
- counters.cpp
  - move local anon namespace functions into rocprofiler::counters:: anon namespace
  - use std::string_view for get_static_string
  - const ref for get_static_ptr
  - misc namespace shortening

* [Public API] rocprofiler_get_version_triplet + rocprofiler_version_triplet_t

- struct rocprofiler_version_triplet_t containing fields for the major, minor, and patch version
- public API function: rocprofiler_get_version_triplet
- define C++ operators for rocprofiler_version_triplet_t
- C++ function compute_version_triplet

* [Tests] Improve async-copy-testing test

- relax constraints
- improve logging

* Update counter_config.h doxygen docs

* ROCPROFILER_SDK_BETA_COMPAT

- ppdef which helps with renaming when set to 1

* Remove spurious include

* Fix includes for cxx/version.hpp

* Doxygen fixes for rocprofiler_get_version and rocprofiler_get_version_triplet

* Public API Experimental Designation

- ROCPROFILER_SDK_EXPERIMENTAL added to experimental function
- "(experimental)" added to doxygen @brief entries

* Fix use of assert instead of static_assert in hip/stream.cpp

* Use typedef instead of define for rocprofiler_profile_config_id_t

* Use inline rocprofiler_{create,destroy}_profile_config instead of ppdef

- added <rocprofiler-sdk/deprecated/profile_config.h>

* Doxygen for rocprofiler_{create,destroy}_profile_config

* ROCPROFILER_SDK_DEPRECATED_WARNINGS

* Temporarily comment out ROCPROFILER_SDK_DEPRECATED_WARNINGS=1

* cmake formatting

* Misc variable renaming in samples and tests

* Fix declarations of types

* Fix hip stream tracing service struct name

- rocprofiler_callback_tracing_stream_handle_data_t renamed to rocprofiler_callback_tracing_hip_stream_api_data_t

* Rename "HIP_STREAM_API" to "HIP_STREAM"

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-03-24 12:07:33 +05:30
Madsen, Jonathan b01465303b [rocprofv3] Support negating aggregate tracing options (#251)
* Support negating aggregate tracing options

- E.g. --runtime-trace --scratch-memory-trace=False

* Add tests

* Update CHANGELOG

* rocprofv3 tweaks

* Added docs update

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Srihari Uttanur <srihari.u@amd.com>
2025-03-21 18:22:39 -05:00
Srihari Uttanur c9ca876b79 Add perfetto support for counter collection
Fix endtimestamp for counter tracks

Add fix for rocprofv3 counter collection tests

Fix formats and refactors

Added docs and addressed review comments

Address more review comments.
2025-03-21 01:41:19 +05:30
Bhardwaj, Gopesh 4735196fe4 changing markdown to rst format (#259)
* changing markdown extension to rst extension

* updating callback services

* updating all services, ssamples and installtion

* Fix build

* More fixes

* more fixes

* minor fixes

* more fixes

* merging changes for SWDEV-510794 from pr 227
2025-03-20 11:09:53 -05:00
Baraldi, Giovanni 821918a512 SWDEV-516846: Fix serialization services conflicts and ATT counter streaming (#230)
* Update TT API

* Rework serialization

* update att_core

* Fix tests

* Fix tool

* Formatting

* Fix perfcounter

* Formatting

* Rename agent TT

* Format

* Workaround for codeQL alert

* Tidy fix

* Fix compiler error

* Tidy

* Fix some tests

* Fixing some tests

* formatting

* Fixing ATT serialization

* Format

* Fix test commandline

* Fixing init order

* Format

* Tidy fixes

* Removing unused sample

* Fix tests and schema

* Added ATT + PMC test

* Fix mode

* Fix file mode

* Review comments

* Fix typo

* Review comments

* Review comments

* Fix missing id inc after review comment

* Review comments

* Suggested Fixes

* Testing changes

* Test fix

* Build fixes

* Minor build fix

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
2025-03-14 18:11:10 -07:00
Trowbridge, Ian ccd1e54293 HIP Streams to Queues Translation (#235)
* rocprofiler_stream_id_t: opaque handle for a stream

- e.g. HIP stream
- the same HIP stream may map to different HSA queues at different points in the application
- added to:
  - rocprofiler_buffer_tracing_hip_api_record_t
  - rocprofiler_buffer_tracing_memory_copy_record_t
  - rocprofiler_callback_tracing_hip_api_data_t
  - rocprofiler_callback_tracing_memory_copy_data_t
---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Mark Meserve <mark.meserve@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Jakaraddi, Manjunath <Manjunath.Jakaraddi@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
Co-authored-by: Nagaraj, Sriraksha <Sriraksha.Nagaraj@amd.com>
Co-authored-by: U, Srihari <Srihari.U@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-03-14 02:45:13 -07:00
Welton, Benjamin aa88dd44c7 [SWDEV-512693] Iteration based counter multiplexing (#272)
Adds iteration based multiplexing to counter collection. Counter groups can now be specified. These counter groups are collected on a device individually until a specified interval period is reached. When the interval is reached, the next counter group is set to be collected on subsequent kernel executions.

Supplies two new argument types that can be included in YAML/JSON inputs:

pmc_groups: an array of arrays containing the counter groups to run (i.e. [ ["SQ_WAVES", "GRBM_COUNT"], ["GRBM_GUI_ACTIVE"])
pmc_group_interval: the number of kernel invocations on a GPU of a group before rotating to the next group

Note: originally there was a random_seed_generator proposed in the linked ticket, that was not implemented since there are very few instances where you would want the selection of the groups to be randomly generated (and if you do, you can randomly generate the pattern and place it as a large list of groups in pmc_group).

All existing counter functionality should be preserved (selection of counters on specific devices only, profiling of only specific kernels, etc).

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-03-14 02:05:36 -07:00
Rawat, Swati 31b8f61c8e Documentation updates (#236)
* Documentation updates

* formatting

* Update using-rocprofv3.rst

* Update counter_collection_services.md

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
2025-02-28 10:10:26 +05:30
Trowbridge, Ian 31fe8858d1 rocJPEG API Tracing (#73)
* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Support for rocJPEG API Trace

* Added newline to rocjpeg_version.h

* json-tool code added, initial test/bin commit

* Formatting

* Resolved rocjpeg bin test compilation errors

* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed

* Formatting and compilation errors

* Minor fixes

* Copyright year update and minor fixes

* Doc update fix

* Added rocjpeg csv file in data

* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes

* Added rocdecode and rocjpeg to CI

* Removed rocdecode and rocjpeg from CI and added back build tests option

* Updated Cmake Files

* Added rocDecode and rocJPEG to CI

* Remove cmake line added in error

* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes

* Added find_package for test

* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path

* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests

* Resolve merge conflicts and formatting

* Added regex find and replace instead of include for CI

* VAAPI package causing errors on Vega20

* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved

* Removed workflows regex

* Formatting and minor test modification

* Modified test for vega20

* Update rocDecode and rocJPEG cmake and tests

* Changelog

* Fix merge conflict

* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing

* Removed if found statements, replaced with TARGET:EXISTS

* Skip json file for rocjpeg and rocdecode tests if not supported

* Add os import

---------

Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-21 13:43:49 -08:00
Bhardwaj, Gopesh 848242eb5c SWDEV-514046 documentation build fix (#208) 2025-02-13 09:25:30 -06:00
Kandula, Venkateshwar reddy 6427fbafc2 Accum_vgpr support in Rocprofv3 (#70)
* output accumulate vgpr count

* fix logic for computing accum_vgpr

* add accum_vgpr to csv.

* accumulation vgpr's docs and support for rocprofv3

* CHANGELOG.md

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
2025-02-12 10:47:46 -08:00
Bhardwaj, Gopesh 075d36eb82 output format envs doc update (#173) 2025-02-11 21:37:12 -06:00
Madsen, Jonathan 6246ec4040 SDK: Agent UUIDs, agent runtime visibility, kernel symbol address (#154)
* [DO NOT MERGE] Misc UUID updates

- this is WIP

* Agent visibility

- Support for ROCR_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, CUDA_VISIBLE_DEVICES, GPU_DEVICE_ORDINAL

* Update CHANGELOG

* tweak to rocprofiler_agent_runtime_visiblity_t

* Code object kernel address

- new fields in code_object_kernel_symbol_register_data_t
  - kernel_code_entry_byte_offset
  - kernel_address

* Support ROCR_VISIBLE_DEVICES reordering devices for HIP

* Addressed code review changes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 14:36:23 -06:00