Commit Graph

79 Commits

Author SHA1 Message Date
Hui, Young 3954cedd25 [rocpd] Adding summary module to generate summaries from rocpd database + query submodule + rocpd command-line tools (#488)
* adding summary.py to generate tmp <category_region>_summary views

* migrating CSV summary to SDK method of writing CSVs

  - Add domain_view to summary.py
  - omit the C++ code of writing CSV because it gets revered later anyway

* Add summary subparser and write_sql_view_to_csv function

* adding all <>_summary views generation to summary.py

* add summary_per_rank feature

* add --summary-per-rank

* reconstruct generate_summary_view and create_domain_view

-introduce by_rank

* remove sqr and variance in summary views

* use RocpdImportData instead of connection

* two fixes on summary.py

--modify the generate_summary_view function to return a tuple with view name and sql code

add if_not_exits parameter to generete_summary_view

* Refactor summary.py to allow output path and filename args, and apply time_window
- clean up summary table column headers
- only generate by-rank views if that param is specified

* Add ProcessID to Hostname output and csv, so users can identify the system in the by-rank summaries

* Summary.py, just add hostname to by-rank summaries, instead of creating mapping table

* Summary - migrate csv writer to pandas, for more future flexibility

* Adding a few simple tests for summary.py

* Linting fixes

* add region_categories to summary options

  -  Automatically retrieve region categories from the database if argument is None

* add backticks for view_names

* fix tests after rebase

* Made code review changes
- fixed whitespace in CMakelists.txt
- adding query.py module & subparser in __main__.py
- refactor summary function to return query
- used query.py to output csv
- used query.py to also output summary to console
- provided new command line options to select summary output to csv or console

* Made fix to jinja template in query.py, as suggested by copilot

* Consolidated output calls to query in export_view function based on feedback
- refactored: helpers, query functions, create view functions
- extended formats to include what query supports (md, html, pdf, json)
- added json format to query, and changed orient=records
- adding jinja2 and reportlab to requirements.txt

* Add version_info for rocpd and roctx

* Add rocpd commandline tool

* Add executable permissions to source/bin/rocpd.py

* Removed rocpd2query, and cleaned up --help examples

---------

Co-authored-by: acanadas <acanadas@amd.com>
Co-authored-by: Jin Tao <jintao12@amd.com>
Co-authored-by: a-canadasruiz <Araceli.CanadasRuiz@amd.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
2025-07-24 16:12:06 -05:00
Nagaraj, Sriraksha 2447a85215 [rocprofv3-avail] - Add sample data (#514)
* Add sample data for avail and remove color code for non terminal output

* review comments

* review comments

* add documentation

* test fix
2025-07-22 08:39:59 -07:00
Nagaraj, Sriraksha 3aaffc42da [rocprofv3-avail] Documentation update and column formatting (#447)
* addressing issues

* doc fix

* test fix

* fix

* fix formatting issue and doc update

* fix column size

* fix

* fix formatting in output

* tests fix

* test fix

* add new line

* add new line

* fix new line

* fixing typo in using-rocprofv3-avail.rst
2025-07-10 11:41:12 -05:00
Baraldi, Giovanni c0c08b2f08 [rocprofv3] Fix ATT library path (#476)
* Fix library path

* Update docs

* Review comments

* Update source/bin/rocprofv3.py

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-07-01 22:08:29 +02:00
Elwazir, Ammar aeb1621c2b [SDK] Support CMake option for using internal RCCL tracing + (temporary) enable in CI (#457)
* Temp: disable RCCL tracing

* Update continuous_integration.yml

* Update continuous_integration.yml

* Update continuous_integration.yml

* Adding option to disable rccl tracing from CMake

* Update codeql.yml

* Misc updates

- ROCPROFILER_BUILD_RCCL -> ROCPROFILER_INTERNAL_RCCL_API_TRACE
- env.EXTRA_TEMP_CMAKE_OPTIONS -> env.GLOBAL_CMAKE_OPTIONS
- add (advanced) option ROCPROFILER_INTERNAL_RCCL_API_TRACE

* Fix rocprofiler::sdk::get_enum_label

- missing enum labels for HIP_RUNTIME_API_TABLE_STEP_VERSION > 8

* Update tests/rocprofv3/advanced-thread-trace/CMakeLists.txt

- improve various aspect of cmake -- particularly echoing where attdecoder_LIBRARY was found

* Use CMAKE_MESSAGE_INDENT

- add prefix to cmake messages to help indicate where messages are coming from
- make find_package(Python3 ...) QUIET for bindings

* Fix rocprofiler::sdk::get_enum_label

- handle HSA_AMD_EXT_API_TABLE_MAJOR_VERSION

* Fix rocprofv3 message for att library path

* Fix tests/rocprofv3/advanced-thread-trace/att_input.yml config

* Fix rocprofv3 check_att_capability + soversion/version library resolution

- Account for ROCPROF_ATT_LIBRARY_PATH in env in check_att_capability
- Add resolve_library_path
  - supports resolution of library names to SOVERSION and VERSION paths

* Fix python linting error (unused import)

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-06-17 08:32:54 -05:00
Baraldi, Giovanni 2fa95e6d6d Enable PC sampling to be run alongside ATT. Add ATT to changelog. (#445)
* Enable PC sampling to be run alongside ATT. Add ATT to changelog.

* Fix tests

* Review comments

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-06-10 20:19:00 +02:00
Nagaraj, Sriraksha 00f5f398e3 avail-fix (#444)
* avail-fix

* formating

* adding dimension info in rocprofv3

* review comments
2025-06-09 16:53:02 -05:00
Nagaraj, Sriraksha 80d60d8535 [rocprofv3-avail] Rework rocprofv3-avail tool (#312)
---------

Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>
2025-06-06 11:51:37 -07:00
Madsen, Jonathan 7afedc63be [rocprofv3] SQLite3 database output (rocpd) support + rocprofiler-sdk-rocpd (#403)
* [rocprofv3] rocpd SQLite3 database output support

* Move counters xml and yaml to source/share/rocprofiler-sdk

- more representative of install hierarchy

* Add share/rocprofiler-sdk/rocpd SQL files

* Experimental rocprofiler-sdk SQL API

* rocprofv3 default output format is rocpd

* Fix rocpd event ids for counter collection w/o kernel dispatch

* Remove fktable entries from rocpd_tables.sql

* Fix rocpd schema path

* Fix install component for roctx python bindings

* rocprofiler-sdk-rocpd

- create include/rocprofiler-sdk-rocpd
- create rocprofiler-sdk-rocpd library, package, etc.
- default all "guid" fields to "{{guid}}" in tables
- remove "{{view_uuid}}" support (always unused)

* Migrate rocprofv3 to use rocprofiler-sdk-rocpd

* Fix missing foreign key reference

* Revert change

* Fix cmake comment

* Fix maybe-uninitialized compiler warning

* Fix maybe-uninitialized compiler warning

* Add logging to rocpd_sql_load_schema

* Improve string sanitization when inserting json strings

* Initialize rocpd logging on rocprofiler-sdk-rocpd library load

* Revert lib/output/generatePerfetto.cpp changes

* [temporary] Tweak rocprofv3-test-list-avail-trace-execute test log level

* Update get_install_path for lib/rocprofiler-sdk-rocpd/sql.cpp

- try to resolve issues on RHEL/SLES for dladdr

* Update lib/common/logging.cpp

- enable environ overrides

* dlsym for rocpd_sql_load_schema

* Make dl_info.dli_fname lexically normal

* Implement node_info alternatives if /etc/machine-id does not exist

* Misc include fixes

* SHA256 and UUIDv7 support

* Implement UUIDv7 in generateRocpd.cpp

* Support push/pop environment variables

* Minor tweak

* Fix glog segfaults when unsetting glog env

* Updated CHANGELOG

* Updates tests/pytest-packages

- rocpd_reader.py: RocpdReader

* Update tests / marker_views.sql

- add test_rocpd_data

* Update rocpd_tables.sql

- Use AUTOINCREMENT
- insert "uuid" and "guid" into rocpd_metadata

* Minor updates to generateRocpd.cpp

- don't quote GUID
- use sqlite3_open_v2
- use sqlite3_close_v2

* Update execute_raw_sql_statements_impl

- uses sqlite3_last_insert_rowid for autoincrement

* Update SQL deferred_transaction

- CI check for nullptr to connection

* Apply suggestions from code review

Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>

* Code review updates

- formatting
- replace if with switch
- remove loop for {{uuid}}

* Fix pmc_groups handling in rocprofv3

* Address code review feedback

- Include rocm_version in rocprofv3 version info
- Note `--version` option for `rocprofv3` in CHANGELOG.md
- remove commented out code

* Fix packaging dependencies

* Fix install package step of CI workflow

* Fix install package step of CI workflow

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
2025-05-30 00:13:19 -05:00
Baraldi, Giovanni 134ac2d9de [rocprofv3] Fix ATT library path (#425)
Fix rocprofv3 `--att-library-path` option 

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-05-29 15:34:07 -05:00
Madsen, Jonathan 14c2dc55ff [roctx] Python bindings for rocprofiler-sdk-roctx (#402)
* [roctx] Python bindings for rocprofiler-sdk-roctx

* Update CHANGELOG

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-05-19 20:02:51 -05:00
Baraldi, Giovanni 65786f619d [RSERP-1802] Add trace decoder to API (#398)
* Add trace decoder to API.

* Cleanup and activity

* Rename

* Minor fix

* Replace tt/TT with thread_trace/THREAD_TRACE

- public API types are not abbreviated

* Fix aliases

* Build system updates

- activate clang-tidy for all subfolders in lib
- fix addition of sources for att-tool

* Fix clang-tidy issues with lib/att-tool/counters.{hpp,cpp}

* Delete counters.cpp

* Formatting

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-05-17 12:08:33 -07:00
Elwazir, Ammar 6f17da7ade [rocprofv3-benchmark] SDK and rocprofv3 Benchmarking Suite (#157)
* Adding Benchmarking Stg1

* config fix

* reset

* add jpeg and decode traces in iteration

* address comments benchmark config files.

* address comments.

* address comments.

* address comments: revert cntrl ctx.

* address comments: revert csv output.

* resolve merge conflits.

* format.

* build fix.

* fix hip runtime api traces.

* loop cb services.

* format.

* bug fix.

* Fix operator>

- public C++ comparison operator

* Update configuration options

- support selected regions (--selected-regions)
- support writing output config json (--output-config)
- update serialization data

* rocprofv3 tool library misc updates

- lambda for starting context
- support for writing config json

* Tool library updates

- Finished support for all benchmarking modes
- Added build spec support to config json

* Fix ROCPROFILER_SOVERSION

- this value should not be multiplied by 10,000

* Minor tweak to rocprofv3

* Benchmarking scripts

* formatting

* Fix duplicate include

* Add reproducible-dispatch-count test app

- used in benchmarking

* registration logging

- report number of registered contexts and active contexts after client initialization

* Serialize environment in rocprofv3 output config

* ROCPROFILER_BUILD_BENCHMARK CMake option

* Update benchmark SQL schema

- hash_id is text
- add md5sum to benchmarked_app
- remove app_id from benchmarked_sdk
- add sdk_id to benchmark_config
- separate hip_trace into hip_runtime_trace and hip_compiler_trace
- use INT instead of INTEGER for MySQL compatibility
- add count column in benchmark_statistics
- allow std_dev to be NULL in benchmark_statistics

* Update rocprofv3-benchmark.py

- use md5 instead of python hash (which includes random seed)
- use args.mysql_database
- compute md5sum of executable
- fix insert_benchmark_config
  - marker trace fixes
  - memory allocation fixes
  - split hip_trace into hip_{runtime,compiler}_trace
- remove app_id from benchmarked_sdk
- support warmup runs
- count field in benchmark_statistics

* Support launcher and environment in YAML

* Update reproducible-dispatch-count.cpp

- support mode which doesn't use hip event timing

* Misc rocprofv3-benchmark.py updates

- fix some MySQL support
- remove some unnecessary logging

* support mysql db.

* Format.

* Updated SQL input files

- moved benchmark_schema.sql to benchmark_table.sql
- added benchmark_views.sql
  - uses {{metric}} syntax for variable substitution

* cmake formatting

* update rocprofv3-benchmark.py

- benchmark config labels
- overhead views

* Encode rocprofv3-benchmark PID in rocprofv3 and timem output files

* Minor tweak to benchmark_views.sql

- include count
- reorder fields for readability

* split statements and use IS if values is NONE.

* use backtick instead of double quotes and add IS before NOT NULL.:

* Adding Mandelbrot Benchmark App

* Adding Dockerfile example

* Update dockerfile

* Update dockerfile

* [SDK] rocprofiler_query_external_correlation_id_request_kind_name

* Execution-profile benchmark mode

* Execution profile SQL support

* Rename mandlebrot folder + misc clang-tidy

* [rocprofv3-benchmark] Execution profile support

* Update installation

* add work dir when setting git revision, useful when building outside src.

* Set FULL_VERSION_STRING and ROCPROFILER_SDK_GIT_REVISION

- when benchmark folder is top-level

* Remove unused python packages from requirements.txt

* Use ldd/pyelftools to include linked libs for md5sum

- also add --filter-benchmark and --filter-rocprofv3 options
- support labeling the rocprofv3 options
- use more argparse groups
- more generic application of filters
- support variable substitution in environment, e.g. PATH=/some/path:$PATH

* Environment improvements

- improve reproducibility when env set via input file vs. shell
- support "environment-ignore" to remove environment variables

* Misc formatting

* Misc. fix

* use backticks for defining new columns name

* Support shuffling the order of benchmark modes/rocprofv3 args

* Address review comments

* Update Dockerfile

- rename to Dockerfile
- reduce to one layer

* Support docker build arg BRANCH

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Venkateshwar Reddy Kandula <vkandula@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-05-13 16:18:23 -05:00
Madsen, Jonathan d2bde3ce27 [rocprofv3] Use -P for collection period shorthand option (#356)
* [rocprofv3] Use -P for collection period option

- Reserve -p for profiler attachment

* Update changelog

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-04-27 20:18:26 -05:00
Baraldi, Giovanni a8f3397069 SWDEV-528686: ATT fix for gfx12 s_wait_idle. Fixes for csv. Default to parse to trace. Fix for ROCR_VISIBLE_DEVICES. (#345)
* Fix for gfx12 s_wait_idle. Added wait field on att.csv

* Format and default to ATT to trace

* Update .mds

* No fatal error for invalid agent

* Tidy fixes

* Rename wait to idle, removed uneeded headers

* Remove unused traceID

* Tidy fix

* Fix csv output

* Formatting

* Fix tests

* Fix tests

* Fix for visible devices

* Review comment: Fix cmake

* Review suggestion

* Remove changelog/readme

* Review comments

* Review comment for CSV

* Formatting

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-04-25 11:49:16 -05:00
Nagaraj, Sriraksha 87badfbd15 [rocprofv3] signal handler fix (#332)
* rocprofv3: LD_PRELOAD for signal and sigaction

- wrappers around `signal` and `sigaction` to prevent applications which install signal handlers to replace the rocprofv3 signal handlers
- minor tweaks to buffer sizes (use page_size instead of
KiB)

* [DO NOT COMMIT] extra logging

* Switch git submodule url for perfetto

- use GitHub URL as this is more accessible

* Update ring_buffer<Tp>

- account for alignment padding

* Update buffered_output

- track number of bytes stored
- add nullptr checks

* Update tmp_file_buffer

- track number of bytes
- read_tmp_file does not create tmp file if it does not already exist

* Update tmp_file

- add exists member function for checking whether temporary file already exists
- tweak remove() implementation

* Update config.hpp

- add option to enable/disable signal handlers
- add option for minimum_output_bytes

* Make signal, sigaction functions visible

* rocprofv3 tool updates

- chained signals
- override the signal handler(s) installed by the application
- improve cleanup of temporary files
- support minimum output bytes

* Add commandline support

* fixing test

* minor fix

* minor fix

* fix clang issue

* fix

* Adding docs

* review comments

* review changes

* review

* YUV pulldown additions to rocdecode

* More rocdecode changes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Jonathan R. Madsen <Jonathan.Madsen@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-04-17 21:10:52 -07:00
Bhardwaj, Gopesh ca7cce9e81 doc improvements for 1.0.0 part 2 (#330)
* update installation steps

* Github Issue #50 Adding README's for samples

* Making name change to ROCprofiler-SDK for consistency

* Fix HIP trace documentation

* Fix HSA trace in docs

* Fix kernel trace in docs

* Fixing memory copy and memory allocation traces

* runtime trace and sys trace doc update

* Fix scratch memory doc

* kernel naming and filtering options

* Adding collection period in docs

* Perfetto configs update

* summary output file

* kernel trace format fix

* update CHANGELOG

* Agent index doc update

* rocm-smi output

* group by queue option

* Updated --group-by-queue description

* perfetto visualization

---------

Co-authored-by: Ian Trowbridge <Ian.Trowbridge@amd.com>
2025-04-15 13:30:07 -07:00
Madsen, Jonathan b01465303b [rocprofv3] Support negating aggregate tracing options (#251)
* Support negating aggregate tracing options

- E.g. --runtime-trace --scratch-memory-trace=False

* Add tests

* Update CHANGELOG

* rocprofv3 tweaks

* Added docs update

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Srihari Uttanur <srihari.u@amd.com>
2025-03-21 18:22:39 -05:00
Baraldi, Giovanni 821918a512 SWDEV-516846: Fix serialization services conflicts and ATT counter streaming (#230)
* Update TT API

* Rework serialization

* update att_core

* Fix tests

* Fix tool

* Formatting

* Fix perfcounter

* Formatting

* Rename agent TT

* Format

* Workaround for codeQL alert

* Tidy fix

* Fix compiler error

* Tidy

* Fix some tests

* Fixing some tests

* formatting

* Fixing ATT serialization

* Format

* Fix test commandline

* Fixing init order

* Format

* Tidy fixes

* Removing unused sample

* Fix tests and schema

* Added ATT + PMC test

* Fix mode

* Fix file mode

* Review comments

* Fix typo

* Review comments

* Review comments

* Fix missing id inc after review comment

* Review comments

* Suggested Fixes

* Testing changes

* Test fix

* Build fixes

* Minor build fix

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
2025-03-14 18:11:10 -07:00
Trowbridge, Ian ccd1e54293 HIP Streams to Queues Translation (#235)
* rocprofiler_stream_id_t: opaque handle for a stream

- e.g. HIP stream
- the same HIP stream may map to different HSA queues at different points in the application
- added to:
  - rocprofiler_buffer_tracing_hip_api_record_t
  - rocprofiler_buffer_tracing_memory_copy_record_t
  - rocprofiler_callback_tracing_hip_api_data_t
  - rocprofiler_callback_tracing_memory_copy_data_t
---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Mark Meserve <mark.meserve@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Jakaraddi, Manjunath <Manjunath.Jakaraddi@amd.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
Co-authored-by: Nagaraj, Sriraksha <Sriraksha.Nagaraj@amd.com>
Co-authored-by: U, Srihari <Srihari.U@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Welton, Benjamin <Benjamin.Welton@amd.com>
Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-03-14 02:45:13 -07:00
Welton, Benjamin aa88dd44c7 [SWDEV-512693] Iteration based counter multiplexing (#272)
Adds iteration based multiplexing to counter collection. Counter groups can now be specified. These counter groups are collected on a device individually until a specified interval period is reached. When the interval is reached, the next counter group is set to be collected on subsequent kernel executions.

Supplies two new argument types that can be included in YAML/JSON inputs:

pmc_groups: an array of arrays containing the counter groups to run (i.e. [ ["SQ_WAVES", "GRBM_COUNT"], ["GRBM_GUI_ACTIVE"])
pmc_group_interval: the number of kernel invocations on a GPU of a group before rotating to the next group

Note: originally there was a random_seed_generator proposed in the linked ticket, that was not implemented since there are very few instances where you would want the selection of the groups to be randomly generated (and if you do, you can randomly generate the pattern and place it as a large list of groups in pmc_group).

All existing counter functionality should be preserved (selection of counters on specific devices only, profiling of only specific kernels, etc).

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-03-14 02:05:36 -07:00
Nagaraj, Sriraksha c30bb7cbda Adding agent-index (#189)
* Adding agent-index

* review changes

* review comments addressed

* minor fix

* fix CI failure

* review comments

* Fix agent index test and address review comments

* Build Fixes

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
2025-03-14 00:51:32 -07:00
Trowbridge, Ian 31fe8858d1 rocJPEG API Tracing (#73)
* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Support for rocJPEG API Trace

* Added newline to rocjpeg_version.h

* json-tool code added, initial test/bin commit

* Formatting

* Resolved rocjpeg bin test compilation errors

* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed

* Formatting and compilation errors

* Minor fixes

* Copyright year update and minor fixes

* Doc update fix

* Added rocjpeg csv file in data

* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes

* Added rocdecode and rocjpeg to CI

* Removed rocdecode and rocjpeg from CI and added back build tests option

* Updated Cmake Files

* Added rocDecode and rocJPEG to CI

* Remove cmake line added in error

* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes

* Added find_package for test

* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path

* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests

* Resolve merge conflicts and formatting

* Added regex find and replace instead of include for CI

* VAAPI package causing errors on Vega20

* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved

* Removed workflows regex

* Formatting and minor test modification

* Modified test for vega20

* Update rocDecode and rocJPEG cmake and tests

* Changelog

* Fix merge conflict

* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing

* Removed if found statements, replaced with TARGET:EXISTS

* Skip json file for rocjpeg and rocdecode tests if not supported

* Add os import

---------

Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-21 13:43:49 -08:00
Madsen, Jonathan 470f347e50 SDK: remove majority of exceptions (#176)
* SDK: remove majority of exceptions

- replace with ROCP_FATAL, ROCP_CI_LOG(WARNING), etc.
- improve logging of symbolic link
- add --readlink and --realpath (hidden options) to rocprofv3 to follow symlinks for preloaded libraries

* Add rocprofv3 --rocm-root argument

* Fix registration resolved_exists

* Fix rocprofv3_avail.py

* Update logging for rocprofiler_configure search

- relax failure conditions

* Misc clang-tidy fixes

* Fix merge

* Fix merge

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>
2025-02-18 10:44:37 -06:00
Nagaraj, Sriraksha 5d0b220c37 target cu to string input (#198)
* target cu to string input

* review comments

* review comments
2025-02-12 12:51:39 -06:00
Madsen, Jonathan 59b41ab5aa rocprofv3: Update rocprofv3 command line for ATT (#201)
* rocprofv3: suppress agent info when no data collected

* Update output config serialization

- full serialization of output configuration

* Update rocprofiler-sdk-att/tests

- add version and soversion
- change output directory
- generate libatt_decoder_summary
- disable tests instead of removing them

* Update rocprofv3 command-line

- make --att-library-path hidden by default
- simplify check_att_capability
- reorder pc sampling options
- add hidden --echo option
- remove ROCPROF_LIST_AVAIL_TOOL_LIBRARY from preload

* Add new rocprofv3 tests for specify the ATT library path

* Tweak to rocprofv3-test-hsa-multiqueue-att tests

* Update rocprofv3 tool to enable output with att

* Fix standalone test installation

* Revert to fetchcontent_makeavailable to fetchcontent_populate

* Revert tests/common/CMakeLists.txt

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 18:10:48 -06:00
Nagaraj, Sriraksha d4a51e4102 Adding att v3 support (#84)
* Adding att v3 support

* misc fix

* bug fix

* Python linting workflow and rules

* fix regex

* Adding temporary args

* fix temporary args

* fix format

* remove att_perfcounters from test input

* Review comments (#163)

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

* Revert "Review comments (#163)"

This reverts commit 9ef0f8e5a4489d5581255e1b70ced2aef5c1c1d0.

* Address review comments 2

* review changes

* review comments

* review

* cmake alias

* review

* review

* review

* review

* Enabling percounter in v3 script

* review

* formatting

* formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-02-04 04:05:38 -06:00
Elwazir, Ammar 19a912d476 Fixing collection period rocprofv3 help message (#148)
Update rocprofv3.py
2025-01-24 08:39:40 -06:00
Rawat, Swati 97b7a6315d update copyright date to 2025 (#102)
* Update LICENSE

* Update conf.py

* Update copyright year

* [fix] Update copyright year

* Update copyright year "ROCm Developer Tools"

* Add license headers to c++ files

* Add license to *.py

* Update licenses in rocdecode sources

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-22 19:11:20 -06:00
Cheruvally, Aravindan 6bb60bf930 Enhance CMAKE install instructions with std install location/destination (#85)
* Enhancement - usage of package name flags commonly across for getting unique folder name

* Enhancements - updating libexec/pkg usage, avoid sbin

* CMAKE Format Update

* Python Format Update

* Revert "Enhancement - usage of package name flags commonly across for getting unique folder name"

This reverts commit 2dcd1ac5f22ab90112d90648e4b5dab5c54bc639.

* REview Comments - Revert PACKAGE_NAME usage

* Review Comments - Update source folders accordingly to new cmake install locations
2025-01-22 11:19:47 -06:00
Trowbridge, Ian e307b89ca4 rocDecode API Tracing Support (#49)
* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Merge conflict error

* CMake files changed in response to review comments. Attempting to implement callbacks.

* Turned off test building for rocdecode

* Minor fixes for review comments

* Review comments

* Updated formatting

* Document changes and format.hpp reversion. Need to remove iterate args support for now for later update.

* Remove iterate args support

* Remove iterate-args

* enforce abi versioning in macro if

* Fix doc error

* removed spaces to fix indentation error

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-01-17 14:42:25 -08:00
Elwazir, Ammar 94474de480 rocprofv3: fix collection period unit handling (#103)
* Fixing Collection Period

* Fixing default value for collection period unit

* Formatting

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
2025-01-14 11:19:16 -06:00
Nagaraj, Sriraksha 554537f140 fix dimensions in avail output (#82)
* fix dimensions in avail output

* review comment addressed
2024-12-20 13:26:02 -06:00
Indic, Vladimir 9c21c49aa1 Remove numpy dependency from rocprofv3.py (#75)
Remove numpy dependency from rocprofv3.py
2024-12-19 20:52:42 +01:00
Nagaraj, Sriraksha 5556774c3a fix avail test (#50)
* fix avail test

* changing the regular expression

* Adding fatal error to avail script

* Revert "changing the regular expression"

This reverts commit e522143b5d9dccb870fd7f5667619ed32687d1e6.
2024-12-06 17:07:45 -06:00
Nagaraj, Sriraksha 17fdc33d05 --pc-sampling-beta-enable in ROCProfV3 (#56)
PC sampling must be explicitly enabled. 
Emit fatal error otherwise.

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

---------

Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2024-12-06 22:04:23 +01:00
Elwazir, Ammar a579c70b71 Adding --collection-period feature in rocprofv3 to match v1/v2 parity (#9)
* Adding Trace Period feature to rocprofv3

* Adding feature documentation

* Update source/bin/rocprofv3.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fixing format

* Moving to Collection Period and changing the input params

* Format Fixes

* Fixing rebasing issues

* Removing atomic include from the tool

* Adding more options for units, optimizing the code

* Fixing rocprofv3.py

* Fixing time conv & adding time controlled app

* Fixing format

* Changing to shared memory testing methodology

* use of shmem use

* Fix include headers for transpose-time-controlled.cpp

* Format upload-image-to-github.py

* Removing shmem and using only env var to dump timestamps from the tool

* Tool Fixes + Test Config

* Adding Tests

* Fixing Review comments

* Update trace period implementation

* Update trace period tests

* check between start and stop timestamps

* Merge Fix

* Update validate.py

* Improve safety of rocprofiler_stop_context after finalization

* Pass context id to collection_period_cntrl by value

* Adding 20 us error margin

* Ensure log level for collection-period test is not more than warning

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-12-06 02:17:24 +00:00
Nagaraj, Sriraksha c42bdc3128 rocprofv3: rocprofv3-avail tool (#15)
* support avail tool

Updating avail library and script

Listing on Std output incase the output folder is not given

Extending list metrics test

misc fix

misc fix

fixing memory leak

changing list-metrics to list-avail

fixing formatting issue

Fixing CMakeLists

Add test for list avil with trace

Fix test fail

clang tidy errors fixed

Removing build commands for rocprofv3-trigger-list

Addressing review changes

addressing review comment

moving avail to libexec

merge fix

Fix test failures

updating doc

Fix doc error

* updating legacy doc

* fix formatting issue

* Addressing review comments
2024-12-04 18:34:10 -06:00
Nagaraj, Sriraksha 50b185b9ac rocprofv3: PC Sampling Support (#14)
* Adding tool pc sampling support

Fixing merge issue

tool support on SDKupdates

link amd-comgr

Sanitizer failure fix

fix format

Addressing review comments

misc fix

Adding dispatch id to the CSV output

AddingCHANGELOG

[ROCProfV3][PC Sampling] Initial ROCProfV3 PC sampling tests for JSON and CSV formats (#17)

ROCProfV3 initial tests for JSON and CSV output.

Simple kernels that simplify the verification of samples to instruction decoding
has been introduced.

removing option to enable pc sampling explicitly

Adding documentation

no pc-sampling option in tests anymore

Addressing review comments

Updating docs

an option for choosing whether all units must be sampled

try ignoring PC sampling tests (#36)

* run pc-sampling tests on MI2xx runners
* use v_fmac_f32 instead of s_nop 0 in tests

* fixing docs
2024-12-04 18:32:48 -06:00
Benjamin Welton 7ddc72ad45 Add rocprofiler_load_counter_definition (#1193)
Adds rocprofiler_load_counter_definition. This function allows a counter definition file to be supplied to rocprofiler-sdk directly. Takes in a string containing the counter definition YAML, its size (in bytes), and a flag value to state whether this is an append operation or not.

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: usrihari123 <srihari.u@amd.com>
2024-11-22 01:55:47 -08:00
Gopesh Bhardwaj a976ed0783 Format rocporfv3 help (#1199)
* Format rocporfv3 help

* python formatter fix
2024-11-18 20:51:02 -06:00
itrowbri 3bd7773cf7 Memory Allocation Tracking (#1142)
* Initial commit: Need to implement wrapper function to collect data and test that wrapper function is correctly replacing core HSA functions

* Attempted to implement wrapper implementation for hsa memory allocation functions. Need to modify generate record files and test if implementation is working as expected

* Debugging and implementing generateCSV function

* Memory allocation size and starting address outputted to csv and json file formats

* Formatting

* Initial setup for OTF2 and Perfetto generation

* Collecting agent id for memory_allocation and formatting

* Modified memory_allocation.cpp to set up code for AMD_EXT commands

* Support for memory_pool_allocate added

* Removed accidently added file

* Made flag optional and added more OTF2 and Perfetto code. Needs testing to ensure perfetto and OTF2 works

* Formatting

* Fixed perfetto and otf2 output

* Fixed flag issue due to incorrect buffer use

* Updated documentation

* Small cleaning and comments

* Added test for HSA memory allocation tracing

* Fixed summary test validation errors due to allocation tracing. Added type to location_base to create unique event ids for allocation due to OTF2 trace error

* Decreased lower limit of hip calls for test

* Modified summary tests to vary number of allocate requests

* Minor fixes to address comments. Still need to address OTF2 comments

* Fix docs and changed OTF2 to use enum for type specified in location_base construction

* Fixed schema error

* Added vmem command tracking. Need to add test

* Updated test to work with vmem command and updated generateCSV to output int instead of hex string.

* OTF2 enum update and mispelling fix

* CI does not support Virtual Memory API. Removed vmem test. Will add back if CI is modifed to suport vmem API

* Update CMakeLists.txt for memory allocation test

* Updated summary test

* Minor fixes to address comments

* Moved domain_type.hpp enum to before LAST

* Fixed compile errors and formatting

* Fixed stats summary domain name error

* Added rocprofv3 test

* Page migration test fix

* Undo page migration test changes. Failures do not appear to have to do with memory allocation
2024-11-18 20:22:14 -06:00
Jonathan R. Madsen d93d4d9463 rocprofv3: support specifying HW counters via command line (#1130)
* rocprofv3: support specifying PMC counters via command line

- E.g. `rocprofv3 --pmc SQ_WAVES -- <app>`

* Update CHANGELOG

* updated rocprofv3 help and documentation

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-25 02:49:30 -05:00
Jonathan R. Madsen a92fa8f071 Miscellaneous Testing Fixes + CMake find_package include guard (#1106)
* Improve ROCPROFILER_DEFAULT_FAIL_REGEX

* Support find_package called twice

* Skip iteration of ROCPROFILER_AGENT_TYPE_CPU

* Relax tests/rocprofv3/summary

* Tweak to rocprofiler-sdk-tool/tool.cpp

* Move rocprofv3-trigger-list-metrics to libexec

* Remove PC_SAMPLING_TESTS_REGEX from CI workflow

* Update packaging

- core component depends on roctx component

* Increase verbosity for failing tests

* Fix RPATH of rocprofv3-trigger-list-metrics

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-22 16:24:16 +05:30
Gopesh Bhardwaj bfe80a4428 Removing rocprofv3.sh and rocprofv3 install path to sbin (#1143) 2024-10-22 14:16:36 +05:30
Gopesh Bhardwaj 320427b5f5 rocprofv3: docs and help menu updates (#1129)
* doc updates

* Correcting ROCtx information

* Making ROCTx string consistent

* missing occurence
2024-10-17 13:28:53 +05:30
Gopesh Bhardwaj 9b2ece76c3 Fixing installed pacakge tests in CI (#1119)
* Fixing installed pacakge tests in CI

* Formatted rocprofv3.py with black formatter
2024-10-07 20:55:35 +05:30
Gopesh Bhardwaj f2f267bcc0 updating rocprofv3 help options (#1113)
* updating rocprofv3 help options

* updating CHANGELOG
2024-10-05 12:40:27 +05:30
Jonathan R. Madsen d5bcb63263 rocprofv3 Kokkos-Tools Support (#1058) 2024-09-12 00:46:07 -05:00
Mythreya 2a146259c7 Add support for RCCL tracing (#1047)
* [Draft]: Add support for RCCL tracing

Address comments

* [Draft]: Add support for RCCL tracing

Address PR comments, changes from RCCL upstream

* Add RCCL library table registration

Working on adding support to rocprofiler-register

* Support compilation w/o <rccl/amd_detail/api_trace.h>

- dummy api_trace.h header
- return ROCPROFILER_STATUS_ERROR_NOT_IMPLEMENTED when RCCL does not have api_trace.h header

* RCCL API tracing tool support

- add to rocprofv3
- add to json-tool

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-09-12 00:42:58 -05:00