Graphe des révisions

88 Révisions

Auteur SHA1 Message Date
Rawat, Swati 31b8f61c8e Documentation updates (#236)
* Documentation updates

* formatting

* Update using-rocprofv3.rst

* Update counter_collection_services.md

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
2025-02-28 10:10:26 +05:30
Trowbridge, Ian 31fe8858d1 rocJPEG API Tracing (#73)
* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Support for rocJPEG API Trace

* Added newline to rocjpeg_version.h

* json-tool code added, initial test/bin commit

* Formatting

* Resolved rocjpeg bin test compilation errors

* Tests implemented. Perfetto module currently resulting in errors, so need to retest whenever it is fixed

* Formatting and compilation errors

* Minor fixes

* Copyright year update and minor fixes

* Doc update fix

* Added rocjpeg csv file in data

* Addresses review comments: Updated fixed Findroc.. and uses root directory as a hint, fixed documentation error, changed tables to use _CORE, minor style fixes

* Added rocdecode and rocjpeg to CI

* Removed rocdecode and rocjpeg from CI and added back build tests option

* Updated Cmake Files

* Added rocDecode and rocJPEG to CI

* Remove cmake line added in error

* Temporarily modified tests to pass if rocdecode or rocjpeg tracing are not supported for CI, cmake changes

* Added find_package for test

* Added back use of system rocDecode and rocJPEG, modifies system files to include prefix path

* Updated no-link to include INCLUDE_DIR/roc(decode|jpeg), added comments for tests

* Resolve merge conflicts and formatting

* Added regex find and replace instead of include for CI

* VAAPI package causing errors on Vega20

* Removed system rocjpeg and rocdecode use temporarily until cmake issues resolved

* Removed workflows regex

* Formatting and minor test modification

* Modified test for vega20

* Update rocDecode and rocJPEG cmake and tests

* Changelog

* Fix merge conflict

* Added back if-statements around add-tests since cmake-generator-expressions are resulting in errors when the packages are missing

* Removed if found statements, replaced with TARGET:EXISTS

* Skip json file for rocjpeg and rocdecode tests if not supported

* Add os import

---------

Co-authored-by: Kandula, Venkateshwar reddy <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-21 13:43:49 -08:00
Bhardwaj, Gopesh 848242eb5c SWDEV-514046 documentation build fix (#208) 2025-02-13 09:25:30 -06:00
Kandula, Venkateshwar reddy 6427fbafc2 Accum_vgpr support in Rocprofv3 (#70)
* output accumulate vgpr count

* fix logic for computing accum_vgpr

* add accum_vgpr to csv.

* accumulation vgpr's docs and support for rocprofv3

* CHANGELOG.md

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
2025-02-12 10:47:46 -08:00
Bhardwaj, Gopesh 075d36eb82 output format envs doc update (#173) 2025-02-11 21:37:12 -06:00
Madsen, Jonathan 6246ec4040 SDK: Agent UUIDs, agent runtime visibility, kernel symbol address (#154)
* [DO NOT MERGE] Misc UUID updates

- this is WIP

* Agent visibility

- Support for ROCR_VISIBLE_DEVICES, HIP_VISIBLE_DEVICES, CUDA_VISIBLE_DEVICES, GPU_DEVICE_ORDINAL

* Update CHANGELOG

* tweak to rocprofiler_agent_runtime_visiblity_t

* Code object kernel address

- new fields in code_object_kernel_symbol_register_data_t
  - kernel_code_entry_byte_offset
  - kernel_address

* Support ROCR_VISIBLE_DEVICES reordering devices for HIP

* Addressed code review changes

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-02-11 14:36:23 -06:00
Bhardwaj, Gopesh cdf22eba7d Adding pc sampling how to guide (#160)
* Adding pc sampling how to guide

* doc update

* Fixing indentation

* updating index

* udpating doc

* updating doc

* Added field information

* Fixing Formatting

* fix formatting error

* Added json format for pc sampling

* feedback resolved

* formatting for text

* PC Sampling API doc

* Reformatted

* Note for shared systems

* update docs

* correcting relative path for cross-referencing

---------

Co-authored-by: vlaindic_amdeng <vladimir.indic@amd.com>
2025-02-10 20:33:05 -06:00
Bhardwaj, Gopesh 7821657d65 SWDEV-510794 Adding MPI usage with rocprofv3 (#183)
* swdev-510794 Adding MPI usage with rocprofv3

* update doc

* Fixed build issues

* updating doc

* doc update

* Fixed Typos

* csv format

* change format to shell
2025-02-07 12:01:31 +05:30
Bhardwaj, Gopesh 12508b9521 Adding ROCTx usage doc (#159)
* Adding Roctx usage doc

* updated CHANGELOG

* dpc update

* Fixing Related Pages issue

* updating doc

* updating docs

* Adding Resource naming section

* Fixed Formatting

* format fix

* format fix

* Fixing build due to incorrect indentation
2025-02-05 11:04:24 -06:00
Nagaraj, Sriraksha d4a51e4102 Adding att v3 support (#84)
* Adding att v3 support

* misc fix

* bug fix

* Python linting workflow and rules

* fix regex

* Adding temporary args

* fix temporary args

* fix format

* remove att_perfcounters from test input

* Review comments (#163)

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

* Revert "Review comments (#163)"

This reverts commit 9ef0f8e5a4489d5581255e1b70ced2aef5c1c1d0.

* Address review comments 2

* review changes

* review comments

* review

* cmake alias

* review

* review

* review

* review

* Enabling percounter in v3 script

* review

* formatting

* formatting

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
Co-authored-by: Baraldi, Giovanni <Giovanni.Baraldi@amd.com>
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>
2025-02-04 04:05:38 -06:00
Rawat, Swati 97b7a6315d update copyright date to 2025 (#102)
* Update LICENSE

* Update conf.py

* Update copyright year

* [fix] Update copyright year

* Update copyright year "ROCm Developer Tools"

* Add license headers to c++ files

* Add license to *.py

* Update licenses in rocdecode sources

---------

Co-authored-by: srawat <120587655+SwRaw@users.noreply.github.com>
Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-22 19:11:20 -06:00
Bhardwaj, Gopesh 73e7f8cfb1 ROCTx Documentation (#29)
* Add roctx doc

* Add roctx doxyfile input

* Update links and toc

* Build doxysphinx for both doxygen files

* Update scripts

* Generate roctx doxygen files

* Change doxygen path

to allow for 2 doxyfiles

* Make doxygen dir for script

* Call make _doxygen dir with p flag

* Create _doxygen dir in workfllow

* Create doc dirs for doxygen

* Run update docs as sudo

* Fix typo in mkdir command

* Include graphviz for dot

* Install dot for docs CI

* Install dot as sudo due to permission denied

* Install doxygen via sudo

* Install doxysphinx

* Add postcheckout step to RTD to config and gen doxygen docs

* On RTD, update doxygen after creating env

* update docs.yml

* update docs.yml

* fixing build-docs-from-source

* Fixing build docs from source

* update docs.yml

* trying to fix readthedocs

* trying to fix readthedocs

* update docs.yml

* improve mainpage documentation

* update docs

* clang-format fix

---------

Co-authored-by: Sam Wu <22262939+samjwu@users.noreply.github.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-01-22 11:34:21 -06:00
Trowbridge, Ian 73e72bb088 Documentation Update to reflect that memory allocation trace records null pointers for free operations (#127)
Update documentation to reflect that nullpointers can be recorded in free memory operations
2025-01-22 11:20:50 -06:00
Madsen, Jonathan 89cfb5317d Update docs jinja requirements (#118)
- Jinja < 3.1.5 has a sandbox breakout through malicious filenames
- Jinja < 3.1.5 has a sandbox breakout through indirect reference to format method

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-01-22 11:18:19 -06:00
Trowbridge, Ian e307b89ca4 rocDecode API Tracing Support (#49)
* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Merge conflict error

* CMake files changed in response to review comments. Attempting to implement callbacks.

* Turned off test building for rocdecode

* Minor fixes for review comments

* Review comments

* Updated formatting

* Document changes and format.hpp reversion. Need to remove iterate args support for now for later update.

* Remove iterate args support

* Remove iterate-args

* enforce abi versioning in macro if

* Fix doc error

* removed spaces to fix indentation error

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-01-17 14:42:25 -08:00
Jakaraddi, Manjunath dfee6489b1 SWDEV-500520: Updated documentation for hang issue (#79)
* SWDEV-500520: Updated documentation for hang issue

* Avoid fatal error when invalid metric is found

* removing invalid metrics

* clang formatting
2025-01-16 02:14:22 -08:00
Bhardwaj, Gopesh 81f600e3ba miscellaneous doc updates (#86)
* miscellaneous doc updates

* updated deprecartion message

* Updated memory allocation tracking documentation

* Update comparing-with-legacy-tools.rst

Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>

* Update comparing-with-legacy-tools.rst

Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>

* Update comparing-with-legacy-tools.rst

---------

Co-authored-by: Ian Trowbridge <ian.trowbridge@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
2025-01-14 11:17:45 -06:00
Bhardwaj, Gopesh 3ee06ed747 gobhardw/docs logging (#10)
* reducing docs logging

* Addressing review comments

* exclude dirs

* maximize NUM_PROC_THREADS

* parallel build
2024-12-10 14:15:59 +05:30
Nagaraj, Sriraksha c509fe799d Updating rocprofv3 doc for pc sampling beta option (#59)
* Updating rocprofv3 doc for pc sampling beta option

* Update source/docs/rocprofv3_input_schema.json

* Update using-rocprofv3.rst

---------

Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
2024-12-06 17:41:28 -06:00
Jakaraddi, Manjunath 78d8f4b8ea SWDEV-492623: Hip Host Function to Device Symbols Mapping (#18)
* Adding changes to register and read symbols from the hip fat binary

* adding json output for host_functions

* added error handling

* adding json tool support

* Adding tests

* formatting changes

* Adding documentation

* refactoring as per amd-staging

* Adding intializers and changing macros

* Fix page-migration background thread on fork (#31)

* Fix page-migration background thread on fork

After falling off main in the forked child, all the children
try to join on on the parent's monitoring thread. This results
in a deadlock. Parent is waiting for the child to exit, but
the child is trying to join the parent's thread which is
signaled from the parent's static destructors.

Even with just one parent and child, due to copy-on-write
semantics, a child signalling the background thread to join
will still block (thread's updated state is not visible
in the child).

This fix creates background treads on fork per-child with a
pthread_atfork handler, ensuring that each child has its own
monitoring thread.

* Formatting fixes

* Detach page-migration background thread and update test timeout

* Attach files with ctest

* Update corr-id assert

* Tweak on-fork, simplify background thread

* Revert thread detach

* Adding --collection-period feature in rocprofv3 to match v1/v2 parity (#9)

* Adding Trace Period feature to rocprofv3

* Adding feature documentation

* Update source/bin/rocprofv3.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fixing format

* Moving to Collection Period and changing the input params

* Format Fixes

* Fixing rebasing issues

* Removing atomic include from the tool

* Adding more options for units, optimizing the code

* Fixing rocprofv3.py

* Fixing time conv & adding time controlled app

* Fixing format

* Changing to shared memory testing methodology

* use of shmem use

* Fix include headers for transpose-time-controlled.cpp

* Format upload-image-to-github.py

* Removing shmem and using only env var to dump timestamps from the tool

* Tool Fixes + Test Config

* Adding Tests

* Fixing Review comments

* Update trace period implementation

* Update trace period tests

* check between start and stop timestamps

* Merge Fix

* Update validate.py

* Improve safety of rocprofiler_stop_context after finalization

* Pass context id to collection_period_cntrl by value

* Adding 20 us error margin

* Ensure log level for collection-period test is not more than warning

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- move error code check macros to implementation
- fix macros which check error code
- use constexpr values instead of #define

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- debugging for error that cannot be locally reproduced

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- improve error handling and logging

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- tweak to non-fatal logging messages

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- cleanup of logging messages

* Update host kernel symbol register data fields

* Update source/lib/rocprofiler-sdk/code_object/hip/code_object.hpp

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-12-06 11:42:37 +00:00
Trowbridge, Ian 79006bb896 SWDEV-492625 memory free functions (#11)
* SWDEV-492625: Track free memory HSA functions to help determine total amount of memory allocated on the system at any one time

* Minor fixes to address comments

* Update allocation size description

* Moved get function back to specialization, minor typo fixes

* Removed memory_operation_type field, removed memory_pool allocation enum, converted starting address to hex string for json format.

* Made conversion to hex_string a function, changed address to use union rocprofiler_address_t type, changed VMEM descriptors

* Removed as_hex from the global namespace

* Formatting

* Removed TRACK_EVENT for memory allocation, now TRACK_COUNTER for memory allocation is being performed

* Check if address was recorded before retrieving allocation size in generate Perfetto

* Formatting

* Update source/lib/output/generatePerfetto.cpp

* Explicitly disable app-abort tests

* Remove excluding app-abort test from workflow CI

- redundant bc these tests are explicitly marked as disabled now

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-12-06 00:05:30 -06:00
Elwazir, Ammar a579c70b71 Adding --collection-period feature in rocprofv3 to match v1/v2 parity (#9)
* Adding Trace Period feature to rocprofv3

* Adding feature documentation

* Update source/bin/rocprofv3.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fixing format

* Moving to Collection Period and changing the input params

* Format Fixes

* Fixing rebasing issues

* Removing atomic include from the tool

* Adding more options for units, optimizing the code

* Fixing rocprofv3.py

* Fixing time conv & adding time controlled app

* Fixing format

* Changing to shared memory testing methodology

* use of shmem use

* Fix include headers for transpose-time-controlled.cpp

* Format upload-image-to-github.py

* Removing shmem and using only env var to dump timestamps from the tool

* Tool Fixes + Test Config

* Adding Tests

* Fixing Review comments

* Update trace period implementation

* Update trace period tests

* check between start and stop timestamps

* Merge Fix

* Update validate.py

* Improve safety of rocprofiler_stop_context after finalization

* Pass context id to collection_period_cntrl by value

* Adding 20 us error margin

* Ensure log level for collection-period test is not more than warning

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-12-06 02:17:24 +00:00
Bhardwaj, Gopesh 6d2e70d8da updating roctx documentation for functions (#30)
updating roctx documentation for funcitons
2024-12-05 19:47:57 +05:30
Nagaraj, Sriraksha c42bdc3128 rocprofv3: rocprofv3-avail tool (#15)
* support avail tool

Updating avail library and script

Listing on Std output incase the output folder is not given

Extending list metrics test

misc fix

misc fix

fixing memory leak

changing list-metrics to list-avail

fixing formatting issue

Fixing CMakeLists

Add test for list avil with trace

Fix test fail

clang tidy errors fixed

Removing build commands for rocprofv3-trigger-list

Addressing review changes

addressing review comment

moving avail to libexec

merge fix

Fix test failures

updating doc

Fix doc error

* updating legacy doc

* fix formatting issue

* Addressing review comments
2024-12-04 18:34:10 -06:00
Nagaraj, Sriraksha 50b185b9ac rocprofv3: PC Sampling Support (#14)
* Adding tool pc sampling support

Fixing merge issue

tool support on SDKupdates

link amd-comgr

Sanitizer failure fix

fix format

Addressing review comments

misc fix

Adding dispatch id to the CSV output

AddingCHANGELOG

[ROCProfV3][PC Sampling] Initial ROCProfV3 PC sampling tests for JSON and CSV formats (#17)

ROCProfV3 initial tests for JSON and CSV output.

Simple kernels that simplify the verification of samples to instruction decoding
has been introduced.

removing option to enable pc sampling explicitly

Adding documentation

no pc-sampling option in tests anymore

Addressing review comments

Updating docs

an option for choosing whether all units must be sampled

try ignoring PC sampling tests (#36)

* run pc-sampling tests on MI2xx runners
* use v_fmac_f32 instead of s_nop 0 in tests

* fixing docs
2024-12-04 18:32:48 -06:00
Benjamin Welton 7ddc72ad45 Add rocprofiler_load_counter_definition (#1193)
Adds rocprofiler_load_counter_definition. This function allows a counter definition file to be supplied to rocprofiler-sdk directly. Takes in a string containing the counter definition YAML, its size (in bytes), and a flag value to state whether this is an append operation or not.

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: usrihari123 <srihari.u@amd.com>
2024-11-22 01:55:47 -08:00
Gopesh Bhardwaj 7ea9ced493 SDK doc updates (#1183)
* correcting usage example

* rccl trace

* Adding Navi power state limitation

* Addressed feedback

* kernel-rename

* kokkos trace

* more information on kookos tracing

* Corecting tool library hardcoding

* summary domains

* Updating domain stats file

* updating images

* rocprofv3 default behavior update

* Removing README from API documentation

* Added missing description in Topics

* Fixed wrong rendering of README in API document

* Fixing Topics in API docs

* Removing API doc for details/rccl.h

* Addressed review comments
2024-11-22 12:05:11 +05:30
itrowbri 3bd7773cf7 Memory Allocation Tracking (#1142)
* Initial commit: Need to implement wrapper function to collect data and test that wrapper function is correctly replacing core HSA functions

* Attempted to implement wrapper implementation for hsa memory allocation functions. Need to modify generate record files and test if implementation is working as expected

* Debugging and implementing generateCSV function

* Memory allocation size and starting address outputted to csv and json file formats

* Formatting

* Initial setup for OTF2 and Perfetto generation

* Collecting agent id for memory_allocation and formatting

* Modified memory_allocation.cpp to set up code for AMD_EXT commands

* Support for memory_pool_allocate added

* Removed accidently added file

* Made flag optional and added more OTF2 and Perfetto code. Needs testing to ensure perfetto and OTF2 works

* Formatting

* Fixed perfetto and otf2 output

* Fixed flag issue due to incorrect buffer use

* Updated documentation

* Small cleaning and comments

* Added test for HSA memory allocation tracing

* Fixed summary test validation errors due to allocation tracing. Added type to location_base to create unique event ids for allocation due to OTF2 trace error

* Decreased lower limit of hip calls for test

* Modified summary tests to vary number of allocate requests

* Minor fixes to address comments. Still need to address OTF2 comments

* Fix docs and changed OTF2 to use enum for type specified in location_base construction

* Fixed schema error

* Added vmem command tracking. Need to add test

* Updated test to work with vmem command and updated generateCSV to output int instead of hex string.

* OTF2 enum update and mispelling fix

* CI does not support Virtual Memory API. Removed vmem test. Will add back if CI is modifed to suport vmem API

* Update CMakeLists.txt for memory allocation test

* Updated summary test

* Minor fixes to address comments

* Moved domain_type.hpp enum to before LAST

* Fixed compile errors and formatting

* Fixed stats summary domain name error

* Added rocprofv3 test

* Page migration test fix

* Undo page migration test changes. Failures do not appear to have to do with memory allocation
2024-11-18 20:22:14 -06:00
venkat1361 472907a576 Dimension support for reduce operator (#1147)
* cache reference nodes

* evaluation based on dim args

* format

* add dimensions for reduce operator

* add dimensions for reduce operator

* add dimensions for reduce operator docs

* add dimensions for reduce operator.

* refactor switch cases

* Update CHANGELOG.md

* updated doc with data example

* updated doc with data example for reduce operation.

* added fallthrough in switch case sum.

* changelog.md

* format

* fix bug in constuct_test_data()
2024-11-11 18:37:28 +05:30
venkat1361 cc4811d27d SWDEV-477244: Select() Expression Dimension Support (#1091)
* add support for select function in derived counters

* formatting

* renaming select dims variable name from set to map

* format

* Update doc with select() for dimensions

* use : for defining range of values in select dims

* - update dimension for metric after select.
- make sure to raise runtime error if user provides range for a dimension.

* use map instead of unordered_map for select dim info

* new line EOF

* fix bug: select() operator.

* Update evaluate_ast.cpp

format

* added a check for dim value exceeds max.

* Update source/lib/rocprofiler-sdk/counters/evaluate_ast.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/counters/evaluate_ast.cpp

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* updated doc with data example for select operation.

* changelog.md

* Update CHANGELOG.md

---------

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2024-11-11 15:14:22 +05:30
Benjamin Welton 6564419357 Update comparing-with-legacy-tools.rst (#1187) 2024-11-06 08:56:32 -08:00
Benjamin Welton c491a5bc34 Timing documentation Update (#1168)
* Timing documentation Update

Documentation update for timing differences. Needs additional review from Joe Greathouse before landing.

* Update comparing-with-legacy-tools.rst
2024-11-06 09:28:41 -06:00
Benjamin Welton dc6a568ec5 Kernel Serialization Documentation (#1167)
* Kernel Serialization Documentation

Added docs for kernel serialization.

* Update counter_collection_services.md

* correcting counter collection mode names

* correcting counter collection modes naming

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-11-05 09:00:11 -06:00
srawat 4204042ac6 Refactor API reference docs (#1125)
* Refactor API reference docs

* refactor API ref docs

* corrections

* consistent naming

* updates

* Update CHANGELOG.md

* improving SEO

* improving SEO

* Update using-rocprofv3.rst

* Update counter_collection_services.md

* Update using-rocprofv3.rst

* Fixing doc build errors

* changelogs and some formatting issues

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-30 19:39:08 +05:30
Manjunath P Jakaraddi f087debe84 SWDEV-493402: Changing the header format for counter collection (#1148)
* SWDEV-493402: Changing the header format for counter collection

* Adding column widths
2024-10-29 10:11:29 +05:30
Jonathan R. Madsen d93d4d9463 rocprofv3: support specifying HW counters via command line (#1130)
* rocprofv3: support specifying PMC counters via command line

- E.g. `rocprofv3 --pmc SQ_WAVES -- <app>`

* Update CHANGELOG

* updated rocprofv3 help and documentation

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-25 02:49:30 -05:00
Sam Wu dd71131114 Documentation - Bump rocm-docs-core to 1.8.2 (#1116)
* Bump rocm-docs-core to 1.8.2

* changing certi back to previous version

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-23 17:24:12 +05:30
venkat1361 3f91d90bbc Check to force tools to initialize the ctx id to zero. (#1135)
* Check to force tool to initialize the ctx id to zero.

* initialize rocprofiler_context_id_t with 0 in units tests

* changelog

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-22 18:09:25 +05:30
Benjamin Welton bb69467765 Renamed agent profiling service to device counting service (#1132)
* Renamed agent profiling service to device counting service

Name more aptly represents what agent profiling did (device wide
counter collection). Conversion of existing user code can be
performed by the following find/sed command:

find . -type f -exec sed -i 's/rocprofiler_agent_profile_callback_t/rocprofiler_device_counting_service_callback_t/g; s/rocprofiler_configure_agent_profile_counting_service/rocprofiler_configure_device_counting_service/g; s/agent_profile.h/device_counting_service.h/g; s/rocprofiler_sample_agent_profile_counting_service/rocprofiler_sample_device_counting_service/g' {} +

* Converted dispatch profile to dispatch counting service

* Debug for functioal counters test

* Minor changes for CI

* Minor fix

* More fixes for CI

* Update evaluate_ast.cpp

---------

Co-authored-by: Benjamin Welton <ben@amd.com>
2024-10-18 14:14:11 +05:30
Gopesh Bhardwaj 320427b5f5 rocprofv3: docs and help menu updates (#1129)
* doc updates

* Correcting ROCtx information

* Making ROCTx string consistent

* missing occurence
2024-10-17 13:28:53 +05:30
Manjunath P Jakaraddi 673c21c6db Adding start and end timestamp columns in csv (#1128)
* Adding start and end timestamp columns in csv

* Adding assert check for the counter timestamps

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>
2024-10-17 11:31:00 +05:30
Gopesh Bhardwaj 81e93a6ddb mem copy direction field update (#1124) 2024-10-08 15:15:48 +05:30
Jeffrey Novotny 78fe20320f Correct link to rocprofiler-sdk GitHub (#1097) 2024-09-27 21:03:49 -06:00
Gopesh Bhardwaj 863b608ae3 comparing tool options in rocprof/rocprofv2/rocprofv3 (#1050)
* comparing tool options in rocprof/rocprofv2/rocprofv3

* Added more perfetto options

* Added new summary optipons

* Added Category for all options

* Apply suggestions from code review

Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>

* changes requested in SWDEV-484472

* Correct broken table formatting

* detailed explanation on json custom format

* Feedback Resolution

---------

Co-authored-by: Jonathan R. Madsen <jrmadsen@users.noreply.github.com>
Co-authored-by: Ammar ELWazir <ammar.elwazir@amd.com>
2024-09-16 20:08:11 +05:30
Gopesh Bhardwaj df939cbe96 SWDEV-48112 Counter header and OTF2 updates (#1042) 2024-09-11 08:15:03 -05:00
Gopesh Bhardwaj 0e7f9444e8 Update git clone URL + few documentation fixes (#1066) 2024-09-11 15:09:21 +05:30
dependabot[bot] 7a9af10833 Bump cryptography from 42.0.8 to 43.0.1 in /source/docs/sphinx (#1049)
Bumps [cryptography](https://github.com/pyca/cryptography) from 42.0.8 to 43.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/42.0.8...43.0.1)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-09-09 12:31:08 -05:00
Jonathan R. Madsen 395f01b689 rocprofv3: summary reports + more JSON metadata (#1029)
* Move include/rocprofiler-sdk/cxx/details/delimit.hpp to tokenize.hpp

* Update docs/how-to/using-rocprofv3.rst

- fix code block indents
- reorder rocprofv3 options, limit them to important options
- add docs for `--runtime-trace`

* Update rocprofv3.py

- parser argument groups
- new `--runtime-trace` option
- new `--summary` option
- new `--summary-per-domain` option
- new `--summary-groups` option
- new `--summary-output-file` option
- new `--summary-units` option

* Update lib/rocprofiler-sdk/hsa/async_copy.cpp

- fix async copy operation names: add "MEMORY_COPY_" prefix

* lib/rocprofiler-sdk-tool: update statistics.{hpp,cpp}

- statistics<>::get_percent function
- stats_entry_t struct
- stats_formatter struct
- percentage struct
- std::to_string(::rocprofiler::tool::percentage)

* lib/rocprofiler-sdk-tool: update domain_type.{hpp,cpp}

- reorder domain_type enum values

* lib/rocprofiler-sdk-tool: update generateCSV.{hpp,cpp}

- separate writing CSV from accumulating statistics
- a lot of functionality was moved to statistics.{hpp,cpp}

* lib/rocprofiler-sdk-tool: update output_file.{hpp,cpp}

- output_stream_t struct
- get_output_stream(...) returns output_stream_t instance

* lib/rocprofiler-sdk-tool: update generateJSON.cpp

- update get_output_stream usage to output_stream_t

* lib/rocprofiler-sdk-tool: update generateOTF2.cpp

- header include order tweak

* lib/rocprofiler-sdk-tool: update buffered_output.hpp

- stats_data_t was renamed to stats_entry_t

* lib/rocprofiler-sdk-tool: update generatePerfetto.cpp

- header include tweak

* lib/rocprofiler-sdk-tool: update tmp_file_buffer.hpp

- emit warning message if write_ring_buffer fails after offloading instead of aborting
- prefer placement new instead of assignment in write_ring_buffer

* lib/rocprofiler-sdk-tool: add generateStats.{hpp,cpp}

- functions for accumulating statistics

* Update tests/rocprofv3/tracing-hip-in-libraries/CMakeLists.txt

- accommodate tweak to CSV output file name for HIP and HSA traces

* lib/rocprofiler-sdk-tool: update config.{hpp,cpp}

- new config variables
  - stats_summary
  - stats_summary_per_domain
  - summary_output
  - stats_summary_unit_value
  - stats_summary_unit
  - stats_summary_file
  - stats_summary_groups
- support output keys for hostname: %hostname% / %h

* lib/rocprofiler-sdk-tool: update tool.cpp

- support summary output

* Documentation fixes

* Test for summary output

* Update tests/bin/transpose to use more ROCTx

- also support building with the roctracer ROCTx

* Remove roctxMark from OTF2 + fix kernel-rename tests

- following more ROCTx calls in transpose, kernel-rename validation had to be updated

* JSON metadata + JSON summary

- add serialization support for config
- add serialization support for statistics
- additions to json spec
  - rocprofiler-sdk-tool/metadata/config
  - rocprofiler-sdk-tool/metadata/command
  - rocprofiler-sdk-tool/summary
- config output_keys support for NVIDIA %q{<ENV-VAR>} syntax
- config output_keys support keys within keys

* rocprofv3 --summary-groups warning if no domain matches

- emit warning if a regex in for summary groups did not match any domain names

* Compile fix for lib/rocprofiler-sdk-tool/tool.cpp

- get_config().scratch_memory_trace
- pass contributions to write_json

* Update rocprofv3.py to preload rocprofiler-sdk-roctx

- appended to LD_PRELOAD when args.marker_trace is enabled

* Fix ReST link errors about subtitle underline being too short

* Patch tokenization of config::stats_summary_groups

- guard against array values of empty strings

* Tweak rocprofv3 summary test

- input-summary.yaml (used by rocprofv3-test-summary-inp-yaml-execute) only provides one summary group regex

* Disable LD_PRELOAD of librocprofiler-sdk-roctx.so

- this causes problems in the sanitizers, will be addressed in another PR
2024-09-09 11:20:55 -05:00
Gopesh Bhardwaj 5bec9f66b6 gobhardw/swdev 479013 (#1014)
* test pr for ROCm sdk tests

* SWDEV-479013 doc updates
2024-08-13 23:35:12 +05:30
SrirakshaNag 403ab6efb1 fixing rocprofv3 doc (#1007) 2024-08-06 12:39:09 -05:00