Граф коммитов

6 Коммитов

Автор SHA1 Сообщение Дата
Trowbridge, Ian e307b89ca4 rocDecode API Tracing Support (#49)
* rocDecode API Tracing support

* Test bin file added to rocdecode. Need to add validate python methods

* Added option to not make rocDecode tests

* Added rocdecode and rocprofv3 tests

* Added csv test

* Address PR comments. Changed tests to use built-in rocstreambit decoder to remove ffmpeg dependancy. Changed cmake option to disbale tests rather than not build them. Tests work locally, but will fail until rocDecode is built with tracing enabled on CI

* Add option to avoid building rocdecode tests

* Added option to avoid building rocdecode bin file

* Merge conflict error

* CMake files changed in response to review comments. Attempting to implement callbacks.

* Turned off test building for rocdecode

* Minor fixes for review comments

* Review comments

* Updated formatting

* Document changes and format.hpp reversion. Need to remove iterate args support for now for later update.

* Remove iterate args support

* Remove iterate-args

* enforce abi versioning in macro if

* Fix doc error

* removed spaces to fix indentation error

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-01-17 14:42:25 -08:00
Jakaraddi, Manjunath 78d8f4b8ea SWDEV-492623: Hip Host Function to Device Symbols Mapping (#18)
* Adding changes to register and read symbols from the hip fat binary

* adding json output for host_functions

* added error handling

* adding json tool support

* Adding tests

* formatting changes

* Adding documentation

* refactoring as per amd-staging

* Adding intializers and changing macros

* Fix page-migration background thread on fork (#31)

* Fix page-migration background thread on fork

After falling off main in the forked child, all the children
try to join on on the parent's monitoring thread. This results
in a deadlock. Parent is waiting for the child to exit, but
the child is trying to join the parent's thread which is
signaled from the parent's static destructors.

Even with just one parent and child, due to copy-on-write
semantics, a child signalling the background thread to join
will still block (thread's updated state is not visible
in the child).

This fix creates background treads on fork per-child with a
pthread_atfork handler, ensuring that each child has its own
monitoring thread.

* Formatting fixes

* Detach page-migration background thread and update test timeout

* Attach files with ctest

* Update corr-id assert

* Tweak on-fork, simplify background thread

* Revert thread detach

* Adding --collection-period feature in rocprofv3 to match v1/v2 parity (#9)

* Adding Trace Period feature to rocprofv3

* Adding feature documentation

* Update source/bin/rocprofv3.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fixing format

* Moving to Collection Period and changing the input params

* Format Fixes

* Fixing rebasing issues

* Removing atomic include from the tool

* Adding more options for units, optimizing the code

* Fixing rocprofv3.py

* Fixing time conv & adding time controlled app

* Fixing format

* Changing to shared memory testing methodology

* use of shmem use

* Fix include headers for transpose-time-controlled.cpp

* Format upload-image-to-github.py

* Removing shmem and using only env var to dump timestamps from the tool

* Tool Fixes + Test Config

* Adding Tests

* Fixing Review comments

* Update trace period implementation

* Update trace period tests

* check between start and stop timestamps

* Merge Fix

* Update validate.py

* Improve safety of rocprofiler_stop_context after finalization

* Pass context id to collection_period_cntrl by value

* Adding 20 us error margin

* Ensure log level for collection-period test is not more than warning

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- move error code check macros to implementation
- fix macros which check error code
- use constexpr values instead of #define

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- debugging for error that cannot be locally reproduced

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- improve error handling and logging

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- tweak to non-fatal logging messages

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- cleanup of logging messages

* Update host kernel symbol register data fields

* Update source/lib/rocprofiler-sdk/code_object/hip/code_object.hpp

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-12-06 11:42:37 +00:00
Trowbridge, Ian 79006bb896 SWDEV-492625 memory free functions (#11)
* SWDEV-492625: Track free memory HSA functions to help determine total amount of memory allocated on the system at any one time

* Minor fixes to address comments

* Update allocation size description

* Moved get function back to specialization, minor typo fixes

* Removed memory_operation_type field, removed memory_pool allocation enum, converted starting address to hex string for json format.

* Made conversion to hex_string a function, changed address to use union rocprofiler_address_t type, changed VMEM descriptors

* Removed as_hex from the global namespace

* Formatting

* Removed TRACK_EVENT for memory allocation, now TRACK_COUNTER for memory allocation is being performed

* Check if address was recorded before retrieving allocation size in generate Perfetto

* Formatting

* Update source/lib/output/generatePerfetto.cpp

* Explicitly disable app-abort tests

* Remove excluding app-abort test from workflow CI

- redundant bc these tests are explicitly marked as disabled now

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2024-12-06 00:05:30 -06:00
Nagaraj, Sriraksha 50b185b9ac rocprofv3: PC Sampling Support (#14)
* Adding tool pc sampling support

Fixing merge issue

tool support on SDKupdates

link amd-comgr

Sanitizer failure fix

fix format

Addressing review comments

misc fix

Adding dispatch id to the CSV output

AddingCHANGELOG

[ROCProfV3][PC Sampling] Initial ROCProfV3 PC sampling tests for JSON and CSV formats (#17)

ROCProfV3 initial tests for JSON and CSV output.

Simple kernels that simplify the verification of samples to instruction decoding
has been introduced.

removing option to enable pc sampling explicitly

Adding documentation

no pc-sampling option in tests anymore

Addressing review comments

Updating docs

an option for choosing whether all units must be sampled

try ignoring PC sampling tests (#36)

* run pc-sampling tests on MI2xx runners
* use v_fmac_f32 instead of s_nop 0 in tests

* fixing docs
2024-12-04 18:32:48 -06:00
itrowbri 3bd7773cf7 Memory Allocation Tracking (#1142)
* Initial commit: Need to implement wrapper function to collect data and test that wrapper function is correctly replacing core HSA functions

* Attempted to implement wrapper implementation for hsa memory allocation functions. Need to modify generate record files and test if implementation is working as expected

* Debugging and implementing generateCSV function

* Memory allocation size and starting address outputted to csv and json file formats

* Formatting

* Initial setup for OTF2 and Perfetto generation

* Collecting agent id for memory_allocation and formatting

* Modified memory_allocation.cpp to set up code for AMD_EXT commands

* Support for memory_pool_allocate added

* Removed accidently added file

* Made flag optional and added more OTF2 and Perfetto code. Needs testing to ensure perfetto and OTF2 works

* Formatting

* Fixed perfetto and otf2 output

* Fixed flag issue due to incorrect buffer use

* Updated documentation

* Small cleaning and comments

* Added test for HSA memory allocation tracing

* Fixed summary test validation errors due to allocation tracing. Added type to location_base to create unique event ids for allocation due to OTF2 trace error

* Decreased lower limit of hip calls for test

* Modified summary tests to vary number of allocate requests

* Minor fixes to address comments. Still need to address OTF2 comments

* Fix docs and changed OTF2 to use enum for type specified in location_base construction

* Fixed schema error

* Added vmem command tracking. Need to add test

* Updated test to work with vmem command and updated generateCSV to output int instead of hex string.

* OTF2 enum update and mispelling fix

* CI does not support Virtual Memory API. Removed vmem test. Will add back if CI is modifed to suport vmem API

* Update CMakeLists.txt for memory allocation test

* Updated summary test

* Minor fixes to address comments

* Moved domain_type.hpp enum to before LAST

* Fixed compile errors and formatting

* Fixed stats summary domain name error

* Added rocprofv3 test

* Page migration test fix

* Undo page migration test changes. Failures do not appear to have to do with memory allocation
2024-11-18 20:22:14 -06:00
Jonathan R. Madsen b4dc4f8e92 JSON schema + JSON node vs. Perfetto category consistency (#869)
* Fix schema differences b/t JSON and perfetto

- "kernel_dispatches" (JSON) vs. "kernel_dispatch" (Perfetto) -> "kernel_dispatch"
- "scratch_api" (JSON) -> "scratch_memory"

* Update include/rocprofiler-sdk/cxx/serialization.hpp

- remove unnecessary includes causing circular dependency

* rocprofv3 docs

- Output formats
- JSON schema

* Spelling fix

* Update schema descriptions

* Improve assert log in test_perfetto_data
2024-05-22 20:36:27 -05:00