Revīziju grafs

554 Revīzijas

Autors SHA1 Ziņojums Datums
Mallya, Ameya Keshava 8641afe3fe Changed branch to mainline
[ROCm/rocprofiler-sdk commit: 720763daac]
2025-01-15 11:08:20 -08:00
Elwazir, Ammar 0611359850 rocprofv3: fix collection period unit handling (#103)
* Fixing Collection Period

* Fixing default value for collection period unit

* Formatting

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>

[ROCm/rocprofiler-sdk commit: 94474de480]
2025-01-14 11:19:16 -06:00
Bhardwaj, Gopesh a4aa2fd14e miscellaneous doc updates (#86)
* miscellaneous doc updates

* updated deprecartion message

* Updated memory allocation tracking documentation

* Update comparing-with-legacy-tools.rst

Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>

* Update comparing-with-legacy-tools.rst

Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>

* Update comparing-with-legacy-tools.rst

---------

Co-authored-by: Ian Trowbridge <ian.trowbridge@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>

[ROCm/rocprofiler-sdk commit: 81f600e3ba]
2025-01-14 11:17:45 -06:00
Mallya, Ameya Keshava 1457d95d9f Added necessary variable for commenting
[ROCm/rocprofiler-sdk commit: 70bcd05fd4]
2025-01-13 20:26:23 -08:00
Mallya, Ameya Keshava 5ed2e2f54c Potential fix for KWS
[ROCm/rocprofiler-sdk commit: c7bf11e080]
2025-01-13 19:37:58 -08:00
Baraldi, Giovanni 8abb65b166 Adding source snapshot and partial serialization (#99)
* Adding source snapshot

* Adding option to serialize only on target kernel

* Fix for tidy

* Formatting

* Testing the new flag

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: a2fa188e14]
2025-01-10 15:43:06 -08:00
Baraldi, Giovanni e226e2a11a SWDEV-508485: Adding MFMA F8 metric (#112)
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: 27266eb242]
2025-01-09 15:27:16 -06:00
Galantsev, Dmitrii 207cb06783 OTF2 - Fix lib vs lib64 location on some systems (#68)
On my dev machine I use OpenSUSE Tumbleweed. For some reason OTF2 gets
installed into BUILD/external/otf2/lib64/, while cmake for the lib
searches BUILD/external/otf2/lib and cannot find it.

My fix sets the location to always match CMAKE_INSTALL_LIBDIR

Signed-off-by: Galantsev, Dmitrii <dmitrii.galantsev@amd.com>

[ROCm/rocprofiler-sdk commit: 271f017251]
2025-01-08 22:47:44 -06:00
Madsen, Jonathan 247ba0afa1 Download perfetto trace_processor_shell (#105)
* Download perfetto trace_processor_shell

* Upgrade to perfetto-trace-processor-shell v0.0.4

* Fix run-ci.py warning

- warning message:

CMake Warning (dev) at /.../build/CTestCustom.cmake:16:
  Syntax Warning in cmake code at column 77
  Argument not separated from preceding token by whitespace.

* Update tests/pytest-packages/pytest_utils/perfetto_reader.py

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 2c3bdeaed9]
2025-01-08 20:32:48 -06:00
Choudhary, Rahul 06ac2cdc68 Delete .github/workflows/force-sync.yml
[ROCm/rocprofiler-sdk commit: 67e00e63b8]
2025-01-07 10:29:31 -08:00
Baraldi, Giovanni 1a90147c48 SWDEV-490031: Adding new rdc ops metrics (#96)
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: fddd8ac4aa]
2025-01-06 11:02:23 +00:00
Indic, Vladimir 5c0235ccc4 ROCProfV3 PC sampling tests: Initial multi-agents test (#72)
Testing multi-agent host-trap PC sampling support in ROCProfV3.

[ROCm/rocprofiler-sdk commit: 00d4c179c6]
2025-01-04 02:35:16 +01:00
Bhardwaj, Gopesh 93421ab066 update PR template (#90)
[ROCm/rocprofiler-sdk commit: 1ca289699e]
2024-12-26 11:24:58 +05:30
Indic, Vladimir c79b0a2eb6 Enable PC sampling on MI300A CI runners (#88)
Co-authored-by: Bhardwaj, Gopesh <Gopesh.Bhardwaj@amd.com>

[ROCm/rocprofiler-sdk commit: abf7280aae]
2024-12-25 10:39:22 +01:00
Bhardwaj, Gopesh d75abf3ae2 Disabling counter-collection-buffer test for MI325 (#89)
Disabling counter-collection-buffer for MI325

[ROCm/rocprofiler-sdk commit: 0b13a14014]
2024-12-25 10:44:21 +05:30
Baraldi, Giovanni d93586b57a SWDEV-492607: Fix for bvh (#87)
Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: 0a8c31842d]
2024-12-24 15:47:13 +01:00
Indic, Vladimir a2f431fdd4 Renaming ROCProfV3 host-trap exec-mask-manipulation tests (#76)
Renaming ROCProfV3 host-trap exec-mask-manipulation tests

[ROCm/rocprofiler-sdk commit: 2d2430b94a]
2024-12-23 18:43:06 +01:00
Nagaraj, Sriraksha 9e379fe2fb fix abort-app CI fail (#39)
* fix abort-app CI fail

* Update source/lib/rocprofiler-sdk-tool/tool.cpp

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

[ROCm/rocprofiler-sdk commit: 202853d579]
2024-12-22 14:17:48 -06:00
Nagaraj, Sriraksha e5dde1b230 fix dimensions in avail output (#82)
* fix dimensions in avail output

* review comment addressed

[ROCm/rocprofiler-sdk commit: 554537f140]
2024-12-20 13:26:02 -06:00
Indic, Vladimir 47a66adf1f ROCProfV3: fatal message if PC sampling unsupported, but requested (#60)
If a user requests PC sampling on a system that does not support this feature,
report a fatal error message and stop executing the program.

[ROCm/rocprofiler-sdk commit: 0ce75c1043]
2024-12-20 08:04:16 -08:00
Baraldi, Giovanni d0d378897a SWDEV-495749: Adding SIMD_UTILIZATION metric (#74)
* SWDEV-495749: Adding SIMD_UTILIZATION metric

* Fix mfmautil

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: 200a8624bc]
2024-12-20 14:35:18 +00:00
Welton, Benjamin 926c1ed153 Add support for device counter collection ioctl (#46)
Add support for device counter colleciton ioctl

Adds support for the device counter collection IOCTL. This IOCTL
allows for device wide counters to be collected even if the queue
is not intercepted by rocprofiler-sdk (required for system profilers).

A test is also included which checks this behavior by creating a queue
that does not have profiling enabled on it and checks to see if SQ
counters can be read from it. Note: this test will be skipped if the KFD
version does not contain this IOCTL.

Right now the check is "soft" in that if the IOCTL is present and there
is an error with permissions, rocprofiler will continue but will print
an error stating that system wide device profiling and collected counter
values may be degraded. This is primarily to avoid breaking existing
users (like PAPI) who may not need the IOCTL's capability and to give
them time to update.

Co-authored-by: Benjamin Welton <ben@amd.com>

[ROCm/rocprofiler-sdk commit: c574881cdb]
2024-12-19 13:27:35 -08:00
Indic, Vladimir 86417237a6 Remove numpy dependency from rocprofv3.py (#75)
Remove numpy dependency from rocprofv3.py

[ROCm/rocprofiler-sdk commit: 9c21c49aa1]
2024-12-19 20:52:42 +01:00
Baraldi, Giovanni f8442415f8 SWDEV-492607: Adding ATT wrapper (#40)
* Adding att parser wrapper

* Adding ATT tests as optional

* Adding decoder API for query capability

* Removed samples

* Formatting

* adding new line

* Removed perfetto and moved to static library

* using default search for lib

* Updated to SDK

* Namespace changes

* Added tests

* Small refactor

* Updated API to receive agent_id

* Fixing tests

* Tidy fixes

* Not write to file

* Switch to filesystem.hpp

* Compilation fixes

* Formatting

* Tidy fix

* Removed likely

* Adding tests

* Added gfx9 test

* Adding gfx12 tests

* Formatting

* Enable tidy

* Fix tests

* Fix deadlock on agent test

* Workaround ASAN

* Moving query outside class.

* Fix standalone tool

* Addressing comments

* Formatting

* Change query name

* Fixed some tests. Updated PR comments.

* Formatting

* Improved coverage

* Formatting

* Fix for comments

* Formatting

* Adding some description. Fix error type.

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: 2c8e88a76b]
2024-12-18 18:53:32 -08:00
Trowbridge, Ian b2aa29eb0a Skip Building of OpenMP Samples and Tests (#77)
* Add option to disable openmp samples

* Skip building openmp tests and samples for now

[ROCm/rocprofiler-sdk commit: 9de284a568]
2024-12-18 11:37:51 -06:00
Elwazir, Ammar 31e086b2ab Changing Mi300 Names (#69)
* Changing Mi300 Names

Making Mi300 names more specific:
Adding multiple type to differentiate between Mi300X, Mi300A, Mi325X

* Enable Mi300A PC Sampling testing

[ROCm/rocprofiler-sdk commit: 590f2a1cd0]
2024-12-13 12:19:23 -06:00
Baraldi, Giovanni 661b608227 SWDEV-489158: Fix for exit thread safety (#61)
* SWDEV-489158: Fix for exit thread safety

* Fixed exit thread logic

* Force CI to rerun

* Remove .vscode

* Fix thread safety bug

* Addressed some comments

* Formatting

---------

Co-authored-by: Giovanni Baraldi <gbaraldi@amd.com>

[ROCm/rocprofiler-sdk commit: f4984f9dcc]
2024-12-11 12:41:19 -06:00
Mallya, Ameya Keshava 0157d4e7ff Reusable PSDB/OSDB (#65)
* Deleting redundant action

* Single reusable workflow for PSDB and OSDB

* fixed calling psdb for mainline

[ROCm/rocprofiler-sdk commit: f80480cc86]
2024-12-10 13:13:17 -08:00
Bhardwaj, Gopesh 681740b52b gobhardw/docs logging (#10)
* reducing docs logging

* Addressing review comments

* exclude dirs

* maximize NUM_PROC_THREADS

* parallel build

[ROCm/rocprofiler-sdk commit: 3ee06ed747]
2024-12-10 14:15:59 +05:30
Welton, Benjamin 1850de7ee1 [AFAR VII] rocprofiler_sample_device_counting_service return data as part of API call (#57)
---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>
Co-authored-by: Benjamin Welton <ben@amd.com>

[ROCm/rocprofiler-sdk commit: 253c9adfc1]
2024-12-06 22:37:45 -08:00
Madsen, Jonathan 22b4e6739d Fix code coverage comment (#58)
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: bd33176966]
2024-12-06 18:44:34 -06:00
Nagaraj, Sriraksha fd9da7dc43 Updating rocprofv3 doc for pc sampling beta option (#59)
* Updating rocprofv3 doc for pc sampling beta option

* Update source/docs/rocprofv3_input_schema.json

* Update using-rocprofv3.rst

---------

Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>

[ROCm/rocprofiler-sdk commit: c509fe799d]
2024-12-06 17:41:28 -06:00
Madsen, Jonathan 0ed4441ca7 rocprofv3: Updates to counter collection optimizations (#24)
* Updates to counter collection optimizations

* Fix logic error

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: a09eda05b2]
2024-12-06 23:29:12 +00:00
Nagaraj, Sriraksha 2cb3c6d84f fix avail test (#50)
* fix avail test

* changing the regular expression

* Adding fatal error to avail script

* Revert "changing the regular expression"

This reverts commit e522143b5d9dccb870fd7f5667619ed32687d1e6.

[ROCm/rocprofiler-sdk commit: 5556774c3a]
2024-12-06 17:07:45 -06:00
Choudhary, Rahul 32fe16606a Update PSDB.yml - removing synchronize events to avoid duplicate triggers
[ROCm/rocprofiler-sdk commit: 745fd143dd]
2024-12-06 14:17:25 -08:00
Nagaraj, Sriraksha 921f57bac3 --pc-sampling-beta-enable in ROCProfV3 (#56)
PC sampling must be explicitly enabled. 
Emit fatal error otherwise.

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

---------

Co-authored-by: Indic, Vladimir <Vladimir.Indic@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

[ROCm/rocprofiler-sdk commit: 17fdc33d05]
2024-12-06 22:04:23 +01:00
Choudhary, Rahul 117d2131d0 Update kws.yml fixing the file extension name
[ROCm/rocprofiler-sdk commit: 6880dd1257]
2024-12-06 11:44:05 -08:00
Indic, Vladimir 00b558c037 PC Sampling API: emit info logs instead of error (#53)
* PC Sampling API: emit info logs instead of error

Inside PC sampling API, emit info logs instead of
error logs. The tests verifies status code of each
API call and decide when to skip, instead of relying
on messages in logs.

The samples_processing.cpp test has been removed as it's
not used.

[ROCm/rocprofiler-sdk commit: b4d7ee7887]
2024-12-06 20:40:30 +01:00
Madsen, Jonathan a70771f8dc Misc AFAR VII updates + clang-tidy-19 + bump version to 0.6.0 (#54)
* Misc AFAR VII updates + clang-tidy-19 + bump version to 0.6.0

- move tests/rocprofv3/trace-period to tests/rocprofv3/collection-period
- bump clang-tidy to v19
- fix misc clang-tidy errors

* Update the collection period test

- don't attach files on fail bc when test is disabled, it causes problems

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: bd447ab941]
2024-12-06 12:35:29 -06:00
Mallya, Ameya Keshava eb15bcdf4e Updated KWS action location to fix failure (#51)
[ROCm/rocprofiler-sdk commit: f18158f56a]
2024-12-06 11:02:00 -06:00
Indic, Vladimir f09ebc11c0 Temporarily disable sampled VM_IDs check (#55)
Temporarily disable sampled VM_IDs check

[ROCm/rocprofiler-sdk commit: 1d5ed0440d]
2024-12-06 14:45:33 +01:00
Jakaraddi, Manjunath 82261be227 SWDEV-492623: Hip Host Function to Device Symbols Mapping (#18)
* Adding changes to register and read symbols from the hip fat binary

* adding json output for host_functions

* added error handling

* adding json tool support

* Adding tests

* formatting changes

* Adding documentation

* refactoring as per amd-staging

* Adding intializers and changing macros

* Fix page-migration background thread on fork (#31)

* Fix page-migration background thread on fork

After falling off main in the forked child, all the children
try to join on on the parent's monitoring thread. This results
in a deadlock. Parent is waiting for the child to exit, but
the child is trying to join the parent's thread which is
signaled from the parent's static destructors.

Even with just one parent and child, due to copy-on-write
semantics, a child signalling the background thread to join
will still block (thread's updated state is not visible
in the child).

This fix creates background treads on fork per-child with a
pthread_atfork handler, ensuring that each child has its own
monitoring thread.

* Formatting fixes

* Detach page-migration background thread and update test timeout

* Attach files with ctest

* Update corr-id assert

* Tweak on-fork, simplify background thread

* Revert thread detach

* Adding --collection-period feature in rocprofv3 to match v1/v2 parity (#9)

* Adding Trace Period feature to rocprofv3

* Adding feature documentation

* Update source/bin/rocprofv3.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fixing format

* Moving to Collection Period and changing the input params

* Format Fixes

* Fixing rebasing issues

* Removing atomic include from the tool

* Adding more options for units, optimizing the code

* Fixing rocprofv3.py

* Fixing time conv & adding time controlled app

* Fixing format

* Changing to shared memory testing methodology

* use of shmem use

* Fix include headers for transpose-time-controlled.cpp

* Format upload-image-to-github.py

* Removing shmem and using only env var to dump timestamps from the tool

* Tool Fixes + Test Config

* Adding Tests

* Fixing Review comments

* Update trace period implementation

* Update trace period tests

* check between start and stop timestamps

* Merge Fix

* Update validate.py

* Improve safety of rocprofiler_stop_context after finalization

* Pass context id to collection_period_cntrl by value

* Adding 20 us error margin

* Ensure log level for collection-period test is not more than warning

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- move error code check macros to implementation
- fix macros which check error code
- use constexpr values instead of #define

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- debugging for error that cannot be locally reproduced

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- improve error handling and logging

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- tweak to non-fatal logging messages

* Update lib/rocprofiler-sdk/code_object/hip/code_object.*

- cleanup of logging messages

* Update host kernel symbol register data fields

* Update source/lib/rocprofiler-sdk/code_object/hip/code_object.hpp

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Kuricheti, Mythreya <Mythreya.Kuricheti@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 78d8f4b8ea]
2024-12-06 11:42:37 +00:00
Indic, Vladimir a0a0a4cffe [AFAR VII] Using v_rcp_f32 instead of v_fmac_f32 in exec_mask_manipulation.cpp (#47)
use v_rcp_f32 instead of v_fmac_f32

[ROCm/rocprofiler-sdk commit: 61ce79c84d]
2024-12-05 23:21:00 -08:00
Trowbridge, Ian 792329fefd SWDEV-492625 memory free functions (#11)
* SWDEV-492625: Track free memory HSA functions to help determine total amount of memory allocated on the system at any one time

* Minor fixes to address comments

* Update allocation size description

* Moved get function back to specialization, minor typo fixes

* Removed memory_operation_type field, removed memory_pool allocation enum, converted starting address to hex string for json format.

* Made conversion to hex_string a function, changed address to use union rocprofiler_address_t type, changed VMEM descriptors

* Removed as_hex from the global namespace

* Formatting

* Removed TRACK_EVENT for memory allocation, now TRACK_COUNTER for memory allocation is being performed

* Check if address was recorded before retrieving allocation size in generate Perfetto

* Formatting

* Update source/lib/output/generatePerfetto.cpp

* Explicitly disable app-abort tests

* Remove excluding app-abort test from workflow CI

- redundant bc these tests are explicitly marked as disabled now

---------

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 79006bb896]
2024-12-06 00:05:30 -06:00
Madsen, Jonathan a79f8a0198 SDK: OMPT Support (#22)
* Ability to select alternative compiler per file

Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.

Misc updates

Update OpenMP target sample

- samples/ompt -> samples/openmp_target
- fix sample test of openmp-target
- reorganize files

Rework OpenMP implementation

Minor OpenMP implementation cleanup

Rename samples/openmp_target CMake targets

Add tests/bin/openmp

- OpenMP target test app in tests/bin/openmp/target

Format samples/openmp_target CMakeLists.txt

Misc lib/rocprofiler-sdk/openmp cleanup

- fix includes
- convert_arg

Update openmp.def.cpp

- tweak includes
- remove lots of temporary variables

Update samples

- common::get_callback_id_names() -> common::get_callback_tracing_names()
- add kernel dispatch, memory copy, scratch memory buffered tracing to openmp target sample

Fix code object operation names

- add "CODE_OBJECT_" prefix

Update include/rocprofiler-sdk/openmp/api_id.h

- remove spurious comment

Miscellaneous openmp updates

- similar API for openmp_begin and openmp_end
- move implementations of ompt callbacks to openmp.cpp
- ompt_{thread_begin,thread_end,parallel_begin,parallel_end}_callbacks are openmp_events

[SWDEV-484495] Fix int truncation in CSV output (#1098)

CSV output truncates doubles to ints when it shouldn't. Derived metrics
are (mostly) doubles and lose precision (or become worthless) if treated
as an int. Converted these to double to match the format we return from
rocprof-sdk.

Co-authored-by: Benjamin Welton <ben@amd.com>

Update limit for max counter records in rocprof-tool (#1073)

A fixed sized std::array is used to store counter records in rocprofiler SDK. This limit was breached in SWDEV-484742. Upping the limit to 512 to be less likely to reach this limit again.

adding proxy ompt_data_t * arguments

fixes for proxy pointers

- Implement proxy ompt_data_t* pointers for clients
- Add ompt_data_t* arguments back to callback API
- Modify openmp sample to illustrate use of proxy pointers

formatting

SWDEV-467350: Skipping tool counter iteration for unsupported hardware (#1083)

Fixing some accumulate metrics (#1089)

* Fixing some accumulate metrics

* Fixing some more accumulate metrics

---------

Co-authored-by: Benjamin Welton <bewelton@amd.com>

updating rocprofv3 help options (#1113)

* updating rocprofv3 help options

* updating CHANGELOG

Fixing installed pacakge tests in CI (#1119)

* Fixing installed pacakge tests in CI

* Formatted rocprofv3.py with black formatter

SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests. (#1112)

* SWDEV-488948: PC Sampling - Correlation class to provide some thread safety. Adding multithread tests.

* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/pc_sampling/parser/correlation.hpp

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

* Adding backlog for codeobj changes

* Formatting

* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

* Update source/lib/rocprofiler-sdk/pc_sampling/code_object.hpp

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

---------

Co-authored-by: Vladimir Indic <139573562+vlaindic@users.noreply.github.com>

SWDEV-487621: Fixes for metric definitions (#1118)

* Fixes for metric definitions

* Removing gfx8

* Update changelog

* Fixing unit tests

* Small fixes

* Fix for write size

Fix PSDB change (#1120)

Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit c77e4d3b80

clang-18 build fix for RCCL (#1123)

Removes ambiguity on const usage, which clang-18 complains about
(preventing build with warn error).

mem copy direction field update (#1124)

Adding Node-id for debugging with log level trace (#1090)

fix botched rebase

Per Jonathan to remove -rdynamic warning so CI will continue

pedantic formatting

Correct the package name of rocprofiler-sdk (#1126)

* Correct the package name of rocprofiler-sdk

ROCM VERSION(for ex: 60300) was missing in the package name.
Added the same

* Use cmake cache string while setting the variable for ROCm Version

* correct the cmake-format

---------

Co-authored-by: Ranjith Ramakrishnan <Ranjith.Ramakrishnan@amd.com>

Fixing kokkosp tool library packaging (#1121)

* Fixing kokkosp tool library packaging

* Update source/lib/rocprofiler-sdk-tool/kokkosp/CMakeLists.txt

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Update CMakeLists.txt

* Update CMakeLists.txt

* Component Requirement in CPack

* Adding package dependency

* Update CMakeLists.txt

* Update rocprofiler_config_packaging.cmake

* Fix rocprofiler-sdk-tool-kokkosp BUILD/INSTALL RPATH

- CMAKE_INSTALL_LIBDIR doesn't help

* Add BUILD/INSTALL RPATH to rocprofv3-trigger-list-metrics

- fixes packaging issues

* Update packaging

- core depends on rocprofiler-sdk-roctx
- add CPACK_DEBIAN_PACKAGE_SHLIBDEPS_PRIVATE_DIRS to resolve inter-package dependencies

* Fix package depends version format

* Improve tests/rocprofv3/summary/validate logging

* Update CI workflow

- prioritize roctx package in Install Packages step

* Remove setting <package-name>_VERSION in config.cmake.in

- this is automatically handled by existence of <package-name>-config-version.cmake

* Update rocprofiler-sdk-config.cmake

- relax find_package versioning requirements to same major and minor version

* Update rocprofiler-sdk-config.cmake

- relax find_package versioning requirements (remove EXACT, specify range)

* Tweak CI workflow

* Update perfetto_reader.py

- better handle failure to load trace processor

* Misc cleanup for config packaging

* Update config packaging

* Update config packaging

* Revert perfetto for core-rpm packages

* Revert perfetto for core-rpm packages

- perfetto < 0.9.0

* Tweak tests/rocprofv3/summary/validate.py

- reorder some checks

---------

Co-authored-by: Ammar Elwazir <aelwazir@useocpm2m-387-013.amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

Clang Warning Fixes (#1131)

Builds prevented on clang-18

Adding start and end timestamp columns in csv (#1128)

* Adding start and end timestamp columns in csv

* Adding assert check for the counter timestamps

---------

Co-authored-by: Gopesh Bhardwaj <gopesh.bhardwaj@amd.com>

rocprofv3: docs and help menu updates (#1129)

* doc updates

* Correcting ROCtx information

* Making ROCTx string consistent

* missing occurence

Renamed agent profiling service to device counting service (#1132)

* Renamed agent profiling service to device counting service

Name more aptly represents what agent profiling did (device wide
counter collection). Conversion of existing user code can be
performed by the following find/sed command:

find . -type f -exec sed -i 's/rocprofiler_agent_profile_callback_t/rocprofiler_device_counting_service_callback_t/g; s/rocprofiler_configure_agent_profile_counting_service/rocprofiler_configure_device_counting_service/g; s/agent_profile.h/device_counting_service.h/g; s/rocprofiler_sample_agent_profile_counting_service/rocprofiler_sample_device_counting_service/g' {} +

* Converted dispatch profile to dispatch counting service

* Debug for functioal counters test

* Minor changes for CI

* Minor fix

* More fixes for CI

* Update evaluate_ast.cpp

---------

Co-authored-by: Benjamin Welton <ben@amd.com>

Testing updated RPM dockers (#1136)

* Testing updated RPM dockers

* Trying to fix PSDB for test package dependency

Agent Profiling Fixes for Broken/Improper API Usage (#1122)

Prevent's multiple setups of agent profiling on the same agent.

Fixes agent read context to only read agents that were setup.

Prevent copy of agent profiling internal data struct and reset
hsa_signal on move to prevent inadvertant delete.

Simplifying PR template (#1139)

Implementation of ompt interface to rocprofiler SDK. task_create and task_schedule are not supported.

Fixing installed pacakge tests in CI (#1119)

* Fixing installed pacakge tests in CI

* Formatted rocprofv3.py with black formatter

Fix PSDB change (#1120)

Reverts change to `source/include/rocprofiler-sdk/callback_tracing.h`
from commit c77e4d3b80

delete unused files

added arguments to some OMPT buffter records

* Fix cmake issues

Remove rocprofiler_ompt_finalize_tool

- a public API function is not necessary: should just finalize rocprofiler-sdk

Fix duplicate ROCPROFILER_{BUFFER,CALLBACK}_TRACING_KIND_STRING

Add lib/rocprofiler-sdk/ompt.hpp

- declares rocprofiler::sdk::finalize_ompt

Remove change to tests/rocprofv3/summary/conftest.py

Add set_fini_status(1) back to registration.cpp

Deleted uneeded files

Incoporate OpenMP code and sample

Fix merge issues with amd-staging

Add push_correlation_id for OpenMP tasking; improve debugability

fixup bad merge

* Suppress OpenMP data race

* Fix openmp_target sample

* Enum and struct name changes + source code reorg

- remove mix of ompt and openmp
  - opted for ompt
- changes made for consistency
  - ompt_api -> ompt
  - openmp_api -> ompt
  - OPENMP -> OMPT

* Update tests and more renaming

- dest_device_num -> dst_device_num
- src_addr -> src_address
- dest_addr -> dst_address
- remove info_type::begin
- require OMP_TARGET_OFFLOAD

* Update openmp-target test/sample env and labels

* Formatting

* Tweaks to cmake for openmp target

- Disable for thread sanitizers due to preloading issue

* OpenMP target cmake updates

- remove gfx1010 (fails on mi300)
- OPENMP_GPU_TARGETS

* Remove device_unload and target_map_emi support

- these are never supported by AMD OpenMP compilers

* Update CI workflow

- exclude openmp-target tests from navi3 and vega20

---------

Co-authored-by: Larry Meadows <Lawrence.Meadows@amd.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: 00c46fd5e5]
2024-12-05 22:48:19 -06:00
Elwazir, Ammar 90e3a30627 Adding --collection-period feature in rocprofv3 to match v1/v2 parity (#9)
* Adding Trace Period feature to rocprofv3

* Adding feature documentation

* Update source/bin/rocprofv3.py

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

* Fixing format

* Moving to Collection Period and changing the input params

* Format Fixes

* Fixing rebasing issues

* Removing atomic include from the tool

* Adding more options for units, optimizing the code

* Fixing rocprofv3.py

* Fixing time conv & adding time controlled app

* Fixing format

* Changing to shared memory testing methodology

* use of shmem use

* Fix include headers for transpose-time-controlled.cpp

* Format upload-image-to-github.py

* Removing shmem and using only env var to dump timestamps from the tool

* Tool Fixes + Test Config

* Adding Tests

* Fixing Review comments

* Update trace period implementation

* Update trace period tests

* check between start and stop timestamps

* Merge Fix

* Update validate.py

* Improve safety of rocprofiler_stop_context after finalization

* Pass context id to collection_period_cntrl by value

* Adding 20 us error margin

* Ensure log level for collection-period test is not more than warning

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>

[ROCm/rocprofiler-sdk commit: a579c70b71]
2024-12-06 02:17:24 +00:00
Kuricheti, Mythreya c2f9e2aca8 Fix page-migration background thread on fork (#31)
* Fix page-migration background thread on fork

After falling off main in the forked child, all the children
try to join on on the parent's monitoring thread. This results
in a deadlock. Parent is waiting for the child to exit, but
the child is trying to join the parent's thread which is
signaled from the parent's static destructors.

Even with just one parent and child, due to copy-on-write
semantics, a child signalling the background thread to join
will still block (thread's updated state is not visible
in the child).

This fix creates background treads on fork per-child with a
pthread_atfork handler, ensuring that each child has its own
monitoring thread.

* Formatting fixes

* Detach page-migration background thread and update test timeout

* Attach files with ctest

* Update corr-id assert

* Tweak on-fork, simplify background thread

* Revert thread detach

[ROCm/rocprofiler-sdk commit: e7d45624d0]
2024-12-05 19:58:38 -06:00
Meserve, Mark f6c923e191 SWDEV-445864: SWDEV-445865: Update page migration events (#16)
* Update kfd ioctl header

- Adds new event for dropped events
- Mirrors kernel update by Philip Yang

* Add error code for page migration events

- Adds support for new error code field for page migration end events
  - Page migration end event is now generated for migration failure
  - Error code is zero for successful migration

* Add dropped event SMI event

- New event type indicates if events were dropped
  - Events are dropped if the buffer is full

[ROCm/rocprofiler-sdk commit: fc2513888f]
2024-12-05 20:44:10 +00:00
Kandula, Venkateshwar reddy 1c25f3920a Rename csv output header for scratch memory trace from Alloc_flags to Alloc_Flags. (#12)
* rename csv output header for scratch memmory trace from Alloc_flags to Alloc_Flags.

* csv output tests for scratch memory trace.

* Check output lengths

---------

Co-authored-by: Mythreya <mythreya.kuricheti@amd.com>

[ROCm/rocprofiler-sdk commit: e77db42d53]
2024-12-05 19:37:23 +00:00
Indic, Vladimir b2ee1ece8f Reducing workload in hammer test (#48)
Reducing workload parser's in hammer test

Reducing hammer test workload by 4 to prevent timeout on ThreadSanitizer job.

[ROCm/rocprofiler-sdk commit: 2dc3a5ae95]
2024-12-05 19:41:59 +01:00