İşleme Grafiği

268 İşleme

Yazar SHA1 Mesaj Tarih
Sajina PK b0ff07b4fe Conditionally include backtraces in ROCPROFSYS_THROW based on verbosity (#272)
* Conditionally include backtraces in ROCPROFSYS_THROW based on verbosity

Modify ROCPROFSYS_THROW to only include backtraces when:
  debug mode is enabled, OR
  verbose level is >= 2, OR
  running in CI environment

* Fix formatting errors
2025-07-07 14:14:02 -04:00
David Galiffi 122623a929 Use gersemi for CMake formatting (#257)
* Replace `cmake-format` with `gersemi`

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Remove .cmake-format.yaml

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update workflow to use gersemi

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CONTRIBUTING.md

* Update helper scripts

* Don't include `*/external/*` in workflows

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-06-22 10:44:33 -04:00
David Galiffi 1e13b590e7 Use clang-format-18 for source formatting (#256)
* Updating clang-format to v18

- Updates the pre-commit-config
- Formats source files according to the utility

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update format source workflow

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Update CONTRIBUTING

* Update comment in .clang-format

* Update CONTRIBUTING.md

* Update helper script

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-06-22 08:48:08 -04:00
Sajina PK e3741f678b Show VCN and JPEG busy values where VCN/JPEG activity is not supported. (#232)
On AMD-SMI, in rocm 7.0, vcn_activity and jpeg_activity will not be reported when XCP (partition) stats, vcn_busy and jpeg_busy, are available. This causes the activity tracking to fail. The fix is to read the busy values when activity values are not supported.

For issue: SWDEV-536439

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-06-19 16:23:30 -04:00
David Galiffi 244c193a57 Unhandled enum in switch statement (#247)
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-06-17 09:57:27 -04:00
Kian Cossettini 0380cf58ba Improve rocprof-sys-avail to report VCN and JPEG metrics on supported devices (#226)
* SWDEV-535445: rocprof-sys-avail shows jpeg_activity even when unsupported

* Added vcn tracking

* jpeg and vcn description now includes supported gpus

* Add getter methods per device to check vcn and jpeg support

Add logic to check if vcn activity and vcn busy values are supported for each device.
Add logic to check if jpeg activity and jpeg busy values are supported for each device.

Co-authored-by: Sajina P Kandy <sputhala@amd.com>

* Add getter methods per device to check vcn and jpeg support (#228)

* Formatting

* Variable fix

* List of supported GPUs are now ordered

* Removed the ability to see which gpu supports jpeg and vcn activity to reduce clutter

* Formatting

* Testing for busy support

* jpeg and vcn only show if supported

* Removed commented code

* Formatting

* Applied amd_smi cpp/hpp fixes

* Added break condition for xcp loop

* Modified loops for efficiency

* Removed unneccessary macro

* Removed unneccessary includes

---------

Co-authored-by: Sajina Kandy <sputhala@amd.com>
Co-authored-by: Sajina PK <Sajina.PuthalathKandy@amd.com>
2025-06-09 16:14:53 -04:00
Pranjal Swarup 96df9b6d3e Update dyninst to v13 (#190)
Update Dyninst submodule
Refactoring of build scripts to build TBB, Boost, ElfUtils, and LibIberty, since Dyninst build scripts no longer do.
Workflows are now building Dyninst and its dependencies.

---------

Co-authored-by: marantic-amd <marantic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-06-06 22:52:23 -04:00
David Galiffi af77d93f75 Use rocprofiler-sdk for RCCL-API tracing (#126)
- Add support for RCCL API tracing through rocprofiler-sdk.
- Refactored the comm_data code to use the SDK RCCL_API callbacks.
- Add a runtime version check for SDK to gate callback enablement, rather than just the compile-time check.
- Fixed: SAMPLING_TIMEOUT was not being handled correctly in add_test.
2025-06-06 11:36:17 -04:00
habajpai-amd c5507e3740 SWDEV-507117: Unify OMP Target Offload Events into a Single Perfetto … (#230)
* SWDEV-507117: Unify OMP Target Offload Events into a Single Perfetto Timeline Row

* Fixed warning and format

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-06-06 11:52:30 +05:30
Sajina PK 4fcd8cc78d Enable MPI tracing for Fortran (#185)
- Move the MPI gotcha functionality from Timemory to the repo.
- Add the PMPI Fortran MPI functions to the existing mpi gotcha handle.
2025-06-04 18:06:18 -04:00
habajpai-amd abecaa8bf8 SWDEV-533856: Handle dynamic event for HIP api for perfetto (#225)
* SWDEV-533856: Handle dynamic event for HIP api for perfetto

* Refactor: Generalize function using template

* Format Source

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-06-04 15:11:26 +05:30
David Galiffi 650827c5ea Fix compilation failure in amd-smi 26.0.0 (#223)
- The parameter "year" was removed from amdsmi_version_t.
- For SWDEV-535858, SWDEV-535870
2025-06-02 18:22:13 -04:00
habajpai-amd 39090bfc54 Add corr_id for HIP Runtime API in Perfetto (#218)
for SWDEV-533883

---------

Co-authored-by: Aleksandar Djordjevic <aleksandar.djordjevic@amd.com>
2025-05-29 16:00:31 -04:00
Pranjal Swarup 4c7560c78c ROCPROFSYS_AMD_SMI_METRICS visibility (#208)
* Removed advanced category from ROCPROFSYS_AMD_SMI_METRICS to have this property visible with rocprof-sys-avail.
2025-05-15 13:31:40 -04:00
Sajina PK 90ad264447 Add new VA-API methods to the gotcha wrappers (#203)
For a new feature in rocJPEG adding new VA-APIs to the gotcha wrapper
2025-05-13 08:05:55 -04:00
David Galiffi adc66956b0 Fix path to post-processing merge script (#187)
- Path to merge script not found unless user explicitly sources "share/rocprofiler-systems/setup-env.sh" to setup PATHs.
- Instead, let's derive the path when the application loads and use it when executing the helper script
- Rename script to rocprof-sys-merge-output.sh.
- Change install folder to <prefix>/libexec/rocprofiler-systems based on dev-ops feedback.
- Updated PATH variable in the modulefile and source scrtipt.
- For SWDEV-528101
2025-05-02 16:52:54 -04:00
anujshuk-amd ff109912c2 Reverting PR-154 Changes since VCN data not seen on Perfetto file (#191) 2025-05-02 16:19:43 -04:00
David Galiffi 0f16d45445 Conditionally include ROCPROFILER_BUFFER_TRACING_PAGE_MIGRATION (#193)
- Include only if ROCPROFILER_SDK_VERSION < 1.0.0, as it is being removed
- For SWDEV-530639
2025-05-02 15:05:27 -04:00
Sajina PK 99a411fe52 Fix to overlapping VCN and JPEG tracks in perfetto (#192)
- Fix overlapping VCN and JPEG activity values in Perfetto output.
- Modify the storage of the activity values to be more efficient.
2025-05-01 19:40:49 -04:00
Luca Bruni 8ae6651357 Appropriately filter data based on -D and -H options (#163)
- Addresses concern that device metric tracks are still shown in Perfetto trace file even when only -H is specified to rocprof-sys-sample (and vice versa).
- Update sampling call-stack docs.
2025-04-30 09:50:51 -04:00
anujshuk-amd 8d48048bd3 Fix ROCPROFSYS_AMD_SMI_METRICS parsing (#178)
Fixes a bug where all the `ROCPROFSYSE_AMD_SMI_METRICS` values were being recorded by default.
Fixes bug with the 'all' and 'none' values giving an exception when specified for `ROCPROFSYSE_AMD_SMI_METRICS`.

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-04-28 09:22:20 -04:00
Sohaib Nadeem 0e535daa93 Initialization fixes (#154)
- Remove tooling initialization from rocprofiler_configure:
when rocprofiler configure is called from __hip_module_ctor
(which in turn is called as a global constructor when loading shared
libraries or before main in a hip program), initializing tooling
in it can cause problems because it is too early to do some of the tasks
that it involves (e.g. opening shared libraries, creating threads).
Instead, we rely on rocprofsys_main to initialize tooling later.

- Skip rocprofiler_configure if ROCPROFSYS_PRELOAD is not set since
preload is required for tooling (such as perfetto, which is used by
the rocprofiler callbacks) to be initialized.

- Revert RCCL initialization changes: These are no longer needed since rocprofsys_init_tooling_hidden will not
be called from rocprofiler_configure

- Force rocprofiler_configure in rocprofsys_init_tooling_hidden if it hasn't been
called through __hip_module_ctor global constructor
2025-04-21 17:04:24 -04:00
David Galiffi 169c9a0d49 Add rocm-6.4 to workflows (#165)
* Add rocm-6.4 to workflows

* Update containers.yml

* Update cpack.yml

* Update cpack.yml

* Disable OpenMP Target Examples on GitHub Runners

* Fix build warnings.

Switch statements with unhandled enums.

* Enable testing on 6.3 and 6.4

* Ubuntu 24 workflow. Build both ROCm 6.3 and 6.4

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-04-18 09:32:26 -04:00
David Galiffi 2680ccc3a7 Fix rocprofiler-sdk includes (#169)
For compatibility with recent rocprofiler-sdk change.
2025-04-16 21:18:06 -04:00
anujshuk-amd 807a622b04 Change the default value of ROCPROFSYS_SAMPLING_CPUS to "none" (#164) 2025-04-11 17:09:26 -04:00
Sajina PK fad3a0d341 RocJpeg cmake and document fixes (#157)
- Fix for rocjpeg sample cmake due to changes in the rocJPEG project
- Fix for rocprofiler-sdk version check - change the format
- Edits to docs for jpeg and vcn activity support - mention that these values may not be supported on all ASICs.
2025-04-09 16:20:02 -04:00
Ben Richard ee11f5b206 rocprof-sys-run: Change terminal color back to normal after printing usage (#155)
* Change terminal color back to normal after printing usage

* Format source

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-04-01 11:06:40 -04:00
David Galiffi 7bb45aba1c Additional AMD-SMI Updates (#149)
- Check AMDSMI header version to fix compilation failure with v2.0 header change
- Fix ROCM-SMI references in documentation and tests
- Check AMDSMI library version at runtime and output in logs
- Fix a possible exception occurring when an in-flight sample is outstanding while the component is shutting down.
2025-03-31 11:07:50 -04:00
David Galiffi b6b39af011 Fix "ROCPROFSYS_USE_ROCM" runtime config setting. (#144) 2025-03-27 16:03:46 -04:00
Aleksandar Djordjevic 2bad0e941b Disable RCCL, load libamdhip64.so (#150)
Disable RCCL and load libamdhip64.so as a fix for sw509497.
2025-03-27 16:02:17 +01:00
Wileam Phan 2805631ccd Fix rocprof-sys-instrument default linkage and visibility criteria (#95)
* Fix default linkage and visibility criteria
* Fix processing of linkage and visibility CLI flags
* Format source

Signed-off-by: Wileam Yonatan Phan <wileamyonatan.phan@amd.com>
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-03-27 18:33:35 +08:00
David Galiffi 85bbea4954 Reapply "Upgrade ROCm-SMI to AMD SMI (#86)" (#147)
* Reapply "Upgrade ROCm-SMI to AMD SMI (#86)"

This reverts commit b3eee295dd.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-03-25 17:31:27 -04:00
ajanicijamd 26bb604215 Fix a RCCL initialization to avoid a deadlock (#136)
Also fixes: 

- crash while finalizing rocprof-sys-causal
2025-03-19 14:48:04 -04:00
David Galiffi 5fc495c1e7 Update libraries' SOVERSION to match other ROCm components (#98)
set SOVERSION to ${PROJECT_VERSION_MAJOR}

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-03-07 10:18:36 -05:00
David Galiffi eb0a969a9c Fix logging error (#130)
When we create profile config with rocprofiler we log the counters being registered. However, this log was being skipped in certain cases.
2025-03-06 14:30:45 -05:00
Sohaib Nadeem 42922ec851 Fix hardware counter summary files not being generated after profiling (#124)
- Register a cleanup function in tim::manager instance to write out data in
counter storages

- The counter_storage::write() calls in tool_fini happen after the storage is destroyed
which is too late for the write to happen.

- Adjust traits for counter_data_tracker

- Add MIN, MAX, VAR, STDDEV columns
- Remove DEPTH, UNITS, %SELF columns

- Update "add_validation_test" to test for the existence of output file(s).
- Added step to test perfetto output for `transpose-rocprofiler-sampling`
and `transpose-rocprofiler-binary-rewrite`

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-03-05 16:05:18 -05:00
Sohaib Nadeem 43f900d01e Fix an application crash when collecting performance counters with rocprofiler (#117)
* Add check to skip counter_storage::write() if internal storage field is destroyed.
* Output warning message if counter data is not available when trying to write out to Timemory

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-02-27 14:34:52 -05:00
Sajina PK 3bea1d8eac Add support for rocJPEG API tracing (#116)
- Add rocDecode API Tracing support using domain `rocjpeg_api` in ROCPROFSYS_ROCM_DOMAINS.
- Modify existing `videodecode` and `jpegdecode` ctests to verify API tracing
- Print Perfetto values for easy debugging in verbose mode
- Convert CMake error to a warning and skip building the "decode" examples if requirements are not found
2025-02-25 21:14:14 -05:00
Sajina PK 59d3399901 JPEG Activity tracing in Perfetto (#108)
- Add JPEG activity track in perfetto trace
- Add JPEG decode tests to the examples
- Change existing videodecode test to include JPEG testing
- Rename videodecode test file to decode to include jpeg tests too
- Fix a bug in the test which checks for total activity of 0
- Disable rocDecode and rocJPEG samples from the github image files
2025-02-21 10:25:01 -05:00
David Galiffi 3833c8d162 Fix hang in config file generation (#101)
- Updated Timemory module.
- Fixes a crash when running rocprof-sys-avail -G without explicitly providing -F <format>. The default value of "txt" was not being used.
- Define "choices" before "default" when defining the "--config-format" argument in the parser.
2025-02-11 17:36:31 -05:00
Sajina PK 697d1ac02f Add support for VA-API and rocDecode tracing (#92)
- VA API tracing using Timemory gotcha wrappers.
- rocDecode API tracing integration using callback to ROCPROFILER_CALLBACK_TRACING_ROCDECODE_API
- Updated videodecode ctest to validate rocDecode APIs in perfetto trace.
2025-02-11 13:08:23 -05:00
David Galiffi e437200e9e Remove remaining roctracer references (#82) 2025-02-07 23:27:58 -05:00
David Galiffi b3eee295dd Revert "Upgrade ROCm-SMI to AMD SMI (#86)" (#100)
This reverts commit 0c32dfd6bc.
2025-02-07 11:45:26 -05:00
ajanicijamd fc5b325979 Added libamd_comgr.so to internal modules and fix argument parsing module in Timemory (#96)
Updates Timemory submodule
2025-02-03 14:43:14 -05:00
cfallows-amd 0c32dfd6bc Upgrade ROCm-SMI to AMD SMI (#86)
* Integrating amd-smi into rocprofiler-systems due to rocm-smi deprecation.
* No functionality changes to users other than naming conventions.
* New tracks available in perfetto- gpu busy percentage metrics now splits gfx busy into separate gfx, umc, and mm engine measurements.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-01-30 21:32:07 -05:00
Peter Park 0a15d355e0 Update copyright year to 2025 (#83) 2025-01-29 16:53:16 -05:00
Maarten Arnst 043a8010a9 Update to KOKKOS_TOOLS_LIBS env var (#69) 2025-01-29 16:53:15 -05:00
Pranjal Swarup 0263e951ff Merge proto files from multiprocess run into one file. (#63)
- Added script to merge multiprocess output automatically to one file when there are multiprocess proto files written into output folder
- Execute the merge multiprocess script from the rank 0 process
- Added the scripts folder path to env path, via setup-env.sh
- Installed merge_multiprocess_output.sh to /share/rocprofiler-systems/bin dir

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2024-12-18 17:34:02 -05:00
Sajina PK 3fa37c991e Enable VCN tracing in Perfetto output (#65)
Enable VCN activity tracing on different instances from the GPU metrics fetched using rsmi_dev_gpu_metrics_info_get in the ROCm-SMI. library.

The tracing can be controlled with ROCPROFSYS_ROCM_SMI_METRICS by setting the value as vcn_activity, Currently this configuration takes the following values: busy, temp, power, mem_usage, vcn_activity.
By default, all the 5 values will be enabled.

Signed-off-by: Sajina P Kandy <Sajina.PuthalathKandy@amd.com>
Co-authored-by: Sajina Kandy <sputhala-amd@amd.com>
2024-12-18 15:56:48 -05:00
David Galiffi 88aa2d3cbe Update to use rocprofiler-sdk (#55)
- Renames the CMake option "ROCPROFSYS_USE_HIP" to "ROCPROFSYS_USE_ROCM"
- Remove the "ROCPROFSYS_USE_ROCM_SMI option. Controlled with the "ROCPROFSYS_USE_ROCM" option, instead.
   - Runtime configuration can still toggle ROCPROFSYS_USE_ROCM_SMI to disable the sampling.
- Rename ROCPROFSYS_HIP_VERSION macro to ROCPROFSYS_ROCM_VERSION and remove blocks for `ROCPROFSYS_ROCM_VERSION < 60000`
- Remove ROCPROFSYS_USE_ROCTRACER and ROCPROFSYS_USE_ROCPROFILER
- Update test cases
- Update docker files and workflows to install cmake 3.21, which is required for the rocprofiler-sdk findPackage script.
- Removed rocm-6.2 from workflows due to a rocprofiler-sdk API change.
2024-12-13 18:48:39 -05:00