214 Коммитов

Автор SHA1 Сообщение Дата
vedithal-amd a838b0c07b [rocprofiler-compute] Fix test case for MI 308 (#2934)
* Fix test case for MI 308

* Use consistent naming of GPUs in comment
2026-01-29 18:54:52 -05:00
vedithal-amd 717cdde126 Update test_metric_validation.py to handle MI325X (#2866) 2026-01-27 16:12:05 -05:00
ggottipa-amd 77f7541755 [rocprofiler-compute] Adding --torch-trace option for SWDEV-559789 (#2089)
* Adding --torch-operator option in rocprof-compute. Creates csv file for
each operator that has gpu activity, showing operator to counter values
mapping.

* --torch-operators flag added to rocprofiler-sdk

* Adding ctest for --torch-operators.

* Adding pytest markers.

* Corrections in ctest and message logging.

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestion from @Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Adding a check for pytorch installation only when --torch-operators is passed.

* moving inject_roctx.py into src/utils.

* rebase

* Updating docs and changelog.

* Update projects/rocprofiler-compute/src/argparser.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update projects/rocprofiler-compute/src/utils/inject_roctx.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Removing special characters.

* Minor corrections.

* Setting default value for torch_operators_enabled.

* Updating the number of files according to the number of passes.

* Adding rocpd support.

* Adding a warning message to be shown when profiling a non-python workload.

* copilot suggestions, rocpd+native tool fix

* Fixed the incorrect usage of dispatch_id as event_id in the function update_rocpd_pmc_events()

* ruff format fix

* ruff formating

* Deleting torch_trace.csvs after consolidating the operator data.

* Removing checks since *torch_trace.csv files are deleted.

* Fixing file deletion.

* Update projects/rocprofiler-compute/src/utils/inject_roctx.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update projects/rocprofiler-compute/src/rocprof_compute_profile/profiler_base.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update projects/rocprofiler-compute/src/utils/utils.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update projects/rocprofiler-compute/tests/test_profile_general.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Using default options in the testcase.

* Adding test for overhead measurement.

* Corrections in docs.

* doc updates.

* Update projects/rocprofiler-compute/src/utils/inject_roctx.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Handling potential empty frames.

* Corrected the test cases.

* Changing the flag to --torch-trace

* Fixed helper_app path issues

* Path issues

* process_torch_trace_output() now takes csv file paths as input + allows default usage.

* Replaced pandas with sqlite3

* Adding marker_trace extraction to rocpd_data.py

* Allowing all workloads to use --torch-trace option. Assuming the workload is user verified.

* Modified help section for the flag.

* Added difference in runtimes for longest running kernels in each profiling runs to overhead measurements.

* Update projects/rocprofiler-compute/src/rocprof_compute_profile/profiler_base.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Update projects/rocprofiler-compute/src/rocprof_compute_profile/profiler_base.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Removed the accesses to the tables.

* Ruff fixes.

* ruff

* Ruff Fixes

* Adding getattr for args.torch_trace to handle mock args.

* Fix for 'Missing guid in counter collection data - in csv mode'

* Sending output_format to process_torch_trace_output

* Warning for self contained binaries.

* Ruff

* Ruff

* Measuring longest_running_kernel_baseline instead of worst_kernel_increase, very small kernel runtimes are blowing up the worst_kernel_increase metric.

* Minor fixes in input arguments

* Ruff

* Loging PyTorch version

* Fix ruff formatting for PyTorch version logging

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-27 19:50:25 +05:30
vedithal-amd aa5dfb98f9 MI350 Fix L2 cache to HBM read counters/metrics (#2501)
* Fix rocprofiler-sdk metrics definition

* Use TCC_EA0_RDREQ_128B instead of TCC_BUBBLE counter for L2 cache to
  HBM counters and metrics

* Update MI350 counter definitions
    * FETCH_SIZE
    * BANDWIDTH_EA

* Update MI350 metrics definitions
    * System Speed of Light, L2-Fabric Read BW
    * Roofline Plot Points, AI (Arithmetic Intensity) HBM
    * Roofline Performance Rates, HBM Bandwidth

* Remove redundant definition for gfx950 and fix BANDWIDTH_EA definition

Test HBM bandwidth metric for memcopy workload

* Add memcopy.cpp workload

* Add metric validation test suite to validate HBM Bandwidth metric for
  memcopy workload

* Move gpu_soc() to test_utils.py for better re-usability

* Update TUI analysis config

* Fix hbm bandwidth formula for mi350 in calc_ai_profile

Co-authored-by: Alysa Liu <Alysa.Liu@amd.com>
2026-01-23 15:56:24 -05:00
vedithal-amd 809eca7616 [rocprofiler-compute] Pin dependencies and fix test configuration paths, remove setuptools dependency (#2821)
* Pin dependencies and fix test paths for package layout

- Pin all dependencies in requirements.txt to specific versions to ensure stability and reproducibility.
- Update test_autogen_config.py to correctly resolve source paths for both development and installed package layouts.
- Validated compatibility with Python 3.9, 3.10, 3.11, and 3.12.

* Remove setuptools dependency since we dont support pip install and instead use cmake
2026-01-23 15:46:58 -05:00
cfallows-amd 62dd4d114d [rocprofiler-compute] Fixes for roofline when used with iteration multiplexing (#2635)
*Added iteration_multiplex_impute_counters on pmc data- GUI dataframe did not implement this in the build_layout method previously
*Created a Workload() in profile mode post-processing for roofline html standalone plot to be generated- this will be removed once roofline plot is moved to analyze phase in future release
*Added iteration_multiplexing run parameter to roofline object init so that we can accurately parse dataframe if the option was used during profiling- this helps us to avoid reading nan values in certain dispatches that did not get imputed in calc_ai_profile
*Cleanup for unused legacy code, adjusted method parameters to assist in moving roofline plotting to analyze mode in future release
*Update iteration multiplexing data imputation algorithm to impute counters for ungrouped dispatches at the end based on the previous group. This however won't work if there are no dispatches that can be grouped (i.e. number of dispatches < number of counter buckets)

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2026-01-23 11:10:46 -05:00
jamessiddeley-amd 69281bbcf4 [rocprofiler-compute] Threshold Based Clamping in Analyze Stage (#2565)
* add threshold clamping function + parse in parser.py (with I/O)

* implemented hybrid threshold solution

* update changelog

* removed absolute threshold hybrid approach; restored relative threshold + warn

* edited warning msg, threshold -> 1%

* update changelog

* added 2 test cases

* ran master workflow yaml config files

* added to FAQ

* Revert "ran master workflow yaml config files"

This reverts commit 75a670e14d6f1619ebbda0ec218755ccbe0d22b1.

* update FAQ

* update config hashes

* Broke down long functions into Class with sub-functions

* ruff format

* addressed comments
2026-01-23 00:54:54 -05:00
abchoudh-amd dd149d3957 [rocprofiler-compute] Support new attach/detach API (#2642)
* Removed attach tool library path

* Support new attach/detach API

* New attach/detach API was introduced in
  https://github.com/ROCm/rocm-systems/pull/1653

* Provide backward compatibility with old api

* Stabilize attach/detach tests by adding sleep to help workload get
  ready for attachment

* Fix typo in test name

---------

Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2026-01-19 16:00:14 -05:00
vedithal-amd 0254181f42 [rocprofiler-compute] Analysis Database Schema Improvements (v1.2.0) (#2526)
* Analysis database v1.2.0

* `pc_sampling` and `roofline_data` tables should relate to `kernel` table instead of `workload` table

* Remove `kernel_name` fields in `pc_sampling` and `roofline_data` table

* Add kernel existence check for roofline data to prevent KeyError (#2536)

* Initial plan

* Add kernel existence check for roofline data to prevent KeyError

Co-authored-by: vedithal-amd <191402304+vedithal-amd@users.noreply.github.com>

---------

Co-authored-by: copilot-swe-agent[bot] <198982749+Copilot@users.noreply.github.com>
Co-authored-by: vedithal-amd <191402304+vedithal-amd@users.noreply.github.com>

* Optimize analysis performance

* Refactor database schema: separate metric definitions from kernels

Reorganize the database ORM to decouple metric definitions from kernel
objects. This improves the schema design by:

- Rename Metric -> MetricDefinition and Value -> MetricValue for clarity
- Move metric definitions from kernel-level to workload-level, since
  metric definitions are shared across kernels
- Update relationships: MetricDefinition belongs to Workload,
  MetricValue
  references both MetricDefinition and Kernel
- Refactor metric_view to join through the new schema structure
- Update test fixtures to use renamed table and class names
- Update documentation with new example output using nbody workload
- Regenerate database schema and views diagrams

* Add min amd max aggregation in kernel_view

* Add primary key id from tables into the view

---------

Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com>
Co-authored-by: vedithal-amd <191402304+vedithal-amd@users.noreply.github.com>
2026-01-19 15:25:43 -05:00
xuchen-amd 71b9ea6ba0 [rocprofiler-compute] improve config management system (#2359) 2026-01-14 13:20:27 -05:00
vedithal-amd f073f1adf2 Fix test for data imputation for iteration mulitplexing (#2564) 2026-01-09 12:42:14 -05:00
vedithal-amd 51ba3c3a53 [rocprofiler-compute] Standalone roofline should create HTML instead of PDF (#2535)
* Standalone roofline should create HTML instead of PDF

* Eiminate the dependency on kaleido and plotly_get_chrome by moving
  towards plotly native HTML image roofline chart generation

* Address review comments
2026-01-09 09:05:49 -05:00
vedithal-amd 769d3dd67a [rocprofiler-compute] Data imputation strategy for iteration multiplexing (#2468)
* Data imputation strategy for iteration multiplexing

* Implement data imputation methodology to handle missing counter values
  in case of iteration multiplexing

* Enable dispatch filtering with iteration multiplexing since we are no
  longer merging dispatches

* Bugfix to prevent check for missing counter values when using csv
  format when profiling with iteration multiplexing

* Move warning and info message in case of iteration multiplexing to
  sanitize function which comes earlier in analyze mode

* Address review comments

* Fix typo in documentation

* Move profiling config init. after path check in sanitize()

* Graceful handling of dispatches with all counters empty within data
  imputation logic

* Improve info message for iteration multiplexing based analysis

* Ensure proper error message when trying to run iteration multiplexing with attach/detach

* fix test case
2026-01-08 12:01:51 -05:00
vedithal-amd ca32193c84 Fix test cases (#2462) 2025-12-30 11:39:20 -05:00
vedithal-amd 61fd728fdb [rocprofiler-compute] Faster counter accuracy testing (#2420)
* Faster counter accuracy testing

* Better handle SPI_CSN_* metrics for lesser than MI350 series

* Use metric filtering to collect only relevant counters for comparison

* Ensure all workload folders are deleted after testing is completed

* Dont use clean_existing=False

* Add manual test for all counter accuracy
2025-12-23 13:13:53 -05:00
vedithal-amd d7302d6c1c [rocprofiler-compute] Test env. vars. in rocprofiler-sdk backend (#2414)
* Test env. vars. in rocprofiler-sdk backend

* Improve rocprofiler-sdk backend test case to check for env. vars. and
  ensure we do not overwrite irrelevant env. vars.

* Remove unnecessary usage of ROCPROF_INDIVIDUAL_XCC_MODE env. var.

* Formatting fixes

* Test fixes

* Remove redundant code in tests

* Remove usage of utils_mod and use utils instead, this prevents
  duplicate imports
2025-12-23 13:13:28 -05:00
vedithal-amd 588773f9bf [rocprofiler-compute] Fix for multi process workload profiling (#2418)
* Fix for multi process workload profiling

Native counter collection tool updates:
    * Do not dump empty counter data for a process
    * Use PID instead of UUID for dumped csv files to facilitate correlation
    * Handle merging multiple pairs of rocpd (from sdk tool) and csv (from
      native tool) files
    * Handle merging multiple pairs of csv (from sdk tool) and csv (from
      native tool) files

Rocpd output format updates:
    * Merge multiple rocpd databases into a single csv
    * Reset dispatch id and kernel id for unique dispatches and unique
      kernels respectively
    * Retain multiple rocpd databases per run for multi process workloads

* Add test case for multiprocess profiling using rocflop workload

* Add rocflop

* Fix native counter csv to rocprofv3 csv conversion

* Use kernel_id instead of dispatch_id to correlate native counter csv
  and kernel trace csv

* python formatting using ruff 0.14 instead of 0.13
2025-12-23 13:12:18 -05:00
abchoudh-amd 5b241f3e61 Fixed ctests (#2406) 2025-12-22 13:12:58 +05:30
vedithal-amd e4abee4f7d [rocprofiler-compute] Improve iteration multiplexing code and documentation (#2080)
* Improve Iteration multiplexing

* Improve iteration multiplexing documentation by adding usage note and
  listing caveats

* Bugfixes for iteration mulitplexing
    * Use merge iteration multiplexing in analysis webui and db mode
    * Do not remove Dispatch_ID column in merge iteration multiplexing
      since it is needed for analysis of top dispatches based on
duration

* Bugfixes for analysis logic
    * Graceful handling of missing counters in case of iteration
      multiplexing
    * Improved warnings when metrics could not be calculated due to
      missing counter data
    * Fix the check to prevent showing table when a column is full of
      N/A
    * Improve detection of empty values when metric evaludation fails
      due to missing counter data

* Bugfixes for profile logic
    * Fix kernel filtering during roofline benchmark phase

* Update changelog for bugfixes

* Remove unnecessary columns when merging dispatches for iteration multiplexing

* bugfix

* Better analysis warnings

* fix to_std() in parser

* Use median in merge iteration multiplex

* Address review comments

* Fix cmake formatting

* fix None handling of parser util functions

* Enable stochastic counter accuracy test

* fix cmake formatting
2025-12-18 11:51:21 -05:00
abchoudh-amd 6d9d880d31 [rocprofiler-compute] Counter accuracy tests and improvements for iteration multiplexing (#2011)
* Added laplace solver in samples

* Add laplace eqn in CMake

* Added counter accuracy test

* Add iteration CLI arg for laplace eq

* Unnest profile method

* Missing counter warning

* Updated insufficient kernel warning

* Added reference for laplace equation

* variable name change

* Added comments for data comparison

* Included scipy as test requirement

* Added line number for ref

* split stochastic and deterministic tests

* Added order cli option for laplace_eqn

* Install laplace eqn

* Missing counter warning

* Warn about missing kernels during analysis

* Update tests

* Split iteration multiplexing ctests

* Updated warning

* Incorporated copilot's suggestions
2025-12-17 18:26:39 +05:30
jamessiddeley-amd 706a8382a5 [rocprof-compute] added graceful exit with corrupt roofline.csv in profile and analyze mode (#1811)
* added graceful errors/exit in profile/analyze roofline.csv

* edit if statement truth

* restore if statement truth (roofline_csv needs at least 2 rows)

* addressed comments and skipped showing roof metrics when data invalid

* fix workload merge

* changed warning to error

* removed redundant variable definition

* added roofline csv validate check in TUI

* add test cases to test validation function

* ruff format

* simplified TUI roofline handling
2025-12-12 17:06:37 -05:00
vedithal-amd 793732a04e [rocprofiler-compute] Improve amdsmi interface (#2245)
* Improve amdsmi interface

* Fix issue where max mem clock was being set as max gfx clock

* Handle the case when all device handles might not be usable due to
  devices being hidden by ROCR and HIP environment variables

* Fix get gpu vram size to return str in KB

* Improve testing of amdsmi interface functions
2025-12-12 09:02:37 -05:00
jamessiddeley-amd d27bd37042 [rocprof-compute] Fix roofline "test_roof_plot_modes" test case (#2217)
* fix roof test to be isolated file paths

* fix typo

* addressed comments

* fix typos
2025-12-10 10:01:49 -05:00
ywang103-amd 092ca13f4f [rocprofiler-compute] add try catch to ensure subprocess killed if test of attach/detach fails (#2139)
* add try catch to ensure subprocess killed if test of attach/detach fails

* remove unnecessary comments

* remove duplicated cleanup
2025-12-04 15:49:03 -08:00
vedithal-amd 7a2df64b59 [rocprofiler-compute] Enable running tests from installation only for TheRock setup (#2067)
* Enable running tests from installation only

* Use cmake option -DTEST_FROM_INSTALL=ON to enable running tests from installation folder only
    * It is not possible to run tests from build folder in this case
    * This option prevents changing working directory to source folder

* Fix SourceFileLoader to import rocprof-compute main module correctly

* Install sample executables in the test folder

* fix num_xcds_cli_output test

* Fix tests

* Skip autogen. config. test and add a TODO task for re-design of this
  test

* Add flexible import of source code in test_gpu_specs.py

* Update cmake to install tests/workloads folder when INSTALL_TESTS=ON

* Fix sys.argv[0] for tests

* fix live attach detach test
2025-12-04 10:12:38 -05:00
Ben Richard 2bfa9a4d4c Intergrate roofline benchmark into rocprof-compute (#2015)
---------

Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2025-12-03 10:51:46 -05:00
vedithal-amd 3f2fbc18e9 [rocprofiler-compute] Only depend on amdsmi in profile phase (#2044)
* Only depepnd on amdsmi in profile phase

* amdsmi interface tests should have common prefix for easier testing
2025-11-28 11:32:00 -05:00
jamessiddeley-amd 833425577f fix roofline kernel_names test case (#1954) 2025-11-21 15:04:08 -05:00
abchoudh-amd 433af908a6 Iteration multiplexing in rocprof-compute (#1533)
* Profile with multiple input files

* Iteration multiplexing kernel option

* Iteration multiplexing data

* Iteration multiplexing

* Counter profile caching

* Counter dispatch info

* Sanitize CLI args

* Formatting and removed unused header file

* Formattng

* Changed CLI args

* Merge counters for analysis

* Iteration multiplexing log while analysis

* Formatting

* Log

* Guard against incomplete profiling

* Fixed merge counter

* Tests

* Update doc

* Test update

* Fixed formatting

* Test fix

* Merge conflict commit

* Fix tests

* Added comment for counter definition file

* Do not allow dispatch filtering with iteration multiplexing

* Fixed formatting

* Doc indentation update
2025-11-19 21:09:16 +05:30
abchoudh-amd 76ea35787d Split roofline tests, and fix none outputs (#1913)
* Split roofline tests

* Use N/A for missing values

* Test eval_expression for no valid data

* Fixed tests

* Updated Changelog for N/A

* Fixed platform specific test failure
2025-11-19 15:36:08 +05:30
vedithal-amd ae8f72fa79 [rocprofiler-compute] Use native tool for counter collection (#1212)
* Use native tool for counter collection

* Add native counter collection tool which uses rocprofiler-sdk C++
  library public API to get counter collection data
    * This is enabled by default, unless --no-native-tool option is
      provided or ROCPROF=rocprofv3 env. var. is provided
    * This tool is only supported for ROCm version >=7.x.x
    * This tool is not supported for attach/detach scenario
* Build native tool shared object during build time
* If using rocprof-compute without building then runtime compilation of
  t push native tool shared object is performed
* rocprofiler-sdk tools is still used for services other than counter
  collection and data collected by native tool is merged into the
  rocpd/csv output of rocprofiler-sdk tool

* Make `rocpd` choice the default choice for `--format-rocprof-output`
  option
    * If `rocpd` public API from rocprofiler-sdk library is not present,
      then fallback to `csv` choice
    * In this case only `pmc_perf.csv` is written in workload folder
      instead of multiple `csv` files for each profiling run
* Remove `json` choice from `--format-rocprof-output` option since it
  functions identical to `csv` option

* Rename option `--rocprofiler-sdk-library-path` to
  `--rocprofiler-sdk-tool-path` since we LD_PRELOAD the
  rocprofiler-sdk tool shared object and not the rocprofiler-sdk library
shared object

* Fix the meaning of `--dispatch` option in `profile` mode to mention
  dispatch iteration filtering instead of dispatch id filtering
    * --dispatch option in analyze mode does dispatch id filtering

* Move standalone binary creation logic from cmake file to docker file

* fix native counter collection tool during attach/detach

* improve logging

* fix attach detach with native tool

* fix attach detach with native tool

* do not support attach/detach in native tool

* Update changelog

* add standalone binary creation functionality in cmake

* address review comments

* address review comments

* fix formatting

* address review comments

* Adding paths for cmake to search. Also updated min. cmake requirement to 3.21 as this was when hip was supported.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Update hip compiler ID check, sometimes comes up as Clang, sometimes ROCMClang- depends on setup.
Updated formatting.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* RHEL8.10 unable to compile due to defaulting to old c++ version, need to force c++17

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Updating changelog per docs team recommendations

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Apply suggestions from code review to changelog

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Do not required HIP complier to build native counter collection tool

* fix cmake

* gersemi formatting on latest cmake change

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* ex ci updated dependencies to include rocprofiler-sdk, but cmake was still not capturing the path- there was a commit that added to the cmake_prefix_path entry that specified rocprof-sdk's cmake location ut was too specific for the search paths in find_package's config mode.
removing the cmake_prefix_path var and adding hints to find_package call instead, and specifying config mode so it knows how to construct the search paths

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* gersemi run for formatting

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Still need prefix path, should not have been removed in last commit but does need to be shortened to just the rocm path to allow for find_package config mode to do the job

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* include cstdint for uint32_t

* Run formatting on helper.cpp

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Remove rocm 7.2 release stuff from version and changelog and handle it in separate pr

* fix version

* fix changelog

* fix changelog

* run ruff formatter

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* fix rocprofiler-sdk attach so path

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-11-18 23:34:38 -05:00
jamessiddeley-amd d49e2e35fd [rocprof-compute] Automate ctest coverage and test cases on runners with CDash (#1481)
* Add nightly coverage workflow

* ruff formatting

* temp workflow testing

* restore workflow file

* add workflow condition

* update workflow file

* update workflow file

* fix typo in run-ci.py

* edit run-ci.py

* add python deps install

* add python deps install

* add python deps install

* add python deps install

* check if enable coverage is on when using workflow

* remove github CI breakdown and fix enable coverage

* set cache variables must be set before dashboard starts

* Update run-ci.py

* Update run-ci.py to fix ctest cache

* Update rocprofiler-compute-code-coverage.yml to install tests

* Update rocprofiler-compute-code-coverage.yml

* Restore workflow file

* Update run-ci.py

* Simplify workflow build command

* Update run-ci.py to build tests

* edited run-ci script

* edit ctest configure commands

* edit ctest configure commands to be on one line

* edit ctest configure command to include path to amdclang++

* update clang check in tests/cmakelists.txt

* update rocm

* update rocm

* update rocm version 7.0.2

* update tests/CMakeLists.txt

* use tarball instead for rocm install

* apt install rocm-dev instead for 7.0.0 release

* workflow tweaks

* update to use new 'tools' dir

* install rocm-dev

* add CMAKE_CXX_COMPILER as clang

* update tests/cmakelists.txt

* update cdasg site and build names

* remove run automatically on pull requests

* ruff format

* increased timeouts for tests

* add back reruns for workflow testing

* fix typo

* rename workflow "nightly" -> "code"

* added tracks to keep track of gpu (325 vs 355)

* remove test_db_connector.py

* revert build names and tracking

* update workflow pushes

* CMake format

* changed parallel level back to 1
2025-11-17 09:24:24 -05:00
jamessiddeley-amd 42cc721a4b [rocprof-compute] remove references to --kernel-names (#1543)
* remove references to --kernel-names

* ruff format

* remove redundant comments

* update docs and roofline image

* added two output lines to docs
2025-11-10 11:47:39 -05:00
vedithal-amd bb5fd1d4ae [rocprofiler-compute] Update analysis db for visualizer integration (#1548)
* Analysis db changes for visualizer

* Add support for per kernel analysis metrics

* Add support for dispatch timeline visualiztion

* Show median instead of mean of dispatch duration in kernel view

* Add test case to validate analysis db schema

* Analysis db schema updte
    * Add Kernel table and make Metric and Dispatch table its children
    * Kernel table is a child of Workload table
    * Update metric_view to show kernel_name column
    * Add disptach timestamps to Dispatch table for dispatch timeline
      visualization
    * Update kernel_view to show duration_ns_median instead of mean
      duration

* Add mean duation in kernel view

* update changelog

---------

Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2025-11-03 09:25:12 -05:00
xuchen-amd b774f28181 [rocprofiler-compute] Remove grafana and mongodb integration (#978)
* Remove grafana and mongodb integration

* Remove grafana documentation assets

* clarify changelog

---------

Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
2025-10-29 11:32:06 -04:00
ywang103-amd 99183ffd92 fix failure of pc sampling and unit tests (#1526) 2025-10-28 11:30:32 -04:00
abchoudh-amd a7bbe0c5d2 Use amd-smi Python API instead of CLI (#1334)
* Use amd-smi Python API instead of CLI

Formatting fix

python path

* Update CHANGELOG

* Create amdsmi interface

* Added amdsmi tests

* Removed run

* Prioritize rocm's amdsmi python API

* address review comments

* update changelog

* fix ruff formatting

---------

Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
2025-10-24 11:11:33 +05:30
xuchen-amd 578589d363 [rocprofiler-compute] metrics generator (#1199) 2025-10-22 15:17:43 -04:00
jamessiddeley-amd 64375c23d0 [rocprof-compute] Improve standalone roofline plot generation (#1298)
* ruff formatting

* Update roofline.py function descriptions

* Update height calculation

* Add back cache level filtering in gui_analysis

* Update roofline_calc.py to take in ai_data for ceiling length calc

* format roofline.py

* update roof test cases

* update roofline legend plot table

* fix pdf generate cutoff

---------

Co-authored-by: cfallows-amd <Carrie.Fallows@amd.com>
2025-10-10 14:23:23 -04:00
vedithal-amd 4870b2b881 Fix tests (#1213) 2025-10-03 09:52:38 -07:00
ywang103-amd eeeaa06159 attach/detach: change workload of unit test to accommodate SDK's current limitation (#1169)
* add double mode of workload dynamic_share with on remove sleeping and
set ROCP_TOOL_ATTACH=1 for running workload

* add comment in dynamic_shared.hip to exaplain how to use argv

* refactor the attach/detach profiling time in unit tests
2025-09-30 13:16:43 -07:00
vedithal-amd bd7a1de879 Remove rocprofv1/v2 in favour of rocprofiler-sdk (#673)
* Set default rocprof interface as rocprofiler-sdk

* Remove rocrprofv1 and rocprofv2 interfaces

* Remove deprecation notice for rocprof v1/v2/v3 interfaces
  * Make rocprofiler-sdk the default interface and make rocprofv3 interface opt-in using ROCPROF=rocprofv3

* Add deprecation notice for rocprofv3
2025-09-24 10:37:01 -04:00
vedithal-amd f5505b5989 Use ROCM_PATH for sdk library path (#1097) 2025-09-24 10:31:20 -04:00
systems-assistant[bot] 872f0aed0c Live attach/detach and its unit tests (#53) 2025-09-23 13:17:08 -04:00
cfallows-amd 9819e1cbfc Refactor roofline binary detection (#933)
* Simplify the roofline binary pickup process by determining which base distribution the system OS is based off of, and select the correct binary.
* Add more OS distribution support to roofline by modifying the detection parameters and adding an AZL binary
* Update changelog to include roofline support additions

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-09-22 12:04:20 -04:00
abchoudh-amd a927f246f6 Fix test failures (#1059)
* Test fix

* Added path not exists check
2025-09-22 18:07:13 +05:30
ywang103-amd 775ac73d25 change interval for host_trap in unit test to adapt to single kernel (#1064) 2025-09-19 17:21:02 -04:00
Jason Bonnell eebf5ead8c Replace cmake-format with gersemi in rocprofiler-compute-formatting.yml (#1053)
* Replace cmake-format with gersemi in rocprofiler-compute-formatting.yml

* Run gersemi formatting on CMakeLists.txt files

* Remove .cmake-format.yaml, add .gersemirc file

* Add more options to .gersemirc

* Add new line to .gersemirc

* Add new line to CMakeLists.txt

* Run gersemi again with new options
2025-09-19 08:42:40 -04:00
ywang103-amd 97f8b7b1ec change to single-kernel workload for pc_sampling tests (#955) 2025-09-16 10:17:23 -04:00
xuchen-amd 7ed6000e32 [rocprofiler-compute] Refactor to add type annotation and misc (#787) 2025-09-12 13:53:24 -04:00