125 Коммитов

Автор SHA1 Сообщение Дата
vedithal-amd aa5dfb98f9 MI350 Fix L2 cache to HBM read counters/metrics (#2501)
* Fix rocprofiler-sdk metrics definition

* Use TCC_EA0_RDREQ_128B instead of TCC_BUBBLE counter for L2 cache to
  HBM counters and metrics

* Update MI350 counter definitions
    * FETCH_SIZE
    * BANDWIDTH_EA

* Update MI350 metrics definitions
    * System Speed of Light, L2-Fabric Read BW
    * Roofline Plot Points, AI (Arithmetic Intensity) HBM
    * Roofline Performance Rates, HBM Bandwidth

* Remove redundant definition for gfx950 and fix BANDWIDTH_EA definition

Test HBM bandwidth metric for memcopy workload

* Add memcopy.cpp workload

* Add metric validation test suite to validate HBM Bandwidth metric for
  memcopy workload

* Move gpu_soc() to test_utils.py for better re-usability

* Update TUI analysis config

* Fix hbm bandwidth formula for mi350 in calc_ai_profile

Co-authored-by: Alysa Liu <Alysa.Liu@amd.com>
2026-01-23 15:56:24 -05:00
vedithal-amd c5bfb37289 Improve documentation for standalone binary creation (#2446)
* Add cmake based instructions to create standalone binary

* Specify standalone binary extraction path in doc.

* Add documentation to explain how to specify self-extraction path
  when building the standalone binary where contents of the binary
  are extracted during execution

* Pin Nuitka to version 2.6 for consistency in building standalone binary
2026-01-09 17:40:47 -05:00
vedithal-amd 050e88ee71 Remove unused python packages (#2437)
* Remove dependency on following unused python packages by updating
  requirements.txt, LICENSE, standalone binary requirements, cmake and
  docker requirements
    * matplotlib
    * kaleido
    * pymongo
    * colorlover
    * tqdm

* Remove unused code from src/utils/gui.py

* Reformat python using ruff
2026-01-07 09:03:49 -05:00
vedithal-amd 588773f9bf [rocprofiler-compute] Fix for multi process workload profiling (#2418)
* Fix for multi process workload profiling

Native counter collection tool updates:
    * Do not dump empty counter data for a process
    * Use PID instead of UUID for dumped csv files to facilitate correlation
    * Handle merging multiple pairs of rocpd (from sdk tool) and csv (from
      native tool) files
    * Handle merging multiple pairs of csv (from sdk tool) and csv (from
      native tool) files

Rocpd output format updates:
    * Merge multiple rocpd databases into a single csv
    * Reset dispatch id and kernel id for unique dispatches and unique
      kernels respectively
    * Retain multiple rocpd databases per run for multi process workloads

* Add test case for multiprocess profiling using rocflop workload

* Add rocflop

* Fix native counter csv to rocprofv3 csv conversion

* Use kernel_id instead of dispatch_id to correlate native counter csv
  and kernel trace csv

* python formatting using ruff 0.14 instead of 0.13
2025-12-23 13:12:18 -05:00
abchoudh-amd 5b241f3e61 Fixed ctests (#2406) 2025-12-22 13:12:58 +05:30
Jason Bonnell 112b4fd413 [rocprofiler-compute] Add SDK dependency to rocprofiler-compute-tarball.yml workflow (#2329)
* Install rocm-dev in rocprofiler-compute-tarball.yml workflow

* Update paths for push and PR for rocprofiler-compute-tarball.yml

* Add ROCm dependencies to disttest job

* cmake fix binary link creation and fix format

* Use python3 instead of python3.9 in RHEL 8 and RHEL 9 workflows

* set default python3 to python3.9 in rhel8

* Try alternatives setup for python3 in RHEL8 env

* Add pip install cmake to debug RHEL8 issue

* Remove python3.11 in RHEL8 workflow

* Add back comment regarding RHEL8

---------

Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
2025-12-18 11:56:23 -05:00
vedithal-amd e4abee4f7d [rocprofiler-compute] Improve iteration multiplexing code and documentation (#2080)
* Improve Iteration multiplexing

* Improve iteration multiplexing documentation by adding usage note and
  listing caveats

* Bugfixes for iteration mulitplexing
    * Use merge iteration multiplexing in analysis webui and db mode
    * Do not remove Dispatch_ID column in merge iteration multiplexing
      since it is needed for analysis of top dispatches based on
duration

* Bugfixes for analysis logic
    * Graceful handling of missing counters in case of iteration
      multiplexing
    * Improved warnings when metrics could not be calculated due to
      missing counter data
    * Fix the check to prevent showing table when a column is full of
      N/A
    * Improve detection of empty values when metric evaludation fails
      due to missing counter data

* Bugfixes for profile logic
    * Fix kernel filtering during roofline benchmark phase

* Update changelog for bugfixes

* Remove unnecessary columns when merging dispatches for iteration multiplexing

* bugfix

* Better analysis warnings

* fix to_std() in parser

* Use median in merge iteration multiplex

* Address review comments

* Fix cmake formatting

* fix None handling of parser util functions

* Enable stochastic counter accuracy test

* fix cmake formatting
2025-12-18 11:51:21 -05:00
abchoudh-amd 6d9d880d31 [rocprofiler-compute] Counter accuracy tests and improvements for iteration multiplexing (#2011)
* Added laplace solver in samples

* Add laplace eqn in CMake

* Added counter accuracy test

* Add iteration CLI arg for laplace eq

* Unnest profile method

* Missing counter warning

* Updated insufficient kernel warning

* Added reference for laplace equation

* variable name change

* Added comments for data comparison

* Included scipy as test requirement

* Added line number for ref

* split stochastic and deterministic tests

* Added order cli option for laplace_eqn

* Install laplace eqn

* Missing counter warning

* Warn about missing kernels during analysis

* Update tests

* Split iteration multiplexing ctests

* Updated warning

* Incorporated copilot's suggestions
2025-12-17 18:26:39 +05:30
vedithal-amd 4870725a62 Do not absolute python path when adding tests (#2282) 2025-12-12 10:57:19 -05:00
cfallows-amd 9d34098350 [rocprofiler-compute] Roofline runtime compilation patch (#2232)
* Add install into CMakeLists.txt file- resolves 'no hip module' issues.
* Readd printout line for peak VALU during benchmarking removed on accident in a different commit.
* Add CHANGELOG entry for commit 2bfa9a4 ("Integrate roofline benchmark into rocprof-compute (#2015)")

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Run formatter checks on rocprof-compute to clear PR checks

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Update benchmark.py link in changelog

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions to CHANGELOG from code review

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-12-10 01:44:28 -05:00
vedithal-amd d8a8a3ef30 [rocprofiler-compute] Add exception handling for native tool path search (#2159)
* Add exception handling for native tool path search

* Fix formatting in roofline benchmark code

* Fix detection of .so files

* include hip code and native tool code in standalone binary

* add fallback path for ROCM_PATH
2025-12-04 10:29:49 -05:00
vedithal-amd ac640c13d6 [rocprofiler-compute] Allow to specify path for standalone binary extraction (#2162)
* Allow to specify path for standalone binary extraction

* Add cmake option -D STANDALONEBINARY_EXTRACT_DIR=<path> to specify extraction dir. for binary

* fix formatting

---------

Co-authored-by: Fei Zheng <44449748+feizheng10@users.noreply.github.com>
2025-12-04 10:13:18 -05:00
vedithal-amd 7a2df64b59 [rocprofiler-compute] Enable running tests from installation only for TheRock setup (#2067)
* Enable running tests from installation only

* Use cmake option -DTEST_FROM_INSTALL=ON to enable running tests from installation folder only
    * It is not possible to run tests from build folder in this case
    * This option prevents changing working directory to source folder

* Fix SourceFileLoader to import rocprof-compute main module correctly

* Install sample executables in the test folder

* fix num_xcds_cli_output test

* Fix tests

* Skip autogen. config. test and add a TODO task for re-design of this
  test

* Add flexible import of source code in test_gpu_specs.py

* Update cmake to install tests/workloads folder when INSTALL_TESTS=ON

* Fix sys.argv[0] for tests

* fix live attach detach test
2025-12-04 10:12:38 -05:00
Jason Bonnell e68873c170 Gersemi formatting for rocprofiler-compute (#1997)
* Run gersemi formatting on cmake files in compute

* Run gersemi again but on updated version
2025-11-25 09:49:16 -05:00
jonatluu 6b8aae3796 Enable Lintian Support rocm-systems (#1578)
* draft testing fix for no copyright file and no changelog

* test fix no-changelog no-copyright

* changelog copyright fixt

* remove utils.cmake

* rocr lintian

* lintian overrides, copyright, changelog install

* fix lintian overrides install

* comp_type static fix and remove debug logs

* syntax error

* update static build check

* update file permissions to 0755 to fix error control-file-has-bad-permissions 0664 != 0755

* fix lintian errors in rdc and remove logs from roctracer

* lintian error fix rocprofiler

* fix lintian error

* mmove lintian overrides install

* lintian errors fix

* move lintian overrides install

* use changelog already provided by rdc

* fix formatting use existing changelog if provided

* fix formatting use changelog in rocprofiler

* draft testing fix for no copyright file and no changelog

* test fix no-changelog no-copyright

* changelog copyright fixt

* lintian overrides, copyright, changelog install

* fix lintian overrides install

* comp_type static fix and remove debug logs

* fix lintian errors in rdc and remove logs from roctracer

* lintian error fix rocprofiler

* fix lintian error

* mmove lintian overrides install

* lintian errors fix

* move lintian overrides install

* use changelog already provided by rdc

* fix formatting use existing changelog if provided

* fix formatting use changelog in rocprofiler

* remove overrides. Use existing changelog and copyright

* resolve merge conflict

* update license for hsa-rocr. Use NCSA license

* install license

* install license
2025-11-20 11:38:39 -05:00
abchoudh-amd 433af908a6 Iteration multiplexing in rocprof-compute (#1533)
* Profile with multiple input files

* Iteration multiplexing kernel option

* Iteration multiplexing data

* Iteration multiplexing

* Counter profile caching

* Counter dispatch info

* Sanitize CLI args

* Formatting and removed unused header file

* Formattng

* Changed CLI args

* Merge counters for analysis

* Iteration multiplexing log while analysis

* Formatting

* Log

* Guard against incomplete profiling

* Fixed merge counter

* Tests

* Update doc

* Test update

* Fixed formatting

* Test fix

* Merge conflict commit

* Fix tests

* Added comment for counter definition file

* Do not allow dispatch filtering with iteration multiplexing

* Fixed formatting

* Doc indentation update
2025-11-19 21:09:16 +05:30
abchoudh-amd 76ea35787d Split roofline tests, and fix none outputs (#1913)
* Split roofline tests

* Use N/A for missing values

* Test eval_expression for no valid data

* Fixed tests

* Updated Changelog for N/A

* Fixed platform specific test failure
2025-11-19 15:36:08 +05:30
vedithal-amd ae8f72fa79 [rocprofiler-compute] Use native tool for counter collection (#1212)
* Use native tool for counter collection

* Add native counter collection tool which uses rocprofiler-sdk C++
  library public API to get counter collection data
    * This is enabled by default, unless --no-native-tool option is
      provided or ROCPROF=rocprofv3 env. var. is provided
    * This tool is only supported for ROCm version >=7.x.x
    * This tool is not supported for attach/detach scenario
* Build native tool shared object during build time
* If using rocprof-compute without building then runtime compilation of
  t push native tool shared object is performed
* rocprofiler-sdk tools is still used for services other than counter
  collection and data collected by native tool is merged into the
  rocpd/csv output of rocprofiler-sdk tool

* Make `rocpd` choice the default choice for `--format-rocprof-output`
  option
    * If `rocpd` public API from rocprofiler-sdk library is not present,
      then fallback to `csv` choice
    * In this case only `pmc_perf.csv` is written in workload folder
      instead of multiple `csv` files for each profiling run
* Remove `json` choice from `--format-rocprof-output` option since it
  functions identical to `csv` option

* Rename option `--rocprofiler-sdk-library-path` to
  `--rocprofiler-sdk-tool-path` since we LD_PRELOAD the
  rocprofiler-sdk tool shared object and not the rocprofiler-sdk library
shared object

* Fix the meaning of `--dispatch` option in `profile` mode to mention
  dispatch iteration filtering instead of dispatch id filtering
    * --dispatch option in analyze mode does dispatch id filtering

* Move standalone binary creation logic from cmake file to docker file

* fix native counter collection tool during attach/detach

* improve logging

* fix attach detach with native tool

* fix attach detach with native tool

* do not support attach/detach in native tool

* Update changelog

* add standalone binary creation functionality in cmake

* address review comments

* address review comments

* fix formatting

* address review comments

* Adding paths for cmake to search. Also updated min. cmake requirement to 3.21 as this was when hip was supported.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Update hip compiler ID check, sometimes comes up as Clang, sometimes ROCMClang- depends on setup.
Updated formatting.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* RHEL8.10 unable to compile due to defaulting to old c++ version, need to force c++17

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Updating changelog per docs team recommendations

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Apply suggestions from code review to changelog

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

* Do not required HIP complier to build native counter collection tool

* fix cmake

* gersemi formatting on latest cmake change

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* ex ci updated dependencies to include rocprofiler-sdk, but cmake was still not capturing the path- there was a commit that added to the cmake_prefix_path entry that specified rocprof-sdk's cmake location ut was too specific for the search paths in find_package's config mode.
removing the cmake_prefix_path var and adding hints to find_package call instead, and specifying config mode so it knows how to construct the search paths

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* gersemi run for formatting

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Still need prefix path, should not have been removed in last commit but does need to be shortened to just the rocm path to allow for find_package config mode to do the job

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* include cstdint for uint32_t

* Run formatting on helper.cpp

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Remove rocm 7.2 release stuff from version and changelog and handle it in separate pr

* fix version

* fix changelog

* fix changelog

* run ruff formatter

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* fix rocprofiler-sdk attach so path

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-11-18 23:34:38 -05:00
jamessiddeley-amd d49e2e35fd [rocprof-compute] Automate ctest coverage and test cases on runners with CDash (#1481)
* Add nightly coverage workflow

* ruff formatting

* temp workflow testing

* restore workflow file

* add workflow condition

* update workflow file

* update workflow file

* fix typo in run-ci.py

* edit run-ci.py

* add python deps install

* add python deps install

* add python deps install

* add python deps install

* check if enable coverage is on when using workflow

* remove github CI breakdown and fix enable coverage

* set cache variables must be set before dashboard starts

* Update run-ci.py

* Update run-ci.py to fix ctest cache

* Update rocprofiler-compute-code-coverage.yml to install tests

* Update rocprofiler-compute-code-coverage.yml

* Restore workflow file

* Update run-ci.py

* Simplify workflow build command

* Update run-ci.py to build tests

* edited run-ci script

* edit ctest configure commands

* edit ctest configure commands to be on one line

* edit ctest configure command to include path to amdclang++

* update clang check in tests/cmakelists.txt

* update rocm

* update rocm

* update rocm version 7.0.2

* update tests/CMakeLists.txt

* use tarball instead for rocm install

* apt install rocm-dev instead for 7.0.0 release

* workflow tweaks

* update to use new 'tools' dir

* install rocm-dev

* add CMAKE_CXX_COMPILER as clang

* update tests/cmakelists.txt

* update cdasg site and build names

* remove run automatically on pull requests

* ruff format

* increased timeouts for tests

* add back reruns for workflow testing

* fix typo

* rename workflow "nightly" -> "code"

* added tracks to keep track of gpu (325 vs 355)

* remove test_db_connector.py

* revert build names and tracking

* update workflow pushes

* CMake format

* changed parallel level back to 1
2025-11-17 09:24:24 -05:00
xuchen-amd b774f28181 [rocprofiler-compute] Remove grafana and mongodb integration (#978)
* Remove grafana and mongodb integration

* Remove grafana documentation assets

* clarify changelog

---------

Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
2025-10-29 11:32:06 -04:00
xuchen-amd 578589d363 [rocprofiler-compute] metrics generator (#1199) 2025-10-22 15:17:43 -04:00
Fei Zheng 2c59a82fe1 Fix rocprof-compute TUI build err with python 39 (#303)
* Upgrade min python version from 3.8 to 3.9

* Set min version for textual-fspicker for TUI support

* Update workflows to use python 3.9 instead of 3.8

* fix formatting

* fix bug

---------

Co-authored-by: Vignesh Edithal <Vignesh.Edithal@amd.com>
2025-10-21 00:27:35 -04:00
Young Hui - AMD 02bf0a8492 [rocprofiler-compute] Source files updated to reference super-repo URL (#1330)
* source files updated to reference super-repo URL
2025-10-15 15:35:11 -04:00
vedithal-amd 5f12d9b789 Fix instructions to build standalone binary (#1116) 2025-09-24 16:31:08 -04:00
systems-assistant[bot] 872f0aed0c Live attach/detach and its unit tests (#53) 2025-09-23 13:17:08 -04:00
ywang103-amd 775ac73d25 change interval for host_trap in unit test to adapt to single kernel (#1064) 2025-09-19 17:21:02 -04:00
Jason Bonnell eebf5ead8c Replace cmake-format with gersemi in rocprofiler-compute-formatting.yml (#1053)
* Replace cmake-format with gersemi in rocprofiler-compute-formatting.yml

* Run gersemi formatting on CMakeLists.txt files

* Remove .cmake-format.yaml, add .gersemirc file

* Add more options to .gersemirc

* Add new line to .gersemirc

* Add new line to CMakeLists.txt

* Run gersemi again with new options
2025-09-19 08:42:40 -04:00
abchoudh-amd 7d847dde3f Split tests (#952) 2025-09-12 12:29:48 +05:30
Ammar ELWazir 2a9700fcd7 [ROCProfiler-Register/Systems/Compute] Fixing License file name usage (#927)
ROCProfiler-Register/Systems/Compute: The license file name in the CMake install module and other locations was originally LICENSE, but it was recently changed to LICENSE.md, requiring an update to the CMake install module and all other relevant locations.
2025-09-10 15:46:39 -04:00
jamessiddeley-amd f3a2bb07a4 [rocprofiler-compute] added ctest coverage and cdash submission (#366)
* added cdash automatic CI upload

* added cdash automatic CI upload

* tweaked wording

* changed nightly to continuous

* removed unnecessary dry-run arg

* updated README.md

* edited workflow description

* update coverage

* formatted cmakelists.txt

* ruff formatting and update coverage
2025-09-02 11:21:40 -04:00
vedithal-amd 748c9b74d9 Update standalone binary to use python 3.9 (#725)
* Update standalone docker to python 3.9

* Add TUI files

* Fix docker files to work with monorepo

* Update standalone binary documentation
2025-08-25 07:57:08 -04:00
ywang103-amd 2a216ecbc1 pc sampling unit tests (#194) 2025-08-23 10:13:22 -04:00
cfallows-amd 3258c69b60 Fix cmake formatting (#222)
Fix formatting of CMakeLists.txt for cmake-format check

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-08-11 16:40:09 -04:00
vedithal-amd 97d9f35033 Fix ctest and docker to work with monorepo (#181)
* Remove .git folder and git command check in cmake

* Update docker container to work in monorepo
    * Update docker container to mount the top level folder in monorepo
2025-08-08 10:26:57 -04:00
vedithal-amd 2444c20172 Fix cmake to skip textual_fspicker check (#121) 2025-08-07 11:43:22 -04:00
xuchen-amd 34dd26fb07 Enable single pass counter collection (#833)
[ROCm/rocprofiler-compute commit: 6a77d241ed]
2025-08-06 10:35:05 -04:00
vedithal-amd 354fe5f52c Unified configuration for metrics (#726)
* Show description of metrics during analysis
    * Use --include-cols Description show the Description column in analyze mode (this is hidden by default)
    * Remove tips field from analysis config

* Align metric names in analysis config and documentation

* Add unified config utils/unified_config.yaml

* Add python script utils/split_config.py to auto generate analysis configuration and documentation metrics description
   * Add test case to ensure unified config is older than auto-generated config
   * Auto generate analysis config and documentation metrics description

* Update CONTRIBUTING.md to add instructions to build documentation assets
    * Add docker image and compose file to build documentation

* Update CHANGELOG and Documentation

* Use jinja template instead of hardcoding metric tables in documentation

[ROCm/rocprofiler-compute commit: bb44e90b2d]
2025-07-25 14:01:34 -04:00
vedithal-amd 1cf98deedf Fix rocprofv3 supported counters not being detected (#832)
* Fix rocprofv3 supported counters not being detected

* Fix rocprof interface deprecation warning appearing twice

[ROCm/rocprofiler-compute commit: dbcaccb9de]
2025-07-24 11:50:07 -04:00
vedithal-amd fc2037870f fix build (#823)
[ROCm/rocprofiler-compute commit: c4d129def5]
2025-07-22 13:02:14 -04:00
vedithal-amd 46ae3d36d9 Remove hardware IP block based filtering (#820)
* Analysis report block based filtering is the default now

* Update documentation

* Update CHANGELOG

* Fix tests
    * Replace hardware block based filtering tests with report block
      based filtering tests

[ROCm/rocprofiler-compute commit: 98bb0f4237]
2025-07-21 09:37:35 -04:00
vedithal-amd ce73a5ef74 Fix roofline and TUI bugs (#803)
* Fix roofline rocm version bug
* Fix utils bug
* Remove unnecessary tests
* Do not check textual-fspicker package in cmake build
* Use rocprofv3 to test MI 100 and fix tests

[ROCm/rocprofiler-compute commit: 000fd4f5b2]
2025-07-09 19:15:46 -04:00
xuchen-amd bac7fde4f4 Add tui cmake install. (#794)
[ROCm/rocprofiler-compute commit: 60a50e681b]
2025-07-08 11:18:26 -04:00
jamessiddeley-amd 94ea0fbf2f additional-code-coverage-compute (#763)
* added additional functions to test_utils.py

* added code coverage for db_connector.py

* Update test_profile_general.py

Added additional roofline test cases

Signed-off-by: jamessiddeley-amd <James.Siddeley@amd.com>

* updated coverage mi_gpu_spec.py 73% -> 94%

* added parser.py coverage

* removed redundant comments

* added test_utils and test_db_connector

---------

Signed-off-by: jamessiddeley-amd <James.Siddeley@amd.com>

[ROCm/rocprofiler-compute commit: a6463f5e98]
2025-07-02 13:29:10 -04:00
Fei Zheng 2ffaf5b453 Documentation update for FP8 on MI300 (#766)
[ROCm/rocprofiler-compute commit: f5bc717fe1]
2025-06-26 13:35:36 -06:00
David Galiffi 3a703cec00 Provide a version for RPM Obsoletes attribute (#670)
Fix RPM generation warning

[ROCm/rocprofiler-compute commit: 1903e8e748]
2025-06-25 12:36:47 -04:00
Kunal Malviya fba643793b Adding verbose and changing threads (#771)
Co-authored-by: rocm <rocm@rocm-System-Kunal.amd.com>

[ROCm/rocprofiler-compute commit: 661de1d483]
2025-06-25 18:49:46 +05:30
xuchen-amd af114a1539 Add test for gfx942 number of xcds. (#674)
* Add test for 9fx942 number of xcds.

* Improve the structure of mi gpu specs, add num_xcds_spec_class test.

* Add to ctest.

---------

Signed-off-by: xuchen-amd <xuchen@amd.com>

[ROCm/rocprofiler-compute commit: 85bfa73e2c]
2025-04-28 11:29:14 -04:00
vedithal-amd 27585a8a2b Support MI 350 profiling (#632)
* Add MI 350 hardware information

* Refactor MI GPU YAML file and corresponding interface

* Add SoC file for gfx950 architecture

* Add analysis report configs for MI 350 containing existing metrics

* Add placeholder None valued metrics for previous architectures to make
  baseline comparison work

* Enable testing on MI 350

* Analysis config metric changes
    - SPI changes
        - Update metric formula for default SPI pipe counter
             - Use efficiently collected pipe wise SPI counters
        - Add SPI Wave Occupancy
        - Add Scheduler-Pipe Wave Utilization
        - Update formula for VGPR Writes
        - Add Scheduler-Pipe FIFO Full Rate
   - CPC changes
	- Add CPC SYNC FIFO Full Rate
	- Add CPC CANE Stall Rate
        - Add CPC ADC Utilization
   - SQ changes
        - Add VALU co-issue efficiency
        - Add F6F4 datatype metrics
        - Update formula for total FLOPs by adding F6F4 counters
        - Add LDS STORE / LOAD / ATOMIC metrics
        - Add LDS STORE / LOAD / ATOMIC bandwidth
        - Add LDS FIFO and TA ADDR / CMD / DATA FIFO full rates

* Collect TCP_TCP_LATENCY_sum only for gfx950 (MI 350)

* Do not inject SQ_ACCUM_PREV_HIRES unnecesarily

* Do not hardcode memory and shader clock speeds

* Write num_hbm_channels to sysinfo.csv instead of hbm_bw while profiling

* Move generate sysinfo.csv to pre processing step of profiling

* Add warnings to use --specs-correction for missing sysinfo.csv values during analysis phase

* Update CHANGELOG

* Analysis phase warning to use --specs-correction when needed

[ROCm/rocprofiler-compute commit: f9aa7be97c]
2025-04-03 02:21:18 -04:00
xuchen-amd 08e083cc25 Add mi300 TCP counter tests (#644)
* Add new sample applications.

* Generalize py test launcher for additional apps.

* Add TCP pytest, and add to ctest.

* Update licensing.

* Disable for non-mi300 machines.

[ROCm/rocprofiler-compute commit: 591632dd69]
2025-04-02 20:32:13 -04:00
vedithal-amd 0f4b5e91bd Standalone binary no self execute fix (#603)
* Fix nuitka command

[ROCm/rocprofiler-compute commit: 15edbf475e]
2025-03-11 13:34:37 -04:00