Граф коммитов

1177 Коммитов

Автор SHA1 Сообщение Дата
cfallows-amd bbe2e17b80 Rename roofline bins (#717)
Rename roofline bins, remove rocm version in naming. Change method for binary search.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-05-28 14:55:51 -04:00
anisha-amd 783193c75f adding L2 model with updated legend and removal of large images (#718)
* adding L2 model with updated legend and removal of large images

* changed image name to perf_model
2025-05-28 14:03:47 -04:00
cfallows-amd cb2d928ecf Add F4 F6 to roofline for MI350 series (#709)
Add roofline bins with FP4 FP6 datatypes enabled for gfx950 arch

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-05-26 18:36:31 -04:00
jamessiddeley-amd 09b6ef4508 Fixed duplicate keys in analysis_configs yamls (#707)
* fixed duplicate keys in analysis_configs yamls

* Fix: removed TODO comment

Signed-off-by: James Siddeley <james.siddeley@amd.com>

---------

Signed-off-by: James Siddeley <james.siddeley@amd.com>
2025-05-20 13:12:46 -04:00
Ben Richard 41dd4aab90 Update illegal character check for profile name (#703) 2025-05-16 15:45:16 -04:00
cfallows-amd 43dbf38b27 Check mode during soc init for roofline (#705)
Check mode before creating roofline object- skip if only printing specs

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-05-16 12:54:53 -04:00
vedithal-amd 7b755fcd86 Fix MI350 tests (#706)
- TCC counter collection tests are still failing due to recent
  rocprofiler-sdk change
2025-05-15 13:37:55 -04:00
Vignesh Edithal 6522fe954b Add James Siddley to code reviewers 2025-05-15 12:14:39 -04:00
vedithal-amd 5cb86e31fc Implement interface to rocprofiler sdk (#695)
* Setting ROCPROF=rocprofiler-sdk environment variable will use rocprofiler-sdk C++ library instead of rocprofv3 python script

* Add runtime option --rocprofiler-sdk-library-path to use custom version of rocprofiler sdk library
    * Add --rocprofiler-sdk-library-path conftest option for tests

* Setup appropriate environment variables to inject rocprofiler sdk code to user command
    * Add env. vars. for counter collection and filtering
    * Add env. vars. for pc sampling

* Use python bindings to list counters supported by rocprofiler sdk
2025-05-13 10:48:21 -04:00
cfallows-amd d527d77337 Fix setting roofline-data-type option in both profile and analyze modes (#702)
Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-05-12 23:02:47 -04:00
xuchen-amd 4e24b2c60a Improve gpu spec tests using chip ids. (#701) 2025-05-09 11:48:25 -04:00
vedithal-amd dbb7f4d493 Use gpu model series instead of gpu model name for testing (#696) 2025-05-06 18:23:08 -04:00
vedithal-amd abd500593b Fix PC sampling analysis config issue (#697) 2025-05-06 18:22:15 -04:00
Ben Richard 35493f440c Avoid crash when profiling data not generated (#694)
* Avoid crash when profiling data not generated

-Handle case where program has no kernel launches
-Improve error messages
-Avoid roofline when profiling data is missing

Signed-off-by: benrichard-amd <ben.richard@amd.com>

* Update other soc_gfx files to catch missing pmc_perf.csv

* Fix formatting

* Fix incorrectly ordered imports

---------

Signed-off-by: benrichard-amd <ben.richard@amd.com>
2025-05-05 16:09:48 -04:00
cfallows-amd 41e73650d5 Enable roofline for MI350 series (#677)
Rework of roofline binaries generated from rocm-amdgpu-bench
- removed arch identifier in bin name
- removed rocm5 bins altogether

Updated required distros for roofline
- updated distro checks and bin naming
- moved up ubuntu20.04->22.04 and sles15.3->15.6 per rocm support

Enabled ctests for mi350 for test_roof_*
- removed mi350 series check to skip these specific tests

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-04-28 16:08:23 -04:00
cfallows-amd ad17c4d587 Update CODEOWNERS (#680)
Add rp-compute technical writer directly for any documentation review.
Remove existing packaging review requests for single user; every repo owner should be notified.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-04-28 16:05:51 -04:00
Daniel Su b77fcf575e Set rocprofV3 agent-index to absolute (#675)
Signed-off-by: Daniel Su <danielsu@amd.com>
2025-04-28 15:38:07 -04:00
xuchen-amd 85bfa73e2c Add test for gfx942 number of xcds. (#674)
* Add test for 9fx942 number of xcds.

* Improve the structure of mi gpu specs, add num_xcds_spec_class test.

* Add to ctest.

---------

Signed-off-by: xuchen-amd <xuchen@amd.com>
2025-04-28 11:29:14 -04:00
xuchen-amd ee73c2a119 process hip trace output. (#654)
Signed-off-by: xuchen-amd <xuchen@amd.com>
2025-04-22 18:31:47 -04:00
xuchen-amd f145f89e30 Patch in new rocprofv3 metrics. (#679) 2025-04-22 18:30:26 -04:00
cfallows-amd 346c7e452a Update runner distro in Formatting workflow (#678)
Update formatting workflow to use 22.04. 20.04 deprecated last week.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-04-22 16:14:15 -04:00
David Galiffi a50e44ec25 Bump VERSION to 3.2.0 2025-04-16 15:23:27 -06:00
ywang103-amd 3e09f038e5 change default rocprof version to v3 when not setting env variable (#673) 2025-04-16 12:38:20 -04:00
ywang103-amd fe2035d166 configure rocprofv3 as default for unit test (#668) 2025-04-11 19:30:18 -04:00
cfallows-amd c056a39db4 Add roofline support for rhel10 (#667)
-add check for rhel10 (platform:el10), force use rhel roof binary
-update changelog in 'unreleased- added' section

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-04-11 17:45:53 -04:00
cfallows-amd 03732d3719 Fix rpath checks during RPM generation on RHEL10 (#669)
Invalid rpath on roofline binaries reported during build testing for new RHEL10 addition, removed rpaths to prevent rpath check failures.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-04-11 17:40:59 -04:00
Daniel Su 36aa7fb7a9 External CI: add parallel mainline checks for develop and staging branches (#666) 2025-04-11 15:34:18 -04:00
dependabot[bot] 550212a886 Bump rocm-docs-core from 1.18.1 to 1.18.2 in /docs/sphinx (#657)
Bumps [rocm-docs-core](https://github.com/ROCm/rocm-docs-core) from 1.18.1 to 1.18.2.
- [Release notes](https://github.com/ROCm/rocm-docs-core/releases)
- [Changelog](https://github.com/ROCm/rocm-docs-core/blob/develop/CHANGELOG.md)
- [Commits](https://github.com/ROCm/rocm-docs-core/compare/v1.18.1...v1.18.2)

---
updated-dependencies:
- dependency-name: rocm-docs-core
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-04-08 17:09:01 -06:00
Pratik Basyal 3b67a00bc9 Broken link and reference text updated (#664) 2025-04-08 12:44:09 -04:00
xuchen-amd e7a7af539a Add mi325 specs. (#663) 2025-04-07 17:03:40 -04:00
cfallows-amd c45e20f325 Fixes for roofline datatype plot outputs (#659)
Profile mode:
Fix roofline plots for datatypes that have peakVALU only. Check for highest roofline to plot the bandwidth lines to proper height, don't rely on existence of peakMFMA for every datatype.
Analyze mode:
Add roofline-data-type option for viewing pdfs in standalone gui. Default is same as profile mode, FP32.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-04-07 12:10:37 -04:00
vedithal-amd f9aa7be97c Support MI 350 profiling (#632)
* Add MI 350 hardware information

* Refactor MI GPU YAML file and corresponding interface

* Add SoC file for gfx950 architecture

* Add analysis report configs for MI 350 containing existing metrics

* Add placeholder None valued metrics for previous architectures to make
  baseline comparison work

* Enable testing on MI 350

* Analysis config metric changes
    - SPI changes
        - Update metric formula for default SPI pipe counter
             - Use efficiently collected pipe wise SPI counters
        - Add SPI Wave Occupancy
        - Add Scheduler-Pipe Wave Utilization
        - Update formula for VGPR Writes
        - Add Scheduler-Pipe FIFO Full Rate
   - CPC changes
	- Add CPC SYNC FIFO Full Rate
	- Add CPC CANE Stall Rate
        - Add CPC ADC Utilization
   - SQ changes
        - Add VALU co-issue efficiency
        - Add F6F4 datatype metrics
        - Update formula for total FLOPs by adding F6F4 counters
        - Add LDS STORE / LOAD / ATOMIC metrics
        - Add LDS STORE / LOAD / ATOMIC bandwidth
        - Add LDS FIFO and TA ADDR / CMD / DATA FIFO full rates

* Collect TCP_TCP_LATENCY_sum only for gfx950 (MI 350)

* Do not inject SQ_ACCUM_PREV_HIRES unnecesarily

* Do not hardcode memory and shader clock speeds

* Write num_hbm_channels to sysinfo.csv instead of hbm_bw while profiling

* Move generate sysinfo.csv to pre processing step of profiling

* Add warnings to use --specs-correction for missing sysinfo.csv values during analysis phase

* Update CHANGELOG

* Analysis phase warning to use --specs-correction when needed
2025-04-03 02:21:18 -04:00
xuchen-amd f3736778f4 Add mi350 ta td tcp tcc counters (#653)
* Add mi350 TA and TD metrics.

* Add mi350 TCC metrics, and separate write and atomic metrics.

* Add mi350 TCP metrics.

* Add none values for non-gfx950 socs, remove missing metrics in rocprofv3.

---------

Signed-off-by: xuchen-amd <xuchen@amd.com>
2025-04-02 21:25:47 -04:00
xuchen-amd 591632dd69 Add mi300 TCP counter tests (#644)
* Add new sample applications.

* Generalize py test launcher for additional apps.

* Add TCP pytest, and add to ctest.

* Update licensing.

* Disable for non-mi300 machines.
2025-04-02 20:32:13 -04:00
xuchen-amd c7202923b0 remove flask debug msg (#655)
* Suppress Flask warning message in quiet mode.

* Init args.gui if dne.
2025-04-02 20:29:39 -04:00
xuchen-amd dce75f4afa Enable tuned performance counters for gfx950 (#652)
* Enable non-functional performance counters for gfx950.

* Update changelog.

* Add none value metrics for non-gfx950 socs

* Remove rocprofv3 missing metrics.
2025-04-02 14:43:12 -04:00
raramakr df2296529b SWDEV-521636 - Add dependent script path to system path in rocprof-compute (#651)
In wheel environment, rocprof-compute in bin folder is not a soft link. For executing rocprof-compute from bin folder, the system path should also have the dependency script paths. Added the same
2025-04-02 09:41:02 -07:00
vedithal-amd a7ebbbd41e Weekly rebase liangdin-test on top of amd-mainline (#650) 2025-04-01 14:18:29 -04:00
xuchen-amd e77dd1a1ab Improve chip id logic (#648)
* Improve chip id logic, add missing physical and virtual chip ids.
2025-04-01 12:18:07 -04:00
ywang103-amd 7b38766caa re-write fucntion that detects whether v1 is in use to avoid false negative result when ROCPROF is not set (#647) 2025-03-31 16:40:53 -04:00
Fei Zheng 9bacad0876 Support host-trap PC Sampling on CLI (beta version) 2025-03-28 16:51:49 -06:00
Ben Richard 9bd45f5135 Read Accum_VGPR_Count from rocprof output if provided (#645) 2025-03-28 10:43:24 -04:00
ywang103-amd 7c1f14123a fix the wrong number of channels of TCC counters to put in pmc txt file (#633) 2025-03-27 18:15:41 -04:00
ywang103-amd cdb93b7a4c fix ip block test by changing ways of extracting agent id (#639) 2025-03-27 16:28:00 -04:00
vedithal-amd af76525baa Inject SQ_ACCUM_PREV_HIRES for LEVEL counters only (#641) 2025-03-27 10:24:21 -04:00
cfallows-amd 6cb5bcdbe9 Add datatypes for roofline profiling (#642)
Rebuild of rocm-amdgpu-bench roofline binaries for MI200/MI300 systems with rocm6.
Added datatype options to roofline feature.

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
2025-03-26 21:07:48 -04:00
Cole Ramos 04f92b72a9 Fix incorrect logging in mi_gpu_spec.py (#626)
* Move console logging to logger function to avoid circular dependency in utils module

Signed-off-by: coleramos425 <colramos@amd.com>

* Apply python formatting

Signed-off-by: coleramos425 <colramos@amd.com>

* Remove the default StreamHandler before adding the custom

 If you are not explicitly removing this default handler, it could be causing duplicate outputs.

Signed-off-by: coleramos425 <colramos@amd.com>

* Fix lingering bugs from merge conflict resolution

Signed-off-by: coleramos425 <colramos@amd.com>

* Comply to python formatting and update pre-commit hook helper

Signed-off-by: coleramos425 <colramos@amd.com>

* Removing redundant console_log call as the get_mi300_num_xcds() call, otherwise ALL Mi200 profiling runs will print this message

Signed-off-by: coleramos425 <colramos@amd.com>

---------

Signed-off-by: coleramos425 <colramos@amd.com>
2025-03-25 17:06:37 -05:00
xuchen-amd 3294c495f5 Improve readability. (#628) 2025-03-25 17:49:42 -04:00
Cole Ramos 38c7dce84a Generalize locale checker to support more UTF-8 types (#623)
Signed-off-by: coleramos425 <colramos@amd.com>
2025-03-25 16:39:02 -05:00
ywang103-amd 983f902fa0 fix the crash related to agent id in rocprofv3 (#631) 2025-03-25 16:33:12 -04:00