Graf commitů

64291 Commity

Autor SHA1 Zpráva Datum
ammallya 5818c3e062 Adding debug statement 2025-08-28 10:44:36 -07:00
ammallya 2e1b6063a7 Change failing checkout logic 2025-08-28 10:25:27 -07:00
ammallya a6e1f61c45 Adding Auth 2025-08-28 10:03:48 -07:00
ammallya cba3213f41 Fix rocm_ci_caller 2025-08-28 09:55:52 -07:00
ammallya f40c04be62 Selective trigger of PSDB jobs (#783)
* Selective trigger of PSDB jobs
2025-08-28 09:50:11 -07:00
gabrpham 94e194eba2 [SWDEV-540377] Fixed segfault in --showevent command (#649)
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-08-28 11:49:36 -05:00
Sam Ruscica 9018e0fc7b SWDEV-546639 monorepo fix for nvidia hip runtime api (#746)
* SWDEV-546639 monorepo fix for nvidia hip runtime api

* Added back hipSetValidDevices.
2025-08-28 09:03:37 -07:00
usrihari123 2449bfd483 Update the scratch memory docs with the new allocation_size field (#685)
* Update the scratch memory docs with the new allocation_size field

* Address review comment

---------

Co-authored-by: Srihari <srihariu1@gmail.com>
2025-08-28 17:37:06 +05:30
Ioannis Assiouras 1017532916 SWDEV-546631 - Fix hipLaunchHostFunction in stream capture for windows (#654) 2025-08-28 07:51:50 +01:00
itrowbri 4d98a0169f Handle special cases when stream value is hipStreamLegacy (0x01) or hipStreamPerThread (0x02) (#343)
* Updated stream code to handle special cases when stream value is 0x01 or 0x02

* Removed extra definitions and updated tests to account for special case

* Modified stream.cpp so that each thread assigned a unique stream ID when hipStreamPerThread is used as stream value. Modified tests to check that threads are assigned unique, repeated values when hipStreamPerThread is called

* Updated idx_offset, stream_map, and thread counter to be in one struct.

* Update stream.cpp to only use add_stream() and update tests for seperate unit test for hipStreamPerThread

* Remove unecessary comment

* Removed unecessary line

* Updated tests and stream.cpp to update stream ID correctly

* Updated test structure
2025-08-27 20:04:13 -05:00
Julia Jiang 9aaad2017b SWDEV-525231 - Update changelog for 7.0 (#768) 2025-08-27 16:10:31 -04:00
Ioannis Assiouras 5f525ee934 SWDEV-550882 - Expect HSA_EXT_POINTER_TYPE_RESERVED_ADDR pointer type from hsa_amd_pointer_info for hmm (#733) 2025-08-27 19:42:13 +01:00
Karthik Jayaprakash 89070536c0 SWDEV-552141 - Fix handle/fd type passed from app to align with spec. (#759)
* SWDEV-552141 - Fix handle/fd type passed from app to align with spec.

* SWDEV-552141 - Fix handle/fd type passed from app to align with spec.
2025-08-27 14:28:53 -04:00
jonatluu 6bc1ea966f fix lintian warning (#696)
* fix lintian warning

* fix lintian warning
2025-08-27 13:53:54 -04:00
cfallows-amd c68ba44e72 Add single kernel filtering to roofline plots (#757)
* Add single kernel filtering for roofline
* Add --kernel to documentation
* Add kernel labels to roofline pdfs

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Add test cases

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Add autodetect for mode (profile or analyze) during roof validate and filter
Prevent --kernel from affecting roofline in gui mode- although this may be broken in develop branch anyways

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Add note about roof-only usage checking for existing profiling files in the dir. If roof-only is not provided, rocprof-compute currently assumes it has to profile in full regardless. Will look into this another day.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Update CHANGELOG.md

Add line in resolved issues section to highlight that kernel filtering is now working for roofline plots

* Apply changes suggested by docs team

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Update projects/rocprofiler-compute/CHANGELOG.md

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-08-27 13:41:07 -04:00
systems-assistant[bot] 2e50d88fe6 [ROCProfiler SDK] Removing regex from the tool and output libraries (#170)
* Removing regex from the tool

* Adding alternative for regex regarding  handling

* Adding ROCpd

* Removing regex include

* Apply suggestion from @jomadsen_amdeng

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Apply suggestion from @jomadsen_amdeng

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Apply suggestion from @jomadsen_amdeng

Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>

* Adding Standalone Regex Header File

* Fixing Regex to handle grouping and

* Fixing Regex to handle grouping and

* Fixing Regex to handle grouping and

* Formatting Fix

* Update rocprofiler-sdk-restrictions.yml

* Separating regex.hpp to source and header & Adding Tests for parity with std::regex

* Update regex.cpp

* Using snake_case for naming and addressing some comments

* Adding more tests & README for regex implementation

* Updating rocprofiler sdk restrictions workflow

* Updating more tests & README for regex implementation

* Update README_regex.md

* Rename README_regex.md to README.md

---------

Co-authored-by: Ammar ELWazir <aelwazir@amd.com>
Co-authored-by: Elwazir, Ammar <Ammar.Elwazir@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
2025-08-27 12:30:12 -05:00
Daniel Su 7055fbfc7f [Ex CI] add hip-clr pipeline ID, add status badge to README (#782) 2025-08-27 13:25:47 -04:00
jamessiddeley-amd f0955f5a83 [rocprof-compute] added f4f6 description and VALU FLOPS split for empirical peaks (#739)
* added f4f6 description and VALU FLOPS split

* changed peak ammolite vars to local

* reverted to dict peak initialization

* ruff check format

* updated VALU descriptions

* updated VALU descriptions

* Update parser.py

* Update parser.py

Added gracefull NameError handling
Moved globals() update to init_metric_evaluation with ammolite__ vars and raw pmc_df

* update formatting
2025-08-27 12:46:41 -04:00
Sourabh U Betigeri 5201efe050 SWDEV-545259 - Adds clang-cl support and missing dispatch table entries (#760) 2025-08-27 09:35:58 -07:00
Pratik Basyal cfd3fee0e2 ROCm Compute Profiler changelog for 7.0 updated (#740)
* ROCm Compute Profiler changelog for 7.0 updated

* Roofline limited support for MI350 removed

* Update CHANGELOG.md
2025-08-27 11:24:49 -04:00
Daniel Su 875d7724d9 create hip-clr combined pipeline trigger (#693) 2025-08-27 10:28:33 -04:00
systems-assistant[bot] 0a1a419191 SWDEV-534394 - Add test for kernel launch stream on different device (#565)
Co-authored-by: Pengda Xie <pengda.xie@amd.com>
2025-08-26 23:24:57 -07:00
David Galiffi 872846bcdc Updated Group By HIP Stream documentation (#717)
Based on feedback from https://github.com/ROCm/rocprofiler-systems/pull/306
2025-08-26 22:03:51 -04:00
Julia Jiang 17ffa13035 SWDEV-538999 - Make correction in porting guid for launch_bounds (#646) 2025-08-26 16:55:52 -04:00
Kian Cossettini 07a7b9b845 Use rocprofiler-SDK for OMPT tracing (#702)
Switch to using SDK for OMPT tracing and remove older OMPT code path
2025-08-26 16:54:01 -04:00
systems-assistant[bot] 5f4e0dc889 SWDEV-538789 - Add multi stream kernel dispatch perf test (#556)
Co-authored-by: Pengda Xie <pengda.xie@amd.com>
2025-08-26 13:42:11 -07:00
Milan Radosavljevic df7b9d559f Fix collecting of stream id's for rocpd (#751) 2025-08-26 16:17:42 -04:00
systems-assistant[bot] ded5b86e83 SWDEV-540609 - capture of MIOpen OCL kernels needs remainder globalWorkSize (#431)
Co-authored-by: Rakesh Roy <rakesh.roy@amd.com>
2025-08-26 16:11:31 -04:00
Jason Bonnell 296a4021f9 [rocprofiler-compute] Fix rocprofiler-compute workflows (#761)
* add working-directory to ver_check step in rocprofiler-compute-packaging.yml

* Remove compute mi-rhel9 workflow badge since workflow is no longer in develop

* Update actions to v5 in rocprofiler-compute-docs

* Add working directory to steps in rocprofiler-compute-docs.yml

* Revert back to v4 pages

* Remove rocprofiler-compute-docs.yml workflow

* Remove docs workflow badge from rocprofiler-compute in README.md

* Remove rocprofiler-compute-packaging.yml, update README.md badges
2025-08-26 14:29:15 -04:00
vedithal-amd 323d06c79c [rocprofiler-compute] Add database output format to analyze mode (#748)
Analysis data dump

* Add `--output-format` and `--output-name` option to analyze mode

* Remove `--output` and `-save-dfs` option to analyze mode

* Add documentation on `rocpd` output format and analysis database file

* Create sqlite3 database using object relation mapping (ORM) provided
  by sqlalchemy library

* Fix metrics config to remove metrics marked as `null`, fix `Unit` header, add
  missing `title`

* Add test cases to ensure analysis data dump work
2025-08-26 14:15:05 -04:00
Satyanvesh Dittakavi 09cfa97156 SWDEV-551218 - Fix hip on nvidia build failures (#642)
* Rebase and address merge conflicts

* SWDEV-551218 - Fix hip on nvidia build failures
2025-08-26 23:40:35 +05:30
Milan Radosavljevic 96a46962ad Change amd_smi and cpu_freq modules to use trace cache for rocpd (#690)
* Move amd-smi to use caching mechanism

* Add VCN and JPEG activity to rocpd

* Switch cpu_freq to use caching mechanism

* Different approach with xcp activity & applied suggestions from code review

* Applied suggestions from code review

* Fix shadowing

* Applied suggestions from code review
2025-08-26 14:00:04 -04:00
systems-assistant[bot] 7601798fa7 SWDEV-545953 - Add Implementation for hipStreamGetId (#434)
Authored-by: Satyanvesh Dittakavi <Satyanvesh.Dittakavi@amd.com>
2025-08-26 22:47:55 +05:30
systems-assistant[bot] 832af6d472 SWDEV-545953 - Add Nvidia mapping for hipStreamGetId (#456)
Co-authored-by: Satyanvesh Dittakavi <Satyanvesh.Dittakavi@amd.com>
2025-08-26 21:35:17 +05:30
systems-assistant[bot] 3e62d0d2e6 SWDEV-545953 - Add hipStreamGetId API header (#428)
Authored-by: Satyanvesh Dittakavi <Satyanvesh.Dittakavi@amd.com>
2025-08-26 21:33:26 +05:30
Jimbo c03048d68e Implement hipMemAllocationTypeUncached in hipMemCreate (#747)
* Revert "SWDEV-547589 - Add hipDeviceMallocUncached to hipMemCreate (#815)"

This reverts commit 5ce7103555.

* Revert "SWDEV-547589 - comment for flag hipDeviceMallocUncached in hipMemcreate (#339)"

This reverts commit 04dac5eae3.

* SWDEV-551942 - implement hipMemAllocationTypeUncached in hipMemCreate
2025-08-26 11:34:49 -04:00
Julia Jiang 202aa7ff8c SWDEV-525231 - Remove Memory Manager support in 7.0 (#741) 2025-08-26 11:13:36 -04:00
MachineTom f1ed57e54d SWDEV-550626 - Make atomics test pass with new compiler (#731)
Change pinned host memory to device memory so that
atomics Min/Max tests can pass with new compiler patch
in integer types.
2025-08-25 22:30:55 -04:00
xuchen-amd e8081bd91a Update mi350 output files for unit tests. (#744) 2025-08-25 21:27:10 -04:00
shwetakhatri-amd 79400a1f23 rocr: GFX12+ - Fix trap handler to process SW trap ID correctly (#736)
When stochastic sampling is not active, the trap handler is incorrectly
branching to .check_exceptions, bypassing the software trap ID checks
and inturn not advancing the PC. Fixed the issue to always check software
traps regardless of PC sampling state.

Co-authored-by: Shweta Khatri <shweta.khatri@amd.com>
2025-08-25 19:20:37 -04:00
SaleelK ddba20579d SWDEV-551080 - Fix hipMemcpyDeviceToDeviceNoCU path (#683)
* hipMemcpyDeviceToDeviceNoCU should always take SDMA path as per the
  flag usage
2025-08-25 15:13:02 -07:00
xuchen-amd 5c8b34ddf5 [rocprofiler-compute][TUI] Add interactive metric description (#718) 2025-08-25 15:53:55 -04:00
vedithal-amd 9a02dae75f [rocprofiler-compute] [Bugfix] Fix analysis not working with rocpd (#704)
* fix rocpd roofline

* Improve rocpd test by using dynamic workload folder

* bugfix

* fix ruff format
2025-08-25 11:46:55 -04:00
systems-assistant[bot] b645010655 Using semaphore to sync with all peer processes in finalization stage (#169)
* Using semaphore to sync with all peer processes in finalization stage

[rocprofv3] Implement synchronization using POSIX semaphore in finalization

* clang format code

* clang 11 format code

* Add process sync option for rocprofv3

* Default value of process sync is false

* Update source/lib/rocprofiler-sdk-tool/tool.cpp

Apply suggestion by Copilot

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* update according to comments

* add new line to helper.hpp

---------

Co-authored-by: Huanran Wang <huanrwan@amd.com>
Co-authored-by: Huanran Wang <huanran.wang@amd.com>
Co-authored-by: Madsen, Jonathan <Jonathan.Madsen@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-08-25 08:57:41 -05:00
vedithal-amd 748c9b74d9 Update standalone binary to use python 3.9 (#725)
* Update standalone docker to python 3.9

* Add TUI files

* Fix docker files to work with monorepo

* Update standalone binary documentation
2025-08-25 07:57:08 -04:00
cfreeamd a013e141b7 Revert "rocr: river interface changes" (#724)
This commit reverts the following related commits which cause
test failures:

6d15779b3e rocr/driver: add PC sampling support to driver interface
56cb9390ff rocr/driver: add PC sampling support to driver interface
76bf829f09 rocr/driver: add ASAN header page management to Driver class
a47c060d6a rocr/driver: add ASAN header page management to Driver class
02d7eaf3b7 rocr: add memory sharing call to Driver interface
9312468655 rocr: add memory sharing call to Driver interface
2025-08-25 12:44:26 +05:30
Ashutosh Mishra f2f7f03d61 Fix buffer overrun (#655)
Assigning a null terminator at
the end of the string wrote
past the end of the allocated
buffer. This patch corrects that.

Signed-off-by: Ashutosh Mishra <ashutosh.mishra@amd.com>
2025-08-25 09:41:25 +05:30
ywang103-amd 2a216ecbc1 pc sampling unit tests (#194) 2025-08-23 10:13:22 -04:00
Taylor Ding b5c8c8bcb1 Eval metrics performance optimizations (#435)
Post-analysis eval metrics performance optimizations.
2025-08-22 16:35:48 -04:00
jamessiddeley-amd 5deeea71df [rocprof-compute] Update Formatting (#671)
* updated rocprof-compute formatting

* fixed ammolite peak variables in parser.py

* format parser.py

* update formatting rocprof_compute_base
2025-08-22 12:22:17 -04:00