Граф коммитов

70968 Коммитов

Автор SHA1 Сообщение Дата
Jason Bonnell 3b732edb0b Move submodule definitions from projects/rccl/.gitmodules to root .gitmodules (#2298) 2025-12-15 09:27:32 -05:00
Ioannis Assiouras 13561fb8cd Bump hash for theRock to 2025-12-12 commit (#2289) 2025-12-13 23:56:08 +00:00
jamessiddeley-amd 706a8382a5 [rocprof-compute] added graceful exit with corrupt roofline.csv in profile and analyze mode (#1811)
* added graceful errors/exit in profile/analyze roofline.csv

* edit if statement truth

* restore if statement truth (roofline_csv needs at least 2 rows)

* addressed comments and skipped showing roof metrics when data invalid

* fix workload merge

* changed warning to error

* removed redundant variable definition

* added roofline csv validate check in TUI

* add test cases to test validation function

* ruff format

* simplified TUI roofline handling
2025-12-12 17:06:37 -05:00
jamessiddeley-amd 81720183ad [rocprof-compute] Merge CDash Nightly and Continuous workflow files (#2279)
* merged code-coverage and continuous workflow files

* fixed runner typos and added build mode

* add actor name to Continuous build

* improve error handling and remove redundant verbose

* fixed workflow file log output

* revert logs output in run_ci.py

* ruff format
2025-12-12 17:04:56 -05:00
Dominic Widdows 9a8ed9f45d Doc updates updating internal links from deprecated repos to rocm-systems project locations (#2294)
* Update README documentation links for clarity and consistency across projects

- Changed links in the README files for `clr`, `hipother`, and `hip-tests` to use relative paths instead of absolute URLs, improving navigation within the repository.

* Update CONTRIBUTING documentation to use relative links for improved navigation

- Changed absolute URLs to relative paths in the CONTRIBUTING.md files for the hip and hipother projects, enhancing consistency and ease of access within the repository.
2025-12-12 13:21:42 -08:00
Eiden Yoshida a9de523e0d Add rccl and rccl-tests to auto-labeler yaml (#2286) 2025-12-12 12:47:20 -07:00
Ajay GunaShekar 0bb5638481 Rock: hip-tests installation path to remain same for linux/windows (#2187)
* Rock: hip-tests installation path remains same for linux and windows

On theRock - installation path remains same linux/windows
share/hip/catch_tests

On internal win build - hip-tests will be installed to catch_tests
flag is passed internally which controls the path.
2025-12-12 08:45:28 -08:00
Dominic Widdows 2073cf2172 Skip running TheRock CI on docs-only changes(#2246)
Following the pattern from ROCm/rocm-libraries#2679, add logic to skip
CI builds when only documentation files are modified.

Changes:
- Add SKIPPABLE_PATH_PATTERNS for docs, markdown, and .gitignore files
- Return empty projects list when only skippable paths are modified
- No workflow changes needed - existing projects != '[]' check handles it
- Add unit tests for doc-filtering logic
- Fix existing tests with proper subprocess mocking

Reference: https://github.com/ROCm/rocm-libraries/pull/2679
2025-12-12 08:30:59 -08:00
vedithal-amd 4870725a62 Do not absolute python path when adding tests (#2282) 2025-12-12 10:57:19 -05:00
vedithal-amd 793732a04e [rocprofiler-compute] Improve amdsmi interface (#2245)
* Improve amdsmi interface

* Fix issue where max mem clock was being set as max gfx clock

* Handle the case when all device handles might not be usable due to
  devices being hidden by ROCR and HIP environment variables

* Fix get gpu vram size to return str in KB

* Improve testing of amdsmi interface functions
2025-12-12 09:02:37 -05:00
Jan Stephan ca0f3a6b5a Fix code highlighting (#2254)
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2025-12-12 14:06:53 +01:00
Jan Stephan 4cc8ff3c54 HIPRTC: Fix CDNA CU description (#2252)
Signed-off-by: Jan Stephan <jan.stephan@amd.com>
2025-12-12 14:06:16 +01:00
Todd tiantuo Li 0eccbf0534 SWDEV-554372 - cuda mappings for GetProcAddress API and flags (#1089) 2025-12-11 23:59:51 -08:00
SaleelK 840301e12d clr: Minor fixes for error return (#2153) 2025-12-11 16:59:56 -08:00
Mario Limonciello b106e6f175 Run pre-commit's whitespace related hooks on projects/rocminfo (#2115)
In order for pre-commit to be useful, everything needs to meet a common
baseline.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-12-11 15:41:47 -06:00
Mario Limonciello bfb13f2b43 Run pre-commit's whitespace related hooks on projects/rocm-smi-lib (#2117)
* Run pre-commit's whitespace related hooks on projects/rocm-smi-lib

In order for pre-commit to be useful, everything needs to meet a common
baseline.

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>

* Added Changelog Spaces for formatting

Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>

---------

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-12-11 15:41:24 -06:00
Ameya Keshava Mallya 56fd780e32 Add 'projects/rccl-tests/' from commit '6405c76e6826663bbb67bd40aeee8c70aa5d3094'
git-subtree-dir: projects/rccl-tests
git-subtree-mainline: 42d84317cf
git-subtree-split: 6405c76e68
2025-12-11 20:46:38 +00:00
Ameya Keshava Mallya 42d84317cf Add 'projects/rccl/' from commit '1f2f9f33bac3e8ecfd84c69af6063d7352c362fc'
git-subtree-dir: projects/rccl
git-subtree-mainline: 3fd8a0d393
git-subtree-split: 1f2f9f33ba
2025-12-11 20:46:05 +00:00
Thomas Huber 1f2f9f33ba Update gfx950 tuner conf to include broadcast (#2065)
Signed-off-by: Thomas Huber <thomas.huber@amd.com>
2025-12-11 14:36:03 -05:00
vedithal-amd 3fd8a0d393 [rocprofiler-compute] Remove hip dependency during analysis (#2276)
* Remove hip dependency during analysis

* dont dynamic import when roofline skipped
2025-12-11 14:33:43 -05:00
jamessiddeley-amd 8f452d29df [rocprof-compute] Update Docs 7.2 + Dual Issue Detection (#2160)
* modified changelog for docs updates 7.2

* update documentation for 7.2

* update FAQ wording

* Update projects/rocprofiler-compute/docs/reference/faq.rst

Co-authored-by: cfallows-amd <Carrie.Fallows@amd.com>

* addressed comments

* fixed header for 'On MI350 and newer platforms'

* Update projects/rocprofiler-compute/src/rocprof_compute_soc/analysis_configs/gfx950/1100_compute_units_compute_pipeline.yaml

Co-authored-by: cfallows-amd <Carrie.Fallows@amd.com>

* ruff format

---------

Co-authored-by: cfallows-amd <Carrie.Fallows@amd.com>
2025-12-11 14:23:34 -05:00
SakaSitharammurthy 9de72d438d Updated amd-smi.h documentation (#2031)
Signed-off-by: Saka, Sitharam Murthy <SitharamMurthy.Saka@amd.com>
2025-12-11 11:42:23 -06:00
systems-assistant[bot] c72b0558a4 [SWDEV-555654] Enable Driver reload on SRIOV (#1898)
Enabled reload argmument. Reload is supported
on SRIOV systems.

Fixes:
sudo amd-smi reset -g all
AttributeError: 'Namespace' object has no attribute 'reload_driver'

Change-Id: Ib75ba043e29ae6e668c18451b93e766a7528739f

Signed-off-by: Charis Poag <Charis.Poag@amd.com>
Co-authored-by: Charis Poag <Charis.Poag@amd.com>
2025-12-11 11:38:40 -06:00
David Galiffi fbaeb74107 [rocprof-sys] Update nightly CI workflow (#2263)
Update ROCm version to 7.1.0

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-12-11 12:23:26 -05:00
Rahul Manocha dd4bee33ff SWDEV-558848 - Update thunk interface signature for vmm enablement (#2259)
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-12-11 08:43:28 -08:00
Mustafa Abduljabbar 2cf6a9bb19 Add WAIT_PEER NPKIT event (#2100) 2025-12-11 11:18:41 -05:00
Aleksei Tumakaev 186cdd63c9 [rocpd] Improve summary categories (#2000)
* Improve summary categories
2025-12-11 16:55:47 +01:00
Marko Crnobrnja Maletić 7b2f68e798 Handle cpu name having colons (#2155)
* Handle cpu name having colons
* Adding tests to verify
* clang-format fix

---------

Co-authored-by: bgopesh <gopesh.bhardwaj@amd.com>
2025-12-11 16:36:01 +01:00
Mario Limonciello 0c4d08f38d Revert correcting the VGPR size for GFX 11.5.1 (#2268)
Although the value is correct; there is no source of truth between
kernel and userspace.  This leads to problems if the kernel has strict
restrictions (such as kernel 6.17 or earlier). The restrictions were
lifted in 6.17.9 and and 6.18, but there is no guarantee userspace is
using this.

So short term this value will be wrong.  But on newer kernels the kernel
will communicate the right size and rocr-runtime will be adjusted to
use that.

Link: https://github.com/ROCm/TheRock/pull/2505

Signed-off-by: Mario Limonciello (AMD) <superm1@kernel.org>
2025-12-11 07:59:19 -06:00
habajpai-amd 6b45657493 update build rccl-tests infrastructure and add getAlgoProtoChannels support (#2212) 2025-12-11 18:29:06 +05:30
SaleelK 10635483ad clr: Fix packet batch write logic (#2236)
* When writing bulk packets always invalidate packet headers, Its
  possible that the CP fetcher can have multiple packets in flight. In
such cases we may end up with a malformed packet because the writes are
not complete yet CP finds a valid header.
2025-12-11 04:26:41 -08:00
Matt Arsenault a495d1137e SWDEV-548892 - Make declaration of __ockl_fdot2 always available (#2229) 2025-12-11 11:53:11 +01:00
Adel Johar 256dd1963a [hip] Docs: Overhaul HW implementation page (#1994)
* [hip] Docs: Overhaul HW implementation page
* Update hardware implementation and glossary
* Update programming model
* Add performance optimization
* Split into how-to and understanding

---------

Signed-off-by: Jan Stephan <jan.stephan@amd.com>
Co-authored-by: Jan Stephan <jan.stephan@amd.com>
Co-authored-by: Julia Jiang <julia.jiang@amd.com>
2025-12-11 10:52:34 +01:00
koushikbillakanti-amd 9e06ea8f79 [SWDEV-564696] Structure size mismatch in SOC pstate/XGMI PLPD (#2207)
* Address PR feedback: consolidate switch cases, move CSV formatting, use direct API calls for error messages
* csv output flattening changes

---------

Signed-off-by: Billakanti, Koushik <Koushik.Billakanti@amd.com>
2025-12-10 23:37:36 -06:00
SakaSitharammurthy caecbb4d01 [SWDEV-354749] Added CPU Performance Tests (#2173)
* CPU Performance testcases
  
---------

Signed-off-by: Saka, Sitharam Murthy <SitharamMurthy.Saka@amd.com>
2025-12-10 21:57:47 -06:00
systems-assistant[bot] e39fe03bcf [SWDEV-488296] Implemented API Performance test case (#1903)
Add API performance testing and execution script

---------

Signed-off-by: Sumanth Gavini <sumanth.gavini@amd.com>
Co-authored-by: Sumanth Gavini <sumanth.gavini@amd.com>
2025-12-10 21:33:44 -06:00
shwetakhatri-amd 0835f2e75a rocrtst: Updated CMakeFiles to find_package instead of hardcoded (#2095)
* rocrtst: Updated CMakeFiles to find_package instead of hardcoded

This is to support TheROCK build environment

* rocrtst: Fix CMake to use find_package() instead of hardcoded ENV paths

Fixed CMake style issues from previos first commit's code review

* rocrtst: Fix rocrtst NUMA dependency detection to use find_package

Also added handling of missing headers

* rocrtst: Fix NUMA and hwloc detection for cross-platform builds

---------

Co-authored-by: Shweta Khatri <shweta.khatri@amd.com>
2025-12-10 16:16:25 -05:00
Geo Min 5384a8abb2 Correct runner name (#2098) 2025-12-10 11:44:48 -08:00
David Galiffi 70562eb854 Add ROCm 7.1 to workflows (#2256)
Signed-off-by: David Galiffi <David.Galiffi@amd.com>
2025-12-10 13:41:22 -05:00
German Andryeyev 3895aadba6 SWDEV-558849 - Make ROCR path in Windows more stable (#2181) 2025-12-10 12:37:10 -05:00
Pengda Xie 1d6b26f829 SWDEV-556684 - HSAIL Cleanup re-apply commit 4abdfe5: (#2024)
Removed some options

-xnack, -force-wgp-mode, -force-wave-size-32, -round-trip-spirv,
-fe-gen-spirv, -lower-pipe-builtins=0|1, -lower-atomics=0|1,
-set-lds=<value>, -set-scalar-registers=<value>,
-set-vector-registers=<value>, -limit-scalar-registers=<value>,
-limit-vector-registers=<value>, -sc-xnack-iommu,
-faa-for-barrier/-fno-a-for-barrier, -sc-dev-format, -verify-lwspir,
-verify-hwspir, -ffma-enable/-fno-fma-enable,
-fmad-enable/-fno-mad-enable, -fdisable-avx/-fno-disable-avx,
-fforce-llvm/-fno-force-llvm, -print-compile-phases,
-kernel-cache-enforce-miss, -kernel-cache-wipe, -kernel-cache,
-sc[=<filename>]/--load-sc-dll[=<filename>],
-be[=<filename>]/--load-be-dll[=<filename>],
-cg[=<filename>]/--load-cg-dll[=<filename>],
-link[=<filename>]/--load-link-dll[=<filename>],
-opt[=<filename>]/--load-opt-dll[=<filename>],
-fe[=<filename>]/--load-fe-dll[=<filename>],
-cl[=<filename>]/--load-cl-dll[=<filename>], -just-kernel=<kernel-name>,
-use-debugil, -fmulti-level-call/-fno-multi-level-call,
-fdebug-call/-fno-debug-call, -fmacro-call/-fno-macro-call,
-fstack-uav/-fno-stack-uav, -fdef-res-id/-fno-def-res-id,
-wokth=int/--waves-opt-kernel-threshold,
-ilkth=int/--inline-kernel-size-threshold,
-ilsth=int/--inline-size-threshold, -ilcth=int/--inline-cost-threshold,
-scopt=int/--sc-opt-level, -flib-no-inline/-fno-lib-no-inline,
-fuser-no-inline/-fno-user-no-inline,
-scras=int/--sc-si-opt-reg-alloc-strategy, -fsc-post-ra-sched,
-fsc-live-sched/-fno-sc-live-sched, -fsc-use-buffer-for-hsa-global,
-fsc-schedule-no-reorder, -fsc-min-reg-schedule,
-fsc-bias-schedule-to-minimize-insts,
-fsc-bias-schedule-to-minimize-regs, -fsc-disable-merge-memory,
-fsc-disable-loop-unroll, -fsc-use-mubuf/-fno-sc-use-mubuf,
-fsc-selective-inline/-fno-sc-selective-inline,
-fsc-keep-calls/-fno-sc-keep-calls, -slc=0|1/--simplifylibcall,
-stack-alignment=<n>, -fdiv2fmul=0|1, -prt-opt-liveness=0|1,
-liveness=0|1, -SRAE-threshold=<value>, -memcombine-max-vec-gen=<value>,
-small-global-objects, -fast-fmaf, -fast-fma, -bfo=0|1, -ebb=0|1, -aa,
-mem2reg=0|1, -licm=0|1, -unroll-allow-partial,
-unroll-threshold=<positive integer>, -unroll-count=<positive integer>,
-apt/--ap-threshold=<positive integer>, -srt/--sr-threshold=<positive
integer>, -fdebug-linker/-fno-debug-linker, -fbin-gpu64/-fno-bin-gpu64,
-fbin-disasm/-fno-bin-disasm, -fbin-bif30, -fbin-hsail/-fno-bin-hsail,
-fbin-amdil/-fno-bin-amdil, -fbin-spir/-fno-bin-spir, -fonly-bin-source,
-fper-pointer-uav/-fno-per-pointer-uav

Co-authored-by: Konstantin Zhuravlyov <kzhuravl_dev@outlook.com>
2025-12-10 09:09:12 -08:00
corey-derochie-amd 18e9ad913b Fixed unit-test env var list parsing and improved filtered test run speed (#1626)
* Fixed parsing of env var lists which were overwriting the mutable env var string and polluting future parses.

* Fixed all tests to obey UT_DATATYPES and UT_REDOPS filters.

* Allow tests to bail early via `GTEST_SKIP` if UT_DATATYPES or UT_REDOPS filters give a test size of zero. This allows tests to run much faster with filters on.

* Wrapped the support checks in helper functions on `TestBed`.
2025-12-10 10:06:44 -07:00
Fábio Mestre d4fe3f1cc3 [hip-tests] Update API coverage report generator (#1932)
* [hip-tests] Update API coverage report generator

Updates the HIP API coverage tool. It now takes
extra arguments for the location of the catch test folder
and for the working directory. This avoids issues where the output
of the executable is dependent on the path where it is being
executed from.

Also updates CmakeLists.txt to integrate seamlessly with the
hip-tests project and avoid using commands which rely on
relative paths.

* Remove double new line

* Remove Cmake option to generate coverage

Removes Cmake option to generate coverage. Instead, explicitly removes
the gen_coverage target from all (this is already the default but
doing it explicitly prevents confusion).
2025-12-10 17:53:47 +01:00
Rahul Manocha 0c1f87a7f6 SWDEV-558848 - vmm api support for rocr on windows (#1761)
* SWDEV-558848 - vmm api support for rocr on windows

* Fixes to VMM handle Map/Unmap Set/Get Access

* Fix GetShareableHandle to use pointer for shareable handle

* Update os specific map/unmap memory calls

* clang format update

* Minor syntax fixes from code review

Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>

---------

Co-authored-by: Rahul Manocha <rmanocha@amd.com>
Co-authored-by: Yiannis Papadopoulos <102817138+ypapadop-amd@users.noreply.github.com>
2025-12-10 08:39:51 -08:00
Venkateshwar Reddy Kandula 465633d707 [AQLProfile] Fix sqtt legacy tests due to command buffer size underestimating (#2194) 2025-12-10 09:25:38 -06:00
jamessiddeley-amd d27bd37042 [rocprof-compute] Fix roofline "test_roof_plot_modes" test case (#2217)
* fix roof test to be isolated file paths

* fix typo

* addressed comments

* fix typos
2025-12-10 10:01:49 -05:00
vedithal-amd 252a5e8146 [rocprofiler-compute] Remove TCP_TCP_LATENCY_sum counter for MI300 (#2174)
* Remove TCP_TCP_LATENCY_sum counter for MI300

* Remove TCP_TCP_LATENCY_sum counter which is unsupported for MI300 per register specification

* Remove VL1 Lat metric from memory chart section (block 3) for MI 300
  since it uses TCP_TCP_LATENCY_sum counter which is unsupported

* Remove references to TCP_TCP_LATENCY_sum

* Update CHANGELOG

* reword changelog
2025-12-10 09:41:46 -05:00
cfallows-amd 9d34098350 [rocprofiler-compute] Roofline runtime compilation patch (#2232)
* Add install into CMakeLists.txt file- resolves 'no hip module' issues.
* Readd printout line for peak VALU during benchmarking removed on accident in a different commit.
* Add CHANGELOG entry for commit 2bfa9a4 ("Integrate roofline benchmark into rocprof-compute (#2015)")

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Run formatter checks on rocprof-compute to clear PR checks

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Update benchmark.py link in changelog

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

* Apply suggestions to CHANGELOG from code review

Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Pratik Basyal <pratik.basyal@amd.com>
2025-12-10 01:44:28 -05:00
Mario Limonciello 73778bf83c Adjust policy for memory display on APUs (#1967)
* Read the ids_flags when fetching GPU info

The ids_flags contains the flags that can help identify if a GPU
is a dGPU or an APU.

* Show correct memory pool for APUs

The kernel policy for APUs will be to choose the bigger pool of
memory (GTT or VRAM) for KFD work.  Adjust the policy for the monitor
and default commands to show the right memory pool when using an APU.
2025-12-09 21:49:06 -06:00
Geo Min 6af9087b0c [ci] Bumping TheRock CI commit hash (#2097)
* Bumping TheRock CI commit hasH

* fixing artifact group
2025-12-09 16:25:57 -08:00