Graf commitů

68501 Commity

Autor SHA1 Zpráva Datum
amilanov-amd da9bb4efae SWDEV-503089 - Fix and enable disabled HIP tests from math group (#1319)
* SWDEV-503089 - Fix and enable disabled HIP tests from math group

* SWDEV-503089 - Move single precision reduced run to a common function
2025-11-26 10:34:05 +01:00
Todd tiantuo Li ee48f6221d SWDEV-562708 - change default maximum SVM size to 256GB (#1731) 2025-11-25 23:59:39 -08:00
Matt Arsenault 9fbb062505 SWDEV-548892 - Stop using ocml isinf wrapper (#1854) 2025-11-25 22:21:37 -05:00
Karthik Jayaprakash 740a06d567 SWDEV-559267 - Use CLPrint to DevLogPrintf with Log Level - detail debug. (#1160) 2025-11-25 19:25:32 -05:00
German Andryeyev 93682f2f75 SWDEV-567852 - Clean-up hip::init() (#1948) 2025-11-25 19:05:41 -05:00
cadolphe-amd cce94f6ee0 SWDEV-557412 - Incorporate proper chunk offset when remapping virtual memory (#1848)
* SWDEV-557412 - Incorporate proper offset when remapping virtual memory

* Fix condition to check if VMHeap allocation address matches a chunk address

* Move offset calculation outside if/else block

---------

Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-25 18:05:25 -05:00
marantic-amd daf8596ce9 [rocprof-sys] Process all information regarding agents and store them as extdata in rocpd database (#1880)
## Motivation

Resolved: SWDEV-566226

The current implementation of agents inside of rocprof-systems keeps just the minimal necessary set of information required for populating the `info_agent` table inside of rocpd database. There is a sufficient amount of data that is being left out from database, so this change should fix that and store the additional agent information as an `extdata` row inside of `info_agent` table.

## Technical Details

This PR introduces additional filed inside of `agent` structure inside which is representing the JSON formatted string of all the additional information we can acquire about particular agent. This data is processed and added during the initial fetching of agents, and afterwards pushed inside of the database.

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-25 17:33:12 -05:00
itrowbri 304c2b82b0 Updated rocprofv3.py to ignore old attach duration msec value (#1980) 2025-11-25 16:30:54 -06:00
Victor Zhang ede71ca3b0 SWDEV-567829 - populateFormatStringHashMap: relax printf hash collisi… (#1944)
* SWDEV-567829 - populateFormatStringHashMap: relax printf hash collision check for duplicate format strings

* function optimized by ai
2025-11-25 17:19:27 -05:00
anujshuk-amd 85b5c03f36 [rocprof-sys] Fix test build failure on RHEL 10 (#1955)
## Motivation

To solve: SWDEV-566076 
FFmpeg versions >= 58.134 no longer expose read_seek and read_seek2 function pointers in AVInputFormat,
requiring alternative seek detection methods. This pull request updates the `VideoDemuxer` class to improve compatibility with newer versions of FFmpeg. The main change is how the code determines whether the input file is seekable, addressing differences in FFmpeg API versions.


## Technical Details

In `video_demuxer.h`, added a conditional check for `USE_AVCODEC_GREATER_THAN_58_134` to set `is_seekable_` to `true` for newer FFmpeg versions, since `read_seek` and `read_seek2` are no longer exposed in `AVFormatContext`. For older versions, the previous method of checking these fields remains in place. The conditional compilation
now assumes seek capability is available for newer FFmpeg versions.
2025-11-25 15:25:05 -05:00
Bindhiya Kanangot Balakrishnan e8c3b22734 [SWDEV-556483] Fix runtime PM suspend causing test failures (#1931)
Added runtime PM detection and DRM ioctl-based device wake
to handle GPUs in BACO state. Modified tests to wake
suspended devices before reading sysfs files.

---------

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-11-25 13:36:45 -06:00
usrihari123 47e53ec6f3 Update rocpd docs (#1276) 2025-11-25 22:33:12 +05:30
German Andryeyev 2c5754844f SWDEV-465041 - Enable direct dispatch under Linux by default. (#1934) 2025-11-25 11:30:32 -05:00
Victor Zhang 92fcc928b6 SWDEV-526773 - Modify LaunchDelayKernel to set a hard coded WallClock… (#1911)
* SWDEV-526773 - Modify LaunchDelayKernel to set a hard coded WallClock value when it's not avaliable

* Change hardcode clockrate in unit of KHz.
2025-11-25 11:21:03 -05:00
Ethan Trinh 2042191e23 Suppress deprecated-declaration warnings (#1817)
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-25 10:31:30 -05:00
Ethan Trinh bef946de1c SWDEV-555551 - Remove hip-test warnings in linux (#1031)
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-25 10:31:15 -05:00
Jason Bonnell e68873c170 Gersemi formatting for rocprofiler-compute (#1997)
* Run gersemi formatting on cmake files in compute

* Run gersemi again but on updated version
2025-11-25 09:49:16 -05:00
Gerardo Hernandez 8abfee9f26 SWDEV-541351 - fix use of uninitialized memory in Unit___hip_atomic_compare_exchange tests (#1976) 2025-11-25 11:02:14 +00:00
solaiys 3466ec5458 Added PCIE Atomic Operations enable check. (#1746)
* Added PCIE Atomic Operations enable check.

Tests if atomic operations are enabled for GPU devices.
Displays the Atomic routing capability via Link capability and status.

Signed-off-by: Saravanan Solaiyappan <saravanan.solaiyappan@amd.com>
2025-11-25 14:29:30 +05:30
Gerardo Hernandez c87014a54c SWDEV-534207 fix order of kernel launch parameters when calling notifiedKernel in some tests: kernel<<<gridDim, blockDim>>> instead of kernel<<<blockDim, gridDim>>>. This was causing out of bounds accesses (#1860) 2025-11-25 06:37:47 +00:00
Pengda Xie 6c31785eaf SWDEV-562761 - Cleanup static fatbin on runtime teardown (#1873) 2025-11-24 21:57:46 -08:00
darren-amd 16e7ee32e6 [rocm-smi-lib] Add iomanip include to frequencies_read (#1797) 2025-11-24 16:38:21 -05:00
Young Hui - AMD a4f533fa92 [rocpd] Fix rocpd convenience scripts to accept --automerge-limit parameter (#1926)
* remove double RocpdImportData calls from execute() in each module

* formatting fix
2025-11-24 14:50:27 -05:00
Maisam Arif 1f7fc8d8a7 Fixed wrapper to respect symlink pathing (#1984)
Signed-off-by: Maisam Arif <Maisam.Arif@amd.com>
2025-11-24 13:14:46 -06:00
systems-assistant[bot] c404fbd851 [SWDEV-560235] Add gpu_board and base_board temperatures to monitor (#1906)
* Add helpers for gpu_board and base_board temperatures
* Added gpu_board and base_board temperatures arguments for non-default monitor subcommand

Signed-off-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
Co-authored-by: Bindhiya Kanangot Balakrishnan <Bindhiya.KanangotBalakrishnan@amd.com>
2025-11-24 13:12:09 -06:00
Marius Brehler 2dc32d645b Explicitly load versioned libamdhip64.so (#1872)
* Explicitly load versioned libamdhip64.so

* Fix syntax errors

* Fix when patching happens in Windows workflow

---------

Co-authored-by: Joseph Macaranas <145489236+jayhawk-commits@users.noreply.github.com>
Co-authored-by: ammallya <ameyakeshava.mallya@amd.com>
2025-11-24 10:05:05 -08:00
sluzynsk-amd 2cf9faa93f SWDEV-563777 - fix warnings related to inconsistent overrides (#1625)
This patch adds missing override keywords. Fixes this class of warnings.

Signed-off-by: Sebastian Luzynski <Sebastian.Luzynski@amd.com>
2025-11-24 18:50:07 +01:00
habajpai-amd 1a3564a51a [rocprof-sys] Fix fork() handling for GPU profiling and AMD SMI (#1930)
- Fix fork() handling for GPU profiling and AMD SMI
- Add hipMallocConcurrency test for CI with GPU
2025-11-24 09:21:27 -05:00
marantic-amd ebd55d2ce0 Track process_sampler state for CPU sampling (#1993) 2025-11-24 15:03:08 +01:00
Aleksandar Djordjevic a5d554b85a [rocprofiler-systems] Implement GTest/GMock integration for unit testing (#1777)
* googletest project set up

---------

Co-authored-by: Aleksandar Djordjevic <adjordje@amd.com>
Co-authored-by: Milan Radosavljevic <milan.radosavljevic@amd.com>
2025-11-24 11:49:30 +01:00
AidanBeltonS 0580e2053c SWDEV-533546, SWDEV-540027 - Add e8m0 conversions and testing (#987)
* SWDEV-533546 - Add conversion functions for e8m0

* SWDEV-533546 - remove whitespace

* Add testing

* Update based on feedback

* Copilot suggestions

---------

Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
2025-11-24 09:14:03 +00:00
Ioannis Assiouras 36029ea1a8 SWDEV-559166 - Fix race condition in getDemangledName (#1868) 2025-11-23 08:45:45 +00:00
Ioannis Assiouras 7313c3752a SWDEV-567475 - Fix failures in graph tests due to GraphExec destroy h… (#1917) 2025-11-22 23:01:47 +00:00
Ioannis Assiouras 75de915725 hip-tests: fix runpath in hipSquareGenericTargetOnly[Compressed] (#1965)
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-11-21 16:31:41 -08:00
jamessiddeley-amd 833425577f fix roofline kernel_names test case (#1954) 2025-11-21 15:04:08 -05:00
ammallya 822b38b743 Migrating amdsmi (#1922) 2025-11-21 11:12:34 -08:00
jokim-amd 770f30bc4c hsakmt: bump vgpr count for gfx1151 (#1807)
GFX1151 has 1.5x VGPR memory compared to the rest of GFX11.
2025-11-21 09:53:32 -08:00
gabrpham 6b1e6187f6 [SWDEV-560681] Allowed GPU enumeration to continue with non-contiguous render nodes (#1609)
* Fix uninitialized variable in GPU enumeration loop (#1643)
* Initialize node_to_gpu_id to prevent undefined behavior

---------

Signed-off-by: gabrpham <Gabriel.Pham@amd.com>
Co-authored-by: Allan Xavier <axavier@digitalocean.com>
2025-11-21 10:01:10 -06:00
vedithal-amd 6540155c9d Bugfixes (#1971)
* Implement AMDGPU driver info and GPU VRAM attributes in system info.
  section of analysis report.

* Backward compatibility for rocprofiler-sdk avail module path migration

* Fix roofline calculation where AI data points are N/A
2025-11-21 10:54:25 -05:00
Jonathan R. Madsen a2288eb50b [rocprofiler-sdk] Install unit tests and helper functions for integration tests (#921)
* [rocprofiler-sdk] Install unit tests and helper functions for integration tests

* Fix rocprofiler-sdk-tests-target export

* Fix handling of cmake policy CMP0174

* Remove -vv from new pytest.ini files

* add unit tests and integration tests.

* add path to ci workflow.

* misc. fixes.

* pc sampling tests.

* bug fixes.

* pc sampling tests fix.

* misc.

* Update CMakeLists.txt

* Update rocprofiler_config_install_tests.cmake, correct license name

* fix units tests install issues.

* fix counters_def file path.

* fix bug, arg shifting.

* vendor pytest-cmake.

* cmake config fix. missing endfunction()

* disable tests, 1.rocprofv3-trace-hip-libs. 2.kernel-tracing. 3.external_correlation 4.rocpd.

* disable buffered-tracing test and remove pytest-cmake from requirements.txt.

* disable hip-graph-tracing test.

* fix building standalone tests to load rocprofiler-sdk cmake package first and then find rocprofiler_sdk_pytest module.

* addressed comments: 1.add local bin path to code cov workflow. 2.add to cmake prefix path local bin. 3.use ROCPROFILER_MEMCHECK_PRELOAD_ENV_VALUE 4.misc. fix

* enabled back tests api_buffered, external_correlation_id, hip-graph, kernel-tracing, rocpd, tracing-hip-in-libraries. and misc fixes(formating, extra fixtures for agent-index tests.)

* cpack to use llvm bin for .hsaco debug symbols.

* psdb tests fixes.

* EOL.

* misc. fixes and Disable api_buffered_tracing, external_correlation_id, hip-graph-tracing, kernel-tracing, rocpd, summary, tracing-hip-libraries, tracing-plus-counter-collection.

* fix incorrect cmakelists file.

* strip smallkernel.bin

* format.

* revert disabled tests commit.

* misc. fix in counter tests.

* misc.

* search codeobj unit test assets in curr bin and install bin.

* refactor newly added rocpd tests.

* modify tests for newly added hip-host-tracing.

* add LD LIB path to units, psdb is failing due to libs not being found.

---------

Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
Co-authored-by: Venkateshwar Reddy Kandula <Venkateshwarreddy.Kandula@amd.com>
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-21 08:06:56 -06:00
Jason Bonnell 66ea1cdff2 Add workflow to remove old untagged rocprofiler GHCR Docker Images (#1959)
* Add WIP workflow step to delete untagged images older than 1 week

* Formatting fix for rocprofiler-systems-ghcr.yml

* Move step to new workflow

* Remove needs parameter from cleanup-rocprofiler-images

* Remove expand-packages option

* Expand cleanup for every OS

* Revert spacing change to rocprofiler-systems-ghcr.yml

* Turn off dry-run to do an initial clean

* Switch dry-run to be only on PR

* Added comment about schedule
2025-11-21 08:49:29 -05:00
Sajina PK d77b245730 [Rocprofiler-systems] : Refactor papi enumeration to fix a hang on Intel systems (#1672)
* Refactor papi enumeration to fix a hang on Intel systems

- Add an exclude argument to available_events_info() for
  perf_event_uncore causing hang like case on Intel systems with large
number of uncore events.
- Enumerate papi available events only when papi events are specified by
  users inside early initialization logic
- Move papi available event query for ROCPROFSYS_SAMPLING_OVERFLOW_EVENT
  config setting to the avail component, to move the heavy logic outside
initialization.
- Make category option for rocprof-sys-avail -H -c case insensitive
- Provide new option to query available overflow events that can be
  specified for ROCPROFSYS_SAMPLING_OVERFLOW_EVENT using new command
option rocprof-sys-avail -H -c overflow

* Update projects/rocprofiler-systems/source/bin/rocprof-sys-avail/common.cpp

Co-authored-by: Milan Radosavljevic <milan.radosavljevic@amd.com>

* Update timemory submodule pointer

Signed-off-by: David Galiffi <David.Galiffi@amd.com>

* Fix errors on compile

* Change 1: Optimization for the category matching lambda

Optmization changes.

* Modify the rocprof-sys-avail -c option for overflow

Overflow should not be displayed as a device in rocprof-sys-avail -H -c CPU

Users can instead do regex on summary where overflow is appended in description

User can do rocprof-sys-avail -H -c CPU -d -r overflow

* Revert change to column width

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: Milan Radosavljevic <milan.radosavljevic@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-11-21 00:19:58 -05:00
Kunal Malviya 4f4352acd0 Enable ROCPROF_KERNEL_TRACE for PC sampling test by setting value to 1 (#1952) 2025-11-20 22:24:15 -05:00
cfreeamd 24c2a84e3f rocr: GPU core file location support (#1732)
* rocr: WIP Support dump of GPU core file

* WIP new core dump tests compile

* WIP: anony namespaces, test updates, progress

Added disabled Fault test. Other non-disabled coredump tests don't work.

* WIP: address code review feedback

* WIP: gpu core dump rocrtst works; combined

* WIP: remove rocrtst changes for this commit
2025-11-20 18:50:51 -08:00
amd-hsivasun adf6a5ec3b [Ex CI] amdsmi monorepo enablement (#1943)
* [Ex CI] amdsmi monorepo enablement

* [Ex CI] Add amdsmi pipeline to monorepo
2025-11-20 14:19:02 -05:00
Milan Radosavljevic 4d670099fa [rocprof-sys] Refactor trace_cache architecture with improved type erasure and processing pipeline (#1710)
- Redesigned buffer_storage with a flush_worker pattern for better thread management and resource cleanup
- Introduced type-safe abstractions through new components: cacheable.hpp, cache_type_traits.hpp, sample_processor.hpp, and type_registry.hpp
- Optimized type erasure implementation in sample processor to reduce runtime overhead
- Renamed rocpd_post_processing to rocpd_processor and restructured the processing pipeline
- Removed storage_parser.cpp and integrated functionality into header-based template implementation
- Enhanced cache_manager with improved processing workflow and better separation of concerns
2025-11-20 14:18:13 -05:00
Istvan Kiss 2f6fb89c51 Add GPU programming patterns tutorials (#1918)
Update projects/hip/docs/tutorial/programming-patterns/atomic_operations_histogram.rst


WIP

Co-authored-by: Julia Jiang <56359287+jujiang-del@users.noreply.github.com>
2025-11-20 10:03:22 -08:00
jonatluu 6b8aae3796 Enable Lintian Support rocm-systems (#1578)
* draft testing fix for no copyright file and no changelog

* test fix no-changelog no-copyright

* changelog copyright fixt

* remove utils.cmake

* rocr lintian

* lintian overrides, copyright, changelog install

* fix lintian overrides install

* comp_type static fix and remove debug logs

* syntax error

* update static build check

* update file permissions to 0755 to fix error control-file-has-bad-permissions 0664 != 0755

* fix lintian errors in rdc and remove logs from roctracer

* lintian error fix rocprofiler

* fix lintian error

* mmove lintian overrides install

* lintian errors fix

* move lintian overrides install

* use changelog already provided by rdc

* fix formatting use existing changelog if provided

* fix formatting use changelog in rocprofiler

* draft testing fix for no copyright file and no changelog

* test fix no-changelog no-copyright

* changelog copyright fixt

* lintian overrides, copyright, changelog install

* fix lintian overrides install

* comp_type static fix and remove debug logs

* fix lintian errors in rdc and remove logs from roctracer

* lintian error fix rocprofiler

* fix lintian error

* mmove lintian overrides install

* lintian errors fix

* move lintian overrides install

* use changelog already provided by rdc

* fix formatting use existing changelog if provided

* fix formatting use changelog in rocprofiler

* remove overrides. Use existing changelog and copyright

* resolve merge conflict

* update license for hsa-rocr. Use NCSA license

* install license

* install license
2025-11-20 11:38:39 -05:00
Sajina PK 124c23e2ff Fix Sampling freq for xgmi tests (#1888)
Low sampling freq was collecting very less samples causing the tests
validation to fail on some systems.
2025-11-20 09:49:09 -05:00
ggottipa-amd bf521c996b Correcting peak VALU Roofline profiling and analysis by removing FP8 VALU and BF16 VALU benchmarking. (#1442)
* Removing FP8 from peak VALU datatypes - PEAK_OPS_DATATYPES.

* Similar change for BF16.

* Roofline binaries from rocm-amdgpu-bench generated 10/22.
https://github.com/ROCm/rocm-amdgpu-bench/commit/2113ef1f5eada8a4a6e44e6d07fd6abac9b0a3f8
Bins include change that removes FP8 and BF16 peak VALU benchmarks.
Built and tested on rhel8, azl3, ubuntu22.04, sles15sp6.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Re-committing the bins

accidentally copied over bins from the wrong folder earlier, caught by Gowthami during testing.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

* Updated changelog

* gersemi fix

* Changelog corrected.

* Changelog fix.

* Adding this to the 7.2.0 section to be picked up in an RC build.

* Moving changelog entry into unreleasesd section - team reconfirmed cutoff date after I requested this change so I am just quickly correcting my mistake in my ask.

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>

---------

Signed-off-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: Carrie Fallows <Carrie.Fallows@amd.com>
Co-authored-by: JeniferC99 <150404595+JeniferC99@users.noreply.github.com>
2025-11-20 12:46:48 +05:30