Wykres commitów

64422 Commity

Autor SHA1 Wiadomość Data
amd-hsivasun fa0d88f475 Update rocprofiler-sdk.yml (#1036) 2025-09-17 13:17:35 -04:00
Julia Jiang 5db71b8e4c SWDEV-551652 - Adding one change in 7.0 changelog (#960)
Co-authored-by: Istvan Kiss <istvan.kiss@amd.com>
2025-09-17 09:22:26 -07:00
systems-assistant[bot] 0018a4e70c SWDEV-541623 - cuda parity hipLaunchCooperativeKernelMultiDevice and hipExtLaunchMultiKernelMultiDevice (#415)
* SWDEV-541623 - cuda parity hipLaunchCooperativeKernelMultiDevice and hipExtLaunchMultiKernelMultiDevice

numDevices does not match the system devices

* SWDEV-541623 -  enable Unit_hipExtLaunchMultiKernelMultiDevice_Negative_MultiKernelSameDevice

---------

Co-authored-by: agunashe <ajay.gunashekar@amd.com>
2025-09-17 08:33:59 -07:00
Swati Rawat e655bb37a7 Update installation.rst (#1034) 2025-09-17 11:10:55 -04:00
cfreeamd 7ca8881862 rocminfo: move header comment after opening # line (#1025) 2025-09-16 22:19:26 -07:00
Julia Jiang 7ab2e49c57 SWDEV-554072 - Update description for hipModuleLoadData (#929) 2025-09-16 17:10:06 -04:00
systems-assistant[bot] 605be4bebc SWDEV-505930 - Avoid static initialization of ModuleGuard (#604)
This is to prevent calling catch2 macros from outside catch2 TEST_CASE
that can lead to undefined bahavior. This change also disables
hipGetProcAddress tests that are not supported on static build.

Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com>
2025-09-16 21:06:24 +01:00
SaleelK ec5e9673ad clr: Use current device copy engine for inter-dev copy (#945)
* For inter-device copies always use the SDMA engine of current device
* ROCr uses srcAgent SDMA engine, and it could be a remote device
2025-09-16 12:56:07 -07:00
systems-assistant[bot] ce9fe34c92 SWDEV-549705 - Fixed memleak in Unit_hipExtLaunchMultiKernelMultiDevice_Functional (#521)
Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com>
2025-09-16 12:18:59 -07:00
systems-assistant[bot] 5f2ef0fc4f SWDEV-549707 - Fix for mem leak in Unit_hipMemImportFromShareableHandle_Positive_Basic (#523)
Co-authored-by: Ioannis Assiouras <Ioannis.Assiouras@amd.com>
2025-09-16 12:18:34 -07:00
systems-assistant[bot] d5fc1b3703 SWDEV-548838 Add local and global fence support for barrier function (#437)
* SWDEV-548838 Add local and global fence support for barrier function

The original barrier function didn't distinct between local and global scope. There was only __CLK_LOCAL_MEM_FENCE which triggers both local and global fence. This commit introduces __CLK_LOCAL_MEM_FENCE and __CLK_GLOBAL_MEM_FENCE that properly distinguish the scopes. 

---------

Co-authored-by: Tim <Tim.Gu@Amd.com>
Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
Co-authored-by: Tim Gu <timgu102@amd.com>
2025-09-16 14:20:57 -04:00
xuchen-amd 288bca17ea [rocprofiler-compute] refactor bug fixes (#994) 2025-09-16 14:20:33 -04:00
Jessey Harrymanoharan 05dc14934a add skip for windows ci (#965) 2025-09-16 13:11:18 -04:00
JC b2e611a874 [CI] Add pre/post cleanup for windows GPU test jobs (TheRock PR#1361) (#1008) 2025-09-16 12:50:14 -04:00
JC 89f9ab1270 [CI] Add 30 minute timeout to Fetch sources and use 12 jobs for Windows (#1001) 2025-09-16 12:48:52 -04:00
AidanBeltonS bf662640ee SWDEV-539805, SWDEV-553860 - Resolve GCC clang ABI mismatch and check vector alignment (#909)
* SWDEV-539805 - Add checks for vector alignment and size

* SWDEV-553860 - Alter alignment for gcc

* SWDEV-553860 - Align fallback method

* SWDEV-553860 - Alter alignment requirement
2025-09-16 17:10:14 +01:00
systems-assistant[bot] 857e5ef3ce chore: unset executable permission (#213)
Co-authored-by: Eisuke Kawashima <e-kwsm@users.noreply.github.com>
Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
2025-09-16 11:06:54 -05:00
systems-assistant[bot] 88201d2b79 [SWDEV-544729] Updated CLI error handling (#216)
Updated: rocm_smi.py
- Remove all else: clauses from functions where rsmi_ret_ok is part of the if clause, as requested.
- rsmi_ret_ok() function already handles unsucessful return codes and gracefully handles them.
- Updated check_runtime_status() function to sweep through /sys/class/drm to find active runtime_status.
- Updated the message to' AMD GPU device(s) is/are in a low-power state. Check power control/runtime_status'
- This clarifies the status of the GPU and tells them where to check for more info.

Signed-off-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Juan Castillo <juan.castillo@amd.com>
Co-authored-by: Maisam Arif <Maisam.Arif@amd.com>
Co-authored-by: gabrpham <Gabriel.Pham@amd.com>
2025-09-16 10:56:03 -05:00
itrowbri 8ba7120b63 [rocprofiler-sdk] Verify there are callbacks for every kernel dispatch when syncing (#321)
* Added check in Queue::sync to verify that there is a callback for every dispatch

* Removed new atomic, using get_balanced_signal_slots() atomic with initial value of NUM_SIGNALS to verify dispatches complete
2025-09-16 10:35:16 -05:00
systems-assistant[bot] 3b5467b746 [DOC] single pass counter collection (#95) 2025-09-16 11:00:11 -04:00
Sunday Clement db63d4c38b hsakmt: Update udmabuf.h License Identifier Header (#873)
Fix typos, and update the license header to include SPDX license
identifier.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
2025-09-16 10:36:02 -04:00
ywang103-amd 97f8b7b1ec change to single-kernel workload for pc_sampling tests (#955) 2025-09-16 10:17:23 -04:00
Matt Williams af2f2c1345 Update index.rst (#1014) 2025-09-16 09:59:04 -04:00
systems-assistant[bot] f1fabcfd64 rocr: Error Handling Issues (#264)
* rocr: Fix Incorrect Assertion Check

The wrong variable is used in the assertion statement, should be error
checking for the value of paramEndLoc after it is modified by the call
to find().

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>

* rocr: Fix Potential Undefined Behaviour

In the event that the SvmProfileControl destructor is called and
event == -1 is true then the call to close(event) is effectively
close(-1) which is undefined behaviour. This has been changed to only
call close() on valid file descriptors.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>

* rocr: Add Error Check on Bytes Read

In the case that there is an incomplete read the call to copyTo() will
now return an error.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>

* rocr: Fix Exception Error

Destructors are implicitly marked with noexcept being true by default
so if its not explicitly marked false in the destructor or the
functions it calls, any thrown exceptions will cause the program to
crash.

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>

---------

Signed-off-by: Sunday Clement <Sunday.Clement@amd.com>
Co-authored-by: Sunday Clement <Sunday.Clement@amd.com>
2025-09-16 09:43:45 -04:00
xuchen-amd a442766d26 [rocprofiler-compute] improve profile options (#999) 2025-09-15 18:21:45 -04:00
Aleksei Tumakaev 646e4d211a [rocpd] Use SQL queries instead of views in summary generator (#311)
* Use queries instead of views in summary.py

* Export queries when created

* Remove HIP and HSA from output

* Fix domain query

* Export summary queries in the main function

* Fix comments and variable names

* Change syntax for old python versions

---------

Co-authored-by: Young Hui <young.hui@amd.com>
2025-09-15 17:13:06 -04:00
harkgill-amd 902ec4d3ad Fix documentation to match function signature (#990)
Co-authored-by: ammallya <ameyakeshava.mallya@amd.com>
2025-09-15 11:19:21 -07:00
Alysa Liu 7277ecc9a3 rocminfo: Add copyright for new files (#888)
Legal Requirements:

For AMD software being released as open source, add copyright at the top of each new file.

Signed-off-by: Alysa Liu <Alysa.Liu@amd.com>
2025-09-15 11:36:49 -04:00
Danylo Lytovchenko c0e7091b9f Fix syntax error in azure subtree script (#988) 2025-09-15 11:22:03 -04:00
vstojilj f24e2ca676 SWDEV-546865 - Disable core dumps when running tests (#880)
* SWDEV-546865 - Disable core dumps when running tests

* SWDEV-546865 - Disable core dumps only for tests that require it
2025-09-15 15:58:41 +02:00
harkgill-amd d1b2b5ed44 Fix grid_group::group_dim to return grid_dim and not block_dim (#823)
* Fix grid_group::group_dim to return grid_dim and not block_dim

* Add unit test for grid_group.group_dim()

* Fix unit test errors

* Skip group_dim() assertions for base_type test
2025-09-15 09:42:55 -04:00
systems-assistant[bot] 2f7e9591be SWDEV-541096 - add hipEventWaitDefault and hipEventWaitExternal (#453)
Co-authored-by: Li, Todd tiantuo <Toddtiantuo.Li@amd.com>
2025-09-13 10:33:00 -07:00
Dmitrii 8abe24d3b0 rdc: Add CPU support and CPU metrics infrastructure (#770) 2025-09-12 16:14:38 -05:00
xuchen-amd eb46160a8f update proj toml (#974) 2025-09-12 16:24:44 -04:00
Venkateshwar Reddy Kandula 4daf25944d add gotcha to rocpd cpack component. (#904)
Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
2025-09-12 15:21:44 -05:00
Julian Jose 8157437273 [Palamida scan] SWDEV-553054 Adding missing copyrights information (#900)
* Add missing copyright headers in rocprofiler-systems
* Update python-tests
* Update causal test

---------

Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-09-12 14:17:58 -04:00
xuchen-amd 7ed6000e32 [rocprofiler-compute] Refactor to add type annotation and misc (#787) 2025-09-12 13:53:24 -04:00
ammallya 37f8da676a Change depth to 250 for large PRs (#972) 2025-09-12 09:18:02 -07:00
amd-hsivasun 2b68ac750e Add rocprofiler-systems project dependency (#915) 2025-09-12 12:16:54 -04:00
marandje 3a37389f6a SWDEV-547554 - Resolve memory leaks in hiprtc tests (#967) 2025-09-12 18:12:15 +02:00
Kian Cossettini 5d582fcd37 [rocprofiler-systems] Add Fortran OpenMP CTests (#874)
* Added Fortran (amdflang) openmp tests using the openmp-vv project

---------

Signed-off-by: David Galiffi <David.Galiffi@amd.com>
Co-authored-by: David Galiffi <David.Galiffi@amd.com>
2025-09-12 09:52:16 -04:00
habajpai-amd 1c7293e6d0 Add bounds checks in transpose_a for both load and store so edge tiles dont read/write past MxN (#950) 2025-09-12 17:32:30 +05:30
Venkateshwar Reddy Kandula 26e7c4231e [rocprofiler-sdk] Add derived metrics for Navi4 (#238)
* add more derived metrics for navi4.

* addr comments

* addr comments, and add more derived counters.

* EOF.

* misc.

* remove duplicate counter.

* misc.

* Remove gfx12 architecture definition for ldslatency

* remove extra architectures for gfx12.

* use wgp for normalization

* move these changes to another PR.

---------

Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
2025-09-12 02:30:10 -05:00
abchoudh-amd 7d847dde3f Split tests (#952) 2025-09-12 12:29:48 +05:30
Venkateshwar Reddy Kandula aa3313aa99 add dl lib to utility_tests (#961) 2025-09-12 10:28:52 +05:30
Marius Brehler 01828d1375 Force gzip to overwrite an existing changelog (#665)
If a compressed changelog exists from a previous build, reconfiguring
the project fails with
```
[rocm-core configure] CMake Error at utils.cmake:213 (message):
[rocm-core configure]   Failed to compress: gzip:
[rocm-core configure]   /home/ben/src/TheRock/build/base/rocm-core/build/DEBIAN/changelog.Debian.gz
[rocm-core configure]   already exists; not overwritten
```

Add `-f` to force overwriting.
2025-09-11 16:34:37 -07:00
systems-assistant[bot] c85200fc42 SWDEV-541096 - add hipEventWaitDefault and hipEventWaitExternal flags (#507)
Co-authored-by: Li, Todd tiantuo <Toddtiantuo.Li@amd.com>
2025-09-11 14:50:55 -07:00
Jatin Chaudhary 3742814d82 SWDEV-553757 - add __HIP__ and __clang__ check for __shfl functions (#872) 2025-09-11 21:57:39 +01:00
amd-hsivasun 892a56cb54 [Ex CI] Enable hip-tests (#957)
* [Ex CI] Enable hip-tests

* Add Pipeline Id

* Fixed typo

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

---------

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
2025-09-11 16:40:07 -04:00
amd-hsivasun 6b923ee1ac [Ex CI] Enable rocr-runtime (#925) 2025-09-11 16:25:09 -04:00