1511 Коммитов

Автор SHA1 Сообщение Дата
Shadi Dashmiz f1e5612e26 SWDEV-572439: make assert_fail constexpr in the hip headers (#2392)
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
2026-01-30 15:41:17 -05:00
Shadi Dashmiz e1844f6a59 SWDEV-573004 - fix shfl_sync for compiler init value (#2533)
- add attribute for maybe undef

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
2026-01-30 15:39:42 -05:00
Alexandra Sidorova 8800e03058 [CLR] Added missed ostream include to amd_hip_bfloat16.h (#2960) 2026-01-30 07:42:38 -05:00
vstojilj 9a8942a89c SWDEV-558836, SWDEV-558837 - Add hipMemSetMemPool and hipMemGetMemPoo… (#1349)
* SWDEV-558836, SWDEV-558837 - Add hipMemSetMemPool and hipMemGetMemPool implementation

* Add managed allocation type for mem pools

* Update rocprofiler-sdk with APis declaration
2026-01-27 18:45:28 +01:00
Sam Ruscica 5daeb14582 SWDEV-547291 - Interop for OpenGL (#2350)
Updated to convert flags correctly

Added ObjectRegistry to track registered and mapped resources and incorporated it into hip_gl.

Added mip level check

Made functions static in-line

Reworked validation to be more clear.
2026-01-21 09:08:55 -08:00
Karthik Jayaprakash 99c3a06f4e SWDEV-549518 - Enable logging dynamically through HIP APIS. (#1079)
* SWDEV-549518 - Enable logging dynamically through HIP APIS.

* SWDEV-549518 - Adding ROCProfiler related new API changes.

* rocprofiler-sdk changes for hip api additions.

---------

Co-authored-by: Venkateshwar Reddy Kandula <venkateshwar.kandula1306@gmail.com>
Co-authored-by: jainprad <92369414+jainprad@users.noreply.github.com>
2026-01-19 16:16:14 -05:00
systems-assistant[bot] 88f07baa92 SWDEV-493792 - add split barriers for grid_group (#508)
* SWDEV-493792 - add split barriers for grid_group

* add tests

* Update change log

* Add Navi4 split barrier

* Update docs

* Use new Catch2 Approx macro

* Update split_barrier.cc to check for coop groups

---------

Co-authored-by: Jatin Chaudhary <jatchaud@amd.com>
Co-authored-by: Jatin Chaudhary <51944368+cjatin@users.noreply.github.com>
2026-01-19 09:17:00 -08:00
Fábio Mestre e6236417f7 SWDEV-571222 - Fix bf16 headers on gcc (#2260)
GCC does not support anonymous structs with members that have non-trivial constructors. This commit changes the header to remove the union when compiling with gcc. This should be a non-breaking change for other compilers.
2026-01-16 15:02:48 +00:00
AidanBeltonS 607d66e87c Add messages to static asserts to prevent warnings (#1011) 2026-01-13 14:02:36 +00:00
Fábio Mestre 09a01ee11c Replace usages of __ockl_clz with builtins (#2234) 2026-01-13 11:15:46 +01:00
Alexandra Sidorova 38a359f5f3 [CLR] prevent compilation errors for non-HIP compilers in amd_hip_mx_common.h and amd_hip_ocp_types.h (#2448)
Co-authored-by: Andrei Kochin <andrei.kochin@amd.com>
2026-01-08 17:49:13 +04:00
Jimbo a59d46ffbf SWDEV-567545 - Implement block_rank in co-op grid groups (#2182)
* SWDEV-567545 - Implement block_rank in co-op grid groups
2025-12-29 11:39:23 -05:00
Matt Arsenault 0c0d8dc974 SWDEV-548892 - Stop using __ockl_lane_id (#2186)
__lane_id already exists and is identical.
2025-12-19 20:34:55 +01:00
Shadi Dashmiz 96f6b6e251 SWDEV-571304 : Fix the constructor for __half (#2240)
- comply with cuda

- Fix usecase for constexpr

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
2025-12-17 11:15:20 -05:00
Matt Arsenault 49565f9d9f SWDEV-548892 - Always declare used ocml and ockl device libs functions (#2230)
Ignore __CLANG_HIP_RUNTIME_WRAPPER_INCLUDED__. This should not be relying
on declarations from the clang builtin headers. There is no issue declaring
the same intrinsics multiple times. This will enable removal of declarations
from the clang builtin headers.
2025-12-15 17:23:33 +01:00
Fábio Mestre 447beeb00b Replace usages of __ockl_gws_init with __builtin_amdgcn_ds_gws_init (#2235) 2025-12-15 16:56:14 +01:00
Matt Arsenault a495d1137e SWDEV-548892 - Make declaration of __ockl_fdot2 always available (#2229) 2025-12-11 11:53:11 +01:00
systems-assistant[bot] 06a3a5ca10 SWDEV-546110 - Fix encoding for certain types (#446) 2025-12-05 13:16:14 +00:00
Matt Arsenault d75d0bc1c9 SWDEV-548892 - Stop using ocml exp and exp2 functions (#2032) 2025-12-02 13:39:09 -05:00
Matt Arsenault f089217e6a SWDEV-548892 - Stop using ockl steadyctr function (#1882)
Directly use the builtin
2025-11-26 09:29:06 -05:00
Matt Arsenault 9fbb062505 SWDEV-548892 - Stop using ocml isinf wrapper (#1854) 2025-11-25 22:21:37 -05:00
AidanBeltonS 0580e2053c SWDEV-533546, SWDEV-540027 - Add e8m0 conversions and testing (#987)
* SWDEV-533546 - Add conversion functions for e8m0

* SWDEV-533546 - remove whitespace

* Add testing

* Update based on feedback

* Copilot suggestions

---------

Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
2025-11-24 09:14:03 +00:00
Matt Arsenault 4830979f0e SWDEV-548892 - Stop using ocml fma wrappers (#1702)
Directly use elementwise builtin
2025-11-13 16:20:27 -08:00
Matt Arsenault 42e91b8934 SWDEV-548892 - Stop using ocml sqrt wrappers (#1716) 2025-11-13 16:19:44 -08:00
Satyanvesh Dittakavi 07dd4c85e7 SWDEV-546308 - Implement hipKernelGetParamInfo API (#1783) 2025-11-12 14:09:26 +05:30
systems-assistant[bot] a66ca8809b SWDEV-511239 - Remove and and use && for preprocessors (#506)
This shows up as warning in msvc.

Co-authored-by: Jatin Chaudhary <JatinJaikishan.Chaudhary@amd.com>
2025-11-11 09:43:57 -08:00
Todd tiantuo Li cf536a8c1a SWDEV-554372 - Add 3 HIP_GET_PROC_ADDRESS_xxx flags (#1771) 2025-11-10 23:29:40 -08:00
Scott Todd fdbafd7757 Revert "SWDEV-554372 - Add 3 HIP_GET_PROC_ADDRESS_xxx flags (#1057)" (#1690)
Reverts ROCm/rocm-systems#1057

Suspected of breaking the build, see https://github.com/ROCm/rocm-systems/pull/1057#issuecomment-3487715129

Logs: https://github.com/ROCm/rocm-systems/actions/runs/19062134668/job/54444052479#step:12:315
```
[rocprofiler-sdk] FAILED: source/lib/rocprofiler-sdk/CMakeFiles/rocprofiler-sdk-object-library.dir/hip/abi.cpp.o 
[rocprofiler-sdk] ccache /opt/rh/gcc-toolset-12/root/usr/bin/c++ -DAMD_INTERNAL_BUILD=1 -DGLOG_USE_GLOG_EXPORT -DROCPROFILER_DL=1 -DROCPROFILER_HAS_GHC_LIB_FILESYSTEM=1 -DROCPROFILER_SDK_USE_SYSTEM_RCCL=0 -DROCPROFILER_SDK_USE_SYSTEM_ROCDECODE=0 -DROCPROFILER_SDK_USE_SYSTEM_ROCJPEG=0 -DUSE_PROF_API=1 -DYAML_CPP_STATIC_DEFINE -D__HIP_PLATFORM_AMD__=1 -Drocprofiler_EXPORTS=1 -I/__w/rocm-systems/rocm-systems/TheRock/build/profiler/rocprofiler-sdk/build/source/include -I/__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source/include -I/__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source -I/__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/yaml-cpp/include -I/__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/ptl/source -I/__w/rocm-systems/rocm-systems/TheRock/build/profiler/rocprofiler-sdk/build/external/ptl/source -isystem /__w/rocm-systems/rocm-systems/TheRock/build/core/clr/dist/include -isystem /__w/rocm-systems/rocm-systems/TheRock/build/core/ROCR-Runtime/dist/include -isystem /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/filesystem/include -isystem /__w/rocm-systems/rocm-systems/TheRock/build/profiler/rocprofiler-sdk/build/external/glog -isystem /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/glog/src -isystem /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/fmt/include -isystem /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/external/elfio -isystem /__w/rocm-systems/rocm-systems/TheRock/build/compiler/amd-comgr-stub/dist/include -isystem /__w/rocm-systems/rocm-systems/TheRock/build/third-party/sysdeps/linux/libdrm/build/stage/lib/rocm_sysdeps/lib/pkgconfig/../../include -isystem /__w/rocm-systems/rocm-systems/TheRock/build/third-party/sysdeps/linux/libdrm/build/stage/lib/rocm_sysdeps/lib/pkgconfig/../../include/libdrm -isystem /__w/rocm-systems/rocm-systems/TheRock/build/third-party/sysdeps/linux/elfutils/build/dist/lib/rocm_sysdeps/include -O3 -DNDEBUG -std=c++17 -fPIC -fvisibility=hidden -fvisibility-inlines-hidden -W -Wall -Wno-unknown-pragmas -faligned-new -rdynamic -fstack-protector-strong -Wstack-protector -MD -MT source/lib/rocprofiler-sdk/CMakeFiles/rocprofiler-sdk-object-library.dir/hip/abi.cpp.o -MF source/lib/rocprofiler-sdk/CMakeFiles/rocprofiler-sdk-object-library.dir/hip/abi.cpp.o.d -o source/lib/rocprofiler-sdk/CMakeFiles/rocprofiler-sdk-object-library.dir/hip/abi.cpp.o -c /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source/lib/rocprofiler-sdk/hip/abi.cpp
[rocprofiler-sdk] In file included from /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source/lib/rocprofiler-sdk/hip/abi.cpp:26:
[rocprofiler-sdk] /__w/rocm-systems/rocm-systems/projects/rocprofiler-sdk/source/lib/common/abi.hpp:62:27: error: static assertion failed: size of the API table struct has changed. Update the STEP_VERSION number (or in rare cases, the MAJOR_VERSION number)
[rocprofiler-sdk]    62 |             sizeof(TABLE) == ::rocprofiler::common::abi::compute_table_offset(NUM),                \
```
2025-11-04 14:29:58 -08:00
Todd tiantuo Li 7573fa168d SWDEV-554372 - Add 3 HIP_GET_PROC_ADDRESS_xxx flags (#1057) 2025-11-04 00:16:12 -08:00
dsicarov-amd 4915496bf9 SWDEV-533237 Add hipOccupancyAvailableDynamicSMemPerBlock API (#899)
* SWDEV-533237 Add initial support for hipOccupancyAvailableDynamicSMemPerBlock API

* SWDEV-533237 Add hipOccupancyAvailableDynamicSMemPerBlock wrapper for nvidia

* SWDEV-533237 Add implementation of hipOccupancyAvailableDynamicSMemPerBlock API

* SWDEV-533237 Add LDSAlignment field in Isa table

---------

Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-10-29 10:58:42 +01:00
Rahul Manocha f5d901f016 SWDEV-546311 - implement hipKernelGetLibrary & hipLibraryEnumerateKer… (#1143)
* SWDEV-546311 - implement hipKernelGetLibrary & hipLibraryEnumerateKernels API

* Fix for LibraryEnumerateKernel and KernelGetName

* Update Enumerate Kernels to handle 0 numKernels

* Minor fixes to function names

* fix error checking in internal function

* Update changelog for new apis

---------

Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-10-27 14:13:17 -07:00
Ajay GunaShekar 4a1a4aa472 SWDEV-560725 - remove final from amd_warp_functions (#1395)
final word used as a literal over a keyword.
2025-10-20 12:27:59 -07:00
Satyanvesh Dittakavi 9d32badcb7 SWDEV-545950 - Update indentation in hip_prof_str.h for hipStreamCopyAttributes (#1352) 2025-10-14 17:35:17 +05:30
Satyanvesh Dittakavi 46e683d41a SWDEV-545950 - Add hipStreamCopyAttributes API Implementation (#914)
* SWDEV-545950 - Add hipStreamCopyAttributes API Implementation

* Add unit test for hipStreamCopyAttributes API

* Add ChangeLog and nvidia mapping for the API

* Update rocprofiler-sdk with new HIP API details

* [rocprofiler-sdk] handle hipStreamCopyAttributes in stream tracing service

- this new HIP function has multiple stream arguments and needs to be skipped because it does not have an explicit create/destroy/set functionality

* Update HIP_RUNTIME_API_TABLE_STEP_VERSION in clr and rocprofiler-sdk

* Resolve merge conflicts

---------

Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com>
2025-10-12 19:57:05 +05:30
Satyanvesh Dittakavi 6b85dcf227 SWDEV-557093 - Add nested tiled partition in HIP cooperative groups (#1166) 2025-10-10 00:21:44 +05:30
Rahul Manocha 27ec19116d SWDEV-557828 - fix hip-tests on cuda (#1152)
Co-authored-by: Rahul Manocha <rmanocha@amd.com>
2025-10-07 08:28:56 -07:00
Sam Ruscica 135c38b41c SWDEV-553436 Created wrapper functions for file read and file write (#935) 2025-10-07 09:42:22 -04:00
MachineTom 4a31affb76 Users/taosang/SWDEV-510994 - Refractor atomics header and tests (#902)
* SWDEV-550626 - Refactor atomics header and tests

1. Introduce __HIP_ATOMIC_BACKWARD_COMPAT.
By default we define __HIP_ATOMIC_BACKWARD_COMPAT=1 to
let hip atomic functions maintain old assumptions. if
users want to adopt the new behavior, that is , by default
assume no-fine-grained no-remote-memory, then they can
define __HIP_ATOMIC_BACKWARD_COMPAT=0 and get the new
behaviour.

2. Use  __HIP_ATOMIC_BACKWARD_COMPAT_MEMORY to replace
original __HIP_FINE_GRAINED_MEMORY  in atomic header.
And apply __HIP_FINE_GRAINED_MEMORY onto all 
atomicXXX_system() functions to prevent failure on memory
allocated by hipHostMalloc().

3. Replace HIP_TEST_FINE_GRAINED_MEMORY with
HIP_TEST_ATOMIC_BACKWARD_COMPAT_MEMORY in hip-tests.

4. Fix negative test errors.
    Fix managed memory test error on memory order.
    some other minor changes.
    As a result  all originally disabled tests are enabled.

5. Add more atomics tests in some cases.

6. Reduce test time in each case.
     Reduce iteration number to 1 for tests that cost too much time.

8. Put common codes into hip_test_common.hh
2025-09-25 10:58:59 -04:00
Jatin Chaudhary e79eaaa8a5 SWDEV-546287 - Implement hipLibrary load/unload (#975) 2025-09-19 22:23:49 +01:00
systems-assistant[bot] d5fc1b3703 SWDEV-548838 Add local and global fence support for barrier function (#437)
* SWDEV-548838 Add local and global fence support for barrier function

The original barrier function didn't distinct between local and global scope. There was only __CLK_LOCAL_MEM_FENCE which triggers both local and global fence. This commit introduces __CLK_LOCAL_MEM_FENCE and __CLK_GLOBAL_MEM_FENCE that properly distinguish the scopes. 

---------

Co-authored-by: Tim <Tim.Gu@Amd.com>
Co-authored-by: systems-assistant[bot] <systems-assistant[bot]@users.noreply.github.com>
Co-authored-by: Tim Gu <timgu102@amd.com>
2025-09-16 14:20:57 -04:00
AidanBeltonS bf662640ee SWDEV-539805, SWDEV-553860 - Resolve GCC clang ABI mismatch and check vector alignment (#909)
* SWDEV-539805 - Add checks for vector alignment and size

* SWDEV-553860 - Alter alignment for gcc

* SWDEV-553860 - Align fallback method

* SWDEV-553860 - Alter alignment requirement
2025-09-16 17:10:14 +01:00
harkgill-amd d1b2b5ed44 Fix grid_group::group_dim to return grid_dim and not block_dim (#823)
* Fix grid_group::group_dim to return grid_dim and not block_dim

* Add unit test for grid_group.group_dim()

* Fix unit test errors

* Skip group_dim() assertions for base_type test
2025-09-15 09:42:55 -04:00
Jatin Chaudhary 3742814d82 SWDEV-553757 - add __HIP__ and __clang__ check for __shfl functions (#872) 2025-09-11 21:57:39 +01:00
Todd tiantuo Li c8ecf77a94 Update dispatch table to move 7.1 new APIs under HIP_RUNTIME_API_TABLE_STEP_VERSION 14 (#790) 2025-09-05 14:14:43 -07:00
Ioannis Assiouras 7bf7110ae8 SWDEV-550667 - Correct the check for availability of __hip_atomic_fetch_add (#818) 2025-09-04 15:15:34 +01:00
systems-assistant[bot] 7601798fa7 SWDEV-545953 - Add Implementation for hipStreamGetId (#434)
Authored-by: Satyanvesh Dittakavi <Satyanvesh.Dittakavi@amd.com>
2025-08-26 22:47:55 +05:30
Danylo Lytovchenko 2ff2316227 Adjust clang format to the new versions, revert broken macro layout (#714) 2025-08-22 17:23:22 +02:00
Danylo Lytovchenko f7338717ae SWDEV-470698 - fix formatting, add format check workflow (#657) 2025-08-20 19:58:06 +05:30
Betigeri, Sourabh 35e48d1eaf SWDEV-546293 - hipMemPrefetchAsync_v2 and hipMemAdvise_v2 implementation (#869)
SWDEV-546293 - hipMemPrefetchAsync hipMemAdvise_v2

Please enter the commit message for your changes. Lines starting

[ROCm/clr commit: cbee74a80e]
2025-08-15 22:40:04 -07:00
Li, Todd tiantuo ad9eb56dd4 SWDEV-546285 - add hipGetDriverEntryPoint (#855)
[ROCm/clr commit: 789e2029ca]
2025-08-15 20:08:21 -07:00