Граф коммитов

13474 Коммитов

Автор SHA1 Сообщение Дата
Chaudhary, Jatin Jaikishan fcaefe97b8 SWDEV-509213 - make cmake_minimum_required consistent across clr (#51)
Change-Id: Ib0b1df7af8984a37d6bf7ca68ec99597d5978821
2025-04-15 15:23:41 +05:30
Chaudhary, Jatin Jaikishan 588cf0fc69 SWDEV-520627 - include warp functions header for warpSize (#177)
Change-Id: Id3fff8f2722d521071ef0ff71b09fc365ef6fa82
2025-04-15 14:40:27 +05:30
Chaudhary, Jatin Jaikishan 07e57a1f0d SWDEV-517941 - use device bitcode before spirv (#95)
Also add flag: HIP_FORCE_SPIRV_CODEOBJECT to allow override to force use
SPIRV.

* use cache for already compiled code objects

* address review comments and use the two spirv isa names
2025-04-14 23:40:52 +01:00
Six, Lancelot 7b72c1b786 SWDEV-517078 - Update 2nd level trap handlers (#148)
* SWDEV-517078 - Maintain the trap handler ABI version in CLR

The trap handler ABI version is communicated to the debugger using
the r_version field in the r_debug structure.  This structure is
an external dependency, which makes it complicated to keep the trap
handler source (in CRL) and the ABI version number (external dependency)
in sync.

This patch proposes to patch the trap handler ABI version number in
_amdgpu_r_debug before communicating it to the debugger.

We can't directly include sc's executable.hpp file in CRL as it relies
on conflicting definition of ELF related types, so instead we need to
rely on a-priori knowledge on the r_debug structure.  Fortunately, this
structure is part of a stable ABI, so its layout is guaranteed to be
kept stable.

Update the 2nd level trap handler to follow updates from the
ROCr-runtime.  The trap handlers are stripped from parts dedicated to
architectures unsupported by CLR.

Bump the r_debug.r_version to track the ABI changes in the trap handler.
2025-04-11 18:59:54 +01:00
Milanov, Aleksandar c4fa3ef927 SWDEV-526208 - Fix miscalculation of coalesced tiled partition mask (#162) 2025-04-11 19:40:26 +02:00
Hernandez, Gerardo 66496258b4 SWDEV-521920 - Fix compilation issues introduced by the reduce sync operations - 2 (#167)
Fix pytorch 2.5 issues, by defining reduce sync operations for type __half in amd_hip_fp16.h and not in
amd_warp_sync_functions.h which is problematic in case __half does not get included before that header.
Only define types not supported by cuda if HIP_ENABLE_EXTRA_WARP_SYNC_TYPES is defined, to avoid portability issues
2025-04-11 17:00:59 +05:30
Yao, Longlong 0de73eeaf8 SWDEV-518966 - Avoid creating Arena Memobj for VMM pointer (#39)
Change-Id: I69c6c0a1464d01e674ac929de34ab10047012f1a

Signed-off-by: Longlong Yao <Longlong.Yao@amd.com>
2025-04-11 16:55:53 +05:30
Haehnle, Nicolai 199b0f1086 Report null stream creation failure (#152)
Explicitly nulling the pointer causes us to report the error below
instead of keeping a dangling pointer around that will most likely lead
to a subsequent segfault.
2025-04-10 11:40:05 -07:00
Jiang, Julia b44f5f9992 SWDEV-525231 - Update changelog for 6.5 feature implementations (#150) 2025-04-10 14:17:32 -04:00
Sang, Tao 6d10577761 SWDEV-521083 - Fix atomicMin/Max issues (#151)
Fix atomicMin/Max(), atomicMin/Max_system() issue on
float types.
2025-04-10 12:30:55 -04:00
Chaudhary, Jatin Jaikishan 628777b73d SWDEV-461087 - fp4/fp6/fp8 ocp headers (#41)
This now has host conversions too, which is directly from Christopher's
work on fcbx.

Signed-off-by: Christopher M. Riedl

* add const to func parameter

* do not depend on builtins, use gfx950 detection
2025-04-10 17:22:15 +01:00
Andryeyev, German 4c363df3bf SWDEV-517481 - Add more restrictions to the queue management (#168) 2025-04-10 21:51:45 +05:30
Xie, Jiabao(Jimbo) 0d6e554d92 SWDEV-524188 - Check for VRam and system RAM properly (#122)
Currently, we check if there's enough system RAM even if we don't allocate on host device. This is incorrect logic.
We should not check for this size on windows because PAL checks for memory allocation. See SWDEV-467263.

Co-authored-by: Jimbo Xie <jiabaxie@amd.com>
2025-04-10 21:50:48 +05:30
Chaudhary, Jatin Jaikishan 5c030840d6 SWDEV-520627 - include sync_warp header instead of warp function header (#18)
Change-Id: Ic3f54b0f5bfee8565a8bbb6218fb0ccdb900c9ea
2025-04-10 21:50:25 +05:30
Andryeyev, German 94cd9bc4f7 SWDEV-525725 - Enable resource cache for SVM (#156)
- Make sure reserved_va_ updated before svmPtr overwrite
2025-04-10 10:54:28 -04:00
Patel, Jaydeepkumar 997519fa94 SWDEV-521262 - Adding MSVC compiler options to fix the conflict with SC module while building hip in debug. (#24) 2025-04-10 15:25:58 +05:30
Patel, Jaydeepkumar 8531cd3bbe SWDEV-508973 - If total # of threads/block is more than HW capacity, it's invalid config issue and should return invalid config error. (#25) 2025-04-10 15:16:16 +05:30
Stojiljkovic, Vladana e91cb4f320 SWDEV-505795 - Return the same ptr from hipIpcOpenMemHandle if it is called multiple times (#93)
* SWDEV-505795 - Return the same ptr from hipIpcOpenMemHandle if it is called multiple times

* Move initialization outside of if statement
2025-04-10 11:20:36 +02:00
Chaudhary, Jatin Jaikishan 5214d1ca07 SWDEV-525969 - add gfx950 to use ocp type for fp8 (#157)
* do not use __gfx94plus_clr__ macro in fp8 header
2025-04-09 16:21:39 +01:00
Stojiljkovic, Vladana bc474ea5af Set maxTexture2DLinear fileds in deviceProp (#89) 2025-04-09 17:13:49 +02:00
Sang, Tao 18d191fd1d SWDEV-523824 - Fix data validation issue of rocFFT (#154)
Fix data validation issue of rocFFT when dynamic queue on.
ReleaseHwQueue() can be called only when no command in HostQueue.
The checking condition need be protected by lock.
2025-04-08 20:30:06 -04:00
Brzak, Branislav b006380ff6 SWDEV-525653 - Make hipGetDeviceProperties and hipChooseDevice use the new API (#159) 2025-04-08 18:54:05 +02:00
Patel, Jaydeepkumar 9e7248aa36 SWDEV-521011 - Allow max stack size as per ISA. (#73) 2025-04-08 10:15:38 +05:30
Andryeyev, German e974f7fde1 SWDEV-497841 - Add VmHeapArray support (#76)
Add VmHeapArray class to reduce the pressure on VA reservation, since
multiple memory pools can be active at the same time.
2025-04-03 21:04:18 +05:30
Andryeyev, German 3514f45544 SWDEV-524849 - Fix HIP error returned during capture (#141)
Always use the latest dependent nodes during hipEventRecord capture
2025-04-03 20:08:25 +05:30
Betigeri, Sourabh 8c6b90996e SWDEV-523281 - [clr] Implementation of hipLaunchKernelExC and hipDrvLaunchKernelEx API with support for cooperative launch (#92) 2025-04-03 20:10:05 +09:00
Arandjelovic, Marko 8fcaa1ca93 SWDEV-517867 - Remove invalid assert (#55)
* Remove invalid assert

* Retrigger CI

* Rebase
2025-04-03 11:14:32 +02:00
Patel, Jaydeepkumar b5c9cbc236 SWDEV-508632 - Align address to 2 MBs for hidden heap allocation. (#29) 2025-04-02 16:33:29 +05:30
Mallya, Ameya Keshava 98f1db181c fixed syntax to mainline 2025-04-01 09:51:41 -07:00
Mallya, Ameya Keshava ae1d0ef8a1 !verify functionality 2025-03-31 13:14:08 -07:00
Mallya, Ameya Keshava 24184e151c Adding KWS check for amd-mainline 2025-03-28 08:05:47 -07:00
MartinezFernandez, Juan f580632174 Remove PCH code: the code related to PCH is dead and not used (#66)
cherry-pick of compute/ec/clr/+/1184122

Co-authored-by: Juan Manuel Martinez Caamaño <juamarti@amd.com>
2025-03-28 10:36:19 +01:00
Sang, Tao 8d90b44a1b SWDEV-508863 - Support generic target in compressed fatbin (#44) 2025-03-27 20:13:51 +05:30
GunaShekar, Ajay 686dd56a4e SWDEV-523853 - Use RecordRenderOps instead of RecordRenderOp (#97) 2025-03-26 09:28:40 +05:30
Belton-Schure, Aidan ded41058a0 SWDEV-515426 - Use RAII classes for comgr (#28)
Change-Id: I9f6005542cc88f1e16e22741dcc0ce904fdaa2b0
2025-03-25 20:10:44 +05:30
Dittakavi, Satyanvesh 376f23b86a SWDEV-516595 - Add __shfl functions with __hip_bfloat16 datatype (#42)
Also removes asserts in cooperative groups shfl functions since
__hip_bfloat16 shfl is present now

Change-Id: I57578b6e68dccc10c2ddcd194e9cc18bc7732ce1
2025-03-25 15:38:01 +05:30
Gupta, Maneesh d9abcdd999 Update CODEOWNERS (#77) 2025-03-20 15:40:50 +05:30
Arandjelovic, Marko e7ada4effe Revert SWDEV-512344 - Unmap all subbuffers (#26)
This reverts commit 0b69120cfcb5b4689d9f2037b1a01e274d85c20f.
2025-03-19 21:17:36 +05:30
Godavarthy Surya, Anusha 2259a8c01c Revert "SWDEV-492049 - Remove the handle of Phy Mem from Memobj" (#72)
This reverts commit 231b2410a0.
2025-03-19 21:16:51 +05:30
Andryeyev, German 28967982b2 SWDEV-517481 - Add dynamic queue management (#37)
Enabled by defaulty. DEBUG_HIP_DYNAMIC_QUEUES controls the feature
2025-03-19 11:22:50 -04:00
Andryeyev, German 392ed53c3c SWDEV-497841 - Avoid access to the null stream on mempool alloc
Null stream isn't created during the device creation
2025-03-17 11:40:14 -04:00
Mallya, Ameya Keshava cde722ad71 Added KWS check 2025-03-12 10:12:06 -07:00
Mallya, Ameya Keshava 35dcd43c59 Added rocm-ci-caller 2025-03-12 10:05:57 -07:00
Gerardo Hernandez 340d6bb69f SWDEV-420237 - Add __reduce_add_sync()
Change-Id: Ic8e4fab6b7aeb879d40b2c1419b30d1355a2bbdc
2025-03-12 03:20:49 -04:00
agunashe f1b8ee7b7f SWDEV-513810 - APU: memory allocations threshold 0.75-->1
Needs further debugging but for now can test the change

Need to verify if this fixes all the below issues-
SWDEV-512754, SWDEV-511675, SWDEV-511055, SWDEV-504085, SWDEV-499503
Also verify original issues
SWDEV-471863, SWDEV-490991

Change-Id: Ic845f851de1b98e8ed9aa0f07afddec3858119e9
2025-03-11 05:30:43 -04:00
Saleel Kudchadker 78d0ff2dbc SWDEV-519596 - Avoid passing dep signal to SDMA
- For D2H cases avoid passing dependent signals to SDMA, the signals
  take a while to resolve on SDMA engine

Change-Id: I569635228af977847f201c82ca897002f8f2f4a8
2025-03-07 17:37:21 -05:00
Pengda Xie b02b1858c0 SWDEV-497619 - Ensure suballocSize is integer multiple of 4096
Change-Id: Iefc452d73566f58cfb63391a68c836f30d77dd6c
2025-03-07 15:36:57 -05:00
Rakesh Roy 5da8ce45ab Revert "SWDEV-508982 - [6.4 Preview] - Handle hipMemPoolCreate, hipMemPoolDestory & hipDeviceSetMemPool during stream capture."
This reverts commit 57df1b348f.

Reason for revert: 6.4 Preview changes need not be merged to amd-staging as of now

Change-Id: I86452adfed14655f72d90440a486089743cc6587
2025-03-07 06:43:24 -05:00
Rakesh Roy 4206405514 Revert "SWDEV-510271 - [6.4 Preview] fix hipCreateSurfaceObject & hipDestroySurfaceObject during stream capture"
This reverts commit c07468e53c.

Reason for revert: 6.4 Preview changes need not be merged to amd-staging as of now

Change-Id: Ifba0c8a248bc40deaa9c59b7f2901531300e5ea4
2025-03-07 06:42:12 -05:00
Rakesh Roy 3fa6049c46 Revert "SWDEV-508980 - [6.4 Preview] fix hipDeviceSetCacheConfig during stream capture"
This reverts commit 9faaf20aae.

Reason for revert: 6.4 Preview changes need not be merged to amd-staging as of now

Change-Id: I04af8603053338f08c396e78ff8a6715e641ca19
2025-03-07 06:40:53 -05:00