Граф коммитов

6933 Коммитов

Автор SHA1 Сообщение Дата
Chaudhary, Jatin Jaikishan fcaefe97b8 SWDEV-509213 - make cmake_minimum_required consistent across clr (#51)
Change-Id: Ib0b1df7af8984a37d6bf7ca68ec99597d5978821
2025-04-15 15:23:41 +05:30
Chaudhary, Jatin Jaikishan 588cf0fc69 SWDEV-520627 - include warp functions header for warpSize (#177)
Change-Id: Id3fff8f2722d521071ef0ff71b09fc365ef6fa82
2025-04-15 14:40:27 +05:30
Chaudhary, Jatin Jaikishan 07e57a1f0d SWDEV-517941 - use device bitcode before spirv (#95)
Also add flag: HIP_FORCE_SPIRV_CODEOBJECT to allow override to force use
SPIRV.

* use cache for already compiled code objects

* address review comments and use the two spirv isa names
2025-04-14 23:40:52 +01:00
Milanov, Aleksandar c4fa3ef927 SWDEV-526208 - Fix miscalculation of coalesced tiled partition mask (#162) 2025-04-11 19:40:26 +02:00
Hernandez, Gerardo 66496258b4 SWDEV-521920 - Fix compilation issues introduced by the reduce sync operations - 2 (#167)
Fix pytorch 2.5 issues, by defining reduce sync operations for type __half in amd_hip_fp16.h and not in
amd_warp_sync_functions.h which is problematic in case __half does not get included before that header.
Only define types not supported by cuda if HIP_ENABLE_EXTRA_WARP_SYNC_TYPES is defined, to avoid portability issues
2025-04-11 17:00:59 +05:30
Haehnle, Nicolai 199b0f1086 Report null stream creation failure (#152)
Explicitly nulling the pointer causes us to report the error below
instead of keeping a dangling pointer around that will most likely lead
to a subsequent segfault.
2025-04-10 11:40:05 -07:00
Sang, Tao 6d10577761 SWDEV-521083 - Fix atomicMin/Max issues (#151)
Fix atomicMin/Max(), atomicMin/Max_system() issue on
float types.
2025-04-10 12:30:55 -04:00
Chaudhary, Jatin Jaikishan 628777b73d SWDEV-461087 - fp4/fp6/fp8 ocp headers (#41)
This now has host conversions too, which is directly from Christopher's
work on fcbx.

Signed-off-by: Christopher M. Riedl

* add const to func parameter

* do not depend on builtins, use gfx950 detection
2025-04-10 17:22:15 +01:00
Xie, Jiabao(Jimbo) 0d6e554d92 SWDEV-524188 - Check for VRam and system RAM properly (#122)
Currently, we check if there's enough system RAM even if we don't allocate on host device. This is incorrect logic.
We should not check for this size on windows because PAL checks for memory allocation. See SWDEV-467263.

Co-authored-by: Jimbo Xie <jiabaxie@amd.com>
2025-04-10 21:50:48 +05:30
Chaudhary, Jatin Jaikishan 5c030840d6 SWDEV-520627 - include sync_warp header instead of warp function header (#18)
Change-Id: Ic3f54b0f5bfee8565a8bbb6218fb0ccdb900c9ea
2025-04-10 21:50:25 +05:30
Patel, Jaydeepkumar 8531cd3bbe SWDEV-508973 - If total # of threads/block is more than HW capacity, it's invalid config issue and should return invalid config error. (#25) 2025-04-10 15:16:16 +05:30
Stojiljkovic, Vladana e91cb4f320 SWDEV-505795 - Return the same ptr from hipIpcOpenMemHandle if it is called multiple times (#93)
* SWDEV-505795 - Return the same ptr from hipIpcOpenMemHandle if it is called multiple times

* Move initialization outside of if statement
2025-04-10 11:20:36 +02:00
Chaudhary, Jatin Jaikishan 5214d1ca07 SWDEV-525969 - add gfx950 to use ocp type for fp8 (#157)
* do not use __gfx94plus_clr__ macro in fp8 header
2025-04-09 16:21:39 +01:00
Stojiljkovic, Vladana bc474ea5af Set maxTexture2DLinear fileds in deviceProp (#89) 2025-04-09 17:13:49 +02:00
Brzak, Branislav b006380ff6 SWDEV-525653 - Make hipGetDeviceProperties and hipChooseDevice use the new API (#159) 2025-04-08 18:54:05 +02:00
Andryeyev, German e974f7fde1 SWDEV-497841 - Add VmHeapArray support (#76)
Add VmHeapArray class to reduce the pressure on VA reservation, since
multiple memory pools can be active at the same time.
2025-04-03 21:04:18 +05:30
Andryeyev, German 3514f45544 SWDEV-524849 - Fix HIP error returned during capture (#141)
Always use the latest dependent nodes during hipEventRecord capture
2025-04-03 20:08:25 +05:30
Betigeri, Sourabh 8c6b90996e SWDEV-523281 - [clr] Implementation of hipLaunchKernelExC and hipDrvLaunchKernelEx API with support for cooperative launch (#92) 2025-04-03 20:10:05 +09:00
Sang, Tao 8d90b44a1b SWDEV-508863 - Support generic target in compressed fatbin (#44) 2025-03-27 20:13:51 +05:30
Belton-Schure, Aidan ded41058a0 SWDEV-515426 - Use RAII classes for comgr (#28)
Change-Id: I9f6005542cc88f1e16e22741dcc0ce904fdaa2b0
2025-03-25 20:10:44 +05:30
Dittakavi, Satyanvesh 376f23b86a SWDEV-516595 - Add __shfl functions with __hip_bfloat16 datatype (#42)
Also removes asserts in cooperative groups shfl functions since
__hip_bfloat16 shfl is present now

Change-Id: I57578b6e68dccc10c2ddcd194e9cc18bc7732ce1
2025-03-25 15:38:01 +05:30
Arandjelovic, Marko e7ada4effe Revert SWDEV-512344 - Unmap all subbuffers (#26)
This reverts commit 0b69120cfcb5b4689d9f2037b1a01e274d85c20f.
2025-03-19 21:17:36 +05:30
Godavarthy Surya, Anusha 2259a8c01c Revert "SWDEV-492049 - Remove the handle of Phy Mem from Memobj" (#72)
This reverts commit 231b2410a0.
2025-03-19 21:16:51 +05:30
Andryeyev, German 392ed53c3c SWDEV-497841 - Avoid access to the null stream on mempool alloc
Null stream isn't created during the device creation
2025-03-17 11:40:14 -04:00
Gerardo Hernandez 340d6bb69f SWDEV-420237 - Add __reduce_add_sync()
Change-Id: Ic8e4fab6b7aeb879d40b2c1419b30d1355a2bbdc
2025-03-12 03:20:49 -04:00
Rakesh Roy 5da8ce45ab Revert "SWDEV-508982 - [6.4 Preview] - Handle hipMemPoolCreate, hipMemPoolDestory & hipDeviceSetMemPool during stream capture."
This reverts commit 57df1b348f.

Reason for revert: 6.4 Preview changes need not be merged to amd-staging as of now

Change-Id: I86452adfed14655f72d90440a486089743cc6587
2025-03-07 06:43:24 -05:00
Rakesh Roy 4206405514 Revert "SWDEV-510271 - [6.4 Preview] fix hipCreateSurfaceObject & hipDestroySurfaceObject during stream capture"
This reverts commit c07468e53c.

Reason for revert: 6.4 Preview changes need not be merged to amd-staging as of now

Change-Id: Ifba0c8a248bc40deaa9c59b7f2901531300e5ea4
2025-03-07 06:42:12 -05:00
Rakesh Roy 3fa6049c46 Revert "SWDEV-508980 - [6.4 Preview] fix hipDeviceSetCacheConfig during stream capture"
This reverts commit 9faaf20aae.

Reason for revert: 6.4 Preview changes need not be merged to amd-staging as of now

Change-Id: I04af8603053338f08c396e78ff8a6715e641ca19
2025-03-07 06:40:53 -05:00
Ioannis Assiouras 8f54aeb765 SWDEV-511813 - Fix linkage of hipRTC-header.o into libhiprtc.a
Using target_link_libraries does not properly link the hipRTC-header.o
into libhiprtc for static build. Change to use target_sources instead.
This does not affect the linkage in the shared build.

Change-Id: I626f9eacc1637b792a50e7ddddb5db09e704ac4a
2025-03-06 16:29:57 -05:00
Jacob Lambert 2e2b6b3592 SWDEV-518221 Fix major/minor Comgr version check
Change-Id: I2210aadafcae984dafc68c3fe16508bb2b409077
2025-03-06 13:02:34 -05:00
Julia Jiang e5425393b4 SWDEV - 508961 - Update requestedHanleTye in CLR repos
Change-Id: I6949a36c5b0bb8e88a2a33ed13ae8f278a5b19c7
2025-03-06 11:37:31 -05:00
taosang2 d91e1f19d0 SWDEV-512613 - Improve device atomics functions
Also part of SWDEV-510994.
1. Fix atomicMin/Max_system() for float and double.
2. Remove logics of gfx941 which isn't supported.

Change-Id: Iacfdc1bc13e8da2f5df8751bb315b37d33cea667
2025-03-06 10:05:59 -05:00
Ioannis Assiouras e963d30b5d SWDEV-517715 - Remove dependency on non-static hipcc from hip-static-devel
Change-Id: I1184680949fa73d7dc0957062292e6682179b203
2025-03-06 10:01:58 -05:00
Saleel Kudchadker 940347ad42 SWDEV-508004 - Improve hipStreamWaitEvent & Fix typo
- hipStreamWaitEvent may not resolve streams
- Correct usage of flag passed to streamWait function

Change-Id: I2ee163615d303b98937c1035d60da283cce6f677
2025-03-05 11:56:01 -05:00
Pengda Xie ae3b053ddf SWDEV-518317 - Don't attempt to remove managedVars when map is empty
Change-Id: I25c33487dc08f96c087b6acc1abe42a4a666a609
2025-03-05 11:53:18 -05:00
Branislav Brzak c2d1776ebd SWDEV-516564 - SWDEV-512817 - Remove mentions of gfx940 and gfx941
Change-Id: Ia069fcb9c6948c3fc9a00961593c9dcc59609375
2025-03-05 04:26:07 -05:00
Saleel Kudchadker e03e4f3b5d SWDEV-502365 - Track last used command
- This change tries to save extra synchronization packets we may insert
  as we didnt track the completion signals for every command. We track
the current enqueued command until it exits the enqueue stage. We also
record the exit scope to know if we flushed the caches
- Handle correct release scopes and store completion signal as HW events
- Use a new finishCommand implementation to only wait for the command
  passed as the argument

Change-Id: Ie4350c5dd24f5d48dfa6ccbabd892f0544caadcc
2025-03-04 16:05:02 -05:00
German Andryeyev cece301fd4 SWDEV-518474 - Add comgr debug mask
Move prints from CO processing under COMGR debug mask.

Change-Id: I2a417e42a1f4e2922a34eb104c69e4db10b5f1c6
2025-03-04 14:37:08 -05:00
Julia Jiang 81db54d3f9 SWDEV-509855 - Update hipDeviceAttributePciDomainID in CLR
Change-Id: I79939b333ef6114b97009ca4bfb67f63a9a22784
2025-03-04 14:08:08 -05:00
Marko Arandjelovic 3ec1d2d2f1 SWDEV-512344 - Unmap all subbuffers
Since hipMemMap can be called for multiple device handles on the same virtual memory, the same is true for hipMemUnmap, meaning that virtual memory can be "partially unmapped".

This means that the unmap function can be called for a specific part of the reserved address, meaning that only the designated subbuffer should be released. If unmap is called on the entire reserved memory, then all subbuffers should be released.

The main point is that for every hsa_amd_vmem_map, there should be a corresponding hsa_amd_vmem_unmap. Otherwise, if entire memory is unmapped by a single unmap call, then HSA will report the memory as "in use" if an attempt is made to delete it.

Change-Id: I039308eafb820decfb1c09f60347f26cdad1a362
2025-03-02 13:41:48 -05:00
Ioannis Assiouras e9b33af45a SWDEV-509788 - Code cleanups in Event class
Change-Id: I4163ce6c1dabeaab92de13b51b6a46b7be83e2bd
2025-02-27 17:16:50 -05:00
Pengda Xie ade704dd2f SWDEV-512044 - Fix logic error in texture size validation
Change-Id: I6aefcfed25b099c17bf0856d621081c0a5ce46c5
2025-02-26 11:20:58 -05:00
Ioannis Assiouras a8f309049d SWDEV-516994 - Fix race condition in the implementation of graph AutoFreeOnLaunch on Windows
Change-Id: I3c98d0d4bffe2a9e0aa5cfa24b6c8e9a8087da29
2025-02-26 02:36:31 -05:00
Rahul Manocha 5930f047bb SWDEV-489106 - Linker API addition to runtime
1) Add Linker APIs to runtime to support SPIRV linking
2) Migrate Internal implementations to runtime and share with rtc
3) Add Support to bundled and unbundled SPIRV Code object linking.

Change-Id: Ic1fd4431f842a208a2468e8aec54a65b5fa6b0e3
2025-02-22 13:39:23 -05:00
Li, Todd tiantuo 9faaf20aae SWDEV-508980 - [6.4 Preview] fix hipDeviceSetCacheConfig during stream capture
Change-Id: I8e89774a8163fdc120155f742606ee2c0aa7103b
2025-02-22 01:05:28 -05:00
Li, Todd tiantuo c07468e53c SWDEV-510271 - [6.4 Preview] fix hipCreateSurfaceObject & hipDestroySurfaceObject during stream capture
Change-Id: I19e149549c271d847f52b72e04cb2427ca194b24
2025-02-22 01:04:35 -05:00
Ioannis Assiouras 721c5800ca SWDEV-509788 - Use stream memory operation in hipStreamWaitEvent
This change removes the stream callback from hipStreamWaitEvent and
uses a stream memory wait operation instead. This allows the
hipStreamWaitEvent to be non-blocking on the host.

Change-Id: Ie5530febda5a5bcb5daa0db8a01249d6b137fd43
2025-02-21 11:46:09 -05:00
German Andryeyev 296dce5570 SWDEV-497841 - Add virtual memory heap
Add initial implementation of virtual memory heap with
dynamic virtual memory mapping support for memory pools.
DEBUG_HIP_MEM_POOL_VMHEAP controls the new method.

Change-Id: I8dc5be2e0f34ab472f1800f43bb6243639a5e500
2025-02-20 10:55:49 -05:00
Jimbo Xie 7a4a22d454 SWDEV-477219 - implement hipEventRecordWithFlags
Change-Id: Icf07e85fc8c15f921f6e7c9fbd31dd3856dc988b
2025-02-19 13:53:00 -05:00
Jatin Chaudhary c3f49c8788 SWDEV-511239 - make fp8 standalone host compileable
- Use correct header in device_library_decl
- use std:: instead of __hip_internal:: for host compilation
- hide device specific stuff behind __clang__ and __HIP__ check

Change-Id: I2f3647e00555ed0e79f9954a459c41394c3cd49b
2025-02-18 19:07:45 -05:00