rocm-systems

Автор	SHA1	Сообщение	Дата
Sang, Tao	45b75013ec	SWDEV-516050 - Fix monitor hang in OCL (#75 ) Fix monitor hang in cts integer_ops. Improve notify(). Won't affect notifyAll() and Hip in direct dispatch mode. Change-Id: I95a458358e1cab9c76aefde117db09cdbd1fd3af [ROCm/clr commit: `78f92901d8`]	2025-04-23 14:34:53 -04:00
Xie, Jiabao(Jimbo)	c7737558a4	SWDEV-441487 - add gfx1150/1 support to amd-staging clr (#182 ) Co-authored-by: Jimbo Xie <jiabaxie@amd.com> [ROCm/clr commit: `9a8c9e70b2`]	2025-04-23 20:43:03 +05:30
GunaShekar, Ajay	34d3f7022b	SWDEV-523281 - CHANGELOG.md and negative test return values : hipLaunchKernelEx, hipLaunchKernelExC, hipDrvLaunchKernelEx (#155 ) [ROCm/clr commit: `64d6f5714a`]	2025-04-22 21:47:37 +05:30
Andryeyev, German	f8344154a0	SWDEV-497841 - Enable memory manager by default (#149 ) [ROCm/clr commit: `a5c860f3b0`]	2025-04-22 21:20:37 +05:30
Andryeyev, German	0569c7713c	SWDEV-523300 - Add the new option to build HIP (#179 ) Add the new cmake option AMD_COMPUTE_WIN to build HIP on Windows from the public github. AMD_COMPUTE_WIN should point to a special repo with the PAL static libs [ROCm/clr commit: `a3effa16f1`]	2025-04-22 21:05:04 +05:30
Hernandez, Gerardo	ba5a9a5395	SWDEV-420237 - Fix reduce sync operations when masks are divergent (#181 ) Do not use __ockl_activelane_u32() to calculate the index of the lane within the mask, as that would not work with divergent masks that have other bits on before the associated lane. [ROCm/clr commit: `1a8d766836`]	2025-04-22 19:47:58 +05:30
Godavarthy Surya, Anusha	41c4bea0f5	SWDEV-508538 - Optimize mem access and pack structure (#71 ) Change-Id: Ib05b8891a6d228fc3266918a000d332fddc7438b [ROCm/clr commit: `bf28bbd9ab`]	2025-04-21 13:43:25 +05:30
Brzak, Branislav	5df4734f71	SWDEV-526612 - Add missing copyright notices (#201 ) [ROCm/clr commit: `99142c3dd9`]	2025-04-18 20:54:27 +05:30
Ramirez, Lucas	0a45aa85c5	SWDEV-524612 - Consider "1" a truthy value for WGPMode (#187 ) The compiler currently serializes the workgroup_processor_mode COMGR metadata boolean field as "0"/"1" instead of "false"/"true". Consider "1" a truthy value during parsing. [ROCm/clr commit: `d020598a0f`]	2025-04-17 11:50:07 +02:00
Brzak, Branislav	6ae4c7278e	SWDEV-525423 - In COMGR Loader don't open file if image is already mapped (#193 ) [ROCm/clr commit: `d00b2a0953`]	2025-04-16 11:00:54 +02:00
Arandjelovic, Marko	963de868ae	SWDEV-523137 - function ptrs should match across all devices (#171 ) [ROCm/clr commit: `5fe080fd67`]	2025-04-16 10:35:48 +02:00
Andryeyev, German	a9df586812	SWDEV-459758 - Pass workgroup size explicitly (#185 ) It's easier for compiler to move explicit kernel arguments into user SGPRs [ROCm/clr commit: `3fd7650fe3`]	2025-04-15 15:22:15 -04:00
Chaudhary, Jatin Jaikishan	7257b705ce	SWDEV-512924 - add fp4 API (#52 ) * Remove C-style include guard * clean up issues in the PR [ROCm/clr commit: `5d638d831c`]	2025-04-15 17:53:50 +01:00
Xie, Pengda	9c1d61fbb0	SWDEV-518317 - Remove Redundant Error Message in removeFatBinary (#164 ) [ROCm/clr commit: `e92ea151b2`]	2025-04-15 09:00:39 -07:00
Andryeyev, German	db3c5f87ea	SWDEV-526836 - Switch PAL backend to CmdReleaseThenAcquire() (#175 ) [ROCm/clr commit: `f6c804edc0`]	2025-04-15 11:49:53 -04:00
Chaudhary, Jatin Jaikishan	c72604a2af	SWDEV-509213 - make cmake_minimum_required consistent across clr (#51 ) Change-Id: Ib0b1df7af8984a37d6bf7ca68ec99597d5978821 [ROCm/clr commit: `fcaefe97b8`]	2025-04-15 15:23:41 +05:30
Chaudhary, Jatin Jaikishan	7c3dcb707e	SWDEV-520627 - include warp functions header for warpSize (#177 ) Change-Id: Id3fff8f2722d521071ef0ff71b09fc365ef6fa82 [ROCm/clr commit: `588cf0fc69`]	2025-04-15 14:40:27 +05:30
Chaudhary, Jatin Jaikishan	e9e207d7b0	SWDEV-517941 - use device bitcode before spirv (#95 ) Also add flag: HIP_FORCE_SPIRV_CODEOBJECT to allow override to force use SPIRV. * use cache for already compiled code objects * address review comments and use the two spirv isa names [ROCm/clr commit: `07e57a1f0d`]	2025-04-14 23:40:52 +01:00
Six, Lancelot	b389f97d73	SWDEV-517078 - Update 2nd level trap handlers (#148 ) * SWDEV-517078 - Maintain the trap handler ABI version in CLR The trap handler ABI version is communicated to the debugger using the r_version field in the r_debug structure. This structure is an external dependency, which makes it complicated to keep the trap handler source (in CRL) and the ABI version number (external dependency) in sync. This patch proposes to patch the trap handler ABI version number in _amdgpu_r_debug before communicating it to the debugger. We can't directly include sc's executable.hpp file in CRL as it relies on conflicting definition of ELF related types, so instead we need to rely on a-priori knowledge on the r_debug structure. Fortunately, this structure is part of a stable ABI, so its layout is guaranteed to be kept stable. Update the 2nd level trap handler to follow updates from the ROCr-runtime. The trap handlers are stripped from parts dedicated to architectures unsupported by CLR. Bump the r_debug.r_version to track the ABI changes in the trap handler. [ROCm/clr commit: `7b72c1b786`]	2025-04-11 18:59:54 +01:00
Milanov, Aleksandar	c83df8a653	SWDEV-526208 - Fix miscalculation of coalesced tiled partition mask (#162 ) [ROCm/clr commit: `c4fa3ef927`]	2025-04-11 19:40:26 +02:00
Hernandez, Gerardo	9264e97cbb	SWDEV-521920 - Fix compilation issues introduced by the reduce sync operations - 2 (#167 ) Fix pytorch 2.5 issues, by defining reduce sync operations for type __half in amd_hip_fp16.h and not in amd_warp_sync_functions.h which is problematic in case __half does not get included before that header. Only define types not supported by cuda if HIP_ENABLE_EXTRA_WARP_SYNC_TYPES is defined, to avoid portability issues [ROCm/clr commit: `66496258b4`]	2025-04-11 17:00:59 +05:30
Yao, Longlong	6015dda120	SWDEV-518966 - Avoid creating Arena Memobj for VMM pointer (#39 ) Change-Id: I69c6c0a1464d01e674ac929de34ab10047012f1a Signed-off-by: Longlong Yao <Longlong.Yao@amd.com> [ROCm/clr commit: `0de73eeaf8`]	2025-04-11 16:55:53 +05:30
Haehnle, Nicolai	b3d1ac7232	Report null stream creation failure (#152 ) Explicitly nulling the pointer causes us to report the error below instead of keeping a dangling pointer around that will most likely lead to a subsequent segfault. [ROCm/clr commit: `199b0f1086`]	2025-04-10 11:40:05 -07:00
Jiang, Julia	2a39c6a782	SWDEV-525231 - Update changelog for 6.5 feature implementations (#150 ) [ROCm/clr commit: `b44f5f9992`]	2025-04-10 14:17:32 -04:00
Sang, Tao	929209b988	SWDEV-521083 - Fix atomicMin/Max issues (#151 ) Fix atomicMin/Max(), atomicMin/Max_system() issue on float types. [ROCm/clr commit: `6d10577761`]	2025-04-10 12:30:55 -04:00
Chaudhary, Jatin Jaikishan	665e88008b	SWDEV-461087 - fp4/fp6/fp8 ocp headers (#41 ) This now has host conversions too, which is directly from Christopher's work on fcbx. Signed-off-by: Christopher M. Riedl * add const to func parameter * do not depend on builtins, use gfx950 detection [ROCm/clr commit: `628777b73d`]	2025-04-10 17:22:15 +01:00
Andryeyev, German	c50f85df20	SWDEV-517481 - Add more restrictions to the queue management (#168 ) [ROCm/clr commit: `4c363df3bf`]	2025-04-10 21:51:45 +05:30
Xie, Jiabao(Jimbo)	88841d1dee	SWDEV-524188 - Check for VRam and system RAM properly (#122 ) Currently, we check if there's enough system RAM even if we don't allocate on host device. This is incorrect logic. We should not check for this size on windows because PAL checks for memory allocation. See SWDEV-467263. Co-authored-by: Jimbo Xie <jiabaxie@amd.com> [ROCm/clr commit: `0d6e554d92`]	2025-04-10 21:50:48 +05:30
Chaudhary, Jatin Jaikishan	07c7d3e860	SWDEV-520627 - include sync_warp header instead of warp function header (#18 ) Change-Id: Ic3f54b0f5bfee8565a8bbb6218fb0ccdb900c9ea [ROCm/clr commit: `5c030840d6`]	2025-04-10 21:50:25 +05:30
Andryeyev, German	90e3d2619a	SWDEV-525725 - Enable resource cache for SVM (#156 ) - Make sure reserved_va_ updated before svmPtr overwrite [ROCm/clr commit: `94cd9bc4f7`]	2025-04-10 10:54:28 -04:00
Patel, Jaydeepkumar	c2ed585737	SWDEV-521262 - Adding MSVC compiler options to fix the conflict with SC module while building hip in debug. (#24 ) [ROCm/clr commit: `997519fa94`]	2025-04-10 15:25:58 +05:30
Patel, Jaydeepkumar	dea880f9da	SWDEV-508973 - If total # of threads/block is more than HW capacity, it's invalid config issue and should return invalid config error. (#25 ) [ROCm/clr commit: `8531cd3bbe`]	2025-04-10 15:16:16 +05:30
Stojiljkovic, Vladana	81a566e397	SWDEV-505795 - Return the same ptr from hipIpcOpenMemHandle if it is called multiple times (#93 ) * SWDEV-505795 - Return the same ptr from hipIpcOpenMemHandle if it is called multiple times * Move initialization outside of if statement [ROCm/clr commit: `e91cb4f320`]	2025-04-10 11:20:36 +02:00
Chaudhary, Jatin Jaikishan	b0d9ea7454	SWDEV-525969 - add gfx950 to use ocp type for fp8 (#157 ) * do not use __gfx94plus_clr__ macro in fp8 header [ROCm/clr commit: `5214d1ca07`]	2025-04-09 16:21:39 +01:00
Stojiljkovic, Vladana	ef52738425	Set maxTexture2DLinear fileds in deviceProp (#89 ) [ROCm/clr commit: `bc474ea5af`]	2025-04-09 17:13:49 +02:00
Sang, Tao	60a1e6dbc1	SWDEV-523824 - Fix data validation issue of rocFFT (#154 ) Fix data validation issue of rocFFT when dynamic queue on. ReleaseHwQueue() can be called only when no command in HostQueue. The checking condition need be protected by lock. [ROCm/clr commit: `18d191fd1d`]	2025-04-08 20:30:06 -04:00
Brzak, Branislav	d4275741ba	SWDEV-525653 - Make hipGetDeviceProperties and hipChooseDevice use the new API (#159 ) [ROCm/clr commit: `b006380ff6`]	2025-04-08 18:54:05 +02:00
Patel, Jaydeepkumar	2f3bc7f01c	SWDEV-521011 - Allow max stack size as per ISA. (#73 ) [ROCm/clr commit: `9e7248aa36`]	2025-04-08 10:15:38 +05:30
Andryeyev, German	4c9cc6ba30	SWDEV-497841 - Add VmHeapArray support (#76 ) Add VmHeapArray class to reduce the pressure on VA reservation, since multiple memory pools can be active at the same time. [ROCm/clr commit: `e974f7fde1`]	2025-04-03 21:04:18 +05:30
Andryeyev, German	3ceab5ba02	SWDEV-524849 - Fix HIP error returned during capture (#141 ) Always use the latest dependent nodes during hipEventRecord capture [ROCm/clr commit: `3514f45544`]	2025-04-03 20:08:25 +05:30
Betigeri, Sourabh	487ede31a9	SWDEV-523281 - [clr] Implementation of hipLaunchKernelExC and hipDrvLaunchKernelEx API with support for cooperative launch (#92 ) [ROCm/clr commit: `8c6b90996e`]	2025-04-03 20:10:05 +09:00
Arandjelovic, Marko	1c83314659	SWDEV-517867 - Remove invalid assert (#55 ) * Remove invalid assert * Retrigger CI * Rebase [ROCm/clr commit: `8fcaa1ca93`]	2025-04-03 11:14:32 +02:00
Patel, Jaydeepkumar	b217d3a4e6	SWDEV-508632 - Align address to 2 MBs for hidden heap allocation. (#29 ) [ROCm/clr commit: `b5c9cbc236`]	2025-04-02 16:33:29 +05:30
Mallya, Ameya Keshava	29be7230eb	fixed syntax to mainline [ROCm/clr commit: `98f1db181c`]	2025-04-01 09:51:41 -07:00
Mallya, Ameya Keshava	f117699bef	!verify functionality [ROCm/clr commit: `ae1d0ef8a1`]	2025-03-31 13:14:08 -07:00
Mallya, Ameya Keshava	594c7e6704	Adding KWS check for amd-mainline [ROCm/clr commit: `24184e151c`]	2025-03-28 08:05:47 -07:00
MartinezFernandez, Juan	966157cd5b	Remove PCH code: the code related to PCH is dead and not used (#66 ) cherry-pick of compute/ec/clr/+/1184122 Co-authored-by: Juan Manuel Martinez Caamaño <juamarti@amd.com> [ROCm/clr commit: `f580632174`]	2025-03-28 10:36:19 +01:00
Sang, Tao	d49a2a51d6	SWDEV-508863 - Support generic target in compressed fatbin (#44 ) [ROCm/clr commit: `8d90b44a1b`]	2025-03-27 20:13:51 +05:30
GunaShekar, Ajay	aaba454bfc	SWDEV-523853 - Use RecordRenderOps instead of RecordRenderOp (#97 ) [ROCm/clr commit: `686dd56a4e`]	2025-03-26 09:28:40 +05:30
Belton-Schure, Aidan	e27e3eb66a	SWDEV-515426 - Use RAII classes for comgr (#28 ) Change-Id: I9f6005542cc88f1e16e22741dcc0ce904fdaa2b0 [ROCm/clr commit: `ded41058a0`]	2025-03-25 20:10:44 +05:30

1 2 3 4 5 ...

12771 Коммитов