rocm-systems

Автор	SHA1	Сообщение	Дата
Li, Todd tiantuo	04dc7ca51f	SWDEV-508980 - [6.4 Preview] fix hipDeviceSetCacheConfig during stream capture Change-Id: I8e89774a8163fdc120155f742606ee2c0aa7103b [ROCm/clr commit: `9faaf20aae`]	2025-02-22 01:05:28 -05:00
Li, Todd tiantuo	82f78ce187	SWDEV-510271 - [6.4 Preview] fix hipCreateSurfaceObject & hipDestroySurfaceObject during stream capture Change-Id: I19e149549c271d847f52b72e04cb2427ca194b24 [ROCm/clr commit: `c07468e53c`]	2025-02-22 01:04:35 -05:00
Ioannis Assiouras	8d29fb9e6d	SWDEV-509788 - Use stream memory operation in hipStreamWaitEvent This change removes the stream callback from hipStreamWaitEvent and uses a stream memory wait operation instead. This allows the hipStreamWaitEvent to be non-blocking on the host. Change-Id: Ie5530febda5a5bcb5daa0db8a01249d6b137fd43 [ROCm/clr commit: `721c5800ca`]	2025-02-21 11:46:09 -05:00
Julia Jiang	1495cc77eb	SWDEV-513294 - fix regression on SVM sub-test failure in Conformance Change-Id: Ic2449dd34a9cd2b623d5f8fbe89fd042566a56e3 [ROCm/clr commit: `b7eaec76fc`]	2025-02-20 15:40:23 -05:00
kjayapra-amd	010253430f	SWDEV-516303 - Remove SDMA retainer logic to select the engine. Change-Id: I818129444131825cdb87e06cb495afa3e5cdb683 [ROCm/clr commit: `1f583a6870`]	2025-02-20 11:34:38 -05:00
German Andryeyev	a7f3ad7867	SWDEV-515356 - Make the round-robin queue selection - Add custom compare to the map of queues, which will help with the round-robin selection Change-Id: Ie67a820bfb1a5b484a1b3edced967eed94228bb8 [ROCm/clr commit: `ba8e740be4`]	2025-02-20 11:09:54 -05:00
German Andryeyev	f9d9b2c441	SWDEV-497841 - Add virtual memory heap Add initial implementation of virtual memory heap with dynamic virtual memory mapping support for memory pools. DEBUG_HIP_MEM_POOL_VMHEAP controls the new method. Change-Id: I8dc5be2e0f34ab472f1800f43bb6243639a5e500 [ROCm/clr commit: `296dce5570`]	2025-02-20 10:55:49 -05:00
German Andryeyev	6f2a603277	SWDEV-497619 - Allocate extra space in CB Compute doesn't support IB chaining, but RGP may collect perf counters, which require more space in CB. Increase CB size if RGP is enabled. Change-Id: Iaa0a620ead8541a679b0dfe5e5711af5afdba545 [ROCm/clr commit: `63cf3057ba`]	2025-02-20 10:40:09 -05:00
Jimbo Xie	8a42a52d0f	SWDEV-477219 - implement hipEventRecordWithFlags Change-Id: Icf07e85fc8c15f921f6e7c9fbd31dd3856dc988b [ROCm/clr commit: `7a4a22d454`]	2025-02-19 13:53:00 -05:00
Jatin Chaudhary	16f9dbff6c	SWDEV-511239 - make fp8 standalone host compileable - Use correct header in device_library_decl - use std:: instead of __hip_internal:: for host compilation - hide device specific stuff behind __clang__ and __HIP__ check Change-Id: I2f3647e00555ed0e79f9954a459c41394c3cd49b [ROCm/clr commit: `c3f49c8788`]	2025-02-18 19:07:45 -05:00
Jatin Chaudhary	508d043176	SWDEV-515255 - do not free bitcode object before code gen - Also add a cache, which allows compiled code objects to be reused instead of compiling again. This should improve performance on multigpu systems. Change-Id: Ib135d616c076b77f8aaf28de275d408b38021d89 [ROCm/clr commit: `0391aec14a`]	2025-02-18 12:39:31 -05:00
Tim Gu	8fcbc2acfe	SWDEV-502248 - Parse file path with space characters Signed-off-by: Tim Gu <Tim.Gu@amd.com> Change-Id: I67fb9cf5559c9c06f24627a1b25fec3e89b2d1cf [ROCm/clr commit: `84a867fb73`]	2025-02-18 10:31:21 -05:00
agunashe	52a1f5dbf7	SWDEV-507967 - Deprecate gfx9, gfx8, gfx7 on Windows PAL_CLIENT_INTERFACE_MAJOR_VERSION from 872 --> 910 Change-Id: I03dfa2924ccdae4c2f13f09d5f34ee58298e1343 [ROCm/clr commit: `ea804e16f8`]	2025-02-17 02:59:41 -05:00
Anusha GodavarthySurya	c6bea0ea59	SWDEV-469422 - hipgraph remove static typecast to parent Change-Id: I339250cfd26a7c04543722a82301acbb41c7d5d7 [ROCm/clr commit: `199e464402`]	2025-02-14 11:09:32 -05:00
David Salinas	e2da5772ff	Deprecate roc-obj* tooling - make Perl packages RECOMENDS/SUGGESTS for hip-dev - update CHANGE log SWDEV-511528 - TECH Remove ROCM Perl dependency - hip-dev SWDEV-333176 - Shift functionality of 'roc-obj-*' perl scripts into llvm-objdump Change-Id: Iec3ba245848781f95c825f0d37aff4b4fb54f5e4 [ROCm/clr commit: `c942833b34`]	2025-02-13 11:42:57 -05:00
Vladana Stojiljkovic	7078aab436	SWDEV-510059 - Format CU mask properly Change-Id: I80e94b4f3ea25f6988fc06d83aeb398e81ccddd1 [ROCm/clr commit: `061c5d877f`]	2025-02-13 11:02:56 -05:00
harkgill-amd	cac2e94141	Specify C++ language mode for warning post amdgpu-arch failure Change-Id: I55bf6734a1e8dc06dd0a1ee12086b7667332206f [ROCm/clr commit: `935b538261`]	2025-02-13 09:40:13 -05:00
Aidan Belton-Schure	4b4a35b86b	SWDEV-508279 - Improve HIP event profiling There are 2 functional changes to this patch: * Use GPU timing for internal markers for HIP. * Measure CPU time closer to GPU timer, to reduce delta between GPU/CPU timestamp measurements. There are some smaller non-functional updates: * waifForFence -> waitForFence typo * Remove unused drmProfiling Change-Id: I4c5fa600a842ab60e454888779edcac8449a902a [ROCm/clr commit: `179801a750`]	2025-02-13 04:15:40 -05:00
Jatin Chaudhary	5725b99619	SWDEV-474146 - use __bf16 to do operations Change-Id: I568dfa97238fd760f5362a8e560c33402f96cff3 [ROCm/clr commit: `c23913f6e7`]	2025-02-12 07:03:05 -05:00
Jatin Chaudhary	db2a3214c4	SWDEV-504769 - Allow hipEvent_t to record on hipStreamLegacy Change-Id: Ib86412255adad172598620ea81214e5eb56020ea [ROCm/clr commit: `e560d94d2c`]	2025-02-12 07:02:35 -05:00
Ioannis Assiouras	a349b23474	SWDEV-514686 - Fixed hipEventSynchronize/hipStreamWaitEvent for IPC events Resolved an issue where hipEventSynchronize and hipStreamWaitEvent APIs did not function correctly for events created with the hipEventInterprocess flag. The bug caused the event to be incorrectly marked as "recorded," leading to these APIs failing to wait for the event as expected. Change-Id: Ic9fdfaab2393beb93d6e0b83661545e902a63499 [ROCm/clr commit: `1cdfbfd270`]	2025-02-11 18:43:06 -05:00
kjayapra-amd	1f648c7d94	SWDEV-511672 - Special case the Remote USWC memory usage for HIP, if the alloc size is large. Change-Id: I524c1402b249cedfd58b56f494caa2ac057e1623 [ROCm/clr commit: `cf6aabb823`]	2025-02-11 06:42:18 -05:00
Saleel Kudchadker	71e1a0b10d	SWDEV-504494 - Further copy improvements - Fix regression for D2H pinned copies which adds systemscope release. - Skip cpu wait for D2H unpinned copies as we can pass the signal of the barrier to rocr copy. - Fix an old bug in sdmaEngineRetainCount_ logic - Improve logging Change-Id: If074bddb05564b15949b0d5f9bf12acd3692174e [ROCm/clr commit: `4c95ee5e1e`]	2025-02-11 00:55:52 -05:00
victzhan	7cd780c1cb	SWDEV-485042 - Remove -I option passed into comgr when file type is not FILE_TYPE_ASM_TEXT Change-Id: If8e469f881651f7b3dae364e8182ef1ba6f3a0d1 [ROCm/clr commit: `ca35d93672`]	2025-02-10 11:47:04 -05:00
Ioannis Assiouras	eb77b9aba6	SWDEV-508435 - Use the stream of the src/dst image memory object in A2H and H2A commands Change-Id: I9b776a54760a4633d5f84cf7b467d2d3ba8cbdde [ROCm/clr commit: `a8edb8d467`]	2025-02-07 13:38:31 -05:00
taosang2	f84a8e62d3	SWDEV-446880 - Make ocltst MemoryInfo pass in EMU Make ocltst -m tests/ocltst/liboclruntime.so -t OCLMemoryInfo pass in emu where GPU memory is very big. Cherry pick https://gerrit-git.amd.com/c/compute/ec/clr/+/1014858 Change-Id: I0228c5e87ce7c366983fd4af71c25e7f8161c2c7 [ROCm/clr commit: `de83d7a6ae`]	2025-02-07 09:16:24 -05:00
Satyanvesh Dittakavi	8daab29f7f	SWDEV-477584 - hipExtGetLastError should return the immediate previous API error hipGetLastError should return the error by any of the previous APIs in the same host thread to match the CUDA behavior, whereas hipExtGetLastError will return the error by the immediate previous API. This Ext API was added earlier to facilitate the existing HIP apps which are following the current behavior of hipGetLastError Change-Id: I61e95b1fc136cc761e2434e02187b7ed2598b733 [ROCm/clr commit: `4b443f8133`]	2025-02-06 23:30:48 -05:00
Ioannis Assiouras	6a00aa8d61	SWDEV-508435 - Added a fix for double free of hsaImageObject Change-Id: I9397f7c9dbbad7c249b359155df312cb920eba6c [ROCm/clr commit: `d05ecea253`]	2025-02-05 22:21:24 +00:00
Ioannis Assiouras	c0b728fcad	SWDEV-513323 - Fix for BatchMemOp on devices with no image support BatchMemop should be positioned before the image support kernels because the total number of kernels is determined by BlitLinearTotal, when there is no image support on the device. Change-Id: I8e53caf744ba54259ac04bad1762eef21806f3f2 [ROCm/clr commit: `3e01da3dac`]	2025-02-05 04:45:22 -05:00
Anusha GodavarthySurya	5535f15104	SWDEV-469422 - hipGraph move to classes from structs Change-Id: I0f9c8ef1161c0c92ebe0cce6844b2feacfee83f5 [ROCm/clr commit: `32e5b00c30`]	2025-02-05 00:33:41 -05:00
taosang2	27e87ccca6	SWDEV-513458 - Add gfx950 target ID Add gfx950 target ID Cherry-picked https://gerrit-git.amd.com/c/compute/ec/clr/+/997678 https://gerrit-git.amd.com/c/compute/ec/clr/+/1063519 Change-Id: I0228c5e87ceec366983fd4afb1c25e7f8161c2c2 [ROCm/clr commit: `29cc394510`]	2025-02-04 18:30:23 -05:00
Steven Chung	5513df58eb	SWDEV-496674 - Convert non-templated typedefs to templates for consistent mangling Change-Id: I952d15f20afc85c0118403f82e75360197049ef5 [ROCm/clr commit: `782976f5c2`]	2025-02-04 16:37:00 -05:00
kjayapra-amd	892d7bb064	SWDEV-488290 - Remove Stream to Engine logic and rely on engine query status HSA API. Change-Id: I469ab6679360c8ee8d4ee515678a8aa8d4578ebf [ROCm/clr commit: `cc62a82347`]	2025-02-04 13:00:16 -05:00
Ajay	cb281e23cd	SWDEV-485453 - add hipcc dependency to hip-dev Change-Id: I607fc7c3b3a2137835cb2fb8eeb23d3daed51c91 [ROCm/clr commit: `25572c2efc`]	2025-02-04 11:29:59 -05:00
Rahul Manocha	4cbfbe2112	SWDEV-511855 - Fix hipMemcpyPeer to support stream capture checks Change-Id: I7797f069b3ed4240b6785e82da7494a97b4843c6 [ROCm/clr commit: `81051f3520`]	2025-02-04 11:22:35 -05:00
Aidan Belton-Schure	33b4f178c0	SWDEV-443561 - Add tools dispatch table Change-Id: I3445554e486ab7b94592571f52c1530cb918d021 [ROCm/clr commit: `152cee3737`]	2025-02-04 04:57:38 -05:00
Juan Manuel Martinez Caamaño	5356f13902	SWDEV-132637: Remove OpenCL cl_khr_depth_images workaround that is not needed anymore The cl_khr_depth_images associated macro definition is defined twice in the compiler: in opencl-c.h and automatically by the compiler deduced from the cl-ext list. These two co-exist and there is no need to remove cl_khr_depth_images from the cl-ext list. If we remove cl_khr_depth_images from the cl-ext list, and we do not include opencl-c.h the macro is not defined. This fixes conformance test ./test_compiler compiler_defines_for_extensions when using Comgr with -include opencl-c-base.h -fdeclare-opencl-builtins without including opencl-c.h. Before we got the error `ERROR: Supported extension cl_khr_depth_images not defined in kernel` This change is needed to eventually get rid of the opencl-c.pch that is embedded in comgr, and that makes implementing a compilation cache in comgr hard. Change-Id: I76497874ebe7163966420d4ac23a0788b93a36fd [ROCm/clr commit: `8c9e6d0fa5`]	2025-02-04 03:14:31 -05:00
Jacob Lambert	2bd527c676	SWDEV-387063 - Use clang default for C++ version Instead of enforcing c++14 here, we can instead use the current clang default Change-Id: Ib0a178a53c1377f2910edf6fab82b2bac6567ac7 [ROCm/clr commit: `33e48b9629`]	2025-02-03 11:07:52 -05:00
Jimbo Xie	cc229f251f	SWDEV-504383 - Cleaned up kForcedTimeout10us and removed IsHwEventReadyForcedWait Also removed active_wait_timeout Change-Id: I7a429f003c09a4df267b5c0983050704260094c6 [ROCm/clr commit: `4872b420c9`]	2025-01-31 14:40:18 -05:00
taosang2	40df900647	SWDEV-501963 - Add missing codes for gfx950 Cherry-pick https://gerrit-git.amd.com/c/compute/ec/clr/+/1162997 Change-Id: I6b3c6bf55c61cffd43cd6f17b75998f751b75723 [ROCm/clr commit: `32daa8f384`]	2025-01-31 14:34:49 -05:00
taosang2	af99b5d52d	FEAT-56803 - Fix ocltst slow issues Fix very slow issues of two ocltst test cases. Cherry pick https://gerrit-git.amd.com/c/compute/ec/clr/+/1009383 Change-Id: I0228c5e87cdec366993fd4afb1c25e7f8161c2c5 [ROCm/clr commit: `4ec274c7d4`]	2025-01-31 10:45:43 -05:00
Anusha GodavarthySurya	837f7ca08c	SWDEV-489084 - Avoid creating internal stream when graph has single branch Change-Id: I9371d44481257069bb51c0217a57f97d803589c4 [ROCm/clr commit: `b385992f94`]	2025-01-31 00:16:57 -05:00
kjayapra-amd	712987ed08	SWDEV-509280 - Combine multiple definitions of callbackQueue into a single function. Change-Id: Ibbb56136bec2beed71c202d75e8aec9e82640a4e [ROCm/clr commit: `0324014710`]	2025-01-30 15:58:11 -05:00
Jatin Chaudhary	f8421ce480	SWDEV-508617 - There is no NaN for E4M3 and FNUZ Change-Id: I330b041019990231c098073f94d9d40a3c13ba76 [ROCm/clr commit: `1fdbf35d14`]	2025-01-30 11:48:34 -05:00
Saleel Kudchadker	d0656c944b	SWDEV-504494 - Resolve signal dependencies - Resolve signal dependencies for barrier value packet if there are > 1 depenent signals. Barrier Value packet accounts for only 1 dep signal - Better log Change-Id: Ia506ad5d80b91d598f92e7b539f41756e9b4b64b [ROCm/clr commit: `2d450e8b06`]	2025-01-29 19:49:02 +00:00
Jatin Chaudhary	992b5fd009	SWDEV-507817 - fix the return type of one of the atomicMin variants Change-Id: I9915eb174d5677e21adbabae5819c9e306338ab3 [ROCm/clr commit: `e6fb89190a`]	2025-01-29 11:52:19 -05:00
Jimbo Xie	0a30936c67	SWDEV-510869 - add gfx1153 id Change-Id: I36d39a1db2392990ad9b01d70676c3c986435707 [ROCm/clr commit: `4abedf2a0e`]	2025-01-28 18:15:46 -05:00
Saleel Kudchadker	21ae9ef25e	SWDEV-508225 - Improve fat binary handling Change-Id: I78a9951f2f4c4c743c1205b1e40aac215054e27d [ROCm/clr commit: `08af3eb484`]	2025-01-28 14:38:21 -05:00
German Andryeyev	ae379965dd	SWDEV-459826 - Add a crash dump for a failed queue The logic can analyze the AQL queue state and find a failed AQL packet with the kernel's name Change-Id: I1a478fa2c25462cd07a194784958bdf22454b897 [ROCm/clr commit: `ea0b092af8`]	2025-01-28 14:27:46 -05:00
Tao Sang	7803594aea	SWDEV-458943 - Add fast path in wait() wait() is redesigned with two pathes: fast path: Use spinlock to wait for notify signal. If the signal hasn't been received for some loops, go to slow path. slow path: Use condition_variable's wait(). Improve monitor wrapper for better performance. Fix some bugs left from name removing patch. Change-Id: I893a8353121a25d11e37c8e631caf31cc1fc1f24 [ROCm/clr commit: `f2ff56af9c`]	2025-01-28 12:19:55 -05:00

1 2 3 4 5 ...

13402 Коммитов