提交線圖

13309 次程式碼提交

作者 SHA1 備註 日期
Jatin Chaudhary a6f2a2c2af SWDEV-505971 - change setArgument arg from uint32_t to uint64_t
We are passing this arg as an address, and memcpy complains about
overreading (8 bytes instead of 4).

Change-Id: Ica9207f6c5f6056a4bfc968280c76e779ded13ae
2025-01-06 08:16:59 -05:00
Pengda Xie 8155943c5f SWDEV-505833 - Provide functionality to avoid L2 flush for CPX mode for dispatch packets
- Added DEBUG_CLR_SKIP_RELEASE_SCOPE flag to force release scope to
   SCOPE_NONE in AQL packet header

Change-Id: Ife02cddb9d5cd4749103ce585d3d5fe9024c6868
2025-01-03 17:28:21 -05:00
Marko Arandjelovic 7e152bb0f3 SWDEV-506234 - Refactor validation in hip_memory
Change-Id: I9d69695e4b6668e6de00f1f6b060862872358340
2024-12-31 00:35:25 +02:00
zichguan-amd b8ba4ccf9c SWDEV-495789 - revert Fix ambiguity of fma for _Float16 for libc++ (#1976)
Change-Id: I45ae4711a047f4484a018b9409c9f6ecf09720ce
2024-12-29 10:56:32 -05:00
Amit Pandey 11cd37ce0b SWDEV-490256 - Fix uri_decode logic to handle Memory URI.
Uri decoder logic currently silently ignores processing of memory uri.
This patch enables the existing logic to handle the processing of offset
and size related to loaded code-object having memory URI.

Change-Id: If03579cefb11d91f667410464dc89404df9270a3
2024-12-25 11:07:16 -05:00
Jaydeep Patel dff8197b1d SWDEV-505276 - Parent graph of orig node and original graph of cloned node should be same.
Change-Id: I6ebc21cc42e41ad5d952a69fb3b3cb095f32cffb
2024-12-24 04:32:14 -05:00
taosang2 d82d6a78cf SWDEV-479958 - Support different address mode
Support different address modes in X, Y, Z directions

Change-Id: If1db5a8af33c92dd14b48968c3e8eceb97daea6c
2024-12-23 16:39:54 -05:00
Julia Jiang 1af639ea44 SWDEV-499281 - Update changelog with new format
Change-Id: I9a764ac99cd03d0a18ebc99cdd0313301e35565b
2024-12-23 10:33:58 -05:00
Ioannis Assiouras a55118f63d SWDEV-497636 - Updated CHANGELOG
Updated CHANGELOG to include the performance fix for
kernel launch latency with increasing number of idle streams.

Change-Id: I509e14cb8f8cd3abe61c6ede78808e96ef8f06e1
2024-12-20 19:09:16 -05:00
Ioannis Assiouras 158b6a29e0 SWDEV-505504 - Disable vectorization in GetHipDispatchTable
Change-Id: Id33144623555a5d25e029ca644f6274610dcd0ad
2024-12-20 17:47:07 -05:00
sonadeem caa10572cb SWDEV-503436 - Fix incorrect OptionGroup for FSanitize option
NOPTION is meant for component options or alias runtime options so
the option group must not be OA_RUNTIME or OA_MISC_ALIAS must be set,
otherwise we incorrectly assume that it has an option variable and
attempting to write to it causes corruption of OptionVariables.

Change-Id: Iafb5a8f743e5ed0f87be36061c44578178f6cfde
2024-12-20 10:14:51 -05:00
Jaydeep Patel a05a02e527 SWDEV-505205 - Fix hipStreamLegacy segfault with hipStreamWaitEvent.
Change-Id: I17fdaf7ac323507f99a7c071066944296537489c
2024-12-20 04:18:21 -05:00
German Andryeyev 0640d36019 SWDEV-504658 - Reduce the lock scope for kernel look-up
The vector with all kernels is preallocated on the executable init.
Thus, reduce the scope  of global lock to the binary creation only.

Change-Id: I73035013a6562175069137e895bba815f466ee35
2024-12-18 17:04:51 -05:00
Sourabh Betigeri cd9db5a2fa SWDEV-505277 - Adds hipStreamBatchMemOp in the enum of hcc_map
Change-Id: I6e58dfbe4ba13db8717edc36020fefabc9ddbe23
2024-12-18 05:38:58 -05:00
Saleel Kudchadker fa63919a63 SWDEV-504340 - Move cast of cl_mem inside the condition
Change-Id: I9c91f5d945a8d8bd2b2f55e3d11ede66afe4eef7
2024-12-17 12:58:12 -05:00
German Andryeyev e3efce20be SWDEV-504650 - Switch to shared_mutex for events
Use shared mutex for events validation

Change-Id: Iff291c758d9edd65717c506150f3b9d39e5306ba
2024-12-17 11:04:58 -05:00
Jatin Chaudhary fccf0fa2f0 SWDEV-377518 - Fix bf16/fp8 header to be compileable with hiprtc
Change-Id: I2093a39d79a46da7e102266c04c2a71e03dcb88e
2024-12-17 08:57:15 -05:00
Ioannis Assiouras e8b2fdab96 SWDEV-483134 - Remove hipExtHostAlloc API
Change-Id: I60777ef5c56b60dd8100d0d794ca10fb3b96a555
2024-12-16 17:13:49 -05:00
Pengda Xie 078fe7e5de SWDEV-503764 - Add wptr and rptr to ClPrint for dispatch barrier methods
- added wptr and rptr to ClPrint in dispatchBarrierPacket and dispatchBarrierValuePacket

Change-Id: I8a62289deb23c9f657a9b0ac6138bb55eafecba2
2024-12-16 16:45:30 -05:00
Ioannis Assiouras a808c4b23a SWDEV-489255 - Update stack size limit in rocvirtual
Change-Id: I2aac9d211f64b3d6c121d8b010d215dcbdeac3aa
2024-12-16 09:30:39 -05:00
Anusha GodavarthySurya 13e2e797c0 SWDEV-469422 - Derive GraphExec from Graph and ChildGraphNode from GraphExec
Change-Id: I54d67a1665355579bc249d8ff4f9806e9ee14588
2024-12-16 00:43:57 -05:00
Istvan Kiss 3c863dad91 SWDEV-502543 - Update doxygen to surface funtions and Coop Group API
Change-Id: Id4df63b8ae64a1113f85d89aa250ac9f7cc8b9bb
2024-12-14 14:11:37 -05:00
Todd tiantuo Li 8ffb1430dd SWDEV-496037 - add Strix and Strix Halo to ocltst runtime test
Change-Id: Ia21afddf5223ecd132a06f37bb430961fb7a9341
2024-12-13 20:19:48 -05:00
Sourabh Betigeri f1c05e9026 SWDEV-421020 - Adds hipGraphAddBatchMemOp, SetGetParams and execSetParams APIs
Change-Id: Ieccecfe6173cc68fd3c01f86c99f7cc09fe194a3
2024-12-13 06:23:39 +00:00
Saleel Kudchadker 93f1e8ff60 SWDEV-301667 - Clear dispatch indicator signal flag
Change-Id: I9028df0bb73289791d169e7f064a1d0f615236a5
2024-12-12 21:20:05 -05:00
Sourabh Betigeri 03dbcd8ca7 SWDEV-440866 - [hip-roclr] Adds support to batch memory operations APIs
Change-Id: I5ac63a6626af8c2b4ac382c52dfe1aaf0b3716b8
2024-12-12 19:29:24 -05:00
Tao Sang 3ad8f1b811 SWDEV-496667 - Support gfx9-4-generic target
Support gfx9-4-generic target to cover mi3XX.
Support features sramecc and xnack in generic target.
Improve some code formats.
Add more log on compiler.

Change-Id: I6b3c6af55c60cffd43ce6f17b75998f751b75713
2024-12-12 14:43:39 -05:00
Saleel Kudchadker 537f2fffc9 SWDEV-301667 - Improve logging for hip_memory
Change-Id: Id624b2c91e6b701bc0ee561a0c193f2c66654890
2024-12-12 14:42:00 -05:00
Michael Xie cfcc743824 SWDEV-499997 - Unify ManagedBuffer and KernelArg buffer implementation
Change-Id: I95421c87904dd62d7ee214539a57c7bda1097ff4
2024-12-12 12:56:23 -05:00
Anusha GodavarthySurya 28cbf2bc4f SWDEV-469422 - Refacor childgraph node
Remove static functions in graph

Change-Id: I4df94915f81f250acaea60398aea32ef0ed658e2
2024-12-12 12:38:24 -05:00
Jaydeep Patel c8afd7109d SWDEV-496544 - Sync with stream if it is different than srcMemory stream.
Change-Id: I9c0f94a9531555278a51202ec7203961e1344c2e
2024-12-12 06:17:39 -05:00
Saleel Kudchadker 23c21d5181 SWDEV-503761 - Reintroduce save-temps path for OpenCL
Change-Id: I5e111047242aed7d982b7a25c11ab52293af639d
2024-12-11 13:01:29 -05:00
Maneesh Gupta 072f94c204 Revert "SWDEV-491314 - enable _sync() functions with 64-bit mask argument"
This reverts commit 4e2fd192eb.

Reason for revert: Introduces regression SWDEV-503319

Change-Id: I888c5b95d904146e4782e8c57d736878fcdde678
2024-12-11 11:16:48 -05:00
German Andryeyev 14f58fc74d SWDEV-501757 - Clean-up signal creation
Use hsa_amd_signal_create() if settings.system_scope_signal_ is true.

Change-Id: I6d440155dfbcd5bf03658583a93827cb1c56537c
2024-12-11 09:57:50 -05:00
Jatin Chaudhary 98b33886cd SWDEV-503299 - Do not use operator to check for nan
Some libs use __HIP_NO_HALF_OPERATORS__ and __HIP_NO_HALF_CONVERSIONS__
which results in operators being hidden and can cause errors.

Change-Id: I83c194d7d727cba30b46d7c296f7d396549f5fca
2024-12-11 00:33:44 -05:00
Todd tiantuo Li def369a010 SWDEV-489099 - Deprecate AMDGPU_TARGETS in lieu of supporting GPU_TARGETS
Change-Id: I184cd18e47b1618bcad0fadc8984de54d2a00a9b
2024-12-10 12:38:37 -05:00
Gerardo Hernandez 03e4057fce SWDEV-496392 - Remove references to ASIC Polaris22 in pal::Settings and pal::Device (it is being retired)
Change-Id: I6318abf3e46ed250d087a3d2266d2ae3d4c8c000
2024-12-10 06:01:32 -05:00
Ranjith Ramakrishnan 7c9c7a6332 SWDEV-498728 - Add backward compatibility for deprecated package rocm-opencl-icd-loader
Change-Id: Ic659639e3bb55bd90bd50acf28d8079ff7b084bc
2024-12-09 16:27:43 -05:00
German Andryeyev f4b9d3b7bd SWDEV-501757 - Use signals without interrupts
In active wait mode use signals without interrupts by default and switch
to the interrupts only if a callback is required.

Change-Id: Ibcde8f7d44c70f8fb8fa5e0a7fdd8b08a2982a8e
2024-12-09 15:16:15 -05:00
Ajay 7d763fb803 SWDEV-498474 - APU: when total allocated is greater than 75% of invisible
Only when memory type is Local and the invisible memory is +ve
Should also fix SWDEV-490991

Change-Id: I78a4925a234ba90c63909bde5b7dc217568b4de3
2024-12-09 12:06:57 -05:00
Branislav Brzak 89dfdc4dbe SWDEV-490860 - Do signal_is_required detection post graph schedule
Change-Id: Iaf1067a811aeac3d16c08de954036e219b545e07
2024-12-09 03:57:44 -05:00
Jaydeep Patel 0d4823ff88 SWDEV-502532 - Exit graph launch in case of empty graph.
Change-Id: Ifb6ab14ca6810cbc1c9e38c59d1d9e7d367358d9
2024-12-07 12:27:53 -05:00
Ioannis Assiouras e80442fdbf SWDEV-497759 - Fix memObj offset computation for hipHostRegister on Windows mgpu
On Windows, hipHostRegister may add a single object in the MemObjMap
that maps to memory that is allocated on different devices.
This change ensures that the offset that is returned from
getMemoryObject() is computed relative to the memory that is allocated
on the current device.

Change-Id: I5fd3af200bf6f4926fdeaea12dcb9d0154d3a843
2024-12-05 16:18:10 -05:00
Marko Arandjelovic 78f62d3230 SWDEV-495609 - Change include path for rocclr/utils
- Header files inside rocclr/utils when included from hipamd or opencl should be included as #include "rocclr/utils/xxx.h" instead of "utils/xxx.h"

Change-Id: Ic0760c33b9d091f5620dec67e5482c9698d22093
2024-12-05 11:44:20 -05:00
Jatin Chaudhary 5122b8c999 SWDEV-501779 - add correct function qualifier for fp16 functions
Some functions were __device__ only, but should be __host__ and
__device__, changed them to __HOST_DEVICE__.
Some functions were __HOST_DEVICE__ but were using ockl functions,
changed them to __device__ only.

Change-Id: Ife9e7abe60415bda68f5f9a101e6e7c39ad51064
2024-12-05 10:17:18 -05:00
Branislav Brzak f2dba978f5 SWDEV-490860 - Include signal_is_required in dot file dumps
Change-Id: Iec4b433b11fbecb71a4ce68beb7d6f681d25b8e6
2024-12-05 04:51:59 -05:00
Shane Xiao 231b2410a0 SWDEV-492049 - Remove the handle of Phy Mem from Memobj
The hipGraph will use VMM by default when allocating memory.
However, the handle of Phy mem has been added to Memobj by default.
Since the Memobj will track the whole address range from handle to
handle + size, this needs the system to reserve the whole address
range. If the system range have not reserved by the system, then it
will have the potential issue that clr finds the Memobj incorrectly.

This patch removes the handle from the Memobj to fix this potential
issue.

Change-Id: I2da38e6b2d11d0d48e1afe66c46899500c290624
2024-12-04 19:39:52 -05:00
Saleel Kudchadker 7863eb92dc SWDEV-497145 - Use rocr copyOnEngine API for staged copies
- Refactor blit code and clean ASAN instrumentation
- Use unified function for rocr copy
- Enable shader copy path for unpinned writeBuffer/readBuffer paths
- Set GPU_FORCE_BLIT_COPY_SIZE=16 which means we will use BLIT copy for
  pinned copies or unpinned H2D/D2H copies < 16KB

Change-Id: I42045cca79234b340dbf53dafb93044199736ae4
2024-12-04 13:38:13 -05:00
German Andryeyev efd3ea4b30 SWDEV-501403 - Switch to std::shared_mutex for streamSetLock
Shared mutex allows to have access to the list of streams  from
multiple threads at the same time.

Change-Id: Ibee64b846cde03321d5b17dbee2829c0bab7e7d6
2024-12-04 12:06:51 -05:00
Jatin Chaudhary 063f7ef32a SWDEV-485945 - use union to convert values
this shows up in some compilers as warnings.

Change-Id: I862cd6baf2edb8161757adc54abb787530489481
2024-12-04 11:15:03 -05:00