Commit graph

12614 Commits

Autor SHA1 Nachricht Datum
Vladana Stojiljkovic 4b786b5207 SWDEV-498061 - Add capture support for hipModuleLaunchCooperativeKernel
Change-Id: I5ed188e046c680c2785b3952391f59ed1d0c21b8


[ROCm/clr commit: 30cb2d0e67]
2025-01-16 10:54:30 -05:00
Marko Arandjelovic cf8eeabfe2 SWDEV-489619 - Fix memcpy tests with capture stream enabled
- Added missing validation as graph node should not be created
 if parameters are invalid
 - Fix conversion of input params to graphNode params

Change-Id: I37ab04942b5fb2eb07386850cb7dbbf26f9ca967


[ROCm/clr commit: db8527f655]
2025-01-16 10:31:04 -05:00
Marko Arandjelovic 8647bb483b SWDEV-504084 - Make hipModuleGetFunction use the device the module is loaded on
If a module is loaded on one device, hipModuleGetFunction and other similar APIs should be able to run successfully from another device.

Change-Id: I96084cbd6c6dcf2a81019779a6ab1842ef2f35d1


[ROCm/clr commit: c46f843b99]
2025-01-16 10:16:42 -05:00
Ioannis Assiouras 67c93c3bad SWDEV-505503 - Use internal device synchronize function in __hipUnregisterFatBinary
This is to avoid calling the HIP_INIT macro during the shutdown process.

Change-Id: I2e65f6e10491918a17445ee1e8ddd08286070358


[ROCm/clr commit: 5e3a29078d]
2025-01-15 18:57:34 -05:00
Sourabh Betigeri c95bb38b2b SWDEV-507960 - Return with error code if stream of type hipStreamLegacy is being attempted to destroy
Change-Id: Iee7ada6a5a905b44360a7e4049fc8b1a45c80db0


[ROCm/clr commit: 9d8d35ae40]
2025-01-13 18:17:37 -05:00
Ioannis Assiouras 161d590405 SWDEV-503760 - Only consider allocations that are less than X% larger in a mempool request
Change-Id: I94acbca606fd4c575e2e1a9e34959ce650571867


[ROCm/clr commit: 44b6b6813d]
2025-01-13 16:57:26 -05:00
Rahul Manocha 0a9315f5e0 SWDEV-497288 - Add validation checks to hipGraphExecNodeSetParams
Change-Id: If8cde47bc8e62414333768e01064298d8a3d80ee


[ROCm/clr commit: 2b32d9aada]
2025-01-13 16:49:45 -05:00
Rahul Manocha 17ad1834d4 SWDEV-492165 - Add support for 16 bit atomicAdd and atomicCAS
Change-Id: I7f1e4876fe2960dff1775d27cb6443a89e146c86


[ROCm/clr commit: 93cff75928]
2025-01-13 13:35:40 -05:00
Jacob Lambert aa918756e8 SWDEV-477039 - Fix spelling typos
Change-Id: Ia46e07ec0a001b7e19bec999597028acd4ee7077


[ROCm/clr commit: 805462a77b]
2025-01-13 12:17:46 -05:00
Aidan Belton-Schure dc2fa93f37 SWDEV-482851 - Do not release last suballocator chunk
Change-Id: Ib28dc9df68e454ee0c0c699c1ff17588fd55f802


[ROCm/clr commit: 451b0ce768]
2025-01-13 10:14:40 -05:00
Daniel Livingston 50387f6eb5 SWDEV-489003 - [Ubertrace] OCL/HIP profiles are missing event instrumentation
Adds UberTrace support for pre-dispatch markers and barrier begin/end markers.

Moves shared definitions out of palgpuopen.hpp into shared header
palcapturemgr.hpp.

Change-Id: I9f464c689e7ff12c54eca043fc1ad65e1836a64f


[ROCm/clr commit: 541c449ce2]
2025-01-10 11:28:52 -05:00
Julia Jiang 60f9ab6fcd SWDEV-507699 - Update CLR license date
Change-Id: I51b641c58b1e9b8c84637af2d22f905bcdab8f56


[ROCm/clr commit: c6e25b2be7]
2025-01-10 11:17:19 -05:00
Julia Jiang 8c070f11f9 SWDEV-497634 - Update change log for hipMalloc allocation fix on Windows
Change-Id: If4351cb4f75141661538e1d26c96e600df3d0b39


[ROCm/clr commit: bdf48bdbf9]
2025-01-10 10:12:39 -05:00
Anusha GodavarthySurya 08c92f4793 SWDEV-480209 - Make internal callbacks non-blocking
Change-Id: Ic918d08f341abfd9a7c167d09f9c723cdc43157f


[ROCm/clr commit: 683a942364]
2025-01-10 02:16:11 -05:00
Saleel Kudchadker 16f14e4b00 SWDEV-504494 - Use system scope for D2H
- When using shader copy, make sure to use release scope for the AQL
  packet. This is a potential bug but is hidden as hipMemcpyAsync always
needs synchronization(which inserts a barrier with release scope). For
hipMemcpy we use a barrier packet to make sure its blocking. Eitherways
a barrier gets always used and hides in some ways a potential bug.

Change-Id: I57fb7f769c3179e76d712471c0905104c801d7ba


[ROCm/clr commit: c9dd95bf6c]
2025-01-10 00:34:08 -05:00
Saleel Kudchadker 6a416008aa SWDEV-508004 - Improve hipEventRecord
- Resolve stream once for event record. We should avoid calling
  getStream again in addMarker

Change-Id: I78448c4f151ae10a5c8e8c248b2f4078b84191cb


[ROCm/clr commit: a22c45d635]
2025-01-09 16:47:46 -05:00
German Andryeyev 5a76761960 SWDEV-507019 - Change the function lock to the module lock
Multiple functions can be located in the same module.

Change-Id: Ia4ca3db64fe5b0822584059d3770c91103665c63


[ROCm/clr commit: 45a12208b6]
2025-01-09 12:35:29 -05:00
Evgenii Averin ef2a812d0a SWDEV-505769 - Fix typo
Change-Id: I2d3f65ed68157718c4439a9da7d2dcdfcbb9f93d


[ROCm/clr commit: b62995ce1a]
2025-01-08 21:31:17 -05:00
Sourabh Betigeri 36f3d7647c SWDEV-505971 - Fix size mismatch of count type to uint32_t
Change-Id: Ie526f828f816e6681ef1735d5edb2db895dace57


[ROCm/clr commit: f5b2516f5d]
2025-01-08 12:47:36 -05:00
Saleel Kudchadker d4594531ef SWDEV-506251 - Disable blit copy thresold for OpenCL
Change-Id: Id0ca43b13d5792791a42da263f6aa4496382cea6


[ROCm/clr commit: 39801b5750]
2025-01-08 02:46:01 +00:00
Rahul Manocha b1ef5972d6 SWDEV-504215 - fix rocalution perf drop by disabling cpu wait
Change-Id: I878f3420073b05cc6241f524ac428e47c0ce823d


[ROCm/clr commit: 05baf9ff22]
2025-01-07 17:02:24 -05:00
Saleel Kudchadker a18f2c549c SWDEV-504494 - Flush to systemscope when copying non-coherent mem
- When we use blit(compute) copies, two subsequent copies may read for
  the same source buffer, the buffer may get modified by the host in
between and if the src buffer was allocated with non-coherent flag, the
device may simply use stale value from previous cacheline fetch. This is
a corner case.

Change-Id: I2ce261c6f6fa4e5bb608f116548e5cc711ae6f3c


[ROCm/clr commit: b63005d550]
2025-01-07 12:49:22 -05:00
Jatin Jaikishan Chaudhary 8b1d0cff83 Revert "SWDEV-505971 - change setArgument arg from uint32_t to uint64_t"
This reverts commit 0830d95f6d.

Reason for revert: There needs to be memcpy size change

Change-Id: If4f51769731e54743ac705b19b4f81b2d5925d5a


[ROCm/clr commit: 446ed661a0]
2025-01-06 18:03:23 -05:00
Jatin Chaudhary 0830d95f6d SWDEV-505971 - change setArgument arg from uint32_t to uint64_t
We are passing this arg as an address, and memcpy complains about
overreading (8 bytes instead of 4).

Change-Id: Ica9207f6c5f6056a4bfc968280c76e779ded13ae


[ROCm/clr commit: a6f2a2c2af]
2025-01-06 08:16:59 -05:00
Pengda Xie 612ae28524 SWDEV-505833 - Provide functionality to avoid L2 flush for CPX mode for dispatch packets
- Added DEBUG_CLR_SKIP_RELEASE_SCOPE flag to force release scope to
   SCOPE_NONE in AQL packet header

Change-Id: Ife02cddb9d5cd4749103ce585d3d5fe9024c6868


[ROCm/clr commit: 8155943c5f]
2025-01-03 17:28:21 -05:00
Marko Arandjelovic 44ff3ba1cc SWDEV-506234 - Refactor validation in hip_memory
Change-Id: I9d69695e4b6668e6de00f1f6b060862872358340


[ROCm/clr commit: 7e152bb0f3]
2024-12-31 00:35:25 +02:00
zichguan-amd a255533afd SWDEV-495789 - revert Fix ambiguity of fma for _Float16 for libc++ (#1976)
Change-Id: I45ae4711a047f4484a018b9409c9f6ecf09720ce


[ROCm/clr commit: b8ba4ccf9c]
2024-12-29 10:56:32 -05:00
Amit Pandey 81a25f9614 SWDEV-490256 - Fix uri_decode logic to handle Memory URI.
Uri decoder logic currently silently ignores processing of memory uri.
This patch enables the existing logic to handle the processing of offset
and size related to loaded code-object having memory URI.

Change-Id: If03579cefb11d91f667410464dc89404df9270a3


[ROCm/clr commit: 11cd37ce0b]
2024-12-25 11:07:16 -05:00
Jaydeep Patel f20f399915 SWDEV-505276 - Parent graph of orig node and original graph of cloned node should be same.
Change-Id: I6ebc21cc42e41ad5d952a69fb3b3cb095f32cffb


[ROCm/clr commit: dff8197b1d]
2024-12-24 04:32:14 -05:00
taosang2 da613fbbeb SWDEV-479958 - Support different address mode
Support different address modes in X, Y, Z directions

Change-Id: If1db5a8af33c92dd14b48968c3e8eceb97daea6c


[ROCm/clr commit: d82d6a78cf]
2024-12-23 16:39:54 -05:00
Julia Jiang 56ccdaf1dc SWDEV-499281 - Update changelog with new format
Change-Id: I9a764ac99cd03d0a18ebc99cdd0313301e35565b


[ROCm/clr commit: 1af639ea44]
2024-12-23 10:33:58 -05:00
Ioannis Assiouras 7051efcc44 SWDEV-497636 - Updated CHANGELOG
Updated CHANGELOG to include the performance fix for
kernel launch latency with increasing number of idle streams.

Change-Id: I509e14cb8f8cd3abe61c6ede78808e96ef8f06e1


[ROCm/clr commit: a55118f63d]
2024-12-20 19:09:16 -05:00
Ioannis Assiouras b4019892c9 SWDEV-505504 - Disable vectorization in GetHipDispatchTable
Change-Id: Id33144623555a5d25e029ca644f6274610dcd0ad


[ROCm/clr commit: 158b6a29e0]
2024-12-20 17:47:07 -05:00
sonadeem ad3d2b0679 SWDEV-503436 - Fix incorrect OptionGroup for FSanitize option
NOPTION is meant for component options or alias runtime options so
the option group must not be OA_RUNTIME or OA_MISC_ALIAS must be set,
otherwise we incorrectly assume that it has an option variable and
attempting to write to it causes corruption of OptionVariables.

Change-Id: Iafb5a8f743e5ed0f87be36061c44578178f6cfde


[ROCm/clr commit: caa10572cb]
2024-12-20 10:14:51 -05:00
Jaydeep Patel 12ed697705 SWDEV-505205 - Fix hipStreamLegacy segfault with hipStreamWaitEvent.
Change-Id: I17fdaf7ac323507f99a7c071066944296537489c


[ROCm/clr commit: a05a02e527]
2024-12-20 04:18:21 -05:00
German Andryeyev 55d4a75016 SWDEV-504658 - Reduce the lock scope for kernel look-up
The vector with all kernels is preallocated on the executable init.
Thus, reduce the scope  of global lock to the binary creation only.

Change-Id: I73035013a6562175069137e895bba815f466ee35


[ROCm/clr commit: 0640d36019]
2024-12-18 17:04:51 -05:00
Sourabh Betigeri 02c203bf9c SWDEV-505277 - Adds hipStreamBatchMemOp in the enum of hcc_map
Change-Id: I6e58dfbe4ba13db8717edc36020fefabc9ddbe23


[ROCm/clr commit: cd9db5a2fa]
2024-12-18 05:38:58 -05:00
Saleel Kudchadker fda4ff1f9d SWDEV-504340 - Move cast of cl_mem inside the condition
Change-Id: I9c91f5d945a8d8bd2b2f55e3d11ede66afe4eef7


[ROCm/clr commit: fa63919a63]
2024-12-17 12:58:12 -05:00
German Andryeyev d5b3b0830a SWDEV-504650 - Switch to shared_mutex for events
Use shared mutex for events validation

Change-Id: Iff291c758d9edd65717c506150f3b9d39e5306ba


[ROCm/clr commit: e3efce20be]
2024-12-17 11:04:58 -05:00
Jatin Chaudhary 11f9d84c34 SWDEV-377518 - Fix bf16/fp8 header to be compileable with hiprtc
Change-Id: I2093a39d79a46da7e102266c04c2a71e03dcb88e


[ROCm/clr commit: fccf0fa2f0]
2024-12-17 08:57:15 -05:00
Ioannis Assiouras 2c8805e536 SWDEV-483134 - Remove hipExtHostAlloc API
Change-Id: I60777ef5c56b60dd8100d0d794ca10fb3b96a555


[ROCm/clr commit: e8b2fdab96]
2024-12-16 17:13:49 -05:00
Pengda Xie 17c7b3b270 SWDEV-503764 - Add wptr and rptr to ClPrint for dispatch barrier methods
- added wptr and rptr to ClPrint in dispatchBarrierPacket and dispatchBarrierValuePacket

Change-Id: I8a62289deb23c9f657a9b0ac6138bb55eafecba2


[ROCm/clr commit: 078fe7e5de]
2024-12-16 16:45:30 -05:00
Ioannis Assiouras 7670376748 SWDEV-489255 - Update stack size limit in rocvirtual
Change-Id: I2aac9d211f64b3d6c121d8b010d215dcbdeac3aa


[ROCm/clr commit: a808c4b23a]
2024-12-16 09:30:39 -05:00
Anusha GodavarthySurya 4288640f69 SWDEV-469422 - Derive GraphExec from Graph and ChildGraphNode from GraphExec
Change-Id: I54d67a1665355579bc249d8ff4f9806e9ee14588


[ROCm/clr commit: 13e2e797c0]
2024-12-16 00:43:57 -05:00
Istvan Kiss 349dacc1d9 SWDEV-502543 - Update doxygen to surface funtions and Coop Group API
Change-Id: Id4df63b8ae64a1113f85d89aa250ac9f7cc8b9bb


[ROCm/clr commit: 3c863dad91]
2024-12-14 14:11:37 -05:00
Todd tiantuo Li 7081be99ae SWDEV-496037 - add Strix and Strix Halo to ocltst runtime test
Change-Id: Ia21afddf5223ecd132a06f37bb430961fb7a9341


[ROCm/clr commit: 8ffb1430dd]
2024-12-13 20:19:48 -05:00
Sourabh Betigeri 750313dfed SWDEV-421020 - Adds hipGraphAddBatchMemOp, SetGetParams and execSetParams APIs
Change-Id: Ieccecfe6173cc68fd3c01f86c99f7cc09fe194a3


[ROCm/clr commit: f1c05e9026]
2024-12-13 06:23:39 +00:00
Saleel Kudchadker 5255fd1fa3 SWDEV-301667 - Clear dispatch indicator signal flag
Change-Id: I9028df0bb73289791d169e7f064a1d0f615236a5


[ROCm/clr commit: 93f1e8ff60]
2024-12-12 21:20:05 -05:00
Sourabh Betigeri 7261404002 SWDEV-440866 - [hip-roclr] Adds support to batch memory operations APIs
Change-Id: I5ac63a6626af8c2b4ac382c52dfe1aaf0b3716b8


[ROCm/clr commit: 03dbcd8ca7]
2024-12-12 19:29:24 -05:00
Tao Sang fb76b9620c SWDEV-496667 - Support gfx9-4-generic target
Support gfx9-4-generic target to cover mi3XX.
Support features sramecc and xnack in generic target.
Improve some code formats.
Add more log on compiler.

Change-Id: I6b3c6af55c60cffd43ce6f17b75998f751b75713


[ROCm/clr commit: 3ad8f1b811]
2024-12-12 14:43:39 -05:00