İşleme Grafiği

12380 İşleme

Yazar SHA1 Mesaj Tarih
Anusha GodavarthySurya fc014587f8 SWDEV-477324 - Graph Capture memcpy D2D
Change-Id: Ifaa4d78854c03b3150233142df187c9bbf731cab


[ROCm/clr commit: e98179d924]
2024-08-28 23:36:51 -04:00
Julia Jiang 71d97112cc SWDEV-476623 - correct the format on the fix for clCopyImage
Change-Id: I3a3fb2eaa338ff4e298a43e583fcf94ec7cabdf6


[ROCm/clr commit: 417d3279f9]
2024-08-28 16:16:24 -04:00
Julia Jiang 049a21f3da SWDEV-476623 - Fix test failures for clCopyImage
Change-Id: I971c5be98304bdbef0feec73e15ebd61a131b12f


[ROCm/clr commit: c3c41dae0d]
2024-08-27 11:43:12 -04:00
Tao Sang eaa7fd41cf SWDEV-474989 - Fix issues of texture tests
Change-Id: Ie1d874742b804f82ceda68864fa54f5d59c092b8


[ROCm/clr commit: 4b211f7272]
2024-08-27 11:29:43 -04:00
kjayapra-amd afd72f9ad0 SWDEV-478099 - Fix multiple mapping case on PAL/Windows backend.
Change-Id: Id1fe7939fbf90649cda1848890b3b4ca9a1fcd00


[ROCm/clr commit: 2a9cb89228]
2024-08-27 11:19:39 -04:00
Ioannis Assiouras a00f071579 SWDEV-470372 - Added hipExtHostAlloc API
This change adds a new HIP API `hipExtHostAlloc` which preserves
the functionality of `hipHostMalloc`.

Change-Id: I13504c6fc13465ddd7aed329795bb4f2fef1baff


[ROCm/clr commit: 2c84211b58]
2024-08-27 08:26:03 -04:00
Jatin Chaudhary 5e3fe9bd1f SWDEV-480489 - fix unsafeAtomicAdd
Integration into pytorch pointed out some issues, value narrowing, to
fix this we are now using unions. Also removed check for -munsafe*
compiler flag. The check is now just on builtin detection.

Change-Id: I49364503fa429bd862952f9b29879072afa6d553


[ROCm/clr commit: bb52d9ed62]
2024-08-27 06:29:11 -04:00
Vladana Stojiljkovic e8b3e9e5b6 SWDEV-478207 - Return hipSuccess on the end of hipTexRefGetMaxAnisotropy
Change-Id: I0c4d6d13a178af8449853c87e62a1868eb17f87d


[ROCm/clr commit: f5e6e27fe1]
2024-08-27 05:30:36 -04:00
ksankisa 3bcd901f06 [SWDEV-469495] Compile blit kernels with -fsanitize=address when asan is enabled.
Change-Id: I96e1abef43317cd58329c4a159f807878bc48cf4


[ROCm/clr commit: e76bf653fb]
2024-08-27 01:27:31 -04:00
Sameer Sahasrabuddhe aacf75f480 SWDEV-480725: missing __ockl_wfall __ockl_wfany in amd_hip_bf16.h
Change-Id: Iff4aeec411bfeaf4cc187c515e2da3d5898f89cb


[ROCm/clr commit: 6df2da65cd]
2024-08-25 22:49:14 -04:00
kjayapra-amd f370cced08 SWDEV-479620 - Change argument type to size_t from uint64_t in nonTemporalMemcpy function.
Change-Id: I31f8a2b00685789b027d78be40a9f82c235f51b9


[ROCm/clr commit: 00eb038eec]
2024-08-24 07:42:37 -04:00
Ajay c774a3470d SWDEV-478881 - Fix log AMD_LOG file corruption
hiprtc and hip APIs use the same file.
Append to file instead of start of file

Change-Id: I2703f9bb67f0c51b557a058daab129679a0b5dd9


[ROCm/clr commit: e07172ff57]
2024-08-23 11:19:48 -04:00
kjayapra-amd 2af78c954b SWDEV-478097 - Check for parents size in case of VA Mem object.
Change-Id: Icfdeabeb178c0dcc8c3a4bc48eec40067985794e


[ROCm/clr commit: d7b097c994]
2024-08-22 14:18:51 -04:00
Julia Jiang eac092348a SWDEV-479940 - Update the changelog for 6.3
Change-Id: I2b465d297466b9c4884e30649bd2ea12a4c4229c


[ROCm/clr commit: 6576be5602]
2024-08-22 11:28:46 -04:00
Shane Xiao 06912065d9 [SWDEV-479204] Fix the hipGraph AQL package fill issue
This patch fixes this potential issue that filling AQL header before
filling the AQL body. The hsa spec specifies "Packet processors may
process AQL packets after the packet format field is updated, but
before the doorbell is signaled."
However, the hipGraph AQL package with valid header will be filled
before fill the body, which may have the potential issue that CP
receive invalid AQL body.

Change-Id: I84af798c19ee2b8805ba19732b0eabdea2958a96


[ROCm/clr commit: 3959b5be1e]
2024-08-21 21:49:11 -04:00
Rahul Manocha 1b14058283 SWDEV-474617 SWDEV-464679 - Fix segfault in palvirtual due to peer memory access
Change-Id: Ib8b641712d78acf8bc073ca5705dea97af6f944a


[ROCm/clr commit: 432bdd7bf2]
2024-08-21 11:34:15 -04:00
Sourabh Betigeri d1d6c448c9 SWDEV-462192 SWDEV-459056 - Fixes corruption
SPT is destroyed with hipDeviceReset(). If a
stream is created right after reset, the same
object id could be reused. Later SPT destructor
incorrectly verifies that the stream is valid
referring to the reused object id causing the
corruption.

Change-Id: I3b1f7ffdf8bab874dca7b8fde22318162997b8f6


[ROCm/clr commit: f6a68b3c2e]
2024-08-21 11:33:44 -04:00
Ioannis Assiouras b5acdd6fdc SWDEV-470612 - Added fixes in optimized multistream path for graph execution
This change adds fixes in optimized multistream path for childGraph uses cases.

1) For childgraph nodes, rely on runNodes() only to process
   the childgraph and skip calls to createCommand and enqueueCommands.
   This ensures that the start/end markers are enqueued correctly
   with respect to the childGraph commands.
   In addition, the runNodes() for the childgraph should be called after
   the dependency walkthrough to make sure that the subgraph is executed once.

2) Nodes with no outgoing edges should be marked
   as a leafs regardless of which stream they are assigned to.
   This is to ensure that marker dependencies from nodes
   that run on non-zero stream to subgraph leafs that run on zero stream
   are still set up correctly.

Change-Id: I4a5f4f3b0e0d01e515cdcb045b46c2798f291255


[ROCm/clr commit: 464b99373b]
2024-08-21 10:11:24 -04:00
Anusha GodavarthySurya c2a4062392 SWDEV-470612 - Add stream id to DOT print when DEBUG_HIP_GRAPH_DOT_PRINT is enabled
Change-Id: Iec3630ba6fb2206925653ea939770bb9820d7c52


[ROCm/clr commit: 19bf971134]
2024-08-21 00:37:41 -04:00
taosang2 785d6e7d01 SWDEV-475144 - Fix random language string
Fix random language string that leads to compiling failure
of trap handler and TDR of hipMemset() on VM in release
mode of hip-rt

Change-Id: Ie1d874742b804f62ceda68064fa54f5d39c092b8


[ROCm/clr commit: 857d0d60b9]
2024-08-20 17:42:31 -04:00
kjayapra-amd 457e46551d SWDEV-439234 - set access for vmm memory on graph/mempool path.
Change-Id: Idfb740dcfe6c7fe0f18231de3074a81d06e6886e


[ROCm/clr commit: e72d5a4443]
2024-08-19 13:16:30 -04:00
kjayapra-amd 30609c2e65 SWDEV-465509 - Save the handle type during import function.
Change-Id: If069abb6cd474a7b071617757041402b53575414


[ROCm/clr commit: 8ddb023512]
2024-08-16 10:55:36 -04:00
Ioannis Assiouras 1110a2f345 SWDEV-470372 - Un-deprecate hipHostAlloc, comply with cuda and introduce hipHostAlloc flags
Change-Id: I8165342825dfe07b6e9edc492d0166d0a03be62d


[ROCm/clr commit: 1e4c60f286]
2024-08-15 18:25:22 -04:00
Ioannis Assiouras 3cc948278e SWDEV-449052 - Fix hipMemcpyParam2D when source or destination pitch is set to zero
When source or destination pitch is set to zero in hip_Memcpy2D struct
it should default to WidthInBytes + [src/dst]XInBytes

Change-Id: Id57b53cab40ba72ced231258da9356554c4868c3


[ROCm/clr commit: 7a1e818c82]
2024-08-14 04:46:41 -04:00
amd-jmacaran 5dac731f29 SWDEV-458516 - External CI: Align with branch naming convention.
Change-Id: Ie1d874742b804f02ceda68064fa54f5d59c092b7


[ROCm/clr commit: cd4ed0916b]
2024-08-13 11:47:11 -04:00
Satyanvesh Dittakavi 6907974f90 SWDEV-473942 - SWDEV-431367 - Correct atomicMax(_system) and atomicMin(_system)
- Fixes -0.0 and +0.0 comparison. For atomicMax if the value on
address is -0.0 and on val is +0.0, gfx90a's unsafe atomics will swap
them. This behavior should be consistent with cas loop as well.

- _system variants of atomicMax and atomicMin are resulting in
incorrect output. Updated these to use the similar implementation as
atomicMax and atomicMin.

Change-Id: I20df36ee29ae0434a6b564f2ba71193fe41cfa59


[ROCm/clr commit: d69cc35750]
2024-08-13 10:38:50 -04:00
Satyanvesh Dittakavi 6bd51db0b1 SWDEV-475185 - Handle device id for hipStreamLegacy
Change-Id: Ib56e6edb77a923f3f9738df64cb9d9ef0b4ba564


[ROCm/clr commit: aa6d07518f]
2024-08-12 09:59:17 -04:00
Ajay 3d22e51806 SWDEV-471863 - APU: device allocation greater than invisible memory
Change-Id: I37f1769873ac7dcbb3cfa51fd815ee1e2123aeae


[ROCm/clr commit: ec0971dd08]
2024-08-09 14:29:18 -04:00
Rahul Manocha 436271e407 SWDEV-468039 - FP8 host only conversion support on mi200
Change-Id: I0891f42d1b7c0d94d099fe26df5db3eff64ba564


[ROCm/clr commit: 39bbc0341d]
2024-08-07 20:51:00 -04:00
Jaydeep Patel c1f83df84c SWDEV-474937 - Fix race condition between main and work thread on windows.
Change-Id: I4d6b9de41d0e5a39094eb3babe47dffde72e0587


[ROCm/clr commit: 912de7ab44]
2024-08-07 14:29:14 -04:00
Alex Xie a381538161 SWDEV-444098 remove "rocm-ocl-icd" package
This is the first step to remove rocm-ocl-icd.
We don't build amd icd after this commit.
We still need to remove header files usage in future steps.

Change-Id: Ic4ac5476180f9ef2ce87b62891c08b28d6c9bfd2


[ROCm/clr commit: 5f775b8b7f]
2024-08-07 11:29:41 -04:00
Jaydeep Patel 12eea11370 SWDEV-457316 - Release graph exec before stream gets deleted.
Releasing graph exec after wait completes and before delete hip::stream obj
during stream destroy.

Change-Id: I1d68aa8d844f7d3af330c6d09c44af07f8553551


[ROCm/clr commit: 8e80429b87]
2024-08-06 00:39:37 -04:00
Jaydeep Patel 82474ca1db SWDEV-465220 - Validate stream on which Kernel is planned to be launched.
Change-Id: I34c679bd888c275584c11ad3e8346d4d542976f9


[ROCm/clr commit: b0047d690a]
2024-08-06 00:31:22 -04:00
Jaydeep Patel 2aafd5a30c SWDEV-457316 - Multiple graph exec can be for given stream.
Change-Id: I0f1b184eb63e0432119d62f094637d375a3d4e55


[ROCm/clr commit: d954eb64db]
2024-08-06 00:31:04 -04:00
Jaydeep Patel c51153f759 SWDEV-470886 - Add maybe_undef attribute for shfl device function due to not all lanes of wave define var and compiler needs to know about this.
Change-Id: I3a683887e033305ac55362f356838b491a6d50f2


[ROCm/clr commit: 6344ddb2f3]
2024-08-05 00:53:13 -04:00
German Andryeyev 35c7a87014 SWDEV-470612 - Add the optimized multistream path
- Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution
- Optimize the launch latency, where commands
creation and execution is done at the same time
- Optimize the scheduling to use less barriers and waiting signals if
the same queue  can be detected
- The new path is controlled by  DEBUG_HIP_FORCE_GRAPH_QUEUES
environment variable, where 0 will use the original path and any other
value will force the number of asynchronous queues for execution
- DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async
execution in graphs(applicable for Navi families only)

Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e


[ROCm/clr commit: 9db52f9a46]
2024-08-02 14:19:44 -04:00
Anusha GodavarthySurya 31927fefd6 SWDEV-468424 - Add support to capture multiple AQL Packets
=> Added support to capture multiple AQL Packets.
=> Added Interface to callback to hip runtime from rocclr to allocate
kernel args from the graph kernel arg pool.
=> Enabled Support to capture memset node.

Change-Id: I7e1c2ba06927459e024653058af142bd82192c43


[ROCm/clr commit: bd3a35bde1]
2024-08-01 23:55:51 -04:00
Julia Jiang 6fa0563ed6 SWDEV-436608 - Un-deprecate hipHostAlloc()
Change-Id: I71393e6f536ba84b4e172acf54ba4f72350e2ae8


[ROCm/clr commit: e988e5e448]
2024-08-01 11:45:20 -04:00
Marko Arandjelovic 7cd2515908 SWDEV-465204 - Fix hipModuleLaunchKernel data validation
Change-Id: I129f265a5eb79d0a13da4f12e78e06ba307b17ee


[ROCm/clr commit: be5f097e8e]
2024-08-01 05:09:23 -04:00
Vladana Stojiljkovic 91800f18ea SWDEV-465142 - Copy memAllocNodePtrs_ when cloning graph
Change-Id: I5a0907e59397e71b44db59c44b551b74a6e59ba0


[ROCm/clr commit: d62c1dea72]
2024-08-01 11:05:06 +02:00
Vladana Stojiljkovic 6fcb2c655f SWDEV-475127 - Check if hipBindTextureToArray parameters are null before dereferencing them
Change-Id: Id0173faff0a385d1665194c9033083ef9b2c48b5


[ROCm/clr commit: d7b07b94a0]
2024-08-01 05:01:55 -04:00
Ioannis Assiouras 42e8d3c894 SWDEV-476460 - Fix for a race condition in SysmemPool::Alloc
Change-Id: Ia94709e68b236c9460589963c0f09ec1f481c306


[ROCm/clr commit: 8e137e8702]
2024-08-01 04:22:26 -04:00
Marko Arandjelovic 92ddb8e242 SWDEV-475114 - Prevent segfaults in hipBindTexture
Change-Id: I050f36a5c74a5d4542155040ccce043fee6b73ad


[ROCm/clr commit: b3153a5f41]
2024-07-31 16:57:57 -04:00
Marko Arandjelovic 662ba4701c SWDEV-461791 make memcpy synchronous for D2D if src&dst ptrs have SYNC_MEMOPS attribute
Change-Id: I603081d21e5eb3c73111845e350d8fa2ba5a7733


[ROCm/clr commit: 7d0ff387e9]
2024-07-30 11:46:55 -04:00
Ioannis Assiouras 19d16561a4 SWDEV-472309 - Ensure static maps are destroyed after __hipUnregisterFatBinary
hipDeviceSynchronize called from __hipUnregisterFatBinary
accesses static maps and monitors. This change ensures these ojects
are not destroyed before __hipUnregisterFatBinary  is called.
Additionally it disables the teardown process for static build.

Change-Id: I46b58641d60efcf6637a8e99cdd786ffe9e2c77d


[ROCm/clr commit: 9b33db9b24]
2024-07-30 10:26:59 -04:00
Sourabh Betigeri f518648183 SWDEV-365151 - Fix the fns32 to do 32 bit computation and adds a wrapper to ease porting from CUDA
Change-Id: I0b5a9ca11c98f8c1c40cfba7f4e057bfda2d756e


[ROCm/clr commit: 7298b80112]
2024-07-26 11:20:14 -04:00
Saleel Kudchadker abfe135e4f SWDEV-475341 - Fix stream resolution for graphs launches
This issue was happening because of incorrect usage of getStream call,
if we get the null stream first and then typecast it, and call on
getStream again, we lose the advantage of simply passing "nullptr" to
indicate NULL stream. Thus we enter the waitActiveStream call and add
barriers to sync across streams.

Change-Id: I94dc4e3ec927295b9e1ab6dee4b37d7d3e00b0cc


[ROCm/clr commit: cda4b7db1c]
2024-07-25 19:38:23 -04:00
Sourabh Betigeri 5b0cc86295 SWDEV-475394 - Fix for the return type to be in-line with CUDA
Change-Id: I7c833571d47b4e86a86e4a0095b61947d16ecab6


[ROCm/clr commit: 4fbd7abbb2]
2024-07-25 16:33:33 -04:00
Saleel Kudchadker 16920809d7 SWDEV-301667 - Refactor Blit force env var
Change-Id: I5344ac2e6442cd8f526118e688f1b1412cc5b45a


[ROCm/clr commit: d379f4efd0]
2024-07-25 15:15:10 -04:00
Rahul Manocha de67a2a1dc SWDEV-468039 - FP8 OCP headers
Change-Id: Iecd32c5a0357781da07395d32f894415954b7b22


[ROCm/clr commit: 353f15afa6]
2024-07-25 12:42:23 -04:00