Граф коммитов

215 Коммитов

Автор SHA1 Сообщение Дата
Branislav Brzak 89dfdc4dbe SWDEV-490860 - Do signal_is_required detection post graph schedule
Change-Id: Iaf1067a811aeac3d16c08de954036e219b545e07
2024-12-09 03:57:44 -05:00
Jaydeep Patel 0d4823ff88 SWDEV-502532 - Exit graph launch in case of empty graph.
Change-Id: Ifb6ab14ca6810cbc1c9e38c59d1d9e7d367358d9
2024-12-07 12:27:53 -05:00
Anusha GodavarthySurya b89977d518 SWDEV-469422 - Always schedule graph nodes
Change-Id: Icc636527fa19e7bf3eb111bc4b1bb9a5f9acff73
2024-12-03 23:44:23 -05:00
Marko Arandjelovic 08aee16573 SWDEV-499794 - Update AQL packet after updating GraphNode
Change-Id: I332d70bdf42a276894a548a02d636e370c2ca08c
2024-12-02 12:29:35 -05:00
Anusha GodavarthySurya c47f9dda58 SWDEV-469422 - Cleanup graph code remove parallellists and nodewaitlists
Change-Id: I00c7b2894333bd13d47b913d3fcdd6e1ffcb741f
2024-11-30 04:40:51 -05:00
Vladana Stojiljkovic b75b0d9a53 SWDEV-494612- Add capture support for hipLaunchCooperativeKernel
Change-Id: I6b3c6af55c60cffd43ce6f47b75998f750b75703
2024-11-29 08:17:41 -05:00
Rahul Manocha e0c11624e5 SWDEV-497288 - Enable hipGraphExecSetParams for Ext SemWait and SemSignal Nodes
Change-Id: I7184a3a04ac17d3d841222ae1559db66d73a429c
2024-11-26 11:34:18 -05:00
Anusha GodavarthySurya 25a893658a SWDEV-491643 - DEBUG_HIP_FORCE_GRAPH_QUEUES is 1 enable capture path
Change-Id: Ibddd50592232b090bf5eab8395fe78a36bb3a14a
2024-11-25 05:21:10 -05:00
Satyanvesh Dittakavi ba2ebb3b99 SWDEV-489570 - Update AQL packet in hipDrvGraphExecMemsetNodeSetParams
After setting the new params in hipDrvGraphExecMemsetNodeSetParams, we
need to update the AQL packet as well, otherwise during the graph launch
it still dispatches the packet which has the original params and not the
updated one.

Change-Id: Ie49a641ba3f66c8085a29f92d88ac6ea6a1c0534
2024-11-01 07:01:10 -04:00
Vladana Stojiljkovic e08df57502 SWDEV-493526 - Create kernel node when hipLaunchByPtr is captured
Change-Id: Id3493485dfdb468436ab33e6d7cb19b6b0066fd4
2024-10-31 12:41:31 -04:00
Vladana Stojiljkovic ec60bb1aed SWDEV-489571 - Fix ihipGraphAddMemsetNode to allow memset of 3d portions of an array
* When hipMemset3dAsync is captured, a 3d extent can set be as a parameter (depth > 1). That worked on nvidia, but on amd wrong portion of array was filled because when creating Memset3D command, extent dimensions were used to create pitchedPtr, instead of original array width and height.
* Also, when capturing hipMemset3dAsync, nvidia allows any of the extent dimension to be 0, and in that case, no work should be done.

Change-Id: I46a605bf9ae801cd3348e98d528c21263a8eefce
2024-10-31 10:29:54 -04:00
Anusha GodavarthySurya f9f995c6d0 SWDEV-480209 - Handle GraphExec object release
=> GraphExec instance is destroyed before async launch completes,
destroy after all pending graph launches
=> Remove GraphExec destroy during next sync point(hipStreamSync,
hipDeviceSync etc..)

Change-Id: I4df682aae5787fd6e5240a7be936ce50361345d0
2024-10-22 12:30:46 -04:00
Vladana Stojiljkovic 6f2bad3998 SWDEV-489823 - Fix hipStreamEndCapture leak when capture is invalidated
Change-Id: If8f5163d70e04d34a75fd0a7ba6c0a15ea59bb8b
2024-10-10 04:38:06 -04:00
Jaydeep Patel e74ac6f580 SWDEV-482692, SWDEV-485802, SWDEV-485489 - Handle refcounts owned by graph for user objects.
Change-Id: Ic739ab1ec5d3dc3143e3ae70f9591922bc0e3d9f
2024-10-08 03:44:44 -04:00
kjayapra-amd 12a39fbf22 SWDEV-480772 - Remove name variable from amd::Monitor class.
Change-Id: Ie2a4fa44f485786227230f8a892e090e718aa30e
2024-09-19 11:55:01 -04:00
Anusha GodavarthySurya e98179d924 SWDEV-477324 - Graph Capture memcpy D2D
Change-Id: Ifaa4d78854c03b3150233142df187c9bbf731cab
2024-08-28 23:36:51 -04:00
Anusha GodavarthySurya 19bf971134 SWDEV-470612 - Add stream id to DOT print when DEBUG_HIP_GRAPH_DOT_PRINT is enabled
Change-Id: Iec3630ba6fb2206925653ea939770bb9820d7c52
2024-08-21 00:37:41 -04:00
German Andryeyev 9db52f9a46 SWDEV-470612 - Add the optimized multistream path
- Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution
- Optimize the launch latency, where commands
creation and execution is done at the same time
- Optimize the scheduling to use less barriers and waiting signals if
the same queue  can be detected
- The new path is controlled by  DEBUG_HIP_FORCE_GRAPH_QUEUES
environment variable, where 0 will use the original path and any other
value will force the number of asynchronous queues for execution
- DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async
execution in graphs(applicable for Navi families only)

Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e
2024-08-02 14:19:44 -04:00
Anusha GodavarthySurya bd3a35bde1 SWDEV-468424 - Add support to capture multiple AQL Packets
=> Added support to capture multiple AQL Packets.
=> Added Interface to callback to hip runtime from rocclr to allocate
kernel args from the graph kernel arg pool.
=> Enabled Support to capture memset node.

Change-Id: I7e1c2ba06927459e024653058af142bd82192c43
2024-08-01 23:55:51 -04:00
Vladana Stojiljkovic d62c1dea72 SWDEV-465142 - Copy memAllocNodePtrs_ when cloning graph
Change-Id: I5a0907e59397e71b44db59c44b551b74a6e59ba0
2024-08-01 11:05:06 +02:00
Anusha GodavarthySurya 35079e834e SWDEV-468424 - Refactor kernel arg
For refactoring of childGraph to have its own graphExec,
kernelArgs needs to be separated from the graphExec object.
All the childNodes part of graph should share same kernelArg pool.
Otherwise we endup creating multiple device kernel arg memory chucks
for single graphExec.

Change-Id: I4029a46ebc1fa112d87df64ab1fecbf288fabe5e
2024-07-16 08:38:44 -04:00
sdashmiz 57e79802cd SWDEV-421021 - Add APIs cuMemcpyNodeGet/Set params
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I24bc0da56aad34c9d5876a3d83b59515f11dc3ea
2024-07-10 12:53:01 -04:00
Jaydeep Patel cf0320a0b9 SWDEV-457316 - Return invalid val from add mem free node if corresponding mem alloc node is missing.
Change-Id: Ib0c346a439fc38ebfd106bcbdf75bd10bfd2f090
2024-07-04 13:09:46 -04:00
Anusha GodavarthySurya 9ad7e79e50 SWDEV-469331 - Fix issue of graph sync.
If graph has multiple branches, End command is enqueued on launch stream which
makes sure all the internal parallel streams are finsihed.

When node is removed from the graph, indegree and outdegree are not getting update correctly for parent, child nodes and
resulting in endNode not having deps on parallel commands. Resulting in graph sync issues.

Change-Id: I33cc2f21220e1c017d88099b29b542e05b683f73
2024-06-28 02:11:44 -04:00
Jaydeep Patel 5c77e30b18 SWDEV-457316 - Other graph can free mem alloc node and return invalid val only if there is double mem free node in all captured graphs.
Change-Id: Icf12164bf0ecd171a4673ff4f384528e7671f944
2024-06-12 00:44:50 -04:00
Jaydeep Patel ca3c2ac185 SWDEV-457316 - Some validations related to Graph Node.
Free node should be added in same graph and once.
Graph clone containing mem alloc/mem free node not supported.
Destroy mem alloc/mem free node is not supported if already added in graph.

Change-Id: I40459e66d7dd84f3b5298617990313b41458c804
2024-05-28 06:31:10 +00:00
Anusha GodavarthySurya 243dad92c9 SWDEV-461072 - Extend AQL Optimization for child graph nodes
Change-Id: I6baf906add7240b29ea653020a9a0b56206ee2a7
2024-05-28 06:31:10 +00:00
sdashmiz 627ccfa502 SWDEV-429053 - Add check for StreamLegacy
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I402185a3b81935aaa1c8c4963407b8de21c72d8a
2024-05-28 06:31:10 +00:00
Vladana Stojiljkovic d543ab6a0b SWDEV-454681 - Add nullptr check for memsetParams in hipDrvGraphAddMemsetNode
Change-Id: Ife8784b23179c5613c29cde27dd2975cb729aaae
2024-05-28 06:31:10 +00:00
shadi e705e5e0d9 SWDEV-421027 - Add more Graph APIs
Signed-off-by: shadi <shadi.dashmiz@amd.com>
Change-Id: I0a1fc284e48317a49ca88d4ed4e3a10e752efd58
2024-05-28 06:28:17 +00:00
Anusha GodavarthySurya bf4d10ff61 SWDEV-460770 - Handle Graph Exec release
Handle GraphExec instance is destroyed before async launch completes
GraphExec instance is destroyed after async launch completes
GraphExec instance is destroyed without a launch

Change-Id: I45a7c82295fea916c7559bd8f796df710513aea1
2024-05-28 06:28:17 +00:00
Anusha GodavarthySurya de95625f09 SWDEV-454247 - Fix graph multi threading issue
Change-Id: I565889da6f7091030b7f6a2d6234b82c389358e3
2024-05-28 06:28:17 +00:00
shadi f2b01782ac SWDEV-420016 - Add more driver side graph APIs
Signed-off-by: shadi <shadi.dashmiz@amd.com>
Change-Id: Iff3ee7dcbcd24836f227fdc9bd5ff4b554ac914f
2024-04-25 12:50:43 -04:00
Rahul Manocha 880963346d [SWDEV-454661][SWDEV-454653] - GraphExecMemcpyNodeSetParam to return error on memcpy direction change
Change-Id: I2c8f5ea394caeaaa6895003e63cd62a052c491f8
2024-04-23 12:56:30 -04:00
Jaydeep Patel 12e0bdcd32 SWDEV-453535 - Capture hipMemset3DAsync.
Change-Id: I517c2557573db258b3e3e353f02f6a56652b0fde
2024-04-18 00:05:45 -04:00
Anusha GodavarthySurya ea4f09e8c0 SWDEV-452787 - correct hipDrvGraphAddMemcpyNode check
Change-Id: Id58f982edd4f17d675f7a0f61a9b4dea0baebd9b
2024-03-29 00:56:12 -04:00
Anusha GodavarthySurya 19b4660cbb SWDEV-443567 - SWDEV-436126 - Fix Prohibited and Unhandled Operations during capture
=> hipDeviceSynchronize is not allowed during capture.
=> hipEventSynchronize during capture should return hipErrorCapturedEvent error
=> hipEventQuery during capture should return hipErrorCapturedEvent error
hipStreamSynchronize, hipEventSynchronize, hipStreamWaitEvent, hipStreamQuery
For Side Stream(Stream that is not currently under capture):
=> If current thread is capturing in relaxed mode, calls are allowed
=> If any stream in current/concurrent thread is capturing in global mode, calls are not allowed
=> If any stream in current thread is capturing in ThreadLocal mode, calls are not allowed
For Stream that is currently under capture
=> calls are not allowed
=> Any call that is not allowed during capture invalidates the capture sequence
=> It is invalid to call synchronous APIs during capture. Synchronous APIs,
such as hipMemcpy(), enqueue work to the legacy stream and synchronize it before returning.

Change-Id: I201c6e63e1a5d93fd416a3b520264c0fdbe31237
2024-03-28 22:10:31 -04:00
Jaydeep Patel 0be92b8f09 SWDEV-452299 - Pass dst pitch while capturing hipMemcpyParam2DAsync & elementSize should be 1 as width is in bytes while capturing hipMemset2DAsync.
Change-Id: I8f9122a30cba0a07c097dfd7609432090caab142
2024-03-21 12:49:34 -04:00
Anusha GodavarthySurya 4feb1f9337 SWDEV-448586 - Added implementation for new API hipStreamBeginCaptureToGraph
Change-Id: I1ce802102cef2b66c92d3375f769983841de793f
2024-03-07 05:24:49 +00:00
Rahul Manocha 2c0fa829b4 [SWDEV-448077][SWDEV-448067] - Changes to Kernel Node Attribute related APIs
Change-Id: Ibbf773fd5f134a62b7ce04f6956b10c1086b1782
2024-02-28 18:14:41 -05:00
Anusha GodavarthySurya 4b4ec7fc52 SWDEV-445981 - Handle hipGraphExecUpdate to update graph kernel node params with graph performance optimizations
Change-Id: I3b05c6bfc83404152bcae9b31cfdf56af7cc61a4
2024-02-27 15:20:23 -05:00
Anusha GodavarthySurya 7d09e1abed SWDEV-444767 - Fix graph tests for context change between Inst & launch with DEBUG_CLR_GRAPH_PACKET_CAPTURE
When graph is Instantiate on device 0 graph and launch on device1 switch to command creation and enqueue during launch.

Change-Id: Ied34dc99b2a776130d1354ed3830c6ccab9068e4
2024-02-14 17:02:36 +00:00
Anusha GodavarthySurya d6bc40e822 SWDEV-445084 - Add DEBUG_CLR_GRAPH_PACKET_CAPTURE support for hipGraphInstantiateWithFlags/Params
Change-Id: I5096b4c8d73d1faf972dfd23ab86a53d888946c4
2024-02-08 04:55:53 -05:00
Rahul Manocha f964975db0 SWDEV-421025 - Graph Instantiate with Params API Update
Change-Id: I3ed821ced02420858d360e8dab5e1e931c350c7e
2024-02-07 11:35:21 -05:00
Anusha GodavarthySurya a1b2cbe44e SWDEV-439637 - Updated to compile with clang compiler
Change-Id: Ib0a8e1cc007f083fb1d1f4363cf89ba76ad3c4f2
2024-02-06 23:57:13 -05:00
Rahul Manocha 1a3901fa49 SWDEV-421025 - hipGraphInstantiateWithParams API changes
Change-Id: Ib07d4dd1698220b68ed27f91d58d3bd315a8804c
2024-02-05 05:08:11 +00:00
Anusha GodavarthySurya e9957151f3 SWDEV-439628 - hipGraphExecKernelNodeSetParams to update graph kernel node params with graph performance optimizations.
During hipGraphExecKernelNodeSetParams kernel function can also be updated.
Hence size required for kernel parameters differs from what is allocated during graphInstantiation.
So, create new 128KB kernel pool and allocate kernel args from the pool.
If the pool is full create new 128KB pool. Release kernel pools when graph exec object is destroyed.

Change-Id: I9567946d63400c79cbfd4c5439c654c92557ceae
2024-02-05 05:08:11 +00:00
Anusha GodavarthySurya 0a055f874b SWDEV-422207 - Added debug env to dump graph during Instantiation
Change-Id: Ibde2ae5b8d240f3986bcd168facc513a319c0f17
2024-02-05 05:08:11 +00:00
Rahul Manocha f40c380cdb SWDEV-421025 - Graph Kernel Node priority Attribute Set/Get
Change-Id: I5c422728aa694c8dabb5cf9bade441101512a249
2024-01-17 12:44:35 -05:00
sdashmiz d23835ffbe SWDEV-421027 - Add hipGraphAddNode
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: Ic8cf293ff483ee2547b52d2975062bcb9a6f5d17
2024-01-12 11:36:30 -05:00