Граф коммитов

82 Коммитов

Автор SHA1 Сообщение Дата
Saleel Kudchadker 72d23a02c5 SWDEV-301667 - Better log
- Print kernelname for graph launches, its hard to correlate packets
otherwise
- Print correlation_id if any

Change-Id: Ib8db7a00e4e7c98f570e71029e61d86f5dccc2ed
2024-05-28 06:31:10 +00:00
Saleel Kudchadker 1ba74c3ce3 SWDEV-451594 - Fix HDP reg readback
Change-Id: I478a968330f85c3b60ff39fb40bf3cd91acd610e
2024-05-28 06:31:10 +00:00
Saleel Kudchadker badf2b0880 SWDEV-301667 - Refactor graph code
- Remove Last graph node optimization and instead submit a barrier NOP
packet always. This simplifies the code.

Change-Id: Ied443173ba47a08b6df148ac7e3ead712acda11c
2024-05-28 06:28:17 +00:00
Anusha GodavarthySurya bf4d10ff61 SWDEV-460770 - Handle Graph Exec release
Handle GraphExec instance is destroyed before async launch completes
GraphExec instance is destroyed after async launch completes
GraphExec instance is destroyed without a launch

Change-Id: I45a7c82295fea916c7559bd8f796df710513aea1
2024-05-28 06:28:17 +00:00
Ioannis Assiouras 6cb7b6ec6b SWDEV-451594 - Change device kernel args to use HDP flush by default
The Readback and Avoid HDP Flush memory ordering workaround is
used as a fallback solution only when HDP flush register is invalid

Change-Id: Ic284eba1f95ed22b0270d3abeb904fb902015b1a
2024-05-02 19:35:13 +00:00
Ioannis Assiouras bf74ef4025 SWDEV-451594 - Implement Readback and Avoid HDP Flush workaround for device kernel args
Change-Id: I6d41a089a17f55306e7ff402588a1e831b20a7a7
2024-04-19 09:29:20 -04:00
Ioannis Assiouras 96f5c44851 SWDEV-451166 - Disable kernel args for non-XGMI if HDP flush register is invalid
Change-Id: I227e046e2b9cb25476a50240f5d070adbd558f21
2024-03-15 05:27:52 -04:00
Anusha GodavarthySurya e0e63eb04d SWDEV-447545 - Fix Enable/Disable node with hipGraph
Node can be enabled/disabled only for kernel, memcpy and memset nodes.
If the node is disabled it becomes empty node.
To maintain ordering just enqueue marker with respective node dependencies.

Change-Id: I710f3e88ab4e76c81f6f86a40a7dc61fd4c7e440
2024-02-28 17:34:03 -05:00
Sourabh Betigeri 3fdd46ae59 SWDEV-425640 - An instantiated graphExec should retain a copy of every reference in the source graph
Change-Id: Idf6b224449ca642af2860b33dc739f51a6248e4c
2024-02-28 12:04:53 -05:00
Anusha GodavarthySurya 2dc6ec68a5 SWDEV-444988 - Fix __amd_rocclr_initHeap sync with DEBUG_CLR_GRAPH_PACKET_CAPTURE
When kernel does device side malloc, initial heap is allocated with __amd_rocclr_initHeap.
During graph launch kernel __amd_rocclr_initHeap is enqueued followed by actual kernel . So kernel will execute after initHeap kernel.

But with graph optimizations during capture initHeap gets enqueued on device null stream and actual kernel on graph launch stream.
So no proper synchronization. Switch to command creation and enqueue during launch for kernel node with hidden heap.

Change-Id: Iaf600251faef9a448853f19429023c118aa760b9
2024-02-27 13:11:31 -05:00
Saleel Kudchadker f138e0d113 SWDEV-443760 - Enable device kern args
- Implement workaround to ensure HDP writes are done by writing and
reading the HDP MMIO register.
- Implement the same workaround for graphs, we no longer need sentinel
write/readback

Change-Id: I0d3027b46a1f61131ec62e3c8c669ff5184fa6b2
2024-02-20 02:03:14 -05:00
Saleel Kudchadker 81b8598af9 SWDEV-301667 - Cleanup code and better log
Change-Id: Ie2345264e84026156a9f81b421eed3cf4aeeeffc
2024-02-19 05:42:47 -05:00
Anusha GodavarthySurya 7d09e1abed SWDEV-444767 - Fix graph tests for context change between Inst & launch with DEBUG_CLR_GRAPH_PACKET_CAPTURE
When graph is Instantiate on device 0 graph and launch on device1 switch to command creation and enqueue during launch.

Change-Id: Ied34dc99b2a776130d1354ed3830c6ccab9068e4
2024-02-14 17:02:36 +00:00
Anusha GodavarthySurya 853abeb75e SWDEV-445013 - During CaptureAQLPackets correct sentinal value to copy integer size bytes
Read and write int bytes sentinal value to dev_ptr or PCIE connected devices at the tail end of the kernarg surface.

Change-Id: I993d552ac872b3cd56aef4746c4d1d92c58d38b4
2024-02-13 07:05:57 +00:00
Anusha GodavarthySurya d6bc40e822 SWDEV-445084 - Add DEBUG_CLR_GRAPH_PACKET_CAPTURE support for hipGraphInstantiateWithFlags/Params
Change-Id: I5096b4c8d73d1faf972dfd23ab86a53d888946c4
2024-02-08 04:55:53 -05:00
Anusha GodavarthySurya ca0b50c9ca SWDEV-444558 - SWDEV-444418 - Fix capturing of AQL packets when kernel arg size is 0
When graph doesn't have kernel nodes.

Change-Id: I6b3b476654d7eedc9ff0cec4b7269168aa115360
2024-02-08 06:12:16 +00:00
Anusha GodavarthySurya ae0368d12d SWDEV-422207 - Enable DEBUG_CLR_GRAPH_PACKET_CAPTURE environiment variable
Change-Id: I9bf72b9c1a56980352109bd4d42b54ecb2d1b8f9
2024-02-05 05:08:11 +00:00
Anusha GodavarthySurya e9957151f3 SWDEV-439628 - hipGraphExecKernelNodeSetParams to update graph kernel node params with graph performance optimizations.
During hipGraphExecKernelNodeSetParams kernel function can also be updated.
Hence size required for kernel parameters differs from what is allocated during graphInstantiation.
So, create new 128KB kernel pool and allocate kernel args from the pool.
If the pool is full create new 128KB pool. Release kernel pools when graph exec object is destroyed.

Change-Id: I9567946d63400c79cbfd4c5439c654c92557ceae
2024-02-05 05:08:11 +00:00
Anusha GodavarthySurya 2bb2446d8f SWDEV-422207 - Fix graph catch tests with graph optimizations(DEBUG_CLR_GRAPH_PACKET_CAPTURE enabled)
Change-Id: I16297e0ddde286bf1798c90f2bf846e69819010d
2023-12-14 01:27:08 -05:00
Saleel Kudchadker 058b2702db SWDEV-301667 - Logging refactor
- Remove newline from logging as log function internally inserts a new
line

Change-Id: I25eb2242a1f1e87cf811bcc373d1d485b2e027a8
2023-12-07 12:12:57 -05:00
Saleel Kudchadker b056686607 SWDEV-422207 - Report kernel names for activity profiling
- Report kernel names for optimized graph path
- Refactor code so that we store profiling info in Accumulate command

Change-Id: Ib97735a0239aeb9fc3a50a4bb7126dd0bcadc8af
2023-11-15 14:38:07 -05:00
Saleel Kudchadker c3bd229f4f SWDEV-422207 - Optimize graph end detection
- Do not use extra barrier to detect graph end. If its a kernel node we
can use a completion signal for the last packet. Saves roughly 6us for
Phantom testcase per graph launch.

Change-Id: I5e0c2479d9964fbeda86ed97533f6718f49a7f91
2023-11-10 11:57:02 -05:00
Saleel Kudchadker 9fdee05aee SWDEV-422207 - Workaround HDP register query bug
Change-Id: Ib886a3166b555fbd6b8e5a249f993f47afd00166
2023-11-08 12:12:15 -05:00
Saleel Kudchadker 40f41f4d0b SWDEV-422207 - Track commands for capture
- Track all captured commands under a new AccumulateCommand
- Add begin() and end() methods to capture commands
- Explicit TS object now passed to certain methods because
profilingBegin() and profilingEnd() now happen separately and thus can
run into threading issues

Change-Id: I171106bdcad72b057836cb2f3fc398db3533119f
2023-11-03 05:09:04 +00:00
sdashmiz 9b567e1799 SWDEV-417075 - add hipDrvAddMemCpyNode
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: Ie631d7b1788f10171a29d463759a3cba3b2b2007

SWDEV-417075 - add hipDrvGraphAddMemcpyNode

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I6bab3310919643e119cd0004276907e223641cfb
2023-10-31 09:55:42 -04:00
Anusha GodavarthySurya 5fb7536586 SWDEV-422207 - Remove L2 flush when kernelArgs are in device memory
Change-Id: I7b5625cb6d55e83689bff7bbb45be9c517ec4a8d
2023-10-26 19:14:58 +00:00
Anusha GodavarthySurya 38d2c56784 SWDEV-422207 - Handle nonkernel nodes for graph opt
- Support graph with different types of nodes with single
branch when DEBUG_CLR_GRAPH_PACKET_CAPTURE flag is enabled

Change-Id: I149a8629769cd0d5849ffefb04f1352668a685b6
2023-10-24 18:36:06 +00:00
taosang2 5a0085e516 SWDEV-364236 - Fix layered Image issue
Fix wrong logic to get layer index;
Make layered image's layout match cuda spec;
Fix wrong comparision of element size.
Remove amd::BufferRect from ihipMemcpyAtoHCommand()
and ihipMemcpyHtoACommand().
Change-Id: Icc6a4233fbce2e9b2dc6feb79e6bfbd761684c7d
2023-10-19 16:06:20 -04:00
Anusha GodavarthySurya e63c280d4d SWDEV-422207 - Capture AQL Packets for graph Kernel nodes during graph Inst. And enqueue AQL packet during launch
Change-Id: I1e5f7f9e2a70bd500d190193cb6ba0867f5a63e7
2023-10-05 00:34:29 -04:00
Anusha GodavarthySurya 530dc6de2a SWDEV-301667 - Optimize performance when graph has single branch
Three for loops iterate over all graph nodes for UpdateStream, FillCommands and
EnqueueCommands has performance drop for large graphs.

Change-Id: I077accf3a4680d5d944b73200fd6498a7a48f25c
2023-09-07 23:35:36 -04:00
Saleel Kudchadker e1e5d071ba SWDEV-301667 - Port optimization to save extra packet to graphs
Change-Id: Ibaf64a4efe070c42620e6e153c1862a4a0b15664
2023-08-23 16:58:21 -04:00
Anusha GodavarthySurya f76a40c26d SWDEV-415772, SWDEV-414682 - Fix childgraph node execution
Change-Id: If9ffc08d98a57b8daa5f131f72ef1bf2317f29e1
2023-08-18 00:45:00 -04:00
Anusha GodavarthySurya fd97dde1e6 SWDEV-407568 - Move graph implementation to hip namespace
Change-Id: I7023f202a7e3eb25b17db6d3e361205594ae81a5
2023-07-26 06:52:45 +00:00
Anusha GodavarthySurya b0e6f99ad7 SWDEV-392732 - Initial commit for graph doorbell optimization(AQL Buffering)
Change-Id: I451725006c54c249dc530c55d2af2a31594bf49b
2023-07-16 07:56:00 -04:00
Jaydeep Patel a8164d3e12 SWDEV-401781 - Auto Clean removes from map so check before remove while submit mem alloc node command.
Change-Id: Id004f75b307c2c769dee556c3d18e781830bcae1
2023-06-29 02:29:01 -04:00
sdashmiz 8578da8a3d SWDEV-367877 - Detect cycle in graph
- detect cycle when graph is instantiated

- remove level calculation from add/remove node

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I3f7432f91f70aec8e4fd866b2766256f8a9a0cfe

graph-cycle-corrections

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I8a3cec9a5a503aac6ea1e85ff3dd2b972790fb1d
2023-05-18 09:44:39 -04:00
German 04b696abee SWDEV-353281 - VM support in mempool for graphs
The change enables VM support in graphs on Windows. That allows
to avoid caching of all allocations at the cost of map/unmap
overhead during memory create/destroy.

Change-Id: I792be00fba099e5e5d3cd44a963e1dfd6976a86d
2023-05-05 15:31:26 -04:00
German 1e88d2c52f SWDEV-380703 - sync all streams individually
Avoid syncing blocking streams with the default stream,
since that introduces extra command dependencies and
doesn't allow to destroy memory after last submission

Change-Id: I618e9bd2091c4cf9157125612d8c4759030c5a80
2023-04-05 16:37:49 +00:00
Ioannis Assiouras 9c04e21b68 SWDEV-388661 - Fixed regression in hipMemCpyParam3D when offset is applied
Change-Id: I31273d643aac05f394f505235734c7f098497051
2023-04-05 10:54:34 +00:00
sdashmiz d2ea4d3dd4 SWDEV-385255 - fixes for graph exec with mem pools
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I6f127090904a6c7985992bfaaab928d8b03013f5
2023-04-05 10:52:04 +00:00
Sourabh Betigeri ec8ab9b29c SWDEV-385089 - Fixes hipGraphLaunch functionality accounting for the existence of MemFree node or already freed memory when the same graph is launched multiple times
Change-Id: I49beb49ad4e6db4a2dd5b8c8cc8ed11ff0e4e132
2023-03-09 21:10:13 -05:00
Saleel Kudchadker dfefc97178 SWDEV-384658 - Optimize D2D memcpy
- Intra device memcpy does not need to perform host side synchronization
- Check alloc flags when determining memory type

Change-Id: Ieff28bd8d62756ffe82905354c4a91e9717e6bd4
2023-03-09 04:47:11 -05:00
Saleel Kudchadker 8028d327e9 SWDEV-345213 - Use the right accessor
- Use correct accessor to fetch memory objects. This checks the svm map
and arena maps

Change-Id: I84515330bb530cfe2b39abf30e1e659938f06806
2023-03-06 19:35:40 -05:00
Ioannis Assiouras 06927fd3c1 SWDEV-381402 - Remove unused getNullStream() from device. Make stream destructor private.
Change-Id: Idde30a8bfe97a525bd9f9fb50698a5cb14b798fc
2023-02-24 10:42:46 +00:00
Ioannis Assiouras e3633dc8f4 SWDEV-381402 - Derive hip::Stream from amd::HostQueue
Change-Id: I6c1aca5eb350c32d974ae4ffcc725705355956d8
2023-02-21 18:12:03 -05:00
sdashmiz 5c5b220561 SWDEV-374378 - correct setparam for memcpy node
- params should be valid when used for default flag since we support
  unified virtual address space

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I75d40e437b12ee58e72e423bb4818b484ce35b66
2023-02-09 09:39:20 -05:00
German Andryeyev 0fe03b6494 SWDEV-353281 - Switch Graph to operate with stream
MemPool was designed to use hip::Stream, but graph implementation uses
amd::HostQueue. Hence switch graph to hip::Stream management.

Change-Id: Ia319389de45e4c3c6043d17473279a6f27a13140
2023-02-06 10:11:55 -05:00
Jaydeep Patel f9e27bcdd4 SWDEV-377804 - Initial commit to support hipGraphInstantiateFlagAutoFreeOnLaunch
Change-Id: I7a35becb6c98a6ff70264e141317d98be7457a37
2023-02-01 11:51:39 -05:00
German Andryeyev eef47ca24a SWDEV-353281 - Initial support of memalloc in graph
Add memory allocation support in graph. Current implementation uses
cache from mempool  to hold the allocations which belong to the graph.
Also the resource tracking is disabled at this moment because mempool
operates with hip::Stream objects, but graph has execution with
amd::HostQueue objects.

Change-Id: I54fe3250126d24f5a26ada975f37d429bb4ef17b
2023-01-13 13:06:59 -05:00
Anusha GodavarthySurya 16b31b0c54 SWDEV-325711 - Correct formatting
Change-Id: Ie26159e0bb3315cf7c3de1eb682f23ef343df0f2
2022-11-30 05:15:01 +00:00