Grafik Komit

74 Melakukan

Penulis SHA1 Pesan Tanggal
Sourabh Betigeri 3fdd46ae59 SWDEV-425640 - An instantiated graphExec should retain a copy of every reference in the source graph
Change-Id: Idf6b224449ca642af2860b33dc739f51a6248e4c
2024-02-28 12:04:53 -05:00
Anusha GodavarthySurya 2dc6ec68a5 SWDEV-444988 - Fix __amd_rocclr_initHeap sync with DEBUG_CLR_GRAPH_PACKET_CAPTURE
When kernel does device side malloc, initial heap is allocated with __amd_rocclr_initHeap.
During graph launch kernel __amd_rocclr_initHeap is enqueued followed by actual kernel . So kernel will execute after initHeap kernel.

But with graph optimizations during capture initHeap gets enqueued on device null stream and actual kernel on graph launch stream.
So no proper synchronization. Switch to command creation and enqueue during launch for kernel node with hidden heap.

Change-Id: Iaf600251faef9a448853f19429023c118aa760b9
2024-02-27 13:11:31 -05:00
Saleel Kudchadker f138e0d113 SWDEV-443760 - Enable device kern args
- Implement workaround to ensure HDP writes are done by writing and
reading the HDP MMIO register.
- Implement the same workaround for graphs, we no longer need sentinel
write/readback

Change-Id: I0d3027b46a1f61131ec62e3c8c669ff5184fa6b2
2024-02-20 02:03:14 -05:00
Saleel Kudchadker 81b8598af9 SWDEV-301667 - Cleanup code and better log
Change-Id: Ie2345264e84026156a9f81b421eed3cf4aeeeffc
2024-02-19 05:42:47 -05:00
Anusha GodavarthySurya 7d09e1abed SWDEV-444767 - Fix graph tests for context change between Inst & launch with DEBUG_CLR_GRAPH_PACKET_CAPTURE
When graph is Instantiate on device 0 graph and launch on device1 switch to command creation and enqueue during launch.

Change-Id: Ied34dc99b2a776130d1354ed3830c6ccab9068e4
2024-02-14 17:02:36 +00:00
Anusha GodavarthySurya 853abeb75e SWDEV-445013 - During CaptureAQLPackets correct sentinal value to copy integer size bytes
Read and write int bytes sentinal value to dev_ptr or PCIE connected devices at the tail end of the kernarg surface.

Change-Id: I993d552ac872b3cd56aef4746c4d1d92c58d38b4
2024-02-13 07:05:57 +00:00
Anusha GodavarthySurya d6bc40e822 SWDEV-445084 - Add DEBUG_CLR_GRAPH_PACKET_CAPTURE support for hipGraphInstantiateWithFlags/Params
Change-Id: I5096b4c8d73d1faf972dfd23ab86a53d888946c4
2024-02-08 04:55:53 -05:00
Anusha GodavarthySurya ca0b50c9ca SWDEV-444558 - SWDEV-444418 - Fix capturing of AQL packets when kernel arg size is 0
When graph doesn't have kernel nodes.

Change-Id: I6b3b476654d7eedc9ff0cec4b7269168aa115360
2024-02-08 06:12:16 +00:00
Anusha GodavarthySurya ae0368d12d SWDEV-422207 - Enable DEBUG_CLR_GRAPH_PACKET_CAPTURE environiment variable
Change-Id: I9bf72b9c1a56980352109bd4d42b54ecb2d1b8f9
2024-02-05 05:08:11 +00:00
Anusha GodavarthySurya e9957151f3 SWDEV-439628 - hipGraphExecKernelNodeSetParams to update graph kernel node params with graph performance optimizations.
During hipGraphExecKernelNodeSetParams kernel function can also be updated.
Hence size required for kernel parameters differs from what is allocated during graphInstantiation.
So, create new 128KB kernel pool and allocate kernel args from the pool.
If the pool is full create new 128KB pool. Release kernel pools when graph exec object is destroyed.

Change-Id: I9567946d63400c79cbfd4c5439c654c92557ceae
2024-02-05 05:08:11 +00:00
Anusha GodavarthySurya 2bb2446d8f SWDEV-422207 - Fix graph catch tests with graph optimizations(DEBUG_CLR_GRAPH_PACKET_CAPTURE enabled)
Change-Id: I16297e0ddde286bf1798c90f2bf846e69819010d
2023-12-14 01:27:08 -05:00
Saleel Kudchadker 058b2702db SWDEV-301667 - Logging refactor
- Remove newline from logging as log function internally inserts a new
line

Change-Id: I25eb2242a1f1e87cf811bcc373d1d485b2e027a8
2023-12-07 12:12:57 -05:00
Saleel Kudchadker b056686607 SWDEV-422207 - Report kernel names for activity profiling
- Report kernel names for optimized graph path
- Refactor code so that we store profiling info in Accumulate command

Change-Id: Ib97735a0239aeb9fc3a50a4bb7126dd0bcadc8af
2023-11-15 14:38:07 -05:00
Saleel Kudchadker c3bd229f4f SWDEV-422207 - Optimize graph end detection
- Do not use extra barrier to detect graph end. If its a kernel node we
can use a completion signal for the last packet. Saves roughly 6us for
Phantom testcase per graph launch.

Change-Id: I5e0c2479d9964fbeda86ed97533f6718f49a7f91
2023-11-10 11:57:02 -05:00
Saleel Kudchadker 9fdee05aee SWDEV-422207 - Workaround HDP register query bug
Change-Id: Ib886a3166b555fbd6b8e5a249f993f47afd00166
2023-11-08 12:12:15 -05:00
Saleel Kudchadker 40f41f4d0b SWDEV-422207 - Track commands for capture
- Track all captured commands under a new AccumulateCommand
- Add begin() and end() methods to capture commands
- Explicit TS object now passed to certain methods because
profilingBegin() and profilingEnd() now happen separately and thus can
run into threading issues

Change-Id: I171106bdcad72b057836cb2f3fc398db3533119f
2023-11-03 05:09:04 +00:00
sdashmiz 9b567e1799 SWDEV-417075 - add hipDrvAddMemCpyNode
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: Ie631d7b1788f10171a29d463759a3cba3b2b2007

SWDEV-417075 - add hipDrvGraphAddMemcpyNode

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I6bab3310919643e119cd0004276907e223641cfb
2023-10-31 09:55:42 -04:00
Anusha GodavarthySurya 5fb7536586 SWDEV-422207 - Remove L2 flush when kernelArgs are in device memory
Change-Id: I7b5625cb6d55e83689bff7bbb45be9c517ec4a8d
2023-10-26 19:14:58 +00:00
Anusha GodavarthySurya 38d2c56784 SWDEV-422207 - Handle nonkernel nodes for graph opt
- Support graph with different types of nodes with single
branch when DEBUG_CLR_GRAPH_PACKET_CAPTURE flag is enabled

Change-Id: I149a8629769cd0d5849ffefb04f1352668a685b6
2023-10-24 18:36:06 +00:00
taosang2 5a0085e516 SWDEV-364236 - Fix layered Image issue
Fix wrong logic to get layer index;
Make layered image's layout match cuda spec;
Fix wrong comparision of element size.
Remove amd::BufferRect from ihipMemcpyAtoHCommand()
and ihipMemcpyHtoACommand().
Change-Id: Icc6a4233fbce2e9b2dc6feb79e6bfbd761684c7d
2023-10-19 16:06:20 -04:00
Anusha GodavarthySurya e63c280d4d SWDEV-422207 - Capture AQL Packets for graph Kernel nodes during graph Inst. And enqueue AQL packet during launch
Change-Id: I1e5f7f9e2a70bd500d190193cb6ba0867f5a63e7
2023-10-05 00:34:29 -04:00
Anusha GodavarthySurya 530dc6de2a SWDEV-301667 - Optimize performance when graph has single branch
Three for loops iterate over all graph nodes for UpdateStream, FillCommands and
EnqueueCommands has performance drop for large graphs.

Change-Id: I077accf3a4680d5d944b73200fd6498a7a48f25c
2023-09-07 23:35:36 -04:00
Saleel Kudchadker e1e5d071ba SWDEV-301667 - Port optimization to save extra packet to graphs
Change-Id: Ibaf64a4efe070c42620e6e153c1862a4a0b15664
2023-08-23 16:58:21 -04:00
Anusha GodavarthySurya f76a40c26d SWDEV-415772, SWDEV-414682 - Fix childgraph node execution
Change-Id: If9ffc08d98a57b8daa5f131f72ef1bf2317f29e1
2023-08-18 00:45:00 -04:00
Anusha GodavarthySurya fd97dde1e6 SWDEV-407568 - Move graph implementation to hip namespace
Change-Id: I7023f202a7e3eb25b17db6d3e361205594ae81a5
2023-07-26 06:52:45 +00:00
Anusha GodavarthySurya b0e6f99ad7 SWDEV-392732 - Initial commit for graph doorbell optimization(AQL Buffering)
Change-Id: I451725006c54c249dc530c55d2af2a31594bf49b
2023-07-16 07:56:00 -04:00
Jaydeep Patel a8164d3e12 SWDEV-401781 - Auto Clean removes from map so check before remove while submit mem alloc node command.
Change-Id: Id004f75b307c2c769dee556c3d18e781830bcae1
2023-06-29 02:29:01 -04:00
sdashmiz 8578da8a3d SWDEV-367877 - Detect cycle in graph
- detect cycle when graph is instantiated

- remove level calculation from add/remove node

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I3f7432f91f70aec8e4fd866b2766256f8a9a0cfe

graph-cycle-corrections

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I8a3cec9a5a503aac6ea1e85ff3dd2b972790fb1d
2023-05-18 09:44:39 -04:00
German 04b696abee SWDEV-353281 - VM support in mempool for graphs
The change enables VM support in graphs on Windows. That allows
to avoid caching of all allocations at the cost of map/unmap
overhead during memory create/destroy.

Change-Id: I792be00fba099e5e5d3cd44a963e1dfd6976a86d
2023-05-05 15:31:26 -04:00
German 1e88d2c52f SWDEV-380703 - sync all streams individually
Avoid syncing blocking streams with the default stream,
since that introduces extra command dependencies and
doesn't allow to destroy memory after last submission

Change-Id: I618e9bd2091c4cf9157125612d8c4759030c5a80
2023-04-05 16:37:49 +00:00
Ioannis Assiouras 9c04e21b68 SWDEV-388661 - Fixed regression in hipMemCpyParam3D when offset is applied
Change-Id: I31273d643aac05f394f505235734c7f098497051
2023-04-05 10:54:34 +00:00
sdashmiz d2ea4d3dd4 SWDEV-385255 - fixes for graph exec with mem pools
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I6f127090904a6c7985992bfaaab928d8b03013f5
2023-04-05 10:52:04 +00:00
Sourabh Betigeri ec8ab9b29c SWDEV-385089 - Fixes hipGraphLaunch functionality accounting for the existence of MemFree node or already freed memory when the same graph is launched multiple times
Change-Id: I49beb49ad4e6db4a2dd5b8c8cc8ed11ff0e4e132
2023-03-09 21:10:13 -05:00
Saleel Kudchadker dfefc97178 SWDEV-384658 - Optimize D2D memcpy
- Intra device memcpy does not need to perform host side synchronization
- Check alloc flags when determining memory type

Change-Id: Ieff28bd8d62756ffe82905354c4a91e9717e6bd4
2023-03-09 04:47:11 -05:00
Saleel Kudchadker 8028d327e9 SWDEV-345213 - Use the right accessor
- Use correct accessor to fetch memory objects. This checks the svm map
and arena maps

Change-Id: I84515330bb530cfe2b39abf30e1e659938f06806
2023-03-06 19:35:40 -05:00
Ioannis Assiouras 06927fd3c1 SWDEV-381402 - Remove unused getNullStream() from device. Make stream destructor private.
Change-Id: Idde30a8bfe97a525bd9f9fb50698a5cb14b798fc
2023-02-24 10:42:46 +00:00
Ioannis Assiouras e3633dc8f4 SWDEV-381402 - Derive hip::Stream from amd::HostQueue
Change-Id: I6c1aca5eb350c32d974ae4ffcc725705355956d8
2023-02-21 18:12:03 -05:00
sdashmiz 5c5b220561 SWDEV-374378 - correct setparam for memcpy node
- params should be valid when used for default flag since we support
  unified virtual address space

Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I75d40e437b12ee58e72e423bb4818b484ce35b66
2023-02-09 09:39:20 -05:00
German Andryeyev 0fe03b6494 SWDEV-353281 - Switch Graph to operate with stream
MemPool was designed to use hip::Stream, but graph implementation uses
amd::HostQueue. Hence switch graph to hip::Stream management.

Change-Id: Ia319389de45e4c3c6043d17473279a6f27a13140
2023-02-06 10:11:55 -05:00
Jaydeep Patel f9e27bcdd4 SWDEV-377804 - Initial commit to support hipGraphInstantiateFlagAutoFreeOnLaunch
Change-Id: I7a35becb6c98a6ff70264e141317d98be7457a37
2023-02-01 11:51:39 -05:00
German Andryeyev eef47ca24a SWDEV-353281 - Initial support of memalloc in graph
Add memory allocation support in graph. Current implementation uses
cache from mempool  to hold the allocations which belong to the graph.
Also the resource tracking is disabled at this moment because mempool
operates with hip::Stream objects, but graph has execution with
amd::HostQueue objects.

Change-Id: I54fe3250126d24f5a26ada975f37d429bb4ef17b
2023-01-13 13:06:59 -05:00
Anusha GodavarthySurya 16b31b0c54 SWDEV-325711 - Correct formatting
Change-Id: Ie26159e0bb3315cf7c3de1eb682f23ef343df0f2
2022-11-30 05:15:01 +00:00
Anusha Godavarthy Surya 08c4619fab SWDEV-366653 - Added Implemention of DOT file generation for graph
Change-Id: I5ab6a58e49451b5e04f2e93bf594b985ac58cc8d
2022-11-24 11:02:21 -05:00
agunashe 47ae1f1fff SWDEV-337331 - Windows graph fix
Unit_hipGraphNodeGetDependentNodes_Functional
Unit_hipGraphNodeGetDependencies_Functional
Unit_hipGraphAddEventRecordNode_Functional_WithoutFlags
Unit_hipGraphMemcpyNodeSetParams_Functional
Unit_hipGraphExecChildGraphNodeSetParams_ChildTopology

Change-Id: I762776d33f27197bcc012951a1828d3d1d2b3e2e
2022-10-28 14:46:04 -04:00
Anusha GodavarthySurya 039e26ee0f SWDEV-357759, SWDEV-360041, SWDEV-361145 Fix Stream end capture on forked streams
Change-Id: If0dc6242d2d3ca680e37e14a5dea5cf68dc295df
2022-10-12 13:00:05 -04:00
Tao Sang 56e7c8b3a0 SWDEV-318349 - Remove sync for null stream
Remove sync for null stream in  hipGraphExec::Run()

Change-Id: Ieaaed1c15b4d258193d8341d4b17d9f03a9e4783
2022-10-12 09:46:18 -04:00
Ajay be80bf5406 SWDEV-353548 - assert with different behavior for Release vs Debug
assert statement behavior
-Debug: crashes tests with SIGABORT
-Release: continues silently without

Change-Id: I7578eb16a7391ff7f9d68f1cae3bcea7f8225579
2022-10-11 14:00:28 -04:00
Tao Sang f83ba8cd23 SWDEV-318349 - Fix hipGraphKernelNode and hipGraphMemcpyNode
For hipGraphKernelNode, remove func_;
and reorganize functions to naturely support mGPU;
For hipGraphMemcpyNode, make EnqueueCommands() support different
queues' sync
Change-Id: I22708923f454adf4456ff99d25559daffed8c20d
2022-10-07 09:07:56 -04:00
Laurent Morichetti 82bce811ee SWDEV-351980 - Consolidate registration tables in the roctracer library
Remove the api_callbacks_table_t that was holding the API activities and
user callbacks. Instead use a single roctracer callback (TracerCallback)
used to report both API activities and callbacks.

Remove the hipInitActivityCallback that was setting the ROCtracer
callback and memory pool for asynchronous activities as it did not
allow disctinct pools to be used for each activity.  Instead, use
hipRegisterTracerCallback to set the single roctracer callback.

Change-Id: I4c10f04f29a6e4cce8caf15db3016c3f72c86b04
2022-09-21 02:41:39 -04:00
pghafari f1d8a02122 SWDEV-356582 - set the parent node in Graph Clone
Change-Id: I9b685ca9827b0a0dddc3ef6a0394b298d1031f04
2022-09-20 16:42:29 -04:00