Graphe des révisions

2136 Révisions

Auteur SHA1 Message Date
Jaydeep Patel 8e80429b87 SWDEV-457316 - Release graph exec before stream gets deleted.
Releasing graph exec after wait completes and before delete hip::stream obj
during stream destroy.

Change-Id: I1d68aa8d844f7d3af330c6d09c44af07f8553551
2024-08-06 00:39:37 -04:00
Jaydeep Patel b0047d690a SWDEV-465220 - Validate stream on which Kernel is planned to be launched.
Change-Id: I34c679bd888c275584c11ad3e8346d4d542976f9
2024-08-06 00:31:22 -04:00
Jaydeep Patel d954eb64db SWDEV-457316 - Multiple graph exec can be for given stream.
Change-Id: I0f1b184eb63e0432119d62f094637d375a3d4e55
2024-08-06 00:31:04 -04:00
German Andryeyev 9db52f9a46 SWDEV-470612 - Add the optimized multistream path
- Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution
- Optimize the launch latency, where commands
creation and execution is done at the same time
- Optimize the scheduling to use less barriers and waiting signals if
the same queue  can be detected
- The new path is controlled by  DEBUG_HIP_FORCE_GRAPH_QUEUES
environment variable, where 0 will use the original path and any other
value will force the number of asynchronous queues for execution
- DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async
execution in graphs(applicable for Navi families only)

Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e
2024-08-02 14:19:44 -04:00
Anusha GodavarthySurya bd3a35bde1 SWDEV-468424 - Add support to capture multiple AQL Packets
=> Added support to capture multiple AQL Packets.
=> Added Interface to callback to hip runtime from rocclr to allocate
kernel args from the graph kernel arg pool.
=> Enabled Support to capture memset node.

Change-Id: I7e1c2ba06927459e024653058af142bd82192c43
2024-08-01 23:55:51 -04:00
Julia Jiang e988e5e448 SWDEV-436608 - Un-deprecate hipHostAlloc()
Change-Id: I71393e6f536ba84b4e172acf54ba4f72350e2ae8
2024-08-01 11:45:20 -04:00
Marko Arandjelovic be5f097e8e SWDEV-465204 - Fix hipModuleLaunchKernel data validation
Change-Id: I129f265a5eb79d0a13da4f12e78e06ba307b17ee
2024-08-01 05:09:23 -04:00
Vladana Stojiljkovic d62c1dea72 SWDEV-465142 - Copy memAllocNodePtrs_ when cloning graph
Change-Id: I5a0907e59397e71b44db59c44b551b74a6e59ba0
2024-08-01 11:05:06 +02:00
Vladana Stojiljkovic d7b07b94a0 SWDEV-475127 - Check if hipBindTextureToArray parameters are null before dereferencing them
Change-Id: Id0173faff0a385d1665194c9033083ef9b2c48b5
2024-08-01 05:01:55 -04:00
Marko Arandjelovic b3153a5f41 SWDEV-475114 - Prevent segfaults in hipBindTexture
Change-Id: I050f36a5c74a5d4542155040ccce043fee6b73ad
2024-07-31 16:57:57 -04:00
Marko Arandjelovic 7d0ff387e9 SWDEV-461791 make memcpy synchronous for D2D if src&dst ptrs have SYNC_MEMOPS attribute
Change-Id: I603081d21e5eb3c73111845e350d8fa2ba5a7733
2024-07-30 11:46:55 -04:00
Ioannis Assiouras 9b33db9b24 SWDEV-472309 - Ensure static maps are destroyed after __hipUnregisterFatBinary
hipDeviceSynchronize called from __hipUnregisterFatBinary
accesses static maps and monitors. This change ensures these ojects
are not destroyed before __hipUnregisterFatBinary  is called.
Additionally it disables the teardown process for static build.

Change-Id: I46b58641d60efcf6637a8e99cdd786ffe9e2c77d
2024-07-30 10:26:59 -04:00
Saleel Kudchadker cda4b7db1c SWDEV-475341 - Fix stream resolution for graphs launches
This issue was happening because of incorrect usage of getStream call,
if we get the null stream first and then typecast it, and call on
getStream again, we lose the advantage of simply passing "nullptr" to
indicate NULL stream. Thus we enter the waitActiveStream call and add
barriers to sync across streams.

Change-Id: I94dc4e3ec927295b9e1ab6dee4b37d7d3e00b0cc
2024-07-25 19:38:23 -04:00
German Andryeyev c6b8d69158 SWDEV-470612 - Avoid copying a vector on the stack
Change-Id: Ia5fc7d1f77d2519dedeedb2c82c26efebb03d1d3
2024-07-25 10:09:19 -04:00
taosang2 e3ef19e22a SWDEV-474091 – Fix sporadic crash in streamcallback test
Also in the scope of SWDEV-467540.
Fix sporadic crash in Unit_hipStreamAddCallback_MultipleThreads by
deferring release() of block_command.
The test will invoke 1000 threads on the same stream thus there
is a chance to free block_command too early in original code.
By deferring release() of block_command we can make sure block_command
is always valid during calling block_command->notifyCmdQueue().

Change-Id: I31555ee18e6958e34b89f04181867fa4e932a38c
2024-07-23 10:24:10 -04:00
kjayapra-amd 9c03f85f46 SWDEV-460948 - Remove dflock, since kernel arguments are part of command now.
Change-Id: I6b5a229307b41bd24ffa0bc172c64ad1154df474
2024-07-22 16:02:01 -04:00
Anusha GodavarthySurya 6ae5d6896c SWDEV-468424 - Fix kernelArg mgr release and clear commands after capture
Creation of ReferenceCountedObject will increase reference count by 1.
Clear the commands from Node after capture so that they wont be reference later.

Change-Id: I1cc4085939cf65218ec2aa2e25ab6d737f7cacd3
2024-07-22 05:16:12 +00:00
Anusha GodavarthySurya 346da4bb40 SWDEV-468424 - hipgraph capture memset node
Capture AQL packets during GraphInstantiation and enqueue AQL packets during graph launch.

Added support to capture single graph memset node.
Capture support for memset node is currently disabled.
Memset capture will be enabled when capture for multiple packets are supported..

Change-Id: I14dfbc41731025cc3a548a730558915def3fa384
2024-07-19 23:52:50 -04:00
Jatin Chaudhary 4b95e7bc87 SWDEV-467414 - add sharedMemPerBlockOptin = sharedMemPerBlock
On some platforms user can ask for extended shared memory for a
particular kernel in some cases. This feature does not exist on HIP at
the moment. So we are setting it to sharedMemPerBlock which is the
maximum user can expect for their kernels.

Change-Id: I81005cf0d1c9fb941e77d34fb8385241ffe5bdd0
2024-07-16 11:00:29 -04:00
kjayapra-amd a5664fc93f SWDEV-460113 - Remove the ufd print.
Change-Id: If0d64ea4b6662493784c040aa1ceffafc8efa1c3
2024-07-16 10:39:16 -04:00
kjayapra-amd e7a7feb273 SWDEV-464828 - Initial implementation of VMM IPC on PAL/Windows.
Change-Id: I3d5e148fad9105704db6724b00df06bef4fc9d2f
2024-07-16 10:38:35 -04:00
Anusha GodavarthySurya 35079e834e SWDEV-468424 - Refactor kernel arg
For refactoring of childGraph to have its own graphExec,
kernelArgs needs to be separated from the graphExec object.
All the childNodes part of graph should share same kernelArg pool.
Otherwise we endup creating multiple device kernel arg memory chucks
for single graphExec.

Change-Id: I4029a46ebc1fa112d87df64ab1fecbf288fabe5e
2024-07-16 08:38:44 -04:00
Marko Arandjelovic 7d3c0c5e10 SWDEV-441296 - Allign hipTexObjectCreate error handling to CUDA
Change-Id: I9ff01c22f14344e0e82e473104d6930e9fa5ff77
2024-07-15 15:51:41 -04:00
Julia Jiang dd30e0e893 SWDEV-472710 - Adding gitattributes and remove trailing spaces
Change-Id: Ic8ad2071745f0ffe6a2e120bfebb6d90bf270f87
2024-07-15 12:39:56 -04:00
sdashmiz 57e79802cd SWDEV-421021 - Add APIs cuMemcpyNodeGet/Set params
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: I24bc0da56aad34c9d5876a3d83b59515f11dc3ea
2024-07-10 12:53:01 -04:00
Ioannis Assiouras ea50d2c0c2 SWDEV-469825 - Modified the kernel argument readback to use a pointer to volatile
This change modifies the readback mechanism to use a pointer to volatile
instead of a volatile pointer. This ensures that the compiler does not
optimize away the read operation.

Change-Id: I79ff925d615aa8cc4f950e8ff4b7e608fcb179a4
2024-07-09 17:28:47 -04:00
sdashmiz 7257f56c60 SWDEV-429053 - Add more check for hipStreamLegacy
Signed-off-by: sdashmiz <shadi.dashmiz@amd.com>
Change-Id: Iaf802372b160b09e5e8451074af1731e4f1d410a
2024-07-08 13:55:54 -04:00
Jaydeep Patel cf0320a0b9 SWDEV-457316 - Return invalid val from add mem free node if corresponding mem alloc node is missing.
Change-Id: Ib0c346a439fc38ebfd106bcbdf75bd10bfd2f090
2024-07-04 13:09:46 -04:00
Saleel Kudchadker 6ac67afdd5 SWDEV-465602 - Fix random segfault
- Introduce a lock when checking isUserObjectValid. We need a lock
here as one can remove the userObject T2, leading to buffer overflow
when checking ranges in T1.

Change-Id: I058144b8cc463c90ab6bf5cf96bf937897742917
2024-07-03 18:57:23 -04:00
Jaydeep Patel 7d7db316b3 SWDEV-465088 - Bypass memcpy kind check for H2H memcpy if XNACK is enabled.
Change-Id: I3e9b23dfb1aedeaf5ea0f26668caddb277ead809
2024-07-01 01:03:42 -04:00
Anusha GodavarthySurya 9ad7e79e50 SWDEV-469331 - Fix issue of graph sync.
If graph has multiple branches, End command is enqueued on launch stream which
makes sure all the internal parallel streams are finsihed.

When node is removed from the graph, indegree and outdegree are not getting update correctly for parent, child nodes and
resulting in endNode not having deps on parallel commands. Resulting in graph sync issues.

Change-Id: I33cc2f21220e1c017d88099b29b542e05b683f73
2024-06-28 02:11:44 -04:00
Ioannis Assiouras 1c6b92627d SWDEV-468381 - Fixed use of vaddr_sub_obj in GraphMemFreeNode
Resolved an issue where a freed virtual buffer was incorrectly
added to the global mapping causing an assertion error during
teardown process.

Change-Id: I4801157a28603ce9be1ca0131982b700ff884f7a
2024-06-27 16:20:47 -04:00
Saleel Kudchadker 17313ec99d SWDEV-465602 - Refactor kernel arg pool allocation for graphs
- Allocate additional argument space to accomodate for kernel node
param updates

Change-Id: I2d4ea8bddd716f1191f3cbea807920d0248f8c4e
2024-06-25 18:28:03 -04:00
Rahul Manocha f309d49b32 [SWDEV-468553] - Add stream validation checks for memcpy APIs
Change-Id: Ic4495d10c8b2d2ac90f7093a08209d9cb373d2a6
2024-06-24 12:47:31 -04:00
Ioannis Assiouras 6b9e89fe0c SWDEV-469138 - Added fix for find_package(LLVM)
Changed find_package call to prioritize the package that is
found under the rocm installation over other system locations

Change-Id: Ice93c94bbb9cdebd467d3e88bb2e4bfb7a1e76d9
2024-06-20 11:03:08 -04:00
Ioannis Assiouras 7b0259c4b7 SWDEV-465236 - Changed RTCProgram::findIsa to not dlopen amdhip64 for static build
Change-Id: I322ef4ca96ea426a0953f1234e60db6cebb09886
2024-06-20 10:55:57 -04:00
Ioannis Assiouras 2aed4cf401 SWDEV-468133 - Fixed hipDeviceGetLimit for hipLimitMallocHeapSize
Change-Id: I91bede414ebe46831509cbd24ffb53cf129d6a40
2024-06-20 10:55:15 -04:00
taosang2 1566ff7639 SWDEV-465162 - Fix some issue with image support
Fix some small issues regarding image and mipmap support

Change-Id: I8e64223d44f37c2dbb115cbb343441a48021ba7b
2024-06-18 16:38:24 -04:00
Anusha GodavarthySurya 57156c524d SWDEV-467102 - Hidden heap init for graph capture
If the graph has kernels that does device side allocation,  during packet capture, heap is
allocated because heap pointer has to be added to the AQL packet, and initialized during
graph launch.

Handle race with wait when 2 kernels with device heap are enqueued on multiple streams.

Change-Id: I45933b77fcaf7bc8fdf1bc906462e32b5d8d3688
2024-06-17 02:07:25 -04:00
Branislav Brzak f014124527 SWDEV-465203 - Treat 0 elf length images as invalid
This addresses:
SWDEV-465203
SWDEV-465202

Change-Id: I49fcdd537fd07585e25c5fdef37cd10815466f79
2024-06-14 04:56:43 -04:00
Marko Arandjelovic d12af175af SWDEV-441296 - Fixes related to hipTexObjectCreate unit test
- Avoid potential division by zero
 - Nullptr check

Change-Id: Ic857eb4fe968173c852eb7a67934e33fc74c055f
2024-06-14 03:58:34 -04:00
Ioannis Assiouras d44f44a5b1 SWDEV-467069 - Added safety check in activity prof for accumulate command
Adding a safety check prevents an invalid memory access
if timestamps and kernelNames vectors are of different size.

The patch also moves the addKernelNames for the accumulate command
into dispatchAqlPacket function.

Change-Id: Iea0927e1253800403a1ae3f3d72de1e7d96476c3
2024-06-12 21:53:03 +01:00
Ioannis Assiouras 3edf1501cc SWDEV-463865 - namespace changes to prevent symbol conflicts in static builds
Change-Id: I09ceb5962b7aa19156909f47167c87d6887c9cd1
2024-06-12 16:22:27 -04:00
Anusha GodavarthySurya 3a5cbb91b9 SWDEV-461072 - Add reference to function parameter
Change-Id: I9ad5dafc6d697d12fbd1675f19f88f83ad2d7b9c
2024-06-12 01:20:28 -04:00
Jaydeep Patel 5c77e30b18 SWDEV-457316 - Other graph can free mem alloc node and return invalid val only if there is double mem free node in all captured graphs.
Change-Id: Icf12164bf0ecd171a4673ff4f384528e7671f944
2024-06-12 00:44:50 -04:00
Ioannis Assiouras 055e05a12a SWDEV-466601 - Fix invalid mem acccess in kernarg readback path
Change-Id: I4654ae592adc8cf9c687136d45eb1b28d99c7ae1
2024-06-10 15:13:05 +01:00
Satyanvesh Dittakavi 1815fc808d SWDEV-464927 - Update the Get by PCI BusId logic and Hop count
- Update the intra socket weight for partitions within single socket as
it is changed to 13 by the driver.
- Use the PCIe function to distinguish the partitions of the same device
such as TPX mode in gfx942.

Change-Id: I8e64023d44e37c2dbb105cbb343441a48021ba7b
2024-06-10 04:46:50 -04:00
Ioannis Assiouras 8f42ad6aa3 SWDEV-464648 - code and comment cleanups
Change-Id: I5ba3f1bff500b3cd5903c2f441017735e688f83f
2024-06-07 22:38:09 +01:00
kjayapra-amd 892071aeb2 SWDEV-460948 - Changes to alloc, set, capture under single function.
Change-Id: I7b2d40e99e812b97c53535c5e63c41ad64a8f543
2024-06-06 16:57:53 -04:00
Ioannis Assiouras b8c2ac4de4 SWDEV-463865 - symbol renamings to prevent conflicts in static build
Change-Id: Id7fbb638c1088c23df52fee877cd790d637b1ffb
2024-06-06 04:05:55 -04:00