Commit gráf

1076 Commit-ok

Szerző SHA1 Üzenet Dátum
Alex Xie f8c56f6bac SWDEV-489468 - make resource cache bigger for APU
Change-Id: I065c712acd06c273a0b194fe792ec4f876fa9c46
2024-10-31 09:55:01 -04:00
Tao Sang 82dff9a67d SWDEV-492563 - Fix Ocl issues
1. Fix LDSSize type to be uint32_t.
2. Prevent clWaitForEvents running on complete events whose
   HostQueue have been destructed.

Change-Id: I829e915f56b37db2ba76bb876c9656166534f154
2024-10-30 19:15:59 -04:00
Saleel Kudchadker e23ff0520b SWDEV-491375 - Improve MemObjMap perf
- Create bins each with its own map and lock. This would help cases
where the hash of a VA is differnet than ther one which falls in
different bin, and there is no lock contention
- Use STL shared mutexes, that way we can unique_lock for map updates
vs simple reads which can use shared_lock

Change-Id: I118818be65c6373700f5e511045babb6a398938a
2024-10-30 05:37:33 +00:00
German Andryeyev 403f624bf8 SWDEV-486602 - Add tracking of HSA handlers
Add an atomic counter to track the outstanding HSA handlers.
Wait on CPU for the callbacks if the number exceeds the value
in DEBUG_HIP_BLOCK_SYNC env variable.

Change-Id: I95dc8c4bf0258c7e59411b7504220709ed6898c5
2024-10-25 15:20:50 -04:00
German Andryeyev dceb320ba7 SWDEV-440746 - Fix a typo with GPU_PINNED_XFER_SIZE
Change-Id: I8fdbfb4e1c6b1274206c28a529eee9ebeaaa26fb
2024-10-24 18:33:14 -04:00
Anusha GodavarthySurya b498103f9b SWDEV-485904 - propagate hsa_amd_vmem_address_free error to hip API
Unit_hipMemSetAccess_GrowVMM test fails with
HSA_STATUS_ERROR_RESOURCE_FREE silently

Change-Id: I7a78410e432de4a2e877062782abf8761645f392
2024-10-21 10:12:32 -04:00
German Andryeyev 364dfb0ed1 SWDEV-486602 - Optimize HSA callback performance
- Don't generate callbacks for HIP events
- Don't process profiling info in the callback for HIP events
- Wait for CPU status update of the submitted commands
every 50 calls. That will allow to drain the commands and
destroy HSA signals.

Change-Id: Ib601a350e7e7c2b6c6209a172385389baccf73a9
2024-10-11 14:50:25 -04:00
Todd tiantuo Li 41dc4545fc SWDEV-472357 - support Rect copy with staging buffer for 2D & 3D memcpy in PAL
Change-Id: Ie32f3e5a6fa077f6b2db20fc1ab1e2e0da8344cb
2024-10-10 18:00:19 -04:00
Jaydeep Patel 5ccc140e1b SWDEV-485866 - Return OOM if stream creation fails due to insufficient memory.
Change-Id: I4e57ecc81921bde274bb6a4e0890f0fc6a17955a
2024-10-10 00:44:54 -04:00
Saleel Kudchadker e36666e536 SWDEV-301667 - Enable ROCr logging
- Use AMD_LOG_LEVEL=5 to dump AQL packets in ROCr

Change-Id: I2c044a5304c4eaf3d3af20e62d1f54c98d4fbaa4
2024-10-04 19:22:12 -04:00
Saleel Kudchadker d3d0ca5fc6 SWDEV-478065 - Revert "SWDEV-478065 - Embed host thread in shared_ptr"
This reverts commit 4b03017e8a.

Reason for revert: This blocks multithreaded callbacks

Change-Id: I9944417e4fb63c9eea2b286c828c7dfa621c4fe8
2024-10-04 19:19:28 -04:00
Jaydeep Patel 292842ad28 SWDEV-471422 - Free memory being double deducted on APUs due to system_total_alloced var holds local memory.
Change-Id: I3fbbc8f8aaa156881ff95cad6a4f82fd3df651d1
2024-10-04 04:49:20 -04:00
Rahul Manocha 9da90fe848 SWDEV-487903 - Fix for Empty Kernel Segfault in PAL
Change-Id: Ia1c19cf4ea24188cdb2d374b07f975f794e02dba
2024-09-30 13:00:15 -04:00
Anusha GodavarthySurya 742b0210d3 SWDEV-477324 - Capture Memcpy1D pinned H2D D2H
Change-Id: I1f4744f20a9caeed005ec68da44e5fde737e09f7
2024-09-30 01:01:30 -04:00
Vladana Stojiljkovic da5f1a6146 SWDEV-482086 - Fix hipGraphInstantiate leak
* In a scenario where kernel is launched with hipExtLaunchKernelGGL and stop event is used, hipGraphInstantiate leaks. Since stop event is used, profiling is enabled and Timestamp (ReferencedCountedObject) is created, but it doesn't get released.
* The idea behind this solution is that profiling should be disabled when command is captured, hence the timestamp should not be created. Because information about capturing isn't available when kernel command is created, packet capturing state is used to determine whether to create a timestamp or not.

Change-Id: Ia23adac4592ded4fb5e236acf99e12e729f63692
2024-09-29 11:36:53 -04:00
Ajay 7a288ea8bf SWDEV-486816 - RenderOpDispatch usage in pal client
Change-Id: I11cae3e625b287b998c9500c547efdacf1034a2b
2024-09-24 14:28:16 -04:00
German Andryeyev 29cc678d8d SWDEV-483586 - Unblock staging H2D transfers
Although unpinned copies require synchronizations
in HIP, runtime can avoid syncs for H2D copies with
a staging buffer

Change-Id: If2203c6bc0cbd89742823688dc8e89e9acd873b2
2024-09-21 10:25:27 -04:00
Maneesh Gupta 2d1c6ee23e SWDEV-485179 - Revert "SWDEV-459254 - Overwrite cacheline size to 256 for gfx12, as it is used for kernarg alignment."
This reverts commit 1f63650bf96e01e48f879aa58b80e2130dd4a567.

Reason for revert: <INSERT REASONING HERE>

Change-Id: I6d7ed87c09d9b77116548dce1f30ac4711c2c09d
2024-09-20 11:33:34 -04:00
Anusha GodavarthySurya 870842201d SWDEV-485904 - Fix virtual,physical mem obj leaks
Change-Id: Ie0456b5dcfec206ae54a6aabfc2a15a620cac693
2024-09-19 23:04:20 -04:00
kjayapra-amd 12a39fbf22 SWDEV-480772 - Remove name variable from amd::Monitor class.
Change-Id: Ie2a4fa44f485786227230f8a892e090e718aa30e
2024-09-19 11:55:01 -04:00
Rahul Manocha 07261002b1 SWDEV-439234 - Fix for Segfault in ValidateMemAccess
Change-Id: I251d277eb5af16ba5c0de85ffd142a5f64fa469d
2024-09-18 10:52:32 -04:00
Daniel Livingston e550032d25 SWDEV-77148 - Add UberTrace support to PAL device
This PR adds UberTrace-based tracing support to ROCclr's PAL device class.
Legacy RGP-based tracing is still available and is the default.
If UberTrace support is enabled tool-side, this new code path will activate.

Change-Id: I268b2dcef70e850a50e2caef8355f38bf51d4641
2024-09-17 16:06:37 -04:00
Jatin Chaudhary 4b03017e8a SWDEV-478065 - Embed host thread in shared_ptr
This shows up in some valgrind runs. Make sure the resources are
released.

Change-Id: I34c25c00370a221585895655744831215136d5f4
2024-09-17 09:53:51 -04:00
Ioannis Assiouras bcc545e6b8 SWDEV-476929 - Introduce an activeQueues set
The new set tracks only the queues that have a command
submitted to them. This allows for fast iteration
in waitActiveStreams.

Change-Id: I2c832eefa01280d9a87a5f57874d36d2e9441de7
2024-09-16 15:53:49 -04:00
kjayapra-amd d81c5d3d7f SWDEV-484188 - Change few std::array style globals to C style to avoid optimization on Static Libs.
Change-Id: Iab6d3c040b8d088013daa08093898be99dd3a971
2024-09-16 09:46:56 -04:00
kjayapra-amd 4ecd77df5e SWDEV-484188 - Moving std::maps into struct const and into amd::Kernel class.
Change-Id: Ie4d5a64511412fdb498b045aaffb52c3a1286de6
2024-09-15 09:14:51 -04:00
Ajay c9955a1cea SWDEV-465215 - hipFuncSetAttribute hipFuncGetAttributes fixes
Change-Id: I2151e4470d63918ff6b809a8fdeaae5bea5cc899
2024-09-13 14:05:25 -04:00
Saleel Kudchadker 9de6d4d46c SWDEV-478624 - Use readback workaround to ensure kernel arg coherence
Use env var DEBUG_CLR_KERNARG_HDP_FLUSH_WA=1 to fall back to HDP flush
workaround. The default is 0

Change-Id: I7bdb9be61da60c30d15ac9991b7cd27351e1831c
2024-09-11 14:53:15 -04:00
Ajay 5a810f789a SWDEV-471863 - avoid copy of invisibleHeap
Change-Id: Ieb0aa22ac6d0d01cb9ca7fbf1305df03a1ab3cdf
2024-09-11 13:24:31 -04:00
Jaydeep Patel 9c90bc43a5 SWDEV-475938 - Update dynamic stack in submit kernel internal.
Change-Id: I816bf9cfe8aaac5486ff3b719dbdc4f4d6134e01
2024-09-11 00:59:45 -04:00
Saleel Kudchadker abc80fcc2f SWDEV-301667 - Improve kernel logging
Change-Id: I4b2b1950e3ab7124fd41af9a92a677c48d6da5eb
2024-09-10 13:43:58 -04:00
Saleel Kudchadker 62a7fed90d SWDEV-481974 - Clear dependent signal bit for barrier value
Change-Id: I3ffda051fa8538970fbb1964beb1f538fce0782c
2024-09-10 13:43:04 -04:00
kjayapra-amd 6211037f63 SWDEV-439234 - Access check before memcpy and kernel operations.
Change-Id: I7057125c03460db205409e19980145298c190fe2
2024-09-06 14:30:00 -04:00
Jimbo Xie 3bdbc1eaf3 SWDEV-403363 - add gfx1152 runtime support
Change-Id: I2f59ddb38a98d9f8edec5d1548232d4d826b7d04
(cherry picked from commit 5e94656f744e315ee7ae1285d3e6dd515f9d66a8)
2024-09-03 17:12:24 -04:00
Rahul Manocha ddbd7039b0 SWDEV-478921 - Destroy Queue created by Coop Launch
Change-Id: I7f31ce05421479ff1de138cae26aafa071e956e2
2024-09-02 02:35:08 -04:00
Julia Jiang 417d3279f9 SWDEV-476623 - correct the format on the fix for clCopyImage
Change-Id: I3a3fb2eaa338ff4e298a43e583fcf94ec7cabdf6
2024-08-28 16:16:24 -04:00
Julia Jiang c3c41dae0d SWDEV-476623 - Fix test failures for clCopyImage
Change-Id: I971c5be98304bdbef0feec73e15ebd61a131b12f
2024-08-27 11:43:12 -04:00
kjayapra-amd 2a9cb89228 SWDEV-478099 - Fix multiple mapping case on PAL/Windows backend.
Change-Id: Id1fe7939fbf90649cda1848890b3b4ca9a1fcd00
2024-08-27 11:19:39 -04:00
ksankisa e76bf653fb [SWDEV-469495] Compile blit kernels with -fsanitize=address when asan is enabled.
Change-Id: I96e1abef43317cd58329c4a159f807878bc48cf4
2024-08-27 01:27:31 -04:00
kjayapra-amd 00eb038eec SWDEV-479620 - Change argument type to size_t from uint64_t in nonTemporalMemcpy function.
Change-Id: I31f8a2b00685789b027d78be40a9f82c235f51b9
2024-08-24 07:42:37 -04:00
kjayapra-amd d7b097c994 SWDEV-478097 - Check for parents size in case of VA Mem object.
Change-Id: Icfdeabeb178c0dcc8c3a4bc48eec40067985794e
2024-08-22 14:18:51 -04:00
Shane Xiao 3959b5be1e [SWDEV-479204] Fix the hipGraph AQL package fill issue
This patch fixes this potential issue that filling AQL header before
filling the AQL body. The hsa spec specifies "Packet processors may
process AQL packets after the packet format field is updated, but
before the doorbell is signaled."
However, the hipGraph AQL package with valid header will be filled
before fill the body, which may have the potential issue that CP
receive invalid AQL body.

Change-Id: I84af798c19ee2b8805ba19732b0eabdea2958a96
2024-08-21 21:49:11 -04:00
Rahul Manocha 432bdd7bf2 SWDEV-474617 SWDEV-464679 - Fix segfault in palvirtual due to peer memory access
Change-Id: Ib8b641712d78acf8bc073ca5705dea97af6f944a
2024-08-21 11:34:15 -04:00
Ajay ec0971dd08 SWDEV-471863 - APU: device allocation greater than invisible memory
Change-Id: I37f1769873ac7dcbb3cfa51fd815ee1e2123aeae
2024-08-09 14:29:18 -04:00
German Andryeyev 9db52f9a46 SWDEV-470612 - Add the optimized multistream path
- Added the optimized multi stream path in graph execution. It uses a fixed number of async streams in the execution
- Optimize the launch latency, where commands
creation and execution is done at the same time
- Optimize the scheduling to use less barriers and waiting signals if
the same queue  can be detected
- The new path is controlled by  DEBUG_HIP_FORCE_GRAPH_QUEUES
environment variable, where 0 will use the original path and any other
value will force the number of asynchronous queues for execution
- DEBUG_HIP_FORCE_ASYNC_QUEUE can force single queue async
execution in graphs(applicable for Navi families only)

Change-Id: I7eb40bc15c45f508d6911868a6f6d4c3598d380e
2024-08-02 14:19:44 -04:00
Anusha GodavarthySurya bd3a35bde1 SWDEV-468424 - Add support to capture multiple AQL Packets
=> Added support to capture multiple AQL Packets.
=> Added Interface to callback to hip runtime from rocclr to allocate
kernel args from the graph kernel arg pool.
=> Enabled Support to capture memset node.

Change-Id: I7e1c2ba06927459e024653058af142bd82192c43
2024-08-01 23:55:51 -04:00
Saleel Kudchadker d379f4efd0 SWDEV-301667 - Refactor Blit force env var
Change-Id: I5344ac2e6442cd8f526118e688f1b1412cc5b45a
2024-07-25 15:15:10 -04:00
German Andryeyev 18187cd8fe SWDEV-470612 - Avoid processing internal signals
If only external signals were provided, then just process it
without adding internal signals

Change-Id: Iaefd65d0f8b0a64b9f6a864a9bd73de20a29dfa4
2024-07-25 10:08:16 -04:00
German Andryeyev 1bac09ea20 SWDEV-469602 - Focre unaligned memory mode with RDP
Change-Id: I770f3dc8dde49d8e4ecdf5c38819e44df3960bce
2024-07-23 18:31:52 -04:00
Anusha GodavarthySurya 346da4bb40 SWDEV-468424 - hipgraph capture memset node
Capture AQL packets during GraphInstantiation and enqueue AQL packets during graph launch.

Added support to capture single graph memset node.
Capture support for memset node is currently disabled.
Memset capture will be enabled when capture for multiple packets are supported..

Change-Id: I14dfbc41731025cc3a548a730558915def3fa384
2024-07-19 23:52:50 -04:00