コミットグラフ

81 コミット

作成者 SHA1 メッセージ 日付
kjayapra-amd 6f5277c701 SWDEV-408473 - Add wait time of 10 us if the waiting signal copy was < 24K.
Change-Id: I438ec9eb07e5034042a4a9a5e6e51d74daba2c83
2023-08-23 10:46:33 -04:00
victzhan cb426df1bd SWDEV-416580 - Add condition when memory has direct access, only use host fill if image is small
Change-Id: I3509c4aa21f6413adad3b46273ec650f5c577ddd
2023-08-17 17:23:49 -04:00
Jaydeep Patel 289535e805 SWDEV-412393 - Force alloc memory to avoid another hsa image creation.
Change-Id: Ia3cd99eb736231e6dfe013ebae6c41fd4cc657bc
2023-08-17 05:18:43 +00:00
Saleel Kudchadker aa6eb555e2 SWDEV-384557 - Enable SDMA query
Change-Id: Ibb0a8d131f799985a4d4adbf753261e58c04157f
2023-08-01 18:41:23 -04:00
Saleel Kudchadker 5447cf8872 SWDEV-301667 - Disable HostBlit copy for HIP
Change-Id: I46333ff42e8c1d402ece97e3ead7b539a27c3f82
2023-07-17 17:49:11 -04:00
Saleel Kudchadker 770b2a4711 SWDEV-384557 - Rename env var
- Rename HIP_USE_SDMA_QUERY to DEBUG_CLR_USE_SDMA_QUERY as this is
supposed to be a temporary env var for debug purposes only.

Change-Id: If6ebd52ab87624375a3df24ceccdcc05c60a65af
2023-06-29 13:54:55 -04:00
German Andryeyev d29755452b SWDEV-396088 - Add image view cache
Blit manager requires an image view to reduce the amount
of copy kernels. Creation/destruction of a view in ROCr is
an expensive operation. Thus, runtime can cache views for fast access.

Change-Id: Ia67d775b481cc8326d91215ca22d4a73c1dddb59
2023-06-28 09:44:05 -04:00
Saleel Kudchadker 0a3d4bd4d4 SWDEV-408180 - Remove largeBar memcpy
- Remove large bar memcpy path. Since we end up waiting for a barrier,
its defeating the true intent of the copy, Also memcpy over PCIE\XGMI is
introducing variability in perf for HPC apps like GROMACS

Change-Id: I3b5c9d9ce93333959c39023bf4f703e2ccb6e3af
2023-06-27 18:15:26 -04:00
Saleel Kudchadker 8d193c32bb SWDEV-384557 - Use toggle for SDMA query
- Use HIP_USE_SDMA_QUERY env var toggle for new API use. Env var is 0 by
default

Change-Id: If725a0c41e15f78a1a6c3f47942954fe9240b4db
2023-06-15 01:02:24 -04:00
Saleel Kudchadker 60d9a4ebab SWDEV-384557 - Do not fall back to compute
- Use regular copy API if we exhaust free SDMA engines and not fall back
to compute copy. Falling to compute is affecting performance for
numerous apps that are GPU bound

Change-Id: I75c767eff0b9f5ada324301c5c327fe2c23a9806
2023-05-22 11:23:23 -04:00
Saleel Kudchadker 0b475284e9 SWDEV-398151 - Partly relax static engine allocation
Change-Id: I4903b51a34b597a2e84d771b52cf629f877dba05
2023-05-11 00:52:18 -04:00
taosang2 7624a48de9 SWDEV-366528 – Fix image memory format updating issue
Add dstMemory format updating.
Separate format updating for srcMemory and dstMemory.

Change-Id: I1692b92d417bbd742d562679f218ebf8ca532e92
2023-05-08 21:43:42 -04:00
Saleel Kudchadker 5865c642d4 SWDEV-384557 - Fix engine status query
- Maintain a map of SDMA engine# to stream allocated following a greedy
approach
- Anything past that will query SDMA engine status always and go with a
SDMA or Blit copy path

Change-Id: Ibfaed7f951ab84d80cb0430596a4d11b5aec9202
2023-04-21 00:57:26 -04:00
Saleel Kudchadker 20ca8b8116 SWDEV-384557 - Leverage SDMA engine status query
Change-Id: I5f386f2965de24a229ea43b6c4da82099692f91f
2023-04-05 07:50:53 +00:00
Jaydeep Patel ad78c5c4a5 SWDEV-382553 - Remove use of useCopyHint.
Change-Id: I82eb5d7569a2a78d7709af9397d4f29c8274d781
2023-03-01 23:20:02 -05:00
jatang b798c85272 SWDEV-380792 - Fix floating point exception when maxEngineClockFrequency_ is 0
Change-Id: Ic443ceae586c4c84995ed2abef9bd7f32f8b60f9
2023-02-07 11:43:10 -05:00
German Andryeyev b23c759746 SWDEV-372790 - Copy AQL packet from runtime setup
Scheduler in device queue requires relaunching itself. Make sure
scheduler uses exactly the same AQL packet as the host launch.

Change-Id: I4eb03c4c91bf2408a6d4607731f081a2e2c2c8ae
2023-01-24 10:25:45 -05:00
Jaydeep Patel 1e4a4162ff SWDEV-378157 - Correct log message
Change-Id: I6297693f67ae78a8874b976ac03353a81b728b1d
2023-01-23 12:06:18 -05:00
Saleel Kudchadker 033d4c0463 SWDEV-345213 - Fix staged line-by-line copy path
- Address an old bug in offset calculation that was causing out of bound
access.
- Improve logging

Change-Id: Iebdf34dddaa5e987cc72184a2152918adc6a96e0
2023-01-16 11:04:30 -05:00
Anusha GodavarthySurya 274f2de391 SWDEV-364576 - initialize device malloc heap state using blit kernel
Change-Id: I5d0172aff7d2c04b322a4d828b8a2b438158b80f
2023-01-07 06:53:53 +00:00
Jaydeep Patel 070ae4e6d4 SWDEV-374370 - Propogate element size to blit kernel.
Change-Id: I06d1ae6feebd238e9a63c617eb4c4dcf485d9ee0
2022-12-26 09:33:50 +00:00
Saleel Kudchadker e0384f9f6b SWDEV-373334 - Use copyMetadata for blit decisions
- Check isAsync flag for small host copies on large bar as it synchronizes
- Use CopyEngine Preference hint if HMM is enabled.

Change-Id: I1ffc4b2604ed03cf5979cdc454178648c5ae5cba
2022-12-15 17:09:02 -05:00
Ioannis Assiouras 72b45e2a1f SWDEV-369581 - Convey copy API metadata to ROCclr
Change-Id: I569462d6d268700d419510255e201bf7d80d6714
2022-12-09 00:27:15 -05:00
Saleel Kudchadker feca11d5e3 SWDEV-301667 - Improve logging
Change-Id: Ifa6da876b85cb503967cf09aac6d477b10db8e63
2022-11-04 18:23:18 -04:00
Saleel Kudchadker 175ad024d3 SWDEV-260345 - Manage constant buffer for blit
- Leverage managed buffer that would use chunks for fill pattern. Use a
different chunk for the next fill to avoid wait

Change-Id: I254483c867e112f66564ffd8f55e0a605d8896c9
2022-07-12 12:41:02 -04:00
Saleel Kudchadker faaa41aab8 SWDEV-335626 - Use ROCr copy for IPC
Detect IPC buffer and use ROCr copy api instead of blit

Change-Id: Ie6bdd6fc45dbd7457611011d81570b53d5fd5276
2022-07-08 13:32:19 -04:00
Ajay d2f837d25f SWDEV-332522 - streamOpsWrite & streamOpsWait to accept memory offset
Change-Id: I4b6ecb4d80c093d038d86616a637c4bb465ae24e
2022-04-25 14:59:36 -04:00
Jason Tang ed7737564e SWDEV-324411 - Use blit kernel for copyBufferRect if atomic is not supported
Change-Id: I2e110fd3418117ee9c7ede379244d2c6c4f248b7
2022-04-24 11:41:16 -04:00
kjayapra-amd 7fb80a027a SWDEV-305527 - Changes to handle memset blit kernel that takes width, height and depth. This also fixes SWDEV-317261.
Change-Id: Ic85f63a95d9d8f48884fc8c7fd95cbb496dfbbca
2022-03-31 09:02:33 -04:00
Satyanvesh Dittakavi c1b95b09bf SWDEV-326397 - P2P copies to take SDMA path if there is no pending dispatch
Change-Id: I50cfb8d77f7882151a20a1de7aaf5219b1695b7d
2022-03-29 14:59:11 +00:00
German Andryeyev 3fd4a67670 SWDEV-316824 - Fix P2P compute copy path
Use device memory object for the GPU VA address look-up.

Change-Id: I76bf58b29205f7b3ba1bf68e9fcca69421267203
2022-02-15 13:20:13 -05:00
Satyanvesh Dittakavi e20dd61932 SWDEV-306939 - Fix vdi errors/warnings by CppCheck
Change-Id: I56d910f8363787f1050d5d7e8064ed553c5827fd
2022-01-12 00:22:16 -05:00
German Andryeyev 008133cf41 SWDEV-305016 - Improve MGPU scaling in Tensorflow
Add a threshold for ROCR/SDMA P2P transfers. ROCR copy path
requires extra barriers in compute for synchronization. That costs
extra performance with tiny transfers.
Reduce active wait time to 10us. Tensorflow uses extra thread
per GPU with constant hipEventQuery() calls. Longer active waits
in ROCr affect CPU performance.

Change-Id: I9020358438615fa2d4617f862f00a562f0a588e7
2021-12-08 11:59:37 -05:00
kjayapra-amd d4ad981c0c SWDEV-312822 - Fix the globalWorkSize to number of sizeof(var) instead of bytes.
Change-Id: Ic6b2bbb2e8d4cb6aa8d906d4b93cd06a176160d8
2021-11-29 17:36:11 -05:00
kjayapra-amd 2e9bc8f793 SWDEV-312822 - Revert "SWDEV-310187 - Change flag to keep track of aligned sizes instead of expanded patterns."
This reverts commit 8307886644.

Change-Id: I022c2a8375f9929e9723cec66e1e0b960263fc39
2021-11-28 23:39:40 -05:00
German Andryeyev 6f2e7c3199 SWDEV-313126 - Use data() method for the base array address
Reference for the first element can trigger an assert with
_GLIBCXX_ASSERTIONS build

Change-Id: I59c63c052831307edfe5dcc6384798a43e9596dd
2021-11-26 09:51:57 -05:00
kjayapra-amd 8307886644 SWDEV-310187 - Change flag to keep track of aligned sizes instead of expanded patterns.
Change-Id: I763feda8688bb1b7b11033a2a8cba0f69f07167d
2021-11-19 10:32:40 -05:00
Bing Ma 02f939a40d SWDEV-306602 - [SANITIZER_AMDGPU] Force copyBuffer to use ROCr functions when ASAN is ON
Change-Id: I04a4cdd5ab8c5543f2a0f08c139c45ac7aebe64a
2021-10-14 12:55:27 -04:00
kjayapra-amd 88ed58735d SWDEV-232903 - Move hipmemset Dword optimization to ROCclr.
Change-Id: I3eae61720cbc6364f1aaac4865bfd8b6ded08097
2021-10-13 11:32:15 -04:00
Jason Tang 55a0cf0b0c SWDEV-306697 - Fix OCLGlobalOffset segfaults
If we don't create the __amd_rocclr_gwsInit kernel, we still want
to create the rest of the image related blit kernels.

Change-Id: I8bc4645f9f9116eeecbb8b22e981ac4d520f3121
2021-10-12 15:13:28 -04:00
kjayapra-amd 7413b7f79b SWDEV-294420 - Ignore Image blit kernels if image instructions are not supported.
Change-Id: I145172672b0b032aa722649b0c4ca9267e3e5c85
2021-10-05 18:12:44 -04:00
Sourabh cbb8d82bdb SWDEV-292525 - [vdi] Path to streamOps shaders
Implementation to use a blit kernel to perform
a hipStreamWait/write instead of an AQL packet.

Change-Id: I462671ed5cec37144dfe97ff66439249196117c1
2021-09-27 13:59:35 -04:00
Saleel Kudchadker 21ba34d0fe SWDEV-297448 - Add 64bit and 16bit write support
For the fillBuffer shader, if there are two 32bit writes to a MMIO
register, it can get dropped. It has to be a single 64bit write.
Add optimization to fillBuffer to write 64bit and 16bit writes.

Change-Id: I3aa78e027898f8ae01e9c8f09004615673720c2b
2021-09-08 12:30:04 -04:00
Sarbojit Sarkar 42d33029dc SWDEV-300655 - Added thread ID to hip trace
Change-Id: I9234d4ec93e7687cd0a5d1bd930bd4f80936311b
2021-09-06 00:22:42 -04:00
agunashe d96481fb36 SWDEV-293742 - Update copyright end year VDI repo
Change-Id: I69d2fea4a7a43adf96ccea794270e4af991c5261
2021-08-22 23:56:07 -07:00
German Andryeyev fa2e154a8b SWDEV-278894 - Use GPU waits for HIP events
Save HW events in amd::Event.
Use HW events for synchronization

Change-Id: I98cf9c2d0ec3c7fcaf254b749ac6c568d7270ae0
2021-05-25 13:41:15 -04:00
Saleel Kudchadker 42b8236f93 SWDEV-280773 - Additional logging for signals
Cleanup new lines in debug log

Change-Id: I6862c332eb9457b51e23cf4e9db9ba3f870d0c39
2021-04-30 15:05:57 -07:00
Saleel Kudchadker aa38af8c96 SWDEV-276120 - Remove support for barrier sync
ROC_BARRIER_SYNC will not work with direct dispatch.
Remove and cleanup.

Change-Id: I81368b2e65039477bd0343bb92708dab48867db6
2021-04-07 17:08:39 -04:00
Ravi C Akkenapally 0a5f9a3b10 SWDEV-179105 - Stream Operations: Add support for Wait and Write
Change-Id: Ibffa1d6d573826b64763da280074a77271d66808
2021-02-15 17:02:38 -08:00
Payam a2e0b0495c SWDEV-257937 - Updated fix for ROC_BARRIER_SYNC=0
Change-Id: I7e28e541b654db57fb0890d7dbb7519cfb2d93db
2021-02-11 14:01:45 -05:00