kjayapra-amd
6f5277c701
SWDEV-408473 - Add wait time of 10 us if the waiting signal copy was < 24K.
...
Change-Id: I438ec9eb07e5034042a4a9a5e6e51d74daba2c83
2023-08-23 10:46:33 -04:00
victzhan
cb426df1bd
SWDEV-416580 - Add condition when memory has direct access, only use host fill if image is small
...
Change-Id: I3509c4aa21f6413adad3b46273ec650f5c577ddd
2023-08-17 17:23:49 -04:00
Jaydeep Patel
289535e805
SWDEV-412393 - Force alloc memory to avoid another hsa image creation.
...
Change-Id: Ia3cd99eb736231e6dfe013ebae6c41fd4cc657bc
2023-08-17 05:18:43 +00:00
Saleel Kudchadker
aa6eb555e2
SWDEV-384557 - Enable SDMA query
...
Change-Id: Ibb0a8d131f799985a4d4adbf753261e58c04157f
2023-08-01 18:41:23 -04:00
Saleel Kudchadker
5447cf8872
SWDEV-301667 - Disable HostBlit copy for HIP
...
Change-Id: I46333ff42e8c1d402ece97e3ead7b539a27c3f82
2023-07-17 17:49:11 -04:00
Saleel Kudchadker
770b2a4711
SWDEV-384557 - Rename env var
...
- Rename HIP_USE_SDMA_QUERY to DEBUG_CLR_USE_SDMA_QUERY as this is
supposed to be a temporary env var for debug purposes only.
Change-Id: If6ebd52ab87624375a3df24ceccdcc05c60a65af
2023-06-29 13:54:55 -04:00
German Andryeyev
d29755452b
SWDEV-396088 - Add image view cache
...
Blit manager requires an image view to reduce the amount
of copy kernels. Creation/destruction of a view in ROCr is
an expensive operation. Thus, runtime can cache views for fast access.
Change-Id: Ia67d775b481cc8326d91215ca22d4a73c1dddb59
2023-06-28 09:44:05 -04:00
Saleel Kudchadker
0a3d4bd4d4
SWDEV-408180 - Remove largeBar memcpy
...
- Remove large bar memcpy path. Since we end up waiting for a barrier,
its defeating the true intent of the copy, Also memcpy over PCIE\XGMI is
introducing variability in perf for HPC apps like GROMACS
Change-Id: I3b5c9d9ce93333959c39023bf4f703e2ccb6e3af
2023-06-27 18:15:26 -04:00
Saleel Kudchadker
8d193c32bb
SWDEV-384557 - Use toggle for SDMA query
...
- Use HIP_USE_SDMA_QUERY env var toggle for new API use. Env var is 0 by
default
Change-Id: If725a0c41e15f78a1a6c3f47942954fe9240b4db
2023-06-15 01:02:24 -04:00
Saleel Kudchadker
60d9a4ebab
SWDEV-384557 - Do not fall back to compute
...
- Use regular copy API if we exhaust free SDMA engines and not fall back
to compute copy. Falling to compute is affecting performance for
numerous apps that are GPU bound
Change-Id: I75c767eff0b9f5ada324301c5c327fe2c23a9806
2023-05-22 11:23:23 -04:00
Saleel Kudchadker
0b475284e9
SWDEV-398151 - Partly relax static engine allocation
...
Change-Id: I4903b51a34b597a2e84d771b52cf629f877dba05
2023-05-11 00:52:18 -04:00
taosang2
7624a48de9
SWDEV-366528 – Fix image memory format updating issue
...
Add dstMemory format updating.
Separate format updating for srcMemory and dstMemory.
Change-Id: I1692b92d417bbd742d562679f218ebf8ca532e92
2023-05-08 21:43:42 -04:00
Saleel Kudchadker
5865c642d4
SWDEV-384557 - Fix engine status query
...
- Maintain a map of SDMA engine# to stream allocated following a greedy
approach
- Anything past that will query SDMA engine status always and go with a
SDMA or Blit copy path
Change-Id: Ibfaed7f951ab84d80cb0430596a4d11b5aec9202
2023-04-21 00:57:26 -04:00
Saleel Kudchadker
20ca8b8116
SWDEV-384557 - Leverage SDMA engine status query
...
Change-Id: I5f386f2965de24a229ea43b6c4da82099692f91f
2023-04-05 07:50:53 +00:00
Jaydeep Patel
ad78c5c4a5
SWDEV-382553 - Remove use of useCopyHint.
...
Change-Id: I82eb5d7569a2a78d7709af9397d4f29c8274d781
2023-03-01 23:20:02 -05:00
jatang
b798c85272
SWDEV-380792 - Fix floating point exception when maxEngineClockFrequency_ is 0
...
Change-Id: Ic443ceae586c4c84995ed2abef9bd7f32f8b60f9
2023-02-07 11:43:10 -05:00
German Andryeyev
b23c759746
SWDEV-372790 - Copy AQL packet from runtime setup
...
Scheduler in device queue requires relaunching itself. Make sure
scheduler uses exactly the same AQL packet as the host launch.
Change-Id: I4eb03c4c91bf2408a6d4607731f081a2e2c2c8ae
2023-01-24 10:25:45 -05:00
Jaydeep Patel
1e4a4162ff
SWDEV-378157 - Correct log message
...
Change-Id: I6297693f67ae78a8874b976ac03353a81b728b1d
2023-01-23 12:06:18 -05:00
Saleel Kudchadker
033d4c0463
SWDEV-345213 - Fix staged line-by-line copy path
...
- Address an old bug in offset calculation that was causing out of bound
access.
- Improve logging
Change-Id: Iebdf34dddaa5e987cc72184a2152918adc6a96e0
2023-01-16 11:04:30 -05:00
Anusha GodavarthySurya
274f2de391
SWDEV-364576 - initialize device malloc heap state using blit kernel
...
Change-Id: I5d0172aff7d2c04b322a4d828b8a2b438158b80f
2023-01-07 06:53:53 +00:00
Jaydeep Patel
070ae4e6d4
SWDEV-374370 - Propogate element size to blit kernel.
...
Change-Id: I06d1ae6feebd238e9a63c617eb4c4dcf485d9ee0
2022-12-26 09:33:50 +00:00
Saleel Kudchadker
e0384f9f6b
SWDEV-373334 - Use copyMetadata for blit decisions
...
- Check isAsync flag for small host copies on large bar as it synchronizes
- Use CopyEngine Preference hint if HMM is enabled.
Change-Id: I1ffc4b2604ed03cf5979cdc454178648c5ae5cba
2022-12-15 17:09:02 -05:00
Ioannis Assiouras
72b45e2a1f
SWDEV-369581 - Convey copy API metadata to ROCclr
...
Change-Id: I569462d6d268700d419510255e201bf7d80d6714
2022-12-09 00:27:15 -05:00
Saleel Kudchadker
feca11d5e3
SWDEV-301667 - Improve logging
...
Change-Id: Ifa6da876b85cb503967cf09aac6d477b10db8e63
2022-11-04 18:23:18 -04:00
Saleel Kudchadker
175ad024d3
SWDEV-260345 - Manage constant buffer for blit
...
- Leverage managed buffer that would use chunks for fill pattern. Use a
different chunk for the next fill to avoid wait
Change-Id: I254483c867e112f66564ffd8f55e0a605d8896c9
2022-07-12 12:41:02 -04:00
Saleel Kudchadker
faaa41aab8
SWDEV-335626 - Use ROCr copy for IPC
...
Detect IPC buffer and use ROCr copy api instead of blit
Change-Id: Ie6bdd6fc45dbd7457611011d81570b53d5fd5276
2022-07-08 13:32:19 -04:00
Ajay
d2f837d25f
SWDEV-332522 - streamOpsWrite & streamOpsWait to accept memory offset
...
Change-Id: I4b6ecb4d80c093d038d86616a637c4bb465ae24e
2022-04-25 14:59:36 -04:00
Jason Tang
ed7737564e
SWDEV-324411 - Use blit kernel for copyBufferRect if atomic is not supported
...
Change-Id: I2e110fd3418117ee9c7ede379244d2c6c4f248b7
2022-04-24 11:41:16 -04:00
kjayapra-amd
7fb80a027a
SWDEV-305527 - Changes to handle memset blit kernel that takes width, height and depth. This also fixes SWDEV-317261.
...
Change-Id: Ic85f63a95d9d8f48884fc8c7fd95cbb496dfbbca
2022-03-31 09:02:33 -04:00
Satyanvesh Dittakavi
c1b95b09bf
SWDEV-326397 - P2P copies to take SDMA path if there is no pending dispatch
...
Change-Id: I50cfb8d77f7882151a20a1de7aaf5219b1695b7d
2022-03-29 14:59:11 +00:00
German Andryeyev
3fd4a67670
SWDEV-316824 - Fix P2P compute copy path
...
Use device memory object for the GPU VA address look-up.
Change-Id: I76bf58b29205f7b3ba1bf68e9fcca69421267203
2022-02-15 13:20:13 -05:00
Satyanvesh Dittakavi
e20dd61932
SWDEV-306939 - Fix vdi errors/warnings by CppCheck
...
Change-Id: I56d910f8363787f1050d5d7e8064ed553c5827fd
2022-01-12 00:22:16 -05:00
German Andryeyev
008133cf41
SWDEV-305016 - Improve MGPU scaling in Tensorflow
...
Add a threshold for ROCR/SDMA P2P transfers. ROCR copy path
requires extra barriers in compute for synchronization. That costs
extra performance with tiny transfers.
Reduce active wait time to 10us. Tensorflow uses extra thread
per GPU with constant hipEventQuery() calls. Longer active waits
in ROCr affect CPU performance.
Change-Id: I9020358438615fa2d4617f862f00a562f0a588e7
2021-12-08 11:59:37 -05:00
kjayapra-amd
d4ad981c0c
SWDEV-312822 - Fix the globalWorkSize to number of sizeof(var) instead of bytes.
...
Change-Id: Ic6b2bbb2e8d4cb6aa8d906d4b93cd06a176160d8
2021-11-29 17:36:11 -05:00
kjayapra-amd
2e9bc8f793
SWDEV-312822 - Revert "SWDEV-310187 - Change flag to keep track of aligned sizes instead of expanded patterns."
...
This reverts commit 8307886644 .
Change-Id: I022c2a8375f9929e9723cec66e1e0b960263fc39
2021-11-28 23:39:40 -05:00
German Andryeyev
6f2e7c3199
SWDEV-313126 - Use data() method for the base array address
...
Reference for the first element can trigger an assert with
_GLIBCXX_ASSERTIONS build
Change-Id: I59c63c052831307edfe5dcc6384798a43e9596dd
2021-11-26 09:51:57 -05:00
kjayapra-amd
8307886644
SWDEV-310187 - Change flag to keep track of aligned sizes instead of expanded patterns.
...
Change-Id: I763feda8688bb1b7b11033a2a8cba0f69f07167d
2021-11-19 10:32:40 -05:00
Bing Ma
02f939a40d
SWDEV-306602 - [SANITIZER_AMDGPU] Force copyBuffer to use ROCr functions when ASAN is ON
...
Change-Id: I04a4cdd5ab8c5543f2a0f08c139c45ac7aebe64a
2021-10-14 12:55:27 -04:00
kjayapra-amd
88ed58735d
SWDEV-232903 - Move hipmemset Dword optimization to ROCclr.
...
Change-Id: I3eae61720cbc6364f1aaac4865bfd8b6ded08097
2021-10-13 11:32:15 -04:00
Jason Tang
55a0cf0b0c
SWDEV-306697 - Fix OCLGlobalOffset segfaults
...
If we don't create the __amd_rocclr_gwsInit kernel, we still want
to create the rest of the image related blit kernels.
Change-Id: I8bc4645f9f9116eeecbb8b22e981ac4d520f3121
2021-10-12 15:13:28 -04:00
kjayapra-amd
7413b7f79b
SWDEV-294420 - Ignore Image blit kernels if image instructions are not supported.
...
Change-Id: I145172672b0b032aa722649b0c4ca9267e3e5c85
2021-10-05 18:12:44 -04:00
Sourabh
cbb8d82bdb
SWDEV-292525 - [vdi] Path to streamOps shaders
...
Implementation to use a blit kernel to perform
a hipStreamWait/write instead of an AQL packet.
Change-Id: I462671ed5cec37144dfe97ff66439249196117c1
2021-09-27 13:59:35 -04:00
Saleel Kudchadker
21ba34d0fe
SWDEV-297448 - Add 64bit and 16bit write support
...
For the fillBuffer shader, if there are two 32bit writes to a MMIO
register, it can get dropped. It has to be a single 64bit write.
Add optimization to fillBuffer to write 64bit and 16bit writes.
Change-Id: I3aa78e027898f8ae01e9c8f09004615673720c2b
2021-09-08 12:30:04 -04:00
Sarbojit Sarkar
42d33029dc
SWDEV-300655 - Added thread ID to hip trace
...
Change-Id: I9234d4ec93e7687cd0a5d1bd930bd4f80936311b
2021-09-06 00:22:42 -04:00
agunashe
d96481fb36
SWDEV-293742 - Update copyright end year VDI repo
...
Change-Id: I69d2fea4a7a43adf96ccea794270e4af991c5261
2021-08-22 23:56:07 -07:00
German Andryeyev
fa2e154a8b
SWDEV-278894 - Use GPU waits for HIP events
...
Save HW events in amd::Event.
Use HW events for synchronization
Change-Id: I98cf9c2d0ec3c7fcaf254b749ac6c568d7270ae0
2021-05-25 13:41:15 -04:00
Saleel Kudchadker
42b8236f93
SWDEV-280773 - Additional logging for signals
...
Cleanup new lines in debug log
Change-Id: I6862c332eb9457b51e23cf4e9db9ba3f870d0c39
2021-04-30 15:05:57 -07:00
Saleel Kudchadker
aa38af8c96
SWDEV-276120 - Remove support for barrier sync
...
ROC_BARRIER_SYNC will not work with direct dispatch.
Remove and cleanup.
Change-Id: I81368b2e65039477bd0343bb92708dab48867db6
2021-04-07 17:08:39 -04:00
Ravi C Akkenapally
0a5f9a3b10
SWDEV-179105 - Stream Operations: Add support for Wait and Write
...
Change-Id: Ibffa1d6d573826b64763da280074a77271d66808
2021-02-15 17:02:38 -08:00
Payam
a2e0b0495c
SWDEV-257937 - Updated fix for ROC_BARRIER_SYNC=0
...
Change-Id: I7e28e541b654db57fb0890d7dbb7519cfb2d93db
2021-02-11 14:01:45 -05:00