نمودار کامیت

24 کامیت‌ها

مولف SHA1 پیام تاریخ
Payam f134b90199 SWDEV-257937 - ROC_BARRIER_SYNC fix for missing SDMA flush
Change-Id: I93e8902bfcb16bac8ea594e16ea397b1ceafbd79
2020-12-15 00:54:33 -05:00
German Andryeyev 18a821acde Add L2 flush/invalidate after CPU copy
CPU read updates L2 with the latest values and requires
invalidation after, because SDMA doesn't use L2 and data can become
out of sync.

Change-Id: I98d1c91ca78a103fa5409e638f97485d62d5b11e
2020-12-11 23:05:49 -05:00
German Andryeyev bd340d8cbf Correct reported info in ROC profiler
OCL can't distinguish different copy types, but ROC profiler
expects SDMA transfer visibility. Add extra code to detect
a transfer with the host memory and substitute OCL command

Change-Id: I5290acd0e10bc082e00c1d4ae1474a075de7f165
2020-10-23 18:29:48 -04:00
German Andryeyev d9397590de Add option to skip AQL barrier
The change reuses HSA signals for dispatches as a wait signal.
Skipping the barrier requires to  disable L2 cache for sysmem
allocations and extra tracking for HDP access with the large bar.
ROC_BARRIER_SYNC=0 activates the new logic. Barrier sync is
still used by default.
ROC_ACTIVE_WAIT=1 enables unconditional active wait in ROCr.
The change also consolidated ROCr wait logic under single function.

Change-Id: I6bd1be30aa88258da1b1f9de319ef5a45852afd8
2020-10-06 08:37:12 -04:00
Alex Xie 7e8f7b5927 SWDEV-249516 - [Lnx][Navi][rocm]conformance image read write tests data error
Change-Id: Ie1c4fda953198b49ed66fea9da23e62c686d9cea
2020-09-01 17:20:58 -04:00
Tao Sang fdef6f722f Apply constexpr on global constant varaibles
When HIP_ENABLE_DEFERRED_LOADING=0, many global variables will be
referenced but they are not initialized in that early time. The patch
will use constexpr to initialze global constant varables in compile
time.

Change-Id: I9d538b7abc6a0ce700ec3332b97fc144db5fc1ef
2020-07-22 22:14:13 -04:00
Jatin Chaudhary cd1e364911 Replacing deprecated HSA API calls with newer ones
Change-Id: Iebe2c00e717ab0e47c61611752b717966c719994
2020-07-08 00:32:24 -04:00
Vlad Sytchenko 5b9af8f28d Fix some -Wunused-but-set-variable warnings
Change-Id: I281583b5abdfc09d5dd8b7dfb20b8821581db193
2020-06-15 17:51:01 -04:00
German Andryeyev 2ce6bbebc4 Fix async mem clear
Optimization for the fence release removed a sync for mem fill.
Add simple const buffer management forr the filled pattern to avoid
pattern overwriting with the async fills.

Change-Id: I63773ac09ceec31d5396d24570e4647ff096326b
2020-05-20 11:13:41 -04:00
Jason Tang cd2a713d63 Add major/minor/stepping to device layer
Change-Id: If82ea55a46b166b243a98089a6e9c40ccfdb479f
2020-05-17 12:57:34 -04:00
Christophe Paquot 6a5af4056e Use system scope for packet following sdma copies
SWDEV-234947
SWDEV-236298
Instead of forcing a barrier packet, just inject system scope on the next packet.

Change-Id: If9bcee23e08dfe5db731235e2fcb30582cbd4c1c
2020-05-15 12:20:06 -04:00
Christophe Paquot 2a02026696 Add gpu().hasPendingDispatch() in the SDMA path
SWDEV-234947

Change-Id: I8aa501f8755d136708b0d12ee3c30229c238660d
2020-05-08 18:19:51 -04:00
Michael LIAO 503ef06555 Clear executable permission.
Change-Id: Ia0d363b1ba89d7947e5b5a55cb67edba86f0515e
2020-05-07 10:38:58 -04:00
Alex Xie bfbc8cd09b SWDEV-234684 - hipmemcpy optimization does not work in tests
Change-Id: I899d172c5b2af88c796fe9a36f97d15ac45caf94
2020-05-05 15:58:03 -04:00
German Andryeyev 7302ebcfbc Optimize synch operations
- Stall the queue only for HSA copy operations

Change-Id: Ia3debcc0f36284c5f8cd2776d31674f3aeed04ea
2020-04-30 11:17:48 -04:00
Alex Xie 6c5a42b33c SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
Apply the optimization to change for OpenCL too.
Clean up some unnecessary checks.

Change-Id: I840261fe35baeeadeba7388e86779d482f509aad
2020-04-30 11:06:28 -04:00
Saleel Kudchadker 5f64e6e7ad Add a threshold for forcing ROCr to take blit path
This workaround is to avoid performance penalty of SDMA engine
taking a while to clock up from a lower DPM state. Add env var
GPU_FORCE_BLIT_COPY_SIZE (1024 by default for HIP in KB). Forcing
Src and Dst agent to be amdgpu makes ROCr take blit copy path for
what otherwise should have been SDMA copy

Change-Id: I222f687155f86000d17d66d25182e490b6710463
2020-04-28 17:11:24 -04:00
Alex Xie 009d0b5f55 SWDEV-232894 Port hipMemcpy optimizations from HCC to VDI
Change-Id: I6bebe9ac503a9f80d067aeea8a848409ad210338
2020-04-27 14:53:58 -04:00
German Andryeyev 89133a7301 SWDEV-232807
[ROCm][TCT][HIP] cooperative stream test case is failing.

Make sure lockXfer() in the blit manager returns a valid value.
Port the latest PAL backend logic into the ROCr backend.
This change doesn't fix the issue, reported in the ticket.

Change-Id: I54101a824f49a2dcfbbf5414cb5b3af41745306d
2020-04-23 15:01:02 -04:00
kjayapra-amd 7458bf9964 SWDEV-229840 - Improve error messages on ROCCLR Layer.
Change-Id: Iab7d9156cdc206db86385aa05023a0095ed40f92
2020-04-19 20:01:49 -04:00
Alex Xie 3e247d2afd SWDEV-229731 - [Lnx][ROCm][Navi]Support images in full OpenCL conformance tests
Duplicate similar blit logic from PAL path

Tests:
1D Array image read/write tests and copy image tests passed

Change-Id: I838bbde252ad0108bfeb82c0c2b669881747c0af
2020-04-04 09:28:37 -04:00
Laurent Morichetti d9d9c69399 Replace cl_* integral types with standard types.
cl_bool -> bool
cl_int -> int32_t
cl_uint -> uint32_t
cl_long -> int64_t
cl_ulong -> uint64_t
cl_float -> float
cl_double -> double
cl_bitfield -> uint64_t

Change-Id: I840c8993b55f98f5b745d21e27f5f28233647a58
2020-02-12 13:16:06 -08:00
Laurent Morichetti b4c6143a2f Update copyright info
Change-Id: Ia4f9ff0f5f873b4223a8cca154188bb0d2f1abba
2020-02-04 09:26:14 -08:00
Laurent Morichetti 20c7173849 Merge branch 'origin/pghafari/vdi-prototype' into lmoriche/amd-master
Change-Id: Id3b833d405596735becb3346f3b08c6da57033fe
2020-01-30 20:12:13 -08:00