Gráfico de commits

247 Commits

Autor SHA1 Mensaje Fecha
Vladislav Sytchenko f45cea29b7 [PAL] Clamp max image buffer size...
to the maximum size we can possibly create.

Change-Id: Iade51d84fdada4ae1299d9b2410d373a46357c66
2020-12-15 12:14:09 -05:00
Payam f134b90199 SWDEV-257937 - ROC_BARRIER_SYNC fix for missing SDMA flush
Change-Id: I93e8902bfcb16bac8ea594e16ea397b1ceafbd79
2020-12-15 00:54:33 -05:00
Jason Tang d4316141b7 SWDEV-263539 - More Target ID fix in GSL path
On behalf of Tony Tye. The bootleg passes CQE Baffin/Ellesmere test.

Change-Id: I4c21d21b3aaba360682ef15b8a4dda239f8af276
2020-12-14 16:48:55 -05:00
German Andryeyev 18a821acde Add L2 flush/invalidate after CPU copy
CPU read updates L2 with the latest values and requires
invalidation after, because SDMA doesn't use L2 and data can become
out of sync.

Change-Id: I98d1c91ca78a103fa5409e638f97485d62d5b11e
2020-12-11 23:05:49 -05:00
Jason Tang eef8405041 SWDEV-263539 - Support new Target ID in GSL path
Change-Id: I6827de93b10b312a1b78b69f5cf7d5b3d5bb1e31
2020-12-11 15:56:54 -05:00
Alex Xie 2505d68eba SWDEV-256126 - Linux pro Nuke app crash with "Out of memory"
Out of memory while running RIP plugin test

Change-Id: I8d6859a45b871f96ac027f8c7274f716e8524a3c
2020-12-10 11:44:54 -05:00
German Andryeyev 1fde842703 Fix a deadlock in ROCr backend
When OCL ROCr backend performs CL_MEM_COPY_HOST_PTR it may attempt
to have access to amd::Memory object it's currently creating,
but it's not ready yet. The logic creates a temporary dummy object
to perform a copy transfer. The new change will make sure runtime
skips allocation of the same device::Memory object second time.

Change-Id: I14c6a00a3941fdcaa6aea299e9f096e4c3f5cadf
2020-12-09 13:23:17 -05:00
Jason Tang b9520ce4cd SWDEV-263435 - Get code object version the correct way
Change-Id: I18877c116e2f013ec9d04411258c0df8cc0159b3
2020-12-05 15:51:26 -05:00
Saleel Kudchadker 59c6cb0268 Use barrier packets for event profiling
Use barrier packets for every profile marker that gets submitted
and use the completion signal to get GPU ts. This gives most accurate
dispatch time. Club cache flushes with profile marker if there is a
pending dispatch that needs cache flush. This optimization saves on
extra barrier and helps wall time

Change-Id: Ib62d6d7aabf4743827b561be6c9c5afa813203da
2020-12-03 13:45:14 -05:00
Jason Tang 054f256589 SWDEV-260632 - [PAL] Report correct Target ID
Change-Id: Ia39395e2c02e7c95b3df93be1f8030b4fa734583
2020-12-01 18:33:25 -05:00
German Andryeyev 4af8b53846 Enable GPU memory in HMM by default
Change-Id: Ifec4733dc7a932163d921ebe1ae9fbd594ea1ef2
2020-11-30 12:39:18 -05:00
German Andryeyev 08b846ae12 Remove obsolete terminate() method
Change-Id: I66b4a74f17977f1af320f402402a2f1b602e9911
2020-11-30 11:46:09 -05:00
Jason Tang 0c62d3bf1c SWDEV-260632 - [PAL] Use new Target ID format
Change-Id: Icd2d95b9c3f5adbd295fb2272bf453ccb9f09678
2020-11-24 17:38:13 -05:00
Alex Xie 6327dbc4cc SWDEV-258808 - OCLSeparateCompile subtest of oclcompiler error
[PAL to KFD/ROCr][ROCr_Runtime][Vega10] OCLSeparateCompile subtest of
oclcompiler from ocltst test package is encountering clLinkProgram()
failed (chksum 0x00000001) error

If runtime does not provide a file name as dump file to ELF library,
ELF library use a temp file in current folder.
The current folder can be not writable for several reasons:
1. The application current folder might be system folder, the user
  does not have write permission.
2. The current folder is under a readonly file system. This happens for
embedded customers.

Tested in VEGA10. Issue was fixed.

Change-Id: Ic0e9f040b7c7583914301673cce237ab28b0c0cb
2020-11-24 15:08:12 -05:00
Aryan Salmanpour 72277c29b0 don't update maxComputeUnits_ if any exception occurs during conversion of global CU mask string
Change-Id: I7664809fe84d7422b18b1272ffeb642e03a39f1a
2020-11-23 09:51:19 -05:00
Jason Tang 3351b9c993 SWDEV-260632 - [PAL] Simplify NullDevice::init()
Change-Id: I9d44162f38806e3742c18da48e382baafeb7060f
2020-11-21 10:00:35 -05:00
Aryan Salmanpour d03ee6eff6 Add an environment variable for setting a global CU mask
Change-Id: I773b152023c7b8e1e679a42015748f9b23fd946d
2020-11-20 10:05:09 -05:00
Vladislav Sytchenko b4e212a0f9 [PAL] Force large buffer mappings to use pinned memory
PAL doesn't perform chunking for system memory allocations, hence we
should fall back to using pinned memory for mapping large buffers.

Change-Id: I1b472616b72d12ed0105fb65532acacdb98ac7b3
2020-11-18 17:12:32 -05:00
Vladislav Sytchenko ec130a5a28 Disable branch-fold optimization temporarily for some Adobe apps
Change-Id: I8b4af4decb6b3ba4b856167ffb0ae8200b21a835
2020-11-17 12:51:33 -05:00
Vladislav Sytchenko 026baec57b [PAL] Navi23 support
Change-Id: I10bb0653746060bd83ca7feda10fdafc07ced845
2020-11-13 15:08:04 -05:00
Vladislav Sytchenko 5e60e06a50 [PAL] Navi22 support
Change-Id: I9f1741898b4afaa0e787d8053d8f006ee3d17017
2020-11-13 15:00:57 -05:00
Vladislav Sytchenko 353a018bce [PAL] Report actual HW limits for max image buffer size
Change-Id: I62aa3f1e9709b91ba223af0abf8bf6395fe8ec59
2020-11-13 14:59:50 -05:00
Jason Tang 2ee2392f63 SWDEV-260376 - [PAL] Fix Windows build
Change-Id: I788198b5980a46981de4b2e7aaa6a495e6e98cad
2020-11-13 09:51:43 -05:00
Jason Tang b1d75637bd SWDEV-260376 - [PAL] Use Pal::AsicRevision to match device
A device's offset in Pal::AsicRevision could be changed from time to time, while the current implementation assume the offset never changes.

Change-Id: Id993512aa0da6e0b2356f594d5e58f76d1f97f16
2020-11-12 09:49:48 -05:00
German Andryeyev 234a94f838 Add SPM support for RGP
RGP protocol supports SPM collection. Enable it in the PAL backend.

Change-Id: I0fa17334addad037ba6689d11fff0993f7899e66
2020-11-11 13:10:23 -05:00
Konstantin Zhuravlyov ee6b0d9294 SWDEV-198415 - Implement Target ID Proposal
Changes from Jason Tang, Tony Tye

Change-Id: Idb9b6923f12dfb61a5773c9aa3d3fbeb1327ec47
2020-11-10 13:22:58 -05:00
Alex Xie f5ce682ac3 SWDEV-258132 [LNX][Navi23] Segmentation fault when run clinfo
Change-Id: Ic8833726214b32f70a35f3922baf2afae87b25af
2020-11-03 14:40:31 -05:00
Tao Sang 6a6faf1d58 Fix crash in delete of TempWrapper
OCLTST crashing at oclruntime.OCLKernelBinary for
Tahiti because of deleting on pointer vector which
is however a single pointer. The fix will correct
the wrong deleting in TempWrapper destructor.

Change-Id: Ic5a1387a426c102b085a4ef8ff8ff05e6a870cba
2020-10-26 16:04:39 -04:00
Jason Tang 1da0fe4263 SWDEV-254181 - Fix ocl min_max_image_buffer_size regression.
ROCr is now reporting the actual HW addressing limits for HIP, so OpenCL will have to impose lower limit.

Change-Id: I60c2ce27ed1d1f45f16fb76438965a236ba872c6
2020-10-26 09:15:31 -04:00
German Andryeyev bd340d8cbf Correct reported info in ROC profiler
OCL can't distinguish different copy types, but ROC profiler
expects SDMA transfer visibility. Add extra code to detect
a transfer with the host memory and substitute OCL command

Change-Id: I5290acd0e10bc082e00c1d4ae1474a075de7f165
2020-10-23 18:29:48 -04:00
Alex Xie e5588f188c SWDEV-256126 - Linux pro Nuke app crash with "Out of memory "while running Rip plugin test
We unmap a memory with a different pointer.
ROCr runtime might be confused and silently ignore the unmap request

Change-Id: Ic5a1387a426cf02a985a4ef8ff8ff05e6a870cbf
2020-10-21 11:33:42 -04:00
Vladislav Sytchenko be66e29e94 [PAL] Change free mem tracking logic to use PAL size
PAL may internally align up the allocation size to the page size
reported by KMD. This will cause a mismatch in size between OCL and PAL.

To avoid this, use PAL size when updating the free memory counter on
both alloc and free.

Change-Id: Ic6e8c861a52170476474fb70a769eef93be3261f
2020-10-20 12:10:14 -04:00
Jason Tang 25cc965c76 Change file mode 755 back to 644
Change-Id: I4ba5d66997ffd3331c56674d4bf805160dcdf049
2020-10-19 15:09:32 -04:00
German Andryeyev cb4e6bd264 Upgrade PAL interface version to 632
Change-Id: I1b8b936cc9bc59ff80296cc6bf5137c3af398c5d
2020-10-16 13:12:16 -04:00
German Andryeyev a5661192b6 Reduce the number of allocated signals
Enable this optimization when the barrier is disabled, since
reuse requires a signal wait.
Use the size of pending AQL signals as the size of signal pool.

Change-Id: I2754a0f8b67e19d2601c58945e10fdf0e8be1624
2020-10-15 16:39:33 -04:00
Alex Xie e4e6c46356 SWDEV-251360 - Add tracing for memory allocation/free.
This can be used to debug VM fault

Change-Id: I7685485b0450ea84d10b710639ad7b6c5ec2fcf3
2020-10-15 15:38:55 -04:00
Rahul Garg a224dc2a7b Fix PCI bus domain ID
SWDEV-256338

Change-Id: I09afdca4f1a08f99ce662a4c4ed8a51d85500699
2020-10-15 14:59:57 -04:00
Rahul Garg c0f8b52f06 Add image1DMaxWidth_ for maxTexture1D property
Needed by SWDEV-254068

Change-Id: Ic650dfb6e5b38d7544ba647c53de52deda39b92d
2020-10-14 17:30:43 -04:00
Vladislav Sytchenko 330b674821 [PAL] Allow more heaps for non-SVM suballocations
On ReBar systems the invible heap is not present, so in theory we should
fail creating the suballocation chunk, however PAL doesn't report any
errors.

To make sure we never fail, allow creating the allocation in the visible
heap and system memory.

Change-Id: Iea9cc68d98b9cb396a2b7a37398b98b66274083b
2020-10-14 16:56:12 -04:00
Vladislav Sytchenko 2ec5a47c88 [PAL] Allow for embedding debug info into IBs
Change-Id: I4473b9c5aa36370d9af37f22a78f4414eaa21e01
2020-10-14 15:54:48 -04:00
Jason Tang b38317cb3c Add COMGR_DYN_DLL to device/rocm/CMakeLists.txt
Now rocm/rocdevice.cpp also includes comgrctx.hpp, and we don't want to statically link against comgr when buidling shared libs.

Change-Id: Ic330bd860559b3e07b776c951afe6126b0f43f7d
2020-10-14 09:28:37 -04:00
Vladislav Sytchenko 26d1b28b16 [PAL] Allow overriding reported asic revision
This is helpfull to do when debugging issues on lowend asics. Navi14 can be emulated as Navi10. So can Navi22 be emulated as Navi21.

Change-Id: I693ffd45a5b03657822afdc872781901bc69b65c
2020-10-13 09:36:15 -04:00
agodavar 92f1ce41dc SWDEV-254185 : hiprtc headers - handle empty headerIncludeNames
Change-Id: Ie06278c18b62cef1bdfbb8ac82728ed5667b2047
2020-10-13 09:23:43 -04:00
German Andryeyev c05de4f2f9 Add ROCr queries for HMM support
Change-Id: I2b5508bf0faf8f48dd7348d6a5202fb28de09876
2020-10-13 09:12:00 -04:00
agodavar ac72e50adc SWDEV-254185 - Added support to pass include headers to hipRTC
Change-Id: Ic7f2957b04e518c57e2fd3fc9d839de07232405e
2020-10-12 03:46:04 -04:00
Vladislav Sytchenko fd09a7a23c [PAL] Skip extra calls to MakeResident
With the PAL_ALWAYS_RESIDENT flag memory objects are resident at allocation time, no need to make them resident again before submit.

Also we should never evict anything with this setting, or we'll generate a VM fault.

Change-Id: Ieacc6af88ab4e09c20efd94100e148b2502e1d70
2020-10-09 14:13:32 -04:00
German Andryeyev d9397590de Add option to skip AQL barrier
The change reuses HSA signals for dispatches as a wait signal.
Skipping the barrier requires to  disable L2 cache for sysmem
allocations and extra tracking for HDP access with the large bar.
ROC_BARRIER_SYNC=0 activates the new logic. Barrier sync is
still used by default.
ROC_ACTIVE_WAIT=1 enables unconditional active wait in ROCr.
The change also consolidated ROCr wait logic under single function.

Change-Id: I6bd1be30aa88258da1b1f9de319ef5a45852afd8
2020-10-06 08:37:12 -04:00
Sarbojit Sarkar 4a025e1a87 [perf]hipMalloc performance optimization
Change-Id: I6e8a918cc1c4cafad197b09e10755cd180e11ead
2020-10-06 03:19:41 -04:00
Sarbojit Sarkar 8ab8fac173 SWDEV-253548 : clean up gfx-arch macros
Change-Id: I8deb2ea44f556260bb78d24f68b04b0c730ed4d8
2020-10-06 03:17:09 -04:00
German Andryeyev df63547906 Add support for CPU device in advise
Change-Id: I7250e0183580c14cd3d6050ef85f9ce26e36f4a8
2020-10-05 12:52:46 -04:00