Граф коммитов

22 Коммитов

Автор SHA1 Сообщение Дата
foreman b750057405 P4 to Git Change 1311385 by gandryey@gera-w8 on 2016/09/06 16:51:05
SWDEV-101448 - [CQE OCL][Brahma][PERF][QR] ~21% perf drop is observed with lulesh-cl subtest of ComputeApps tests : Faulty CL # 1306133
	- Use the logic for transfer size before CL#1306133

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#124 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#10 edit
2016-09-06 17:00:06 -04:00
foreman 57043d662d P4 to Git Change 1309866 by gandryey@gera-w8 on 2016/09/01 13:50:12
SWDEV-79445 - OCL generic changes and code clean-up
	- Improve image fill performance with multiple writes in a single thread. The current split has 3 regions

Affected files ...

... //depot/stg/opencl/drivers/opencl/library/common.hsa/src/blitKernels.cl#4 edit
... //depot/stg/opencl/drivers/opencl/library/common/src/blitKernels.cl#4 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#123 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#40 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.hpp#4 edit
2016-09-01 14:01:08 -04:00
foreman cd7727d007 P4 to Git Change 1308294 by gandryey@gera-w8 on 2016/08/29 18:22:03
SWDEV-101206 - [CQE OCL][Perf][G][QR] Upto ~9% Performance drop observed while running Video Composition subtest of Compubench; Faulty CL#1306133
	- Use the original logic without DMA flush. Flush on staging write helps with a blocking op only, but currently VDI doesn't have that information.

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#122 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#7 edit
2016-08-29 18:31:20 -04:00
foreman 862e3a1a79 P4 to Git Change 1306133 by gandryey@gera-w8 on 2016/08/23 14:00:09
SWDEV-79445 - OCL generic changes and code clean-up
	- Update staging copy path with a flush so CPU copy and SDMA transfer could run asynchronously.
	- Tune chunk size for transfers

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#121 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#6 edit
2016-08-23 14:12:24 -04:00
foreman 922e14c46d P4 to Git Change 1201783 by gandryey@gera-w8 on 2015/10/20 18:03:34
SWDEV-79151 - clenqueuereadImage is slow when using a pinned buffer and a row_picth!0
	- Add a check if the provided rowPitch is equal to the actual transfer width. SDMA doesn't support row/slice pitches, thus runtime still has to fall back to compute in other cases

Affected files ...

... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_memobj.cpp#78 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#120 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.cpp#122 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.hpp#92 edit
2015-10-20 18:37:35 -04:00
foreman 4b23814a4d P4 to Git Change 1195141 by gandryey@gera-dev-w7 on 2015/09/28 15:09:34
SWDEV-77522 - Remove direct references to the Resource object
	- In non-VM mode Resource was used as a memory object outside of SW heap

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#119 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#39 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuconstbuf.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuconstbuf.hpp#6 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#528 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#153 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#298 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#117 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.hpp#49 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#38 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.hpp#14 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#208 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#60 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#228 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#84 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#383 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#135 edit
2015-09-28 17:41:36 -04:00
foreman bc5a50bf7b P4 to Git Change 1191682 by gandryey@gera-dev-w7 on 2015/09/17 11:14:23
ECR #304775 - Remove EG/NI support
	- Remove the heap emulation (non-vm)

Affected files ...

... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_memobj.cpp#77 edit
... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_svm.cpp#12 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpusettings.cpp#31 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#186 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#253 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#118 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#523 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#148 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuheap.cpp#28 delete
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuheap.hpp#16 delete
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#297 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#116 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.cpp#122 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.hpp#48 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#227 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#83 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#329 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.hpp#94 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#379 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#143 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.h#57 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsasettings.cpp#38 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsasettings.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#242 edit
2015-09-17 11:24:31 -04:00
foreman 10b19089fe P4 to Git Change 1191418 by gandryey@gera-dev-w7 on 2015/09/16 16:13:13
ECR #304775 - Remove EG/NI specific features

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#185 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#251 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#117 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#522 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#147 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#296 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#115 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#226 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#82 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuscr800.cpp#11 delete
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuscsi.cpp#34 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#326 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.hpp#93 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#378 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#134 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.cpp#79 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.h#51 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#142 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.h#56 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/program.cpp#66 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#241 edit
2015-09-16 16:26:46 -04:00
foreman 5632ebd275 P4 to Git Change 1185139 by fdaniil@spb_fdaniil_amd_hsa_brigvar_test on 2015/08/27 08:31:20
ECR #304775 - prepare to build with MSVC 18, part 3:
	changes in runtime/ugl

	testing done: smoke, precheckin
	reviewers: German Andryeyev, Bart Crane

	http://ocltc.amd.com/reviews/r/8338/

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpucommand.cpp#65 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpudevice.cpp#274 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpumapping.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpumapping.hpp#3 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpuvirtual.cpp#25 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#183 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#116 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#521 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#295 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#37 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#204 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#375 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsadevice.cpp#93 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsakernel.cpp#26 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsadevice.cpp#37 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsakernel.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/runtime/os/os_posix.cpp#40 edit
... //depot/stg/opencl/drivers/opencl/runtime/os/os_win32.cpp#45 edit
2015-08-27 08:40:14 -04:00
foreman 1386191b6c P4 to Git Change 1179663 by gandryey@gera-dev-w7 on 2015/08/12 13:14:46
EPR #419072 - [OpenCL2.0] Enable 16MB large on device queues
	- Enable device queue creation up to 12MB. That should allow to run Intel SDK sample from the EPR that requires 6MB queue only.
	- Currently a queue with >12.5MB size has a significant performance degradation. Thus the current max possible is 12MB. In general it's preferable to use the queue size more suitable for the task, rather than max possible.

Affected files ...

... //depot/stg/opencl/drivers/opencl/library/hsa/hsail/src/devenq/schedule.cl#10 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#115 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#38 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#123 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#517 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusched.hpp#17 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#372 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#131 edit
2015-08-12 13:37:08 -04:00
foreman 637492a7dd P4 to Git Change 1128337 by rili@rili_opencl_stg on 2015/03/06 14:37:45
EPR #415638 - Improve APU performance
	                         - Force remote allocation of local and persistent memory to Remote from RemoteUSWC:
	                         - Use gpu copy for remote/pinned image/buffer.

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#114 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#211 edit
2015-03-06 15:58:00 -05:00
foreman ae9e6d1a92 P4 to Git Change 1128279 by gandryey@gera-w8 on 2015/03/06 12:37:59
ECR #304775 - Mip levels implementation
	- Initial change. Update the runtime interfaces to allow a mipmap allocation.

Affected files ...

... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_memobj.cpp#74 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#240 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#113 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#499 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#138 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.cpp#119 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.hpp#47 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#210 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#79 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#305 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#110 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.cpp#118 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.hpp#89 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#225 edit
2015-03-06 13:13:39 -05:00
foreman dc8a3205ce P4 to Git Change 1097200 by gandryey@gera-dev-w7 on 2014/11/14 13:59:46
ECR #304775 - Optimize oclBandwidthTest from nVidia SDK
	- Cache pinned memory, since the benchmark sends the same transfer in a single batch. Thus we could avoid pin/unpin
	- Swap SDMA engine allocation order. Blit manager allocates a queue on device, thus the first app queue was getting the paging second SDMA.

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#112 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#37 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#339 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#121 edit
2014-11-14 14:07:55 -05:00
foreman bfc41a18dd P4 to Git Change 1083967 by gandryey@gera-dev-w7 on 2014/10/03 11:20:24
ECR #304775 - Fix for BUG#10330.
	- Add an optimized version for unaligned buffer copy

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/blitcl.cpp#7 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#111 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsablit.cpp#9 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsablit.cpp#5 edit
2014-10-03 12:04:15 -04:00
foreman b672b6c4da P4 to Git Change 1077444 by gandryey@gera-dev-w7 on 2014/09/16 14:31:35
ECR #304775 - Add capability to enable large allocations >4GB
	- Update the blit kernels to consider a buffer size >4GB

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/blitcl.cpp#4 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#110 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#280 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsablit.cpp#8 edit
... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#214 edit
2014-09-16 14:43:17 -04:00
foreman 5efe63df44 P4 to Git Change 1069927 by skudchad@skudchad_test_win_opencl2 on 2014/08/25 14:51:55
ECR #304775 - Optimization for rectangular copies(Part2). Due to HW restriction of 14bits for src and dst pitch, its advantageous to choose optimal bpp. Higher the bpp the larger the byte pitch. This indirectly helps to reduce the number of packets for buffer copy(line by line vs a single sub_win raw packet)

	ReviewBoardURL = http://ocltc.amd.com/reviews/r/5605/diff/

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#109 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#191 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#76 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.cpp#64 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.h#38 edit
2014-08-25 15:09:01 -04:00
foreman a5e788c9f8 P4 to Git Change 1067573 by skudchad@skudchad_opencl_win_2 on 2014/08/18 16:38:03
ECR #304775 - Refactor code to do line by line copies for read\write Rect. This avoids taking the blit copy path which may be even slower.

	ReviewBoardURL = http://ocltc.amd.com/reviews/r/5567/

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#108 edit
2014-08-18 16:46:45 -04:00
foreman 1681dd142f P4 to Git Change 1058007 by rili@rili_opencl_stg_01 on 2014/07/22 17:28:41
EPR #399808 - Fixed wrong conversion of sRGBA when using host copy instead of blit kernel transfer

Affected files ...

... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_memobj.cpp#68 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/blit.cpp#3 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/blit.hpp#2 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#107 edit
2014-07-22 17:42:44 -04:00
foreman d2b905f18e P4 to Git Change 1057998 by gandryey@gera-dev-w7 on 2014/07/22 17:15:58
ECR #304775 - Device enqueuing
	- Use atomic fetch for enqueue flags
	- Switch to a multithreaded scheduler
	- Add a workaround for Linux host_multi_queue failures. Linux has only 2 queues, but the test allocates multiple host queues and the same HW ring can be used

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#106 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#449 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#127 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuschedcl.cpp#22 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#325 edit
2014-07-22 17:30:56 -04:00
foreman 1b9e65b27b P4 to Git Change 1057445 by rili@rili_opencl_stg on 2014/07/21 14:11:34
EPR #399808 - Add CL_RGB, CL_UNORM_INT_101010 support

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#105 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#111 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#186 edit
... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.cpp#106 edit
2014-07-21 14:27:24 -04:00
foreman 6314b334ba P4 to Git Change 1055054 by gandryey@gera-dev-w7 on 2014/07/14 20:18:53
ECR #304775 - Device enqueuing
	- Switch to the single thread scheduler for now(the current version isn't friendly for single thread). Hopefully it's a temporary solution until synchronization issue with multithreaded scheduler will be identified.

Affected files ...

... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#104 edit
... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuschedcl.cpp#20 edit
2014-07-14 20:24:58 -04:00
foreman 3694ab2ce8 initial commit 2014-07-04 16:17:05 -04:00