rocm-systems

Автор	SHA1	Сообщение	Дата
foreman	b750057405	P4 to Git Change 1311385 by gandryey@gera-w8 on 2016/09/06 16:51:05 SWDEV-101448 - [CQE OCL][Brahma][PERF][QR] ~21% perf drop is observed with lulesh-cl subtest of ComputeApps tests : Faulty CL # 1306133 - Use the logic for transfer size before CL#1306133 Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#124 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#10 edit	2016-09-06 17:00:06 -04:00
foreman	57043d662d	P4 to Git Change 1309866 by gandryey@gera-w8 on 2016/09/01 13:50:12 SWDEV-79445 - OCL generic changes and code clean-up - Improve image fill performance with multiple writes in a single thread. The current split has 3 regions Affected files ... ... //depot/stg/opencl/drivers/opencl/library/common.hsa/src/blitKernels.cl#4 edit ... //depot/stg/opencl/drivers/opencl/library/common/src/blitKernels.cl#4 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#123 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#40 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#8 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.hpp#4 edit	2016-09-01 14:01:08 -04:00
foreman	cd7727d007	P4 to Git Change 1308294 by gandryey@gera-w8 on 2016/08/29 18:22:03 SWDEV-101206 - [CQE OCL][Perf][G][QR] Upto ~9% Performance drop observed while running Video Composition subtest of Compubench; Faulty CL#1306133 - Use the original logic without DMA flush. Flush on staging write helps with a blocking op only, but currently VDI doesn't have that information. Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#122 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#7 edit	2016-08-29 18:31:20 -04:00
foreman	862e3a1a79	P4 to Git Change 1306133 by gandryey@gera-w8 on 2016/08/23 14:00:09 SWDEV-79445 - OCL generic changes and code clean-up - Update staging copy path with a flush so CPU copy and SDMA transfer could run asynchronously. - Tune chunk size for transfers Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#121 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/pal/palblit.cpp#6 edit	2016-08-23 14:12:24 -04:00
foreman	922e14c46d	P4 to Git Change 1201783 by gandryey@gera-w8 on 2015/10/20 18:03:34 SWDEV-79151 - clenqueuereadImage is slow when using a pinned buffer and a row_picth!0 - Add a check if the provided rowPitch is equal to the actual transfer width. SDMA doesn't support row/slice pitches, thus runtime still has to fall back to compute in other cases Affected files ... ... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_memobj.cpp#78 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#120 edit ... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.cpp#122 edit ... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.hpp#92 edit	2015-10-20 18:37:35 -04:00
foreman	4b23814a4d	P4 to Git Change 1195141 by gandryey@gera-dev-w7 on 2015/09/28 15:09:34 SWDEV-77522 - Remove direct references to the Resource object - In non-VM mode Resource was used as a memory object outside of SW heap Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#119 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#39 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuconstbuf.cpp#9 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuconstbuf.hpp#6 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#528 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#153 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#298 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#117 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.hpp#49 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#38 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.hpp#14 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#208 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.hpp#60 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#228 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#84 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#383 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#135 edit	2015-09-28 17:41:36 -04:00
foreman	bc5a50bf7b	P4 to Git Change 1191682 by gandryey@gera-dev-w7 on 2015/09/17 11:14:23 ECR #304775 - Remove EG/NI support - Remove the heap emulation (non-vm) Affected files ... ... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_memobj.cpp#77 edit ... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_svm.cpp#12 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpusettings.cpp#31 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#186 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#253 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#118 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#523 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#148 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuheap.cpp#28 delete ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuheap.hpp#16 delete ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#297 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#116 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.cpp#122 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.hpp#48 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#227 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#83 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#329 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.hpp#94 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#379 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#143 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.h#57 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsasettings.cpp#38 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsasettings.cpp#9 edit ... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#242 edit	2015-09-17 11:24:31 -04:00
foreman	10b19089fe	P4 to Git Change 1191418 by gandryey@gera-dev-w7 on 2015/09/16 16:13:13 ECR #304775 - Remove EG/NI specific features Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#185 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#251 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#117 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#522 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#147 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#296 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.hpp#115 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#226 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#82 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuscr800.cpp#11 delete ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuscsi.cpp#34 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#326 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.hpp#93 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#378 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#134 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.cpp#79 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.h#51 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#142 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.h#56 edit ... //depot/stg/opencl/drivers/opencl/runtime/platform/program.cpp#66 edit ... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#241 edit	2015-09-16 16:26:46 -04:00
foreman	5632ebd275	P4 to Git Change 1185139 by fdaniil@spb_fdaniil_amd_hsa_brigvar_test on 2015/08/27 08:31:20 ECR #304775 - prepare to build with MSVC 18, part 3: changes in runtime/ugl testing done: smoke, precheckin reviewers: German Andryeyev, Bart Crane http://ocltc.amd.com/reviews/r/8338/ Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpucommand.cpp#65 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpudevice.cpp#274 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpumapping.cpp#4 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpumapping.hpp#3 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/cpu/cpuvirtual.cpp#25 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/device.cpp#183 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#116 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#521 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpukernel.cpp#295 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprintf.cpp#37 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuprogram.cpp#204 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#375 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsadevice.cpp#93 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsakernel.cpp#26 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsadevice.cpp#37 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsakernel.cpp#9 edit ... //depot/stg/opencl/drivers/opencl/runtime/os/os_posix.cpp#40 edit ... //depot/stg/opencl/drivers/opencl/runtime/os/os_win32.cpp#45 edit	2015-08-27 08:40:14 -04:00
foreman	1386191b6c	P4 to Git Change 1179663 by gandryey@gera-dev-w7 on 2015/08/12 13:14:46 EPR #419072 - [OpenCL2.0] Enable 16MB large on device queues - Enable device queue creation up to 12MB. That should allow to run Intel SDK sample from the EPR that requires 6MB queue only. - Currently a queue with >12.5MB size has a significant performance degradation. Thus the current max possible is 12MB. In general it's preferable to use the queue size more suitable for the task, rather than max possible. Affected files ... ... //depot/stg/opencl/drivers/opencl/library/hsa/hsail/src/devenq/schedule.cl#10 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#115 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#38 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#123 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#517 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusched.hpp#17 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#372 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#131 edit	2015-08-12 13:37:08 -04:00
foreman	637492a7dd	P4 to Git Change 1128337 by rili@rili_opencl_stg on 2015/03/06 14:37:45 EPR #415638 - Improve APU performance - Force remote allocation of local and persistent memory to Remote from RemoteUSWC: - Use gpu copy for remote/pinned image/buffer. Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#114 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#211 edit	2015-03-06 15:58:00 -05:00
foreman	ae9e6d1a92	P4 to Git Change 1128279 by gandryey@gera-w8 on 2015/03/06 12:37:59 ECR #304775 - Mip levels implementation - Initial change. Update the runtime interfaces to allow a mipmap allocation. Affected files ... ... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_memobj.cpp#74 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/device.hpp#240 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#113 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#499 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#138 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.cpp#119 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpumemory.hpp#47 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#210 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#79 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#305 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLDevice.cpp#110 edit ... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.cpp#118 edit ... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.hpp#89 edit ... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#225 edit	2015-03-06 13:13:39 -05:00
foreman	dc8a3205ce	P4 to Git Change 1097200 by gandryey@gera-dev-w7 on 2014/11/14 13:59:46 ECR #304775 - Optimize oclBandwidthTest from nVidia SDK - Cache pinned memory, since the benchmark sends the same transfer in a single batch. Thus we could avoid pin/unpin - Swap SDMA engine allocation order. Blit manager allocates a queue on device, thus the first app queue was getting the paging second SDMA. Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#112 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.hpp#37 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#339 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.hpp#121 edit	2014-11-14 14:07:55 -05:00
foreman	bfc41a18dd	P4 to Git Change 1083967 by gandryey@gera-dev-w7 on 2014/10/03 11:20:24 ECR #304775 - Fix for BUG#10330. - Add an optimized version for unaligned buffer copy Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/blitcl.cpp#7 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#111 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsablit.cpp#9 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/hsa_foundation/hsablit.cpp#5 edit	2014-10-03 12:04:15 -04:00
foreman	b672b6c4da	P4 to Git Change 1077444 by gandryey@gera-dev-w7 on 2014/09/16 14:31:35 ECR #304775 - Add capability to enable large allocations >4GB - Update the blit kernels to consider a buffer size >4GB Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/blitcl.cpp#4 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#110 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpusettings.cpp#280 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/hsa/hsablit.cpp#8 edit ... //depot/stg/opencl/drivers/opencl/runtime/utils/flags.hpp#214 edit	2014-09-16 14:43:17 -04:00
foreman	5efe63df44	P4 to Git Change 1069927 by skudchad@skudchad_test_win_opencl2 on 2014/08/25 14:51:55 ECR #304775 - Optimization for rectangular copies(Part2). Due to HW restriction of 14bits for src and dst pitch, its advantageous to choose optimal bpp. Higher the bpp the larger the byte pitch. This indirectly helps to reduce the number of packets for buffer copy(line by line vs a single sub_win raw packet) ReviewBoardURL = http://ocltc.amd.com/reviews/r/5605/diff/ Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#109 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#191 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.hpp#76 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.cpp#64 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gslbe/src/rt/GSLContext.h#38 edit	2014-08-25 15:09:01 -04:00
foreman	a5e788c9f8	P4 to Git Change 1067573 by skudchad@skudchad_opencl_win_2 on 2014/08/18 16:38:03 ECR #304775 - Refactor code to do line by line copies for read\write Rect. This avoids taking the blit copy path which may be even slower. ReviewBoardURL = http://ocltc.amd.com/reviews/r/5567/ Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#108 edit	2014-08-18 16:46:45 -04:00
foreman	1681dd142f	P4 to Git Change 1058007 by rili@rili_opencl_stg_01 on 2014/07/22 17:28:41 EPR #399808 - Fixed wrong conversion of sRGBA when using host copy instead of blit kernel transfer Affected files ... ... //depot/stg/opencl/drivers/opencl/api/opencl/amdocl/cl_memobj.cpp#68 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/blit.cpp#3 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/blit.hpp#2 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#107 edit	2014-07-22 17:42:44 -04:00
foreman	d2b905f18e	P4 to Git Change 1057998 by gandryey@gera-dev-w7 on 2014/07/22 17:15:58 ECR #304775 - Device enqueuing - Use atomic fetch for enqueue flags - Switch to a multithreaded scheduler - Add a workaround for Linux host_multi_queue failures. Linux has only 2 queues, but the test allocates multiple host queues and the same HW ring can be used Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#106 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.cpp#449 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudevice.hpp#127 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuschedcl.cpp#22 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuvirtual.cpp#325 edit	2014-07-22 17:30:56 -04:00
foreman	1b9e65b27b	P4 to Git Change 1057445 by rili@rili_opencl_stg on 2014/07/21 14:11:34 EPR #399808 - Add CL_RGB, CL_UNORM_INT_101010 support Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#105 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpudefs.hpp#111 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuresource.cpp#186 edit ... //depot/stg/opencl/drivers/opencl/runtime/platform/memory.cpp#106 edit	2014-07-21 14:27:24 -04:00
foreman	6314b334ba	P4 to Git Change 1055054 by gandryey@gera-dev-w7 on 2014/07/14 20:18:53 ECR #304775 - Device enqueuing - Switch to the single thread scheduler for now(the current version isn't friendly for single thread). Hopefully it's a temporary solution until synchronization issue with multithreaded scheduler will be identified. Affected files ... ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpublit.cpp#104 edit ... //depot/stg/opencl/drivers/opencl/runtime/device/gpu/gpuschedcl.cpp#20 edit	2014-07-14 20:24:58 -04:00
foreman	3694ab2ce8	initial commit	2014-07-04 16:17:05 -04:00

22 Коммитов