rocm-systems

مولف	SHA1	پیام	تاریخ
Jatin	2d517fdcc6	Adding changes for hipExtLaunchKernel for rocCLR Change-Id: Iba52bc3bde7c37f3fb375a55ba0947e87b3cdc9b	2020-06-02 14:16:41 -04:00
Payam	c5f76c3de3	name change vdi to rocclr Change-Id: I06d198bbb4a499e153b290b73a92afed3553b252	2020-05-06 09:14:30 -04:00
Alex Xie	d890d77da4	SWDEV-221166 - Detect support for large bar access through HIP runtime API Change-Id: Iaa9756c1b5e40c1ab5afb38e44a6699fa5f6c13f	2020-05-01 20:39:52 -04:00
Maneesh Gupta	a0b5dfd625	Merge in the rocclr based hip runtime (#2032 ) * Merge master-next changes in master (include vdi development in master branch)	2020-04-23 09:12:06 -07:00
Jeff Daily	ef596cd088	add IPC event support (#1996 )	2020-04-17 10:31:22 +05:30
Yaxun (Sam) Liu	8d83e95457	Disable device side malloc (#2009 ) * Disable device side malloc Currently device side malloc is not working and takes excessive device memory. Disable it for now until a working malloc is implemented. Change-Id: I1ad908c1c53a83752383b4be96688a848642c699	2020-04-14 16:07:14 +05:30
Aryan Salmanpour	cf8589b8c8	[HIP] add support for NoPreSync/NoPostSync flags for Cooperative MultiDevice launch API (#1990 )	2020-04-13 14:02:52 +05:30
Rahul Garg	ba8a556ea9	Rename hipDrvOccupancy to hipModuleOccupancy and match CUDA syntax (#1943 )	2020-04-07 14:02:52 +05:30
satyanveshd	17862812b4	fix hipIpcOpenMemHandle (#1998 )	2020-04-06 15:39:49 +05:30
Rahul Garg	a12cc8b031	use hsa_executable_get_symbol_by_name in find_kernel_by_name (#1994 )	2020-04-06 15:39:30 +05:30
Rahul Garg	59afcb1091	Bump version to 3.5 (#1993 ) * Switch CI testing from rocm-3.1.x to rocm-3.3.x * Update hcc workweek for cooperative view * bump version to 3.5	2020-04-06 15:39:10 +05:30
Sarbojit2019	b80a2c3966	hipEventElapsedTime should respect device (#1992 ) Fixes SWDEV-228636. Also added a unit test to verify this.	2020-04-06 15:38:25 +05:30
lmoriche	9de5e90ab5	Don't duplicate embedded code objects (#1991 ) If the code object is embedded in an already mapped file, and the lifetime of the mapped file exceeds the lifetime of the executable, we do not need to make a copy of the binary. This allows the ROCR to present the code object URI as file:///path/to/file#offset=X&size=Y.	2020-04-06 15:37:35 +05:30
Jatin Chaudhary	6358e40a76	Removing header size from formula (#1988 ) Fixed a bug in the elf file size computation.	2020-04-06 15:37:07 +05:30
Rahul Garg	6c65fc04d1	Fix 2D and 3D memset (#1987 )	2020-04-06 15:35:59 +05:30
ansurya	50ef250a3b	tex1Dfetch behaviour for different address mode and filter mode (#1772 ) Fixes github issue: #1754 - When ResourceDesc::resType is hipResourceTypeLinear ignore address mode and filter mode. - When textureDesc::normalizedCoords is set to zero, AddressModeWrap and AddressModeMirror won't be supported and will be switched to AddressModeClamp.	2020-04-01 12:10:17 +05:30
Sarbojit2019	eba596c87a	Fix for segfault seen in hipMemcpyDtoD (#1964 ) * Fixes SWDEV-227444.	2020-03-28 17:29:49 +05:30
satyanveshd	351d39e6aa	[dtests] Added few Negative tests (#1735 )	2020-03-27 14:10:12 +05:30
Siu Chi Chan	43abf84f54	don't expose symbols from code_object_bundle (#1971 ) Change-Id: I56479485aad42c3d517fe6d9055be1cd846eeb00	2020-03-27 14:09:07 +05:30
Sarbojit2019	f1b028b93e	Fix few memory leaks in HIP (#1969 )	2020-03-27 14:08:30 +05:30
Aryan Salmanpour	c8ca2355ae	[hip] fix a build error when building hip with latest hcc (#1977 ) there is a build error when building HIP with latest HCC from GitHub after PR#1935 merged into HIP master branch. this PR changed blockDimX to blockDim and two lines missed this change where added in the current PR.	2020-03-26 17:10:42 +05:30
Siu Chi Chan	8fefda2bb9	Initialize all undef symbols with a magic poison (#1962 )	2020-03-26 17:06:09 +05:30
Sarbojit2019	3e363047d5	Fix for segfault seen if invalid kind is passed to hipMemcpy (#1937 ) Fixes SWDEV-224941	2020-03-26 17:04:43 +05:30
Joseph Greathouse	f61b79d9a3	Fix cooperative launch APIs to set hipGetLastError (#1935 ) * Fix cooperative launch APIs to set hipGetLastError Previously, the cooperative launch APIs did not properly log their errors in the global hipGetLastError variable before returning back to the user. As such, the APIs would leave hipSuccess in the last error, which would break some use cases. This fixes that problem by making a trampoline function that does the HIP_INIT_API and ihipLogStatus. * Add missing flag to the log of multi-GPU launch	2020-03-25 14:39:24 -07:00
Jeff Daily	01d661b159	fix hipStreamAddCallback, block future work on stream (#1934 )	2020-03-19 16:16:04 +05:30
Aryan Salmanpour	4acb0ea038	[HIP] use markers to sync cooperative and normal queues (#1948 )	2020-03-18 11:20:43 +05:30
jglaser	b5e683a35d	Implement accurate max block size in hipFuncGetAttributes() (#1676 ) This PR takes ensures that the maxThreadsPerBlock returned by hipFuncGetAttributes is both a multiple of the warp size and that the register usage of the maximum block does not exceed the number of available registers. Fixes #1662	2020-03-18 11:20:06 +05:30
zhaozhangjian	7c8b8d24ef	fix a bug when initializing a vector of hipFunction_t (#1949 )	2020-03-17 14:05:07 +05:30
Joseph Greathouse	18e6c529bc	Fix detection of support for cooperative groups (#1932 ) Query ROCr to see if we have the proper lower-level support for cooperative groups -- GWS support through the firmware, driver, thunk, and ROCr. ROCr does these checks for us, and presents a query that allows us to see if GWS entries are available for use. If so, then we have all the lower-level technologies needed, and we should enable cooperative groups support for HIP.	2020-03-17 14:01:44 +05:30
Joseph Greathouse	55e55e78bb	Fix maxSharedMemoryPerMultiProcessor attribute (#1927 ) The maxSharedMemoryPerMultiProcessor attribute is meant to describe the number of bytes of shared memory (LDS space in AMD terminology) in each SM (CU in AMD terminology). For instance, on AMD GPUs this is often 64KB per CU, and some Nvidia GPUs it's 96KB per SM. This shared memory is a different address space from the normal global memory. However, the current HIP-HCC properties fill this in with a size that matches the totalGlboalMem property. This gives a drastically too-high calculation for the amount of LDS space that each CU has -- tens of GBs vs. 10s of KBs. This patch fixes this by pulling the maxSharedMemoryPerMultiProcessor property from the HSA pool that describes how much workgroup-local space is available on each CU. The HSA runtime eventually pulls this from the topology information about LDSSizeInKB, defined as "Size of Local Data Store in Kilobytes per SIMD". Previously, this HSA query was used to fill in the value of the sharedMemPerBlock property. On today's AMD GPUs, we know that the amount of LDS avaialble to the workgroup is identical to the amount of LDS space in the CU. However, in the future this may differ. As such, this patch changes around the order and fills in the "PerMultiProcessor" property from the HSA query (since what's what the query is defined to return), and then separately fills in the "PerBlock" property as we know it.	2020-03-17 14:00:51 +05:30
Joseph Greathouse	bf04d7380a	Fix errors in occupancy calculation function (#1926 ) Fix two errors in hipOccupancyMaxActiveBlocksPerMultiprocessor. 1) Fix a possible segfault if the user passed in a null pointer for the numBlocks value. 2) Handle the situation when the user is asking for a block size that is larger than what the target device can hold within a single block.	2020-03-17 14:00:38 +05:30
Evgeny Mankov	821c60a3d9	Merge pull request #1916 from asalmanp/refactor_cooperative_APIs [HIP] Refactor cooperative APIs	2020-03-12 19:12:50 +03:00
Evgeny Mankov	70f5646f8a	Merge pull request #1908 from asalmanp/prop_mulit_coop [HIP] add hip specific properties for cooperative kernel multi device	2020-03-12 19:12:11 +03:00
srinivamd	65a790bc08	return hipSuccess when count is zero (#1900 )	2020-03-11 14:32:54 +05:30
Aryan Salmanpour	b663fccf0b	[HIP] return an error if blockDim exceeds maxThreadsPerBlock	2020-03-10 15:26:53 -04:00
Aryan Salmanpour	5494f5b247	[HIP] fix formatting/code clean up and fix a bug	2020-03-09 16:03:59 -04:00
Aryan Salmanpour	4844fbdf0a	[HIP] Refactor cooperative APIs	2020-03-06 18:30:12 -05:00
Aryan Salmanpour	03797ae986	[HIP] add hip specific properties for cooperative kernel multi device	2020-03-03 13:25:36 -05:00
Siu Chi Chan	57edf48191	improve code object loading error message (#1889 )	2020-02-28 16:47:40 +05:30
saleelk	3e1f41c165	Fix HIPRTC headers to export C style symbols (#1879 )	2020-02-28 16:47:29 +05:30
Rahul Garg	6c5fa32815	Remove deprecated HIP markers (#1876 )	2020-02-28 16:47:15 +05:30
Rahul Garg	edc97f3073	Add hipDrvOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags] (#1854 ) Equivalent to cuOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags].	2020-02-28 16:46:55 +05:30
Alex Voicu	d830dad3be	Address post-staging issues in #1809 (#1894 ) Fixes SWDEV-223910 and SWDEV-223663	2020-02-27 16:21:12 +05:30
Alex Voicu	9b4f39e1d8	Tweak synchronous memcpy implementation (#1809 ) The existing one can have issues on certain systems, therefore this limits use of direct memcpy via largeBAR to sizes where it is unequivocally better. Also addresses SWDEV-220030 and SWDEV-222237.	2020-02-18 20:50:27 +05:30
Rahul Garg	8c5e5e435b	Fix hipMemcpy3D (#1798 ) Fixes #1790 and #1791. hipMemcpy3D still requires further refactoring for different input and output combinations.	2020-02-17 19:35:35 +05:30
Maneesh Gupta	e7120dd876	Use deque instead of vector for code readers so that the iterators and references will be stable (#1851 ) * Use deque instead of vector for code readers so that the iterators and references will be stable * Fix compile error * Assign the iterator * Add multithreaded test * Make threads a multiple of hardware concurrency * Output on failure * Add setDevice to try and initialize the context on cuda * Create context for cuda * Set context on each thread * Reduce threads on cuda * Skip test on cuda * Try to initialize the primary context on cuda * Push ctx to the stack as current * Revert "Push ctx to the stack as current" This reverts commit `bff8cbe950`. * Revert "Try to initialize the primary context on cuda" This reverts commit `fd98514113`. * updated test for nvidia path * Add c++11 option for nvcc Co-authored-by: satyanveshd <53337087+satyanveshd@users.noreply.github.com>	2020-02-15 09:51:24 +05:30
Jeff Daily	03bb658721	missing break statement in hipDeviceGetAttribute (#1865 ) The break is missing for hipDeviceAttributeMaxTexture3DDepth.	2020-02-13 14:22:56 +05:30
Sarbojit2019	1109cbff83	[hip] Fix for bug introduced in #1770 when blockSize is non-power of 2 (#1864 ) Fixes SWDEV-222161	2020-02-13 14:22:46 +05:30
Sarbojit2019	fc5256fd28	ihipEnablePeerAccess return error if peer is not accessible (#1858 ) hipDeviceEnablePeerAccess returns success and adds peer into the list even if it is not accessible which creates problem in hipMalloc when it tries to share the ptr to peer device. Proposed change is to check the access status before updating the peer list and update only when it can access the peer.	2020-02-13 14:22:11 +05:30
ansurya	8c6934223b	Reduce GPU copying based on arch it runs on (#1751 ) Implements SWDEV-213230.	2020-02-13 14:21:51 +05:30

1 2 3 4 5 ...

1113 کامیت‌ها