نمودار کامیت

1113 کامیت‌ها

مولف SHA1 پیام تاریخ
Jatin 2d517fdcc6 Adding changes for hipExtLaunchKernel for rocCLR
Change-Id: Iba52bc3bde7c37f3fb375a55ba0947e87b3cdc9b
2020-06-02 14:16:41 -04:00
Payam c5f76c3de3 name change vdi to rocclr
Change-Id: I06d198bbb4a499e153b290b73a92afed3553b252
2020-05-06 09:14:30 -04:00
Alex Xie d890d77da4 SWDEV-221166 - Detect support for large bar access through HIP runtime API
Change-Id: Iaa9756c1b5e40c1ab5afb38e44a6699fa5f6c13f
2020-05-01 20:39:52 -04:00
Maneesh Gupta a0b5dfd625 Merge in the rocclr based hip runtime (#2032)
* Merge master-next changes in master (include vdi development in master branch)
2020-04-23 09:12:06 -07:00
Jeff Daily ef596cd088 add IPC event support (#1996) 2020-04-17 10:31:22 +05:30
Yaxun (Sam) Liu 8d83e95457 Disable device side malloc (#2009)
* Disable device side malloc

Currently device side malloc is not working and takes excessive
device memory.

Disable it for now until a working malloc is implemented.

Change-Id: I1ad908c1c53a83752383b4be96688a848642c699
2020-04-14 16:07:14 +05:30
Aryan Salmanpour cf8589b8c8 [HIP] add support for NoPreSync/NoPostSync flags for Cooperative MultiDevice launch API (#1990) 2020-04-13 14:02:52 +05:30
Rahul Garg ba8a556ea9 Rename hipDrvOccupancy to hipModuleOccupancy and match CUDA syntax (#1943) 2020-04-07 14:02:52 +05:30
satyanveshd 17862812b4 fix hipIpcOpenMemHandle (#1998) 2020-04-06 15:39:49 +05:30
Rahul Garg a12cc8b031 use hsa_executable_get_symbol_by_name in find_kernel_by_name (#1994) 2020-04-06 15:39:30 +05:30
Rahul Garg 59afcb1091 Bump version to 3.5 (#1993)
* Switch CI testing from rocm-3.1.x to rocm-3.3.x
* Update hcc workweek for cooperative view
* bump version to 3.5
2020-04-06 15:39:10 +05:30
Sarbojit2019 b80a2c3966 hipEventElapsedTime should respect device (#1992)
Fixes SWDEV-228636.
Also added a unit test to verify this.
2020-04-06 15:38:25 +05:30
lmoriche 9de5e90ab5 Don't duplicate embedded code objects (#1991)
If the code object is embedded in an already mapped file, and the
lifetime of the mapped file exceeds the lifetime of the executable,
we do not need to make a copy of the binary.

This allows the ROCR to present the code object URI as
file:///path/to/file#offset=X&size=Y.
2020-04-06 15:37:35 +05:30
Jatin Chaudhary 6358e40a76 Removing header size from formula (#1988)
Fixed a bug in the elf file size computation.
2020-04-06 15:37:07 +05:30
Rahul Garg 6c65fc04d1 Fix 2D and 3D memset (#1987) 2020-04-06 15:35:59 +05:30
ansurya 50ef250a3b tex1Dfetch behaviour for different address mode and filter mode (#1772)
Fixes github issue: #1754

- When ResourceDesc::resType is hipResourceTypeLinear ignore address mode and filter mode.
- When textureDesc::normalizedCoords is set to zero, AddressModeWrap and AddressModeMirror won't be supported and will be switched to AddressModeClamp.
2020-04-01 12:10:17 +05:30
Sarbojit2019 eba596c87a Fix for segfault seen in hipMemcpyDtoD (#1964)
* Fixes SWDEV-227444.
2020-03-28 17:29:49 +05:30
satyanveshd 351d39e6aa [dtests] Added few Negative tests (#1735) 2020-03-27 14:10:12 +05:30
Siu Chi Chan 43abf84f54 don't expose symbols from code_object_bundle (#1971)
Change-Id: I56479485aad42c3d517fe6d9055be1cd846eeb00
2020-03-27 14:09:07 +05:30
Sarbojit2019 f1b028b93e Fix few memory leaks in HIP (#1969) 2020-03-27 14:08:30 +05:30
Aryan Salmanpour c8ca2355ae [hip] fix a build error when building hip with latest hcc (#1977)
there is a build error when building HIP with latest HCC from GitHub after PR#1935 merged into HIP master branch. this PR changed blockDimX to blockDim and two lines missed this change where added in the current PR.
2020-03-26 17:10:42 +05:30
Siu Chi Chan 8fefda2bb9 Initialize all undef symbols with a magic poison (#1962) 2020-03-26 17:06:09 +05:30
Sarbojit2019 3e363047d5 Fix for segfault seen if invalid kind is passed to hipMemcpy (#1937)
Fixes SWDEV-224941
2020-03-26 17:04:43 +05:30
Joseph Greathouse f61b79d9a3 Fix cooperative launch APIs to set hipGetLastError (#1935)
* Fix cooperative launch APIs to set hipGetLastError

Previously, the cooperative launch APIs did not properly log their
errors in the global hipGetLastError variable before returning back
to the user. As such, the APIs would leave hipSuccess in the
last error, which would break some use cases.

This fixes that problem by making a trampoline function that does
the HIP_INIT_API and ihipLogStatus.

* Add missing flag to the log of multi-GPU launch
2020-03-25 14:39:24 -07:00
Jeff Daily 01d661b159 fix hipStreamAddCallback, block future work on stream (#1934) 2020-03-19 16:16:04 +05:30
Aryan Salmanpour 4acb0ea038 [HIP] use markers to sync cooperative and normal queues (#1948) 2020-03-18 11:20:43 +05:30
jglaser b5e683a35d Implement accurate max block size in hipFuncGetAttributes() (#1676)
This PR takes ensures that the maxThreadsPerBlock returned by hipFuncGetAttributes is both a multiple of the warp size and that the register usage of the maximum block does not exceed the number of available registers.

Fixes #1662
2020-03-18 11:20:06 +05:30
zhaozhangjian 7c8b8d24ef fix a bug when initializing a vector of hipFunction_t (#1949) 2020-03-17 14:05:07 +05:30
Joseph Greathouse 18e6c529bc Fix detection of support for cooperative groups (#1932)
Query ROCr to see if we have the proper lower-level support for
cooperative groups -- GWS support through the firmware, driver,
thunk, and ROCr. ROCr does these checks for us, and presents a
query that allows us to see if GWS entries are available for use.
If so, then we have all the lower-level technologies needed, and
we should enable cooperative groups support for HIP.
2020-03-17 14:01:44 +05:30
Joseph Greathouse 55e55e78bb Fix maxSharedMemoryPerMultiProcessor attribute (#1927)
The maxSharedMemoryPerMultiProcessor attribute is meant to describe
the number of bytes of shared memory (LDS space in AMD terminology)
in each SM (CU in AMD terminology). For instance, on AMD GPUs this
is often 64KB per CU, and some Nvidia GPUs it's 96KB per SM.

This shared memory is a different address space from the normal
global memory. However, the current HIP-HCC properties fill this
in with a size that matches the totalGlboalMem property. This gives
a drastically too-high calculation for the amount of LDS space that
each CU has -- tens of GBs vs. 10s of KBs.

This patch fixes this by pulling the maxSharedMemoryPerMultiProcessor
property from the HSA pool that describes how much workgroup-local
space is available on each CU. The HSA runtime eventually pulls
this from the topology information about LDSSizeInKB, defined as
"Size of Local Data Store in Kilobytes per SIMD".

Previously, this HSA query was used to fill in the value of the
sharedMemPerBlock property. On today's AMD GPUs, we know that
the amount of LDS avaialble to the workgroup is identical to the
amount of LDS space in the CU. However, in the future this may
differ. As such, this patch changes around the order and fills
in the "PerMultiProcessor" property from the HSA query (since
what's what the query is defined to return), and then separately
fills in the "PerBlock" property as we know it.
2020-03-17 14:00:51 +05:30
Joseph Greathouse bf04d7380a Fix errors in occupancy calculation function (#1926)
Fix two errors in hipOccupancyMaxActiveBlocksPerMultiprocessor.
1) Fix a possible segfault if the user passed in a null pointer for
   the numBlocks value.
2) Handle the situation when the user is asking for a block size
   that is larger than what the target device can hold within a
   single block.
2020-03-17 14:00:38 +05:30
Evgeny Mankov 821c60a3d9 Merge pull request #1916 from asalmanp/refactor_cooperative_APIs
[HIP] Refactor cooperative APIs
2020-03-12 19:12:50 +03:00
Evgeny Mankov 70f5646f8a Merge pull request #1908 from asalmanp/prop_mulit_coop
[HIP] add hip specific properties for cooperative kernel multi device
2020-03-12 19:12:11 +03:00
srinivamd 65a790bc08 return hipSuccess when count is zero (#1900) 2020-03-11 14:32:54 +05:30
Aryan Salmanpour b663fccf0b [HIP] return an error if blockDim exceeds maxThreadsPerBlock 2020-03-10 15:26:53 -04:00
Aryan Salmanpour 5494f5b247 [HIP] fix formatting/code clean up and fix a bug 2020-03-09 16:03:59 -04:00
Aryan Salmanpour 4844fbdf0a [HIP] Refactor cooperative APIs 2020-03-06 18:30:12 -05:00
Aryan Salmanpour 03797ae986 [HIP] add hip specific properties for cooperative kernel multi device 2020-03-03 13:25:36 -05:00
Siu Chi Chan 57edf48191 improve code object loading error message (#1889) 2020-02-28 16:47:40 +05:30
saleelk 3e1f41c165 Fix HIPRTC headers to export C style symbols (#1879) 2020-02-28 16:47:29 +05:30
Rahul Garg 6c5fa32815 Remove deprecated HIP markers (#1876) 2020-02-28 16:47:15 +05:30
Rahul Garg edc97f3073 Add hipDrvOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags] (#1854)
Equivalent to cuOccupancyMaxActiveBlocksPerMultiprocessor[WithFlags].
2020-02-28 16:46:55 +05:30
Alex Voicu d830dad3be Address post-staging issues in #1809 (#1894)
Fixes SWDEV-223910 and SWDEV-223663
2020-02-27 16:21:12 +05:30
Alex Voicu 9b4f39e1d8 Tweak synchronous memcpy implementation (#1809)
The existing one can have issues on certain systems, therefore this limits use of direct memcpy via largeBAR to sizes where it is unequivocally better.

Also addresses SWDEV-220030 and SWDEV-222237.
2020-02-18 20:50:27 +05:30
Rahul Garg 8c5e5e435b Fix hipMemcpy3D (#1798)
Fixes #1790 and #1791. hipMemcpy3D still requires further refactoring for different input and output combinations.
2020-02-17 19:35:35 +05:30
Maneesh Gupta e7120dd876 Use deque instead of vector for code readers so that the iterators and references will be stable (#1851)
* Use deque instead of vector for code readers so that the iterators and references will be stable

* Fix compile error

* Assign the iterator

* Add multithreaded test

* Make threads a multiple of hardware concurrency

* Output on failure

* Add setDevice to try and initialize the context on cuda

* Create context for cuda

* Set context on each thread

* Reduce threads on cuda

* Skip test on cuda

* Try to initialize the primary context on cuda

* Push ctx to the stack as current

* Revert "Push ctx to the stack as current"

This reverts commit bff8cbe950.

* Revert "Try to initialize the primary context on cuda"

This reverts commit fd98514113.

* updated test for nvidia path

* Add c++11 option for nvcc

Co-authored-by: satyanveshd <53337087+satyanveshd@users.noreply.github.com>
2020-02-15 09:51:24 +05:30
Jeff Daily 03bb658721 missing break statement in hipDeviceGetAttribute (#1865)
The break is missing for hipDeviceAttributeMaxTexture3DDepth.
2020-02-13 14:22:56 +05:30
Sarbojit2019 1109cbff83 [hip] Fix for bug introduced in #1770 when blockSize is non-power of 2 (#1864)
Fixes SWDEV-222161
2020-02-13 14:22:46 +05:30
Sarbojit2019 fc5256fd28 ihipEnablePeerAccess return error if peer is not accessible (#1858)
hipDeviceEnablePeerAccess returns success and adds peer into the list even if it is not accessible which creates problem in hipMalloc when it tries to share the ptr to peer device.
Proposed change is to check the access status before updating the peer list and update only when it can access the peer.
2020-02-13 14:22:11 +05:30
ansurya 8c6934223b Reduce GPU copying based on arch it runs on (#1751)
Implements SWDEV-213230.
2020-02-13 14:21:51 +05:30