- Header files inside rocclr/utils when included from hipamd or opencl should be included as #include "rocclr/utils/xxx.h" instead of "utils/xxx.h"
Change-Id: Ic0760c33b9d091f5620dec67e5482c9698d22093
hipDeviceSynchronize called from __hipUnregisterFatBinary
accesses static maps and monitors. This change ensures these ojects
are not destroyed before __hipUnregisterFatBinary is called.
Additionally it disables the teardown process for static build.
Change-Id: I46b58641d60efcf6637a8e99cdd786ffe9e2c77d
During hipGraphExecKernelNodeSetParams kernel function can also be updated.
Hence size required for kernel parameters differs from what is allocated during graphInstantiation.
So, create new 128KB kernel pool and allocate kernel args from the pool.
If the pool is full create new 128KB pool. Release kernel pools when graph exec object is destroyed.
Change-Id: I9567946d63400c79cbfd4c5439c654c92557ceae
Set flag with hipCtxCreate so that get flags works.
Validate hipHostGetDevicePointer for flags!=0.
Validate mem cpy kind and accommodate new type hipMemcpyDeviceToDeviceNoCU.
Match error code for hipGetChannelDesc.
Change-Id: If09a635ac01bc53f1fe2b7df3f3f9c1b0d69a0ab
- Aggregate all TLS(Thread Local Storage) variables into a single class
- This is to improve cache accesses per thread
Change-Id: Ic8361eaeae290fff00254684e309471958365eb9
HIP_MEM_POOL_SUPPORT controls memory pool support in runtime.
Currently it's disabled by default. The initial change doesn't
include: IPC, MGPU, virtual memory alloc, suballoc, defragmentation,
internal dependencies.
Change-Id: Ibed8528ebec698b045ebb247e49c0ecd6e587ed7
Set affinity to the node nearest to default GPU at init. Afterthat
set it to NUMA node thats nearest to whatever GPU is set with
hipSetDevice
Change-Id: I85749258ea7c25385096ffe4089a70c948f332c7
Change-Id: I99a92c922655e22955bee512073b6ac8e6ced3a2
Remove hip-hcc codes from hip code base
Simplify hip CMakeLists.txt to exclude hip-hcc
Simplify cmake cmd for hip-rocclr building
Some minor fixes
Change-Id: I1ae357ecfd638d6c25bca293c1724b026be21ecd
* all thread local access now through single struct
* clean up old commented-out code, more use of GET_TLS()
* fewer calls to GET_TLS by passing tls as a funtion argument
* revert unnecessary change to printf
* fix failing tests due to TLS change
* fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor
Logging status of hipCtxSynchronize was missing
Test if hip profiling is active for MARKER_END in ihipPostLaunchKernel
Add MARKER_END after the completion of a kernel launched through
the "grid launch"
1) hipSetDevice sets a flag so that next call to hipCtxGetCurrent returns primary context on current device
2) hipCtxGetCurrent returns primary context on current device if TLS context stack is empty
3) hipCtxPopCurrent falls back to primary context on current device as default
4) hipCtxPushCurrent, hipCtxSetCurrent and hipCtxCreate reset the flag set in hipSetDevice
1. hipHccModuleLaunchKernel is same as hipModuleLaunchKernel with OpenCL workitem model
2. Added copy right
3. Fixed header naming
Change-Id: I6a7c35a3566e2f8d3f5056613e34193775d4b236
-
-Contexts across threads are listed under device
-Device reset cleans up all contexts and re-initializes _primaryCtx
Change-Id: Ie1cfbb26d43a8dc6869be3e6ebaf7344ce374643