aligned.
In hcc_am, a bigger buffer will be allocated for alignment purpose
and _unalignedDevicePointer is added in struct AmPointerInfo for
original allocated address.
1) hipSetDevice sets a flag so that next call to hipCtxGetCurrent returns primary context on current device
2) hipCtxGetCurrent returns primary context on current device if TLS context stack is empty
3) hipCtxPopCurrent falls back to primary context on current device as default
4) hipCtxPushCurrent, hipCtxSetCurrent and hipCtxCreate reset the flag set in hipSetDevice
- Refactoring introduced a bug when user does not specify any target via
--amdgpu-target, but has an invalid target specified in
HCC_AMDGPU_TARGET. In this case the selection logic was defaulting to
gf803.
- Removed defaulting to any specific target if rocm_agent_enumerator
fails. hipcc will report this and die if linking was required.
Change-Id: I76131867049fef92331807dd19a926406dcc1d02
Existing logic has a bug. If user specifies targetA via commandline
options, while enumerator returns targetB, hipcc will create a fatbin
containing targets targetA and targetB. enumerator should only be used
when no target is specified by user (commandline or env var).
Change-Id: I6da857f86860c0e671b5988cd858644a08f723b9
Associated change is to optimize event recording so it uses
agent-scope releaes (since it was only using system-scope release
to support non-coherent host mem).
Flags and environment variables exist to obtain previous behavior
if desired. Options are documented in new performance guide.
Cmake cache was being rebuilt on each build. This was being done
to update HIP_VERSION, HCC_VERSION, .hipInfo and .hipVersion.
However, rebuilding cache also re-runs HIT parser which is slow.
Removing the cache rebuild should speed up the build. But user
needs to explicitly rebuild the cache in case HIP_VERSION or
HCC_VERSION changes by calling "make rebuild_cache"
Change-Id: Ia5476eb7105aa614239c4dc7968c37f5e6cb0b29
[hipify-clang] Finally finished syncing with CUDA 8.0.61 Driver and Runtime API (including missing data types, D3D, OpenGL, VDPAU and EGL interop).
+ All the Modules are supported now:
1) 4.1 – 4.31 from CUDA 8.0.61 Driver API
2) CUDA_Driver_API_functions_supported_by_HIP.md updated accordingly
3) 4.1 – 4.31 from CUDA 8.0.61 Runtime API
4) CUDA_Runtime_API_functions_supported_by_HIP.md updated accordingly
+ Typos fixing
+ Annotating