The test needs some more modifications before it is ready for building
on nvcc path. Disabling it for now so that the pull request can be
merged.
Change-Id: I19a894fcda7b2159f86a4e4e95a409c5655d3760
[ROCm/clr commit: 0d537f9966]
1) hipSetDevice sets a flag so that next call to hipCtxGetCurrent returns primary context on current device
2) hipCtxGetCurrent returns primary context on current device if TLS context stack is empty
3) hipCtxPopCurrent falls back to primary context on current device as default
4) hipCtxPushCurrent, hipCtxSetCurrent and hipCtxCreate reset the flag set in hipSetDevice
[ROCm/clr commit: c4e9323877]
- Refactoring introduced a bug when user does not specify any target via
--amdgpu-target, but has an invalid target specified in
HCC_AMDGPU_TARGET. In this case the selection logic was defaulting to
gf803.
- Removed defaulting to any specific target if rocm_agent_enumerator
fails. hipcc will report this and die if linking was required.
Change-Id: I76131867049fef92331807dd19a926406dcc1d02
[ROCm/clr commit: 8f6c150134]
Existing logic has a bug. If user specifies targetA via commandline
options, while enumerator returns targetB, hipcc will create a fatbin
containing targets targetA and targetB. enumerator should only be used
when no target is specified by user (commandline or env var).
Change-Id: I6da857f86860c0e671b5988cd858644a08f723b9
[ROCm/clr commit: 2a2c7575eb]
Associated change is to optimize event recording so it uses
agent-scope releaes (since it was only using system-scope release
to support non-coherent host mem).
Flags and environment variables exist to obtain previous behavior
if desired. Options are documented in new performance guide.
[ROCm/clr commit: 6576201ec2]
Cmake cache was being rebuilt on each build. This was being done
to update HIP_VERSION, HCC_VERSION, .hipInfo and .hipVersion.
However, rebuilding cache also re-runs HIT parser which is slow.
Removing the cache rebuild should speed up the build. But user
needs to explicitly rebuild the cache in case HIP_VERSION or
HCC_VERSION changes by calling "make rebuild_cache"
Change-Id: Ia5476eb7105aa614239c4dc7968c37f5e6cb0b29
[ROCm/clr commit: 1b5d19ff36]
[hipify-clang] Finally finished syncing with CUDA 8.0.61 Driver and Runtime API (including missing data types, D3D, OpenGL, VDPAU and EGL interop).
+ All the Modules are supported now:
1) 4.1 – 4.31 from CUDA 8.0.61 Driver API
2) CUDA_Driver_API_functions_supported_by_HIP.md updated accordingly
3) 4.1 – 4.31 from CUDA 8.0.61 Runtime API
4) CUDA_Runtime_API_functions_supported_by_HIP.md updated accordingly
+ Typos fixing
+ Annotating
[ROCm/clr commit: 9b10efe419]