f68149eafd
Update HIP's unsafeAtomicAdd to: - Compile properly even when not compiling for gfx90a - Fall back to safe atomic add on non-gfx90a architectures - use flat atomic add for FP64 on gfx90a, instead of dynamically checking memory spaces. In addition, when the compiler is passed -munsafe-fp-atomics, it will define __AMDGCN_UNSAFE_FP_ATOMICS__. When this happens, the compiler is requesting that the HIP headers force all HIP atomicAdd() calls on floats or doubles to use their unsafe versions. This patch thus causes unsafeAtomicAdd() calls when that define is seen. This call to unsafeAtomicAdd() is also done for atomicSub(), since that calls atomicAdd underneath. This is not done for system-scope atomicAdd because, on gfx90a, system-scope atomic FP add instructions would need to target fine-grained memory, which is always unsafe. This patch also creates safeAtomicAdd() functions for float and double. These functions will create a standalone safe atomic, even when the application is compiled with -munsafe-fp-atomics. Finally, this patch adds wrappers in the Nvidia path of HIP so that these HIP functions call through to atomicAdd there as well. Change-Id: I8af0621d3d28ea30c9278bfeea7393d03bbdac6d