Currently the texture C++ API is forwarded to the ihip*Impl() calls, which are not even a part of Cuda. These should be forwarded to their respective Cuda C APIs instead.
This change also fixes a bug with hipUnbindTexture() creating a dangling pointer.
Change-Id: Ifafc9d106855a11bec84a18ea214b3d89e39990d
Similar to the previous patch, this change adds type constraints to texture indirect functions. Since we don't have to deduce the return type for these, we simply just have to check if the user provided a valid channel type.
Change-Id: Ia094bd6126e01df2ea90902c9aa59cb6cfe85773
When sampling a pixel the hw always returns a float4. The type in the texture reference controls the bitcast that we perform before returning the sampled pixel. Creating a texture with an unsupported will lead to potential UB.
This change makes it so that it's only possible to use textures with a type that makes sense. Using something like texture<int, hipTextureType1D, hipReadModeNormalizedFloat> will now lead to a compilation error with a message "Invalid channel type!".
Change-Id: I7fde44cb1d4b9737e0c48c28cb59c018c59ccaa2
HIP-Clang cuda_wrapper headers require clang include path before standard C++ include path.
However libc++ include path requires to be before clang include path.
To workaround this, we pass -isystem with the parent directory of clang include
path instead of the clang include path itself.
This change addresses three things.
First the available APIs are brought up to par with Cuda (missing ones are added and incorrect ones removed).
Second the size of hip/hcc_detail/texture_functions.h. Using some template magic we can bring down the code size down from ~11k lines to only ~900 lines in total.
Third this change fixes some bugs in the declaration of the texture fetch funcitons. Currently the return type for textures with read mode set to hipReadModeNormalizedFloat is not float. This causes pixel data to be lost during the bitcast when the texture pixel element size is less than the size of float.
The new headers will only be enabled for VDI to avoid breaking HCC.
Change-Id: I77cb29293fb79e55681be094c37702a48d80b64c
GCC emits a warning about using static functions like
hipCUDAErrorTohipError inside this function, because it has an
inline directive, but it's not static. Adding static to this function
to silence warnings (and prevent potential problems in the future).
NVCC warned if you tried to use hipOccupancyMaxActiveBlocksPerMultiprocessor
because when passing in a device function pointer, "const void* func" was
insufficient to describe it accurately. Adding a C++ templated class type
definition for this function.
- Need to check the availability of `__has_attribute` builtin macro
instead of compiler versions. That's more reliable and portable among
various compilers.
- Provides a very basic support of vectors for unknown compilers.
There are now two implementations of printf in HIP:
1. The implemenation for HCC is controlled by the HC_FEATURE_PRINTF
macro, and it works only with the HCC compiler used in combination
with the HCC runtime.
2. The implementation for hip-clang requires the VDI runtime, and is
always enabled with that combination.
Change-Id: Ibaeda7900ffe2ce602ca0094aafed0f1147ac2b6
There are now two implementations of printf in HIP:
1. The implemenation for HCC is controlled by the HC_FEATURE_PRINTF
macro, and it works only with the HCC compiler used in combination
with the HCC runtime.
2. The implementation for hip-clang requires the VDI runtime, and is
always enabled with that combination.
Fix compilation error on hip-hcc+clang , hip-vdi+clang
Enabled hipExtLaunchMultiKernelMultiDevice test on hip-vdi path
hipExtLaunchMultiKernelMultiDevice common declaration for all paths
Change-Id: I76031840614fce8e12a8e845548fa43a389a741a
Temporarily comment out Hcc-specific template functions
hipExtLaunchKernelGGL and hipOccupancyMaxPotentialBlockSize for CLang
compiler so that all test cases under hip/samples can be built
successfully for Clang + Hip/Hcc runtime.
Change-Id: Iafc761257be4a7b34eafa6759a01f369570cd6ce
* Device texture functions should not normalize the sampled pixel. This is already done by HW.
* Add support to use h/w capability for normalized float data convertion for driver API's
Co-authored-by: ansurya <50609411+ansurya@users.noreply.github.com>