* [HIP][tests] New testcases for module api
* [HIP][Tests]Support for CUDA devices
* Updated tests as per latest master & test GetGlobal to work on all platforms
* Add Max Texture 1D,2D,3D device properties
* Corrected testcase to use enums defined in hipDeviceAttribute_t
* Added texture 1D,2D and 3D support for NVIDIA path
* UChar and UShort textures as Normalized Float
* UChar and UShort textures as Normalized Float for all float variants
* Handled uninitilaized texture format value
+ Source files are the first to go. It is needed for in-place hipification in order to avoid errors with included but already hipified header files.
+ More extensions support for batch processing.
[Reason] To be compatible with CUDA [#1133]
Update HIP code, hipify-clang, tests and docs
[TODO] Add support of the corresponding functions on nvcc fallback path
+ Add option -print-stats-csv to dump statistics to CSV file
+ If -o-dir is specified, CSV file will be dumped there
+ Generate 1 summary file sum_stat.csv in case of multiple sources
Typo introduced here:
commit 87eac86298
Author: Alex Voicu <alexandru.voicu@amd.com>
Date: Mon Jun 24 20:02:09 2019 -0500
Put 3-wide vector types on a ketogenic diet. (#1180)
* Remove flags parameter from hipOccupancyMaxPotentialBlockSize
This commit makes the hipOccupancyMaxPotentialBlockSize method consistent with hcc path and the CUDA API.
module_api_global relies on a HCC only feature which allows host code
to write to device variables. This feature does not exist in CUDA
or hip-clang, which causes the sample not working in CUDA or hip-clang.
This patch fixes the sample by using standard features of CUDA and
hip-clang. The fixed sample works in HCC, CUDA and hip-clang.
module_api_global relies on a HCC only feature which allows host code
to write to device variables. This feature does not exist in CUDA
or hip-clang, which causes the sample not working in CUDA or hip-clang.
This patch fixes the sample by using standard features of CUDA and
hip-clang. The fixed sample works in HCC, CUDA and hip-clang.
This fixes a bug where GCC++ on Ubuntu 18.04 creates failing executables compared to GCC++ on 16.04 and clang++. While creating function names on Ubuntu 18.04, dl_phdr_info seems to provide a non-zero value for dlpi_addr on initial iteration, and an empty string in dlpi_name. This is causing failure when linking with g++, since the empty string prevents the kernel function from being loaded. Clang++ and GCC on UB16 provide a zero value for dlpi_addr. To fix this, we need to verify both addr and name exists, so that /proc/self/exe can be properly loaded.
* Put 3-wide vector types on a ketogenic diet.
* Remove needless include.
* Do not be narrow-minded.
* Do not be narrow-minded.
* Put the C people on a diet too.