* Enabled gcc for hip host code
* Adding tests for hip code + (gcc & g++), without kernels
* Excluding nvcc platforms for gcc and g++ tests + Addressing review comments
* minor code clean-up
* Add rocm include path
* Added relative path for library
* Hiding non supported functions for gcc
* Incorporating review comments
* all thread local access now through single struct
* clean up old commented-out code, more use of GET_TLS()
* fewer calls to GET_TLS by passing tls as a funtion argument
* revert unnecessary change to printf
* fix failing tests due to TLS change
* fix merge conflicts in ihipOccupancyMaxActiveBlocksPerMultiprocessor
* Added support of hipOccupancyMaxActiveBlocksPerMultiprocessor & hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags APIs
* Taking into account of SGPR usage to determine the max active blocks in hipOccupancyMaxActiveBlocksPerMultiprocessor()
* Add Max Texture 1D,2D,3D device properties
* Corrected testcase to use enums defined in hipDeviceAttribute_t
* Added texture 1D,2D and 3D support for NVIDIA path
* UChar and UShort textures as Normalized Float
* UChar and UShort textures as Normalized Float for all float variants
* Handled uninitilaized texture format value
[Reason] To be compatible with CUDA [#1133]
Update HIP code, hipify-clang, tests and docs
[TODO] Add support of the corresponding functions on nvcc fallback path
Typo introduced here:
commit 67abac1365
Author: Alex Voicu <alexandru.voicu@amd.com>
Date: Mon Jun 24 20:02:09 2019 -0500
Put 3-wide vector types on a ketogenic diet. (#1180)
* Remove flags parameter from hipOccupancyMaxPotentialBlockSize
This commit makes the hipOccupancyMaxPotentialBlockSize method consistent with hcc path and the CUDA API.
* Put 3-wide vector types on a ketogenic diet.
* Remove needless include.
* Do not be narrow-minded.
* Do not be narrow-minded.
* Put the C people on a diet too.
* [hip] implement the hipExtLaunchMultiKernelMultiDevice API
* add a guard to check the HCC version for acquire_locked_hsa_queue() API which was introdued in HCC for ROCm 2.5
* modified code based on the requested changes
* changes to lock all streams before launching kernels for each device and unlock them after the dispatches
* check each stream to be valid before starting to lock all the streams
* Implement the hipOccupancyMaxPotentialBlockSize function
* Replaced hipGetDeviceProperties() call by ihipGetDeviceProperties() in ihipOccupancyMaxPotentialBlockSize()
* Add test for hipOccupancyMaxPotentialBlockSize in Module API
* Added extern declaration for ihipGetDeviceProperties() to be accessed inside ihipOccupancyMaxPotentialBlockSize()
* fixed hipOccupancyMaxPotentialBlockSize test build issue
* Fix hipOccupancyMaxPotentialBlockSize dtest
* Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest
* Revert "Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest"
This reverts commit 0480ff56f1441fc515d2c26ce33783e303423938.
* Disable hipOccupancyMaxPotentialBlockSize dtest on NVCC
* move extern declaration of ihipGetDeviceProperties to hip_module.cpp
* Update the limiation of 32 wavefronts per CU and 800/512 SGPRs for VI/pre-VI chips to calculate the occupancy
+ Makes hip_Memcpy2D struct compatible with CUDA_MEMCPY2D struct
+ Add hipMemcpyParam2D support in nvcc fallback path
+ Update hipify-clang, tests and docs accordingly
* Add ersatz for NVRTC.
* Fix extraneous paren and use correct namespace.
* Use lowerCamelCase (yuck, yuck) consistently.
* Link against FS when building hiprtc lib.
* Correctly mark Manipulators. Fix dual compile.
* Add unit tests. Extend HIT to accept linker options.
* Make sure the HIPRTC library is installed.
* Better logging. Try to auto-detect the target.
* Stop specifying the target explicitly.
* Add missing flavour of `hipModuleLaunchKernel`.
* Program was already destroyed.
* Don't use `--genco`. Fix mangled name trimming.
* Fix HIPRTC breakage due to upstream noise.
* [dtests] Replace RUN -> TEST in hiprtc tests
Change-Id: Ie499e92dfe4e5c94634b1c2b76cf52d241bcfea3
* [hit] Set HIP_PATH to HIP_ROOT_DIR for all tests
Change-Id: Ib0ad1f99bc71c03e363e055dd508a7a4a210680a