Changed the third arg of the functions __hip_as_write_block and __ockl_as_write_block from ulong to uint64_t so as to fix the compilation error in windows
* Enabled gcc for hip host code
* Adding tests for hip code + (gcc & g++), without kernels
* Excluding nvcc platforms for gcc and g++ tests + Addressing review comments
* minor code clean-up
* Add rocm include path
* Added relative path for library
* Hiding non supported functions for gcc
* Incorporating review comments
* Added support of hipOccupancyMaxActiveBlocksPerMultiprocessor & hipOccupancyMaxActiveBlocksPerMultiprocessorWithFlags APIs
* Taking into account of SGPR usage to determine the max active blocks in hipOccupancyMaxActiveBlocksPerMultiprocessor()
* Fix hipMemcpy-size test running out of Host Mem
The hipMemcpy-size uses a maxElem calculated from the total GPU mem /8. Then it will allocate 4 times that amount of host memory. This tests begins failing when there is not enough host memory, such as on systems with 32GB GPU mem, and 16GB RAM. This fixes the test if not enough host memory is available on the system.
* Add windows support to hipMemcpy-size fix
* avoid linking extra libs for windows
* HIPMemcpy-size Remove freeCPU including swap
* [HIP][tests] New testcases for module api
* [HIP][Tests]Support for CUDA devices
* Updated tests as per latest master & test GetGlobal to work on all platforms
* Add Max Texture 1D,2D,3D device properties
* Corrected testcase to use enums defined in hipDeviceAttribute_t
* Added texture 1D,2D and 3D support for NVIDIA path
* Put 3-wide vector types on a ketogenic diet.
* Remove needless include.
* Do not be narrow-minded.
* Do not be narrow-minded.
* Put the C people on a diet too.
* Implement the hipOccupancyMaxPotentialBlockSize function
* Replaced hipGetDeviceProperties() call by ihipGetDeviceProperties() in ihipOccupancyMaxPotentialBlockSize()
* Add test for hipOccupancyMaxPotentialBlockSize in Module API
* Added extern declaration for ihipGetDeviceProperties() to be accessed inside ihipOccupancyMaxPotentialBlockSize()
* fixed hipOccupancyMaxPotentialBlockSize test build issue
* Fix hipOccupancyMaxPotentialBlockSize dtest
* Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest
* Revert "Add BUILD_CMD in hipOccupancyMaxPotentialBlockSize dtest"
This reverts commit 0480ff56f1441fc515d2c26ce33783e303423938.
* Disable hipOccupancyMaxPotentialBlockSize dtest on NVCC
* move extern declaration of ihipGetDeviceProperties to hip_module.cpp
* Update the limiation of 32 wavefronts per CU and 800/512 SGPRs for VI/pre-VI chips to calculate the occupancy
* Add ersatz for NVRTC.
* Fix extraneous paren and use correct namespace.
* Use lowerCamelCase (yuck, yuck) consistently.
* Link against FS when building hiprtc lib.
* Correctly mark Manipulators. Fix dual compile.
* Add unit tests. Extend HIT to accept linker options.
* Make sure the HIPRTC library is installed.
* Better logging. Try to auto-detect the target.
* Stop specifying the target explicitly.
* Add missing flavour of `hipModuleLaunchKernel`.
* Program was already destroyed.
* Don't use `--genco`. Fix mangled name trimming.
* Fix HIPRTC breakage due to upstream noise.
* [dtests] Replace RUN -> TEST in hiprtc tests
Change-Id: Ie499e92dfe4e5c94634b1c2b76cf52d241bcfea3
* [hit] Set HIP_PATH to HIP_ROOT_DIR for all tests
Change-Id: Ib0ad1f99bc71c03e363e055dd508a7a4a210680a
- Current clang disallows any invocation of wrong-side functions even
under context with type-inspection only. Work around that by adding a
variant of `std::decl` with `__device__` attribute.
- It's a common mistake by assuming 1 << shamt would be promoted to
64-bit, if shamt is a 64-bit integer. That's not the case. Replace
that left shift to a 64-bit one to ensure it won't fall into undefined
behavior.
- Fix the host-side implementation as well for device function testing.