1. Added fma intrinsic support for double and float
2. Added test for fma
Change-Id: I909fdbec34a3d12c03ba6eff3a39376a7128ee43
[ROCm/clr commit: cc1f8a1011]
1. Use -DHIP_FAST_MATH to make precise math functions compiled to fast math
2. Added double fast math functions for sqrt
3. Changed hipcc to parse -use_fast_math (not working)
4. Added passed tag to hipFloatMath test
Change-Id: I72884b2436b4efe61e9a9297346c1358fee38a2d
[ROCm/clr commit: c2f6ecf264]
Print TID mapping at init when HIP_TRACE_API=1.
Print base host/dev info from tracker during copy.
Change-Id: I84e26d7b801567e5a91baad36126fb590920ec87
[ROCm/clr commit: 111b57ddd0]
1. Added fast math intrinsics for single precision data types
2. Added test to check the intrinsics
3. Added HIP_PRECISE_MATH macro to enable precise math on fast math
Change-Id: Iadacbb6182c31252c5e3252854372d1b80dfd27b
[ROCm/clr commit: d9a3527769]
1. Added fast math apis for sin, cos, tan, sincos
2. Added test for trig math functions
3. Added logarithm fast math
4. Changed how hipGetDevice, hipDeviceGetCacheConfig emit errors
Change-Id: Ie6ab594ddd5853cbe85e39a2f6d3479a807fa323
[ROCm/clr commit: 1a85762f53]
1. Texture functions are now compiling fine
2. Fixed hipFuncCache to hipFuncCache_t
Change-Id: I8f815887e4de43ee115bbaff249905b236541c39
[ROCm/clr commit: 2611de2477]
1. Changed test macro to emit line numbers
2. Added getcacheconfig api test for nvcc path
3. Fixed hipFuncCache_t data type
TODO: With this commit, right now there are 2 func cache datatypes
a. hipFuncCache_t for runtime API
b. hipFuncCache for driver API
Map these to a single data type
Change-Id: Ia47c9f5d7c2633638051bf17b1103048a1ede973
[ROCm/clr commit: b3c16ea7b5]
1. Added copyright to all new tests
2. Added test for hipDeviceGetAttribute
Change-Id: I7a070c5b8316ef6575b3f4c49bda2769aea2a7c4
[ROCm/clr commit: e0aba8647f]
1. Added add, sub, mul packed math i8 intrinsics
2. Removed c++ packed data structures included from HCC
Change-Id: I1d109c5ce10c48b7cd3ea059478b88fc1de78499
TODO: Add better packed data structures support
[ROCm/clr commit: 603bb321ec]
Prefer use of source-engine for DMA copies, even if user submits copy
in a stream attached to a different device.
The stream is now used only for synchronization, and HIP
makes the most optimal decision for which engine to perform the
copy - typically the source copy engine.
HIP now makes decision on which engine should perform the copy
and passes this to HCC using new apis.
HIP has additional information about peer
visibility and will make a decision which agent should perform
the copy .
Change-Id: I0cf4cfebeae256e6ca795f08a7ed7130f4857d1f
[ROCm/clr commit: e767e0032e]