Aditya Atluri
30674382a4
added fma for double and float
...
1. Added fma intrinsic support for double and float
2. Added test for fma
Change-Id: I909fdbec34a3d12c03ba6eff3a39376a7128ee43
2016-11-23 18:22:05 -06:00
pensun
8a8c7a6b4d
Add some missing APIs on nv path and hipify
...
Change-Id: Ic0f4740ab06bf70b1de61b39fedc7a6e7605cb61
2016-11-23 14:36:30 -06:00
pensun
a836395350
Add several missing APIs in hipify
...
Change-Id: I58912871cb0b10128f221ef26a11b0d69fb7873c
2016-11-23 14:06:18 -06:00
Aditya Atluri
043da795f6
Added fast math flag
...
1. Use -DHIP_FAST_MATH to make precise math functions compiled to fast math
2. Added double fast math functions for sqrt
3. Changed hipcc to parse -use_fast_math (not working)
4. Added passed tag to hipFloatMath test
Change-Id: I72884b2436b4efe61e9a9297346c1358fee38a2d
2016-11-23 11:19:15 -06:00
Ben Sander
75c540fe3c
Add toc to hip_profiling.md
...
Change-Id: I3ae100f12686d0398a0403b78ca571382acce135
2016-11-23 08:36:08 -06:00
Ben Sander
dec59d9909
Improve docs in some places
...
Change-Id: If31e84fbf0c8595ca72edb842dce7ce47783579b
2016-11-23 08:16:18 -06:00
Ben Sander
b6ae6b08fb
Improve debug capabilities.
...
Print TID mapping at init when HIP_TRACE_API=1.
Print base host/dev info from tracker during copy.
Change-Id: I84e26d7b801567e5a91baad36126fb590920ec87
2016-11-23 08:16:18 -06:00
Ben Sander
e4e14211b3
Improve profiler and debug documentation
2016-11-23 08:15:40 -06:00
Rahul Garg
6a4f44bce0
Removed nested HIP calls from hip_device functions
...
Change-Id: I18785b0ee27e32fb8950982fa5c3a64d1ae6a9b8
2016-11-23 18:37:06 +05:30
Aditya Atluri
f843928ddd
added fast math intrinsics to HIP
...
1. Added fast math intrinsics for single precision data types
2. Added test to check the intrinsics
3. Added HIP_PRECISE_MATH macro to enable precise math on fast math
Change-Id: Iadacbb6182c31252c5e3252854372d1b80dfd27b
2016-11-22 15:26:00 -06:00
Aditya Atluri
94d2115d6d
added fast math APIs
...
1. Added fast math apis for sin, cos, tan, sincos
2. Added test for trig math functions
3. Added logarithm fast math
4. Changed how hipGetDevice, hipDeviceGetCacheConfig emit errors
Change-Id: Ie6ab594ddd5853cbe85e39a2f6d3479a807fa323
2016-11-22 10:20:09 -06:00
Rahul Garg
2dcf20ac6f
Removed hsaKmtReleaseSystemProperties call
...
Change-Id: I7cb992cccf587c333f0ca0cb518409f3944bdb06
2016-11-22 06:15:35 +05:30
Aditya Atluri
7145ea6a4a
fixed error output for hipDeviceGetAttribute
...
Change-Id: I1e343a4e4e20e1a550d419f701cc1e60e9d03af4
2016-11-21 18:07:01 -06:00
Aditya Atluri
d6ad91ffa4
fixed texture header on nvcc
...
Change-Id: Ibe19f94be5edf972b6b51dea263e1088b6c60c1d
2016-11-21 13:53:28 -06:00
Aditya Atluri
6052eaa761
removed warnings in macros
...
Change-Id: I992b11f6aee2bab09f46885a2d12234aa6814cc5
2016-11-21 09:04:36 -06:00
Aditya Atluri
2412c9a061
fixed compilation bugs
...
1. Texture functions are now compiling fine
2. Fixed hipFuncCache to hipFuncCache_t
Change-Id: I8f815887e4de43ee115bbaff249905b236541c39
2016-11-21 08:56:30 -06:00
Aditya Atluri
afaa5fcf96
Fixed hipDeviceGetCacheConfig on nvcc path
...
1. Changed test macro to emit line numbers
2. Added getcacheconfig api test for nvcc path
3. Fixed hipFuncCache_t data type
TODO: With this commit, right now there are 2 func cache datatypes
a. hipFuncCache_t for runtime API
b. hipFuncCache for driver API
Map these to a single data type
Change-Id: Ia47c9f5d7c2633638051bf17b1103048a1ede973
2016-11-20 12:18:08 -06:00
Aditya Atluri
0edc082ff6
added new test for getting attribute
...
1. Added copyright to all new tests
2. Added test for hipDeviceGetAttribute
Change-Id: I7a070c5b8316ef6575b3f4c49bda2769aea2a7c4
2016-11-20 11:53:16 -06:00
Aditya Atluri
a6c4304725
added copy right to new header
...
Change-Id: I16e1d02194551e4b20019bcb6850a3f84882ef18
2016-11-19 23:02:56 -06:00
Aditya Atluri
428041cfc2
added tests to check nvcc runtime api output
...
Change-Id: Ifdd39b5d0a6a58d20a8e9745e59dd82d50a90e2f
2016-11-19 21:36:28 -06:00
Sandeep Kumar
53e771fc75
fix_format
...
Change-Id: I34e265de434263a11654e5deba044c3f21e86578
2016-11-18 14:34:14 +05:30
Maneesh Gupta
c0419cc749
Refactor for building HIP as dynamic library
...
Change-Id: I65a3d9d589c4fdbbdcf1611e5427224253be8260
2016-11-18 14:33:20 +05:30
scchan
3d6bf5e799
Add extra linker flags to the shared library build
...
Change-Id: I19e569d566fb5e25e343e364a3053a3f12659361
2016-11-18 14:18:29 +05:30
Maneesh Gupta
4fc082ff09
Fix broken tests due to dc64a73
...
Change-Id: I847c80f8462e1c955bdef957e6de2841a3a6ab29
2016-11-18 12:20:47 +05:30
Aditya Atluri
1618cb3f85
moved runtime macros to runtime_api.h
...
Change-Id: Ib47e449328e8e6ec55d1b6ee19899de4b591ea8e
2016-11-17 14:19:18 -06:00
Aditya Atluri
c20c524400
added texture header to memory api source
...
Change-Id: I1af6d60aca5a9a9ef1cadf8c304bea892acbe061
2016-11-17 11:57:53 -06:00
Aditya Atluri
dc64a732d8
make texture as seperate header as of now
...
Change-Id: I3c65aa75f2f729eedd8c3292fa3cbc37709c1cfe
2016-11-17 11:55:29 -06:00
Aditya Atluri
12dd9df88f
Added i8 packed math intrinsics
...
1. Added add, sub, mul packed math i8 intrinsics
2. Removed c++ packed data structures included from HCC
Change-Id: I1d109c5ce10c48b7cd3ea059478b88fc1de78499
TODO: Add better packed data structures support
2016-11-17 01:09:12 -06:00
Maneesh Gupta
76cbf6954d
Merge branch 'amd-develop' into amd-master
...
Change-Id: I4fbb7ac287c182fea97bf31562a3d64554e59e94
2016-11-15 10:44:21 +05:30
Maneesh Gupta
888a3528d2
Enable USE_COPY_EXT_V2 by default
...
Change-Id: I2c0dc80f85a0ccb5744715b5418a604e38b249ed
2016-11-15 10:42:27 +05:30
Ben Sander
0c624c009b
tweak hcc demangler
2016-11-14 15:26:27 -07:00
Maneesh Gupta
8d40253664
Merge branch 'amd-develop' into amd-master
...
Change-Id: I32d41081ac065f2c50531dc2e420802d765665e2
2016-11-14 06:12:03 +05:30
Sandeep Kumar
09b157ca8c
Add p2p for cookbook
...
Change-Id: Id2e77ab31123ef95885d665efe34bc0d4596733a
(cherry picked from commit 6fbd0352713ca36e399b1ed4f17c486207a53875)
2016-11-14 06:10:36 +05:30
Maneesh Gupta
fd1483ce35
Revert "hipcc: Turn back linking hip_ir.ll by default"
...
This reverts commit 528b257004 .
2016-11-14 06:05:31 +05:30
Ben Sander
faf2a1e01a
Add draft doc on profiling with hip.
...
Change-Id: I79727dd2500333b3f16acb381dd5852a15ed408a
2016-11-13 10:01:05 -06:00
Ben Sander
c9401cb95f
Add   to demangler
...
Change-Id: I89586c7c17f5152b7a6850d0d6c2aa1d3ebc8190
2016-11-11 16:50:56 -06:00
pensun
50867efa10
Add direct test case for threadfence_system workaround
...
Change-Id: I5b21b590e957c901044741ac94e816cd8b1426f9
2016-11-11 15:09:43 -06:00
Aditya Atluri
abf6872b2b
fixed multi-dim module kernel launch
...
Change-Id: Id1d81f2375d058979ab526433f905cf0ea3d23d6
2016-11-11 12:25:23 -06:00
Ben Sander
1e5515ee9f
Add option to deny peer access.
...
Also fix test.
Change-Id: I1b247f6c4271442b008e560669bca4daf8eb94c7
2016-11-10 23:12:48 -06:00
Ben Sander
65584e48de
Use forceUnpinnedCopy to resolve P2p corner cases.
...
Change-Id: I2aebb419881246cebb696bec87798635bc71acc2
2016-11-10 23:12:48 -06:00
Ben Sander
d3d6feb4de
Enable async copy again.
...
Also add HIP_FORCE_SYNC_COPY chicken bit.
Change-Id: I76a385410494b99bf27305d3c08f55dd81987565
2016-11-10 23:12:48 -06:00
Ben Sander
8724273f28
Doc change only - add comments to test.
...
Change-Id: Ie42087cf3c78e49337b18bb71f3f0e1e7950ee1b
2016-11-10 23:12:48 -06:00
Ben Sander
ced9d72d94
Refactor copy and P2P logic.
...
Prefer use of source-engine for DMA copies, even if user submits copy
in a stream attached to a different device.
The stream is now used only for synchronization, and HIP
makes the most optimal decision for which engine to perform the
copy - typically the source copy engine.
HIP now makes decision on which engine should perform the copy
and passes this to HCC using new apis.
HIP has additional information about peer
visibility and will make a decision which agent should perform
the copy .
Change-Id: I0cf4cfebeae256e6ca795f08a7ed7130f4857d1f
2016-11-10 23:12:48 -06:00
Ben Sander
2dea3a0b1a
Improve memory debug
...
Change-Id: I0f033139aa4e4b47039eb016e404009127bd0a44
2016-11-10 23:12:48 -06:00
pensun
1ec5761a11
Update depreciated information for threadfence_system()
...
Change-Id: Id13d2f81edb51eb42b896a5c06913d59ec907c55
2016-11-10 11:55:12 -06:00
Maneesh Gupta
a12d5a8989
CMakeLists.txt: Cascade CMAKE_BUILD_TYPE to tests
...
Change-Id: I53a3ea951c1fd57e43a02381a457c1dedc1a34f7
2016-11-10 21:26:34 +05:30
Rahul Garg
fcb94863f7
hipDeviceGetByPCIBusId support for HIP/NVCC
...
Change-Id: I8f82890e88d2a15f592bff192179e7d5c5362722
2016-11-10 11:40:59 +05:30
Maneesh Gupta
669d734624
hipcc: Default to HIP_LIB_TYPE=1
...
Change-Id: I83b05accd76f7bc94bd724c66ae060fa0095bc8d
2016-11-10 11:34:00 +05:30
Maneesh Gupta
36024deb3a
hcc_dialects/Makefile: use clamp-config
...
Change-Id: I86df82f75b75125825e22d0545209a19386d9936
2016-11-10 11:31:50 +05:30
pensun
4d7ac1e091
resolve conflicts for git pull
...
Change-Id: Ie353b831e2241bc28042069b6cc7405257e871e1
2016-11-09 21:38:43 -06:00