Граф коммитов

6428 Коммитов

Автор SHA1 Сообщение Дата
Aditya Atluri 30674382a4 added fma for double and float
1. Added fma intrinsic support for double and float
2. Added test for fma

Change-Id: I909fdbec34a3d12c03ba6eff3a39376a7128ee43
2016-11-23 18:22:05 -06:00
pensun 8a8c7a6b4d Add some missing APIs on nv path and hipify
Change-Id: Ic0f4740ab06bf70b1de61b39fedc7a6e7605cb61
2016-11-23 14:36:30 -06:00
pensun a836395350 Add several missing APIs in hipify
Change-Id: I58912871cb0b10128f221ef26a11b0d69fb7873c
2016-11-23 14:06:18 -06:00
Aditya Atluri 043da795f6 Added fast math flag
1. Use -DHIP_FAST_MATH to make precise math functions compiled to fast math
2. Added double fast math functions for sqrt
3. Changed hipcc to parse -use_fast_math (not working)
4. Added passed tag to hipFloatMath test

Change-Id: I72884b2436b4efe61e9a9297346c1358fee38a2d
2016-11-23 11:19:15 -06:00
Ben Sander 75c540fe3c Add toc to hip_profiling.md
Change-Id: I3ae100f12686d0398a0403b78ca571382acce135
2016-11-23 08:36:08 -06:00
Ben Sander dec59d9909 Improve docs in some places
Change-Id: If31e84fbf0c8595ca72edb842dce7ce47783579b
2016-11-23 08:16:18 -06:00
Ben Sander b6ae6b08fb Improve debug capabilities.
Print TID mapping at init when HIP_TRACE_API=1.
Print base host/dev info from tracker during copy.

Change-Id: I84e26d7b801567e5a91baad36126fb590920ec87
2016-11-23 08:16:18 -06:00
Ben Sander e4e14211b3 Improve profiler and debug documentation 2016-11-23 08:15:40 -06:00
Rahul Garg 6a4f44bce0 Removed nested HIP calls from hip_device functions
Change-Id: I18785b0ee27e32fb8950982fa5c3a64d1ae6a9b8
2016-11-23 18:37:06 +05:30
Aditya Atluri f843928ddd added fast math intrinsics to HIP
1. Added fast math intrinsics for single precision data types
2. Added test to check the intrinsics
3. Added HIP_PRECISE_MATH macro to enable precise math on fast math

Change-Id: Iadacbb6182c31252c5e3252854372d1b80dfd27b
2016-11-22 15:26:00 -06:00
Aditya Atluri 94d2115d6d added fast math APIs
1. Added fast math apis for sin, cos, tan, sincos
2. Added test for trig math functions
3. Added logarithm fast math
4. Changed how hipGetDevice, hipDeviceGetCacheConfig emit errors

Change-Id: Ie6ab594ddd5853cbe85e39a2f6d3479a807fa323
2016-11-22 10:20:09 -06:00
Rahul Garg 2dcf20ac6f Removed hsaKmtReleaseSystemProperties call
Change-Id: I7cb992cccf587c333f0ca0cb518409f3944bdb06
2016-11-22 06:15:35 +05:30
Aditya Atluri 7145ea6a4a fixed error output for hipDeviceGetAttribute
Change-Id: I1e343a4e4e20e1a550d419f701cc1e60e9d03af4
2016-11-21 18:07:01 -06:00
Aditya Atluri d6ad91ffa4 fixed texture header on nvcc
Change-Id: Ibe19f94be5edf972b6b51dea263e1088b6c60c1d
2016-11-21 13:53:28 -06:00
Aditya Atluri 6052eaa761 removed warnings in macros
Change-Id: I992b11f6aee2bab09f46885a2d12234aa6814cc5
2016-11-21 09:04:36 -06:00
Aditya Atluri 2412c9a061 fixed compilation bugs
1. Texture functions are now compiling fine
2. Fixed hipFuncCache to hipFuncCache_t

Change-Id: I8f815887e4de43ee115bbaff249905b236541c39
2016-11-21 08:56:30 -06:00
Aditya Atluri afaa5fcf96 Fixed hipDeviceGetCacheConfig on nvcc path
1. Changed test macro to emit line numbers
2. Added getcacheconfig api test for nvcc path
3. Fixed hipFuncCache_t data type

TODO: With this commit, right now there are 2 func cache datatypes
a. hipFuncCache_t for runtime API
b. hipFuncCache for driver API

Map these to a single data type

Change-Id: Ia47c9f5d7c2633638051bf17b1103048a1ede973
2016-11-20 12:18:08 -06:00
Aditya Atluri 0edc082ff6 added new test for getting attribute
1. Added copyright to all new tests
2. Added test for hipDeviceGetAttribute

Change-Id: I7a070c5b8316ef6575b3f4c49bda2769aea2a7c4
2016-11-20 11:53:16 -06:00
Aditya Atluri a6c4304725 added copy right to new header
Change-Id: I16e1d02194551e4b20019bcb6850a3f84882ef18
2016-11-19 23:02:56 -06:00
Aditya Atluri 428041cfc2 added tests to check nvcc runtime api output
Change-Id: Ifdd39b5d0a6a58d20a8e9745e59dd82d50a90e2f
2016-11-19 21:36:28 -06:00
Sandeep Kumar 53e771fc75 fix_format
Change-Id: I34e265de434263a11654e5deba044c3f21e86578
2016-11-18 14:34:14 +05:30
Maneesh Gupta c0419cc749 Refactor for building HIP as dynamic library
Change-Id: I65a3d9d589c4fdbbdcf1611e5427224253be8260
2016-11-18 14:33:20 +05:30
scchan 3d6bf5e799 Add extra linker flags to the shared library build
Change-Id: I19e569d566fb5e25e343e364a3053a3f12659361
2016-11-18 14:18:29 +05:30
Maneesh Gupta 4fc082ff09 Fix broken tests due to dc64a73
Change-Id: I847c80f8462e1c955bdef957e6de2841a3a6ab29
2016-11-18 12:20:47 +05:30
Aditya Atluri 1618cb3f85 moved runtime macros to runtime_api.h
Change-Id: Ib47e449328e8e6ec55d1b6ee19899de4b591ea8e
2016-11-17 14:19:18 -06:00
Aditya Atluri c20c524400 added texture header to memory api source
Change-Id: I1af6d60aca5a9a9ef1cadf8c304bea892acbe061
2016-11-17 11:57:53 -06:00
Aditya Atluri dc64a732d8 make texture as seperate header as of now
Change-Id: I3c65aa75f2f729eedd8c3292fa3cbc37709c1cfe
2016-11-17 11:55:29 -06:00
Aditya Atluri 12dd9df88f Added i8 packed math intrinsics
1. Added add, sub, mul packed math i8 intrinsics
2. Removed c++ packed data structures included from HCC

Change-Id: I1d109c5ce10c48b7cd3ea059478b88fc1de78499
TODO: Add better packed data structures support
2016-11-17 01:09:12 -06:00
Maneesh Gupta 76cbf6954d Merge branch 'amd-develop' into amd-master
Change-Id: I4fbb7ac287c182fea97bf31562a3d64554e59e94
2016-11-15 10:44:21 +05:30
Maneesh Gupta 888a3528d2 Enable USE_COPY_EXT_V2 by default
Change-Id: I2c0dc80f85a0ccb5744715b5418a604e38b249ed
2016-11-15 10:42:27 +05:30
Ben Sander 0c624c009b tweak hcc demangler 2016-11-14 15:26:27 -07:00
Maneesh Gupta 8d40253664 Merge branch 'amd-develop' into amd-master
Change-Id: I32d41081ac065f2c50531dc2e420802d765665e2
2016-11-14 06:12:03 +05:30
Sandeep Kumar 09b157ca8c Add p2p for cookbook
Change-Id: Id2e77ab31123ef95885d665efe34bc0d4596733a
(cherry picked from commit 6fbd0352713ca36e399b1ed4f17c486207a53875)
2016-11-14 06:10:36 +05:30
Maneesh Gupta fd1483ce35 Revert "hipcc: Turn back linking hip_ir.ll by default"
This reverts commit 528b257004.
2016-11-14 06:05:31 +05:30
Ben Sander faf2a1e01a Add draft doc on profiling with hip.
Change-Id: I79727dd2500333b3f16acb381dd5852a15ed408a
2016-11-13 10:01:05 -06:00
Ben Sander c9401cb95f Add &nbsp to demangler
Change-Id: I89586c7c17f5152b7a6850d0d6c2aa1d3ebc8190
2016-11-11 16:50:56 -06:00
pensun 50867efa10 Add direct test case for threadfence_system workaround
Change-Id: I5b21b590e957c901044741ac94e816cd8b1426f9
2016-11-11 15:09:43 -06:00
Aditya Atluri abf6872b2b fixed multi-dim module kernel launch
Change-Id: Id1d81f2375d058979ab526433f905cf0ea3d23d6
2016-11-11 12:25:23 -06:00
Ben Sander 1e5515ee9f Add option to deny peer access.
Also fix test.

Change-Id: I1b247f6c4271442b008e560669bca4daf8eb94c7
2016-11-10 23:12:48 -06:00
Ben Sander 65584e48de Use forceUnpinnedCopy to resolve P2p corner cases.
Change-Id: I2aebb419881246cebb696bec87798635bc71acc2
2016-11-10 23:12:48 -06:00
Ben Sander d3d6feb4de Enable async copy again.
Also add HIP_FORCE_SYNC_COPY chicken bit.

Change-Id: I76a385410494b99bf27305d3c08f55dd81987565
2016-11-10 23:12:48 -06:00
Ben Sander 8724273f28 Doc change only - add comments to test.
Change-Id: Ie42087cf3c78e49337b18bb71f3f0e1e7950ee1b
2016-11-10 23:12:48 -06:00
Ben Sander ced9d72d94 Refactor copy and P2P logic.
Prefer use of source-engine for DMA copies, even if user submits copy
in a stream attached to a different device.
The stream is now used only for synchronization, and HIP
makes the most optimal decision for which engine to perform the
copy - typically the source copy engine.

HIP now makes decision on which engine should perform the copy
and passes this to HCC using new apis.
HIP has additional information about peer
visibility and will make a decision which agent should perform
the copy .

Change-Id: I0cf4cfebeae256e6ca795f08a7ed7130f4857d1f
2016-11-10 23:12:48 -06:00
Ben Sander 2dea3a0b1a Improve memory debug
Change-Id: I0f033139aa4e4b47039eb016e404009127bd0a44
2016-11-10 23:12:48 -06:00
pensun 1ec5761a11 Update depreciated information for threadfence_system()
Change-Id: Id13d2f81edb51eb42b896a5c06913d59ec907c55
2016-11-10 11:55:12 -06:00
Maneesh Gupta a12d5a8989 CMakeLists.txt: Cascade CMAKE_BUILD_TYPE to tests
Change-Id: I53a3ea951c1fd57e43a02381a457c1dedc1a34f7
2016-11-10 21:26:34 +05:30
Rahul Garg fcb94863f7 hipDeviceGetByPCIBusId support for HIP/NVCC
Change-Id: I8f82890e88d2a15f592bff192179e7d5c5362722
2016-11-10 11:40:59 +05:30
Maneesh Gupta 669d734624 hipcc: Default to HIP_LIB_TYPE=1
Change-Id: I83b05accd76f7bc94bd724c66ae060fa0095bc8d
2016-11-10 11:34:00 +05:30
Maneesh Gupta 36024deb3a hcc_dialects/Makefile: use clamp-config
Change-Id: I86df82f75b75125825e22d0545209a19386d9936
2016-11-10 11:31:50 +05:30
pensun 4d7ac1e091 resolve conflicts for git pull
Change-Id: Ie353b831e2241bc28042069b6cc7405257e871e1
2016-11-09 21:38:43 -06:00