Граф коммитов

429 Коммитов

Автор SHA1 Сообщение Дата
Aditya Atluri fe38e9652b added math functions for half
1. Added math functions for half precision
2. HRCP is not available due to device code linking errors, will be enabled once it is fixed
3. Added math functions to half test file

Change-Id: Ie317ce70ef518a4fc3f27142143d01e0327f5df3
2017-01-13 12:05:29 -06:00
Aditya Atluri 646f566bbf added half2 cmp and conv, data movement device functions
1. Added half2 comparision functions
2. Added conversion and data movement half apis

Change-Id: Ia33c0e957d9deb1f2b7a8fde8e22168f4d41b88b
2017-01-13 10:56:07 -06:00
Aditya Atluri 89998d436f added comparision device functions for fp16
1. Added comparision device functions
2. Added test to check correct isa getting generated

Change-Id: I16732f5a1438bdce145f7bfcecd28198e3cc4b79
2017-01-12 14:52:14 -06:00
Aditya Atluri eeef055469 added packed math fp16 native device functions
1. Added SDWA implementation inside IR file
2. Added device functions to header + used them in test

Change-Id: Ib4e059a58eee201cc82438689e3e9bc5f9d26653
2017-01-12 14:10:51 -06:00
Aditya Atluri c286bf6f8a Started adding native half math library support
1. Removed HIP_EXPERIMENTAL env variable so that device code will be accessed from LLVM IR
2. Removed soft support from headers and moved to hip_fp16.cpp
3. Added LLVM IR + inline asm to hip_ir.ll
4. Added test for fp16
5. Added barriers for hcc 3.5 and hcc 4.0 for half support
a. Which means, hcc 4.0 can parse __fp16 but hcc 3.5 cant
b. HCC 4.0 code is implemented now, hcc 3.5 will be added later

Change-Id: Ic37859b2688ebb02e168bab643d1882bf4727952
2017-01-12 11:30:20 -06:00
Aditya Atluri f85d7b7d97 changed data type used for complex
Change-Id: I0a3bb281af3d5ac1290207821c7c45aea40f513f
2017-01-11 18:23:37 -06:00
Aditya Atluri 7dbf63dde2 changed copyright year from 2016 to 2017 in include directory
Change-Id: Ib5935a84fb51a04b3446df31cc2287101f791b83
2017-01-11 18:09:33 -06:00
Aditya Atluri 430e1364f5 fixed compilation issues with operator overloading device data types
Change-Id: I6a60282f0c04a3c0d382cdf2d67ad8d9156880ad
2017-01-11 17:53:32 -06:00
Aditya Atluri 4e57822d95 Added proper device data types
Change-Id: I42029635ff68c3c13a764a3eda6447e6c77878c6
2017-01-11 15:06:25 -06:00
Rahul Garg 090eadd0bd Added state for hipDevice.
Change-Id: Idbc3c04cd054a01b634856a1e0a23ff172e991aa
2017-01-09 23:54:01 +05:30
Rahul Garg 0578febb99 Removed redundant GetPCIBusID int version function
Change-Id: I37f2ff87d09fcfb1e3b104c44c51f606fcb83c01
2016-12-20 23:25:16 +05:30
Ben Sander 5d815937de Add name for function 2016-12-17 08:54:09 -06:00
Ben Sander 2bd70ff345 Remove HSA dependency from hipFunction_t
Place _groupSegmentSize and _privateSegmentSize inside Function,
remove hsa_executable_symbol_t.
2016-12-17 07:22:56 -06:00
Ben Sander 3f9404d0e1 Refactor Module and Function APIs.
- hipFunction_t is now returned by value.  This eliminates dynamic
      allocation / memory management complexity in the module.  Removed
the kernel
      name so the structure is just 16 bytes now.

    - Moved the hsa_executable_load_module and hsa_executable_freeze
      calls to the hipModuleLoad and hipModuleLoadData calls.

    - Apply sharedMemBytes in hipModuleLaunchKernel to group segment
      size (not private).
2016-12-17 07:22:33 -06:00
Rahul Garg bddaa0e81c Mapped hipDevice_t to int
Change-Id: I6cfa56c42b7cd04aa0e0bce510c0d72d34ea211a
2016-12-17 16:53:03 +05:30
Aditya Atluri c673aec971 disabled half native support as inline asm is not working
Change-Id: I3073d8ae39eed321987f0f2f0e689eec4cdbb48c
2016-12-16 09:24:59 -06:00
Aditya Atluri a1d1fcfdac fixed compilation issues
Change-Id: I96692538736e2e4f2da9dba9c8c29a164aec4c0d
2016-12-14 16:50:16 -06:00
Aditya Atluri c20a86d866 added half2 support
Change-Id: I0f3b9b7037fed97e80ec99f5369c75a63f001aae
2016-12-14 14:18:48 -06:00
Aditya Atluri 01ed8e91e9 added simple half math ops
Change-Id: I10b1d1023a9e5f2ba63f28c4a2bbe60ee49a8aee
2016-12-13 20:20:58 -06:00
Aditya Atluri 26934a920c disabled compiler flag hcc 4.0 for half support
Change-Id: I32175113f4c05d43310b3a05c2a14e12f6d48b09
2016-12-13 20:06:56 -06:00
Aditya Atluri 7a712aa76b added few type reinterpret cast device functions
1. __int_as_float
2. __hiloint2double

Change-Id: Id247c196887b24a12090f0521bf91e13afeec733
2016-12-13 14:41:36 -06:00
Aditya Atluri 02eab122c5 added half math addition ISA support
Change-Id: I293b771f695b499b795d7e53f600c9e4fe2a2071
2016-12-13 09:18:34 -06:00
Rahul Garg a6b2f9c3a0 Fixed build error due to GetPCIBusId overloaded function
Change-Id: I626446f2c72c8143f08c95367bc1c528abeaf69d
2016-12-08 14:35:58 +05:30
Maneesh Gupta c677041b37 hcc_detail/hip_runtime_api.h: Fix IPC API signature
Change-Id: I0be0f09c62f231620341141bd66183c3338be56a
2016-12-08 12:50:25 +05:30
pensun 7ac5f2e8c3 HIP IPC implementation on ROCr IPC APIs
Change-Id: I1ca9d520f5d0b1b56694211471b81eb7c6c23d16
2016-12-07 15:38:36 -06:00
Rahul Garg d8fdd6c6fc hipDeviceGetPCIBusId int version changes for CUDA runtime API
Change-Id: I4d3b995f1d1ac83415ca84808a074e5c8cd72f3c
2016-12-07 12:12:40 +05:30
pensun 092924d660 IPC prototyps and part of the implementation included
Change-Id: Id88c7f155d23ec63f57a6ef05098fba43f8af336
2016-12-06 14:24:09 -06:00
pensun 808e555247 local changes for hipnccl
Change-Id: I05a1f0381ce2914a800f573342cc954eb5ff82d9
2016-12-06 14:22:02 -06:00
Ben Sander 783ac156ce Add additional controls for forcing serialization and blocking.
Move HIP_COHERENT_HOST_ALLOC so it is read once at init time.
Add HIP_LAUNCH_BLOCKING_KERNELS, HIP_API_BLOCKING.
Update docs on debug and chicken bits.

Conflicts:
	src/hip_hcc.cpp
2016-12-02 18:03:59 -06:00
Rahul Garg bda0704213 Added support for hipMemGetAddressRange
Change-Id: I99a796a4eb765152cf15a12d6a86b58684d34f50
2016-11-29 22:04:09 +05:30
pensun 8e2980c7ef Change the parameter type of hipDeviceGetPCIBusID to char*
Change-Id: Ia72f403126e95f65da53208fc246f45d1417381f
2016-11-28 10:47:18 -06:00
Aditya Atluri de89b25d52 added support for rcp for float and double
Change-Id: Ibeba3a9f64494fc0a176bcb4a854fb2f56567b55
2016-11-23 20:01:18 -06:00
Aditya Atluri cc1f8a1011 added fma for double and float
1. Added fma intrinsic support for double and float
2. Added test for fma

Change-Id: I909fdbec34a3d12c03ba6eff3a39376a7128ee43
2016-11-23 18:22:05 -06:00
pensun 69b43ec17c Add some missing APIs on nv path and hipify
Change-Id: Ic0f4740ab06bf70b1de61b39fedc7a6e7605cb61
2016-11-23 14:36:30 -06:00
Aditya Atluri c2f6ecf264 Added fast math flag
1. Use -DHIP_FAST_MATH to make precise math functions compiled to fast math
2. Added double fast math functions for sqrt
3. Changed hipcc to parse -use_fast_math (not working)
4. Added passed tag to hipFloatMath test

Change-Id: I72884b2436b4efe61e9a9297346c1358fee38a2d
2016-11-23 11:19:15 -06:00
Ben Sander 9db93a1b96 Improve docs in some places
Change-Id: If31e84fbf0c8595ca72edb842dce7ce47783579b
2016-11-23 08:16:18 -06:00
Rahul Garg 8a2685e6cd Removed nested HIP calls from hip_device functions
Change-Id: I18785b0ee27e32fb8950982fa5c3a64d1ae6a9b8
2016-11-23 18:37:06 +05:30
Aditya Atluri d9a3527769 added fast math intrinsics to HIP
1. Added fast math intrinsics for single precision data types
2. Added test to check the intrinsics
3. Added HIP_PRECISE_MATH macro to enable precise math on fast math

Change-Id: Iadacbb6182c31252c5e3252854372d1b80dfd27b
2016-11-22 15:26:00 -06:00
Aditya Atluri 1a85762f53 added fast math APIs
1. Added fast math apis for sin, cos, tan, sincos
2. Added test for trig math functions
3. Added logarithm fast math
4. Changed how hipGetDevice, hipDeviceGetCacheConfig emit errors

Change-Id: Ie6ab594ddd5853cbe85e39a2f6d3479a807fa323
2016-11-22 10:20:09 -06:00
Aditya Atluri 2ded0ce302 fixed texture header on nvcc
Change-Id: Ibe19f94be5edf972b6b51dea263e1088b6c60c1d
2016-11-21 13:53:28 -06:00
Aditya Atluri fef766df88 removed warnings in macros
Change-Id: I992b11f6aee2bab09f46885a2d12234aa6814cc5
2016-11-21 09:04:36 -06:00
Aditya Atluri 2611de2477 fixed compilation bugs
1. Texture functions are now compiling fine
2. Fixed hipFuncCache to hipFuncCache_t

Change-Id: I8f815887e4de43ee115bbaff249905b236541c39
2016-11-21 08:56:30 -06:00
Aditya Atluri b3c16ea7b5 Fixed hipDeviceGetCacheConfig on nvcc path
1. Changed test macro to emit line numbers
2. Added getcacheconfig api test for nvcc path
3. Fixed hipFuncCache_t data type

TODO: With this commit, right now there are 2 func cache datatypes
a. hipFuncCache_t for runtime API
b. hipFuncCache for driver API

Map these to a single data type

Change-Id: Ia47c9f5d7c2633638051bf17b1103048a1ede973
2016-11-20 12:18:08 -06:00
Aditya Atluri cc829f04c5 added copy right to new header
Change-Id: I16e1d02194551e4b20019bcb6850a3f84882ef18
2016-11-19 23:02:56 -06:00
Aditya Atluri 6692ee09d7 added tests to check nvcc runtime api output
Change-Id: Ifdd39b5d0a6a58d20a8e9745e59dd82d50a90e2f
2016-11-19 21:36:28 -06:00
Maneesh Gupta 2195e3c37d Refactor for building HIP as dynamic library
Change-Id: I65a3d9d589c4fdbbdcf1611e5427224253be8260
2016-11-18 14:33:20 +05:30
Aditya Atluri 3b1f0e903c moved runtime macros to runtime_api.h
Change-Id: Ib47e449328e8e6ec55d1b6ee19899de4b591ea8e
2016-11-17 14:19:18 -06:00
Aditya Atluri 94984470d4 make texture as seperate header as of now
Change-Id: I3c65aa75f2f729eedd8c3292fa3cbc37709c1cfe
2016-11-17 11:55:29 -06:00
Aditya Atluri 603bb321ec Added i8 packed math intrinsics
1. Added add, sub, mul packed math i8 intrinsics
2. Removed c++ packed data structures included from HCC

Change-Id: I1d109c5ce10c48b7cd3ea059478b88fc1de78499
TODO: Add better packed data structures support
2016-11-17 01:09:12 -06:00
pensun 9aa2269d5c Update depreciated information for threadfence_system()
Change-Id: Id13d2f81edb51eb42b896a5c06913d59ec907c55
2016-11-10 11:55:12 -06:00