Rahul Garg
0578febb99
Removed redundant GetPCIBusID int version function
...
Change-Id: I37f2ff87d09fcfb1e3b104c44c51f606fcb83c01
2016-12-20 23:25:16 +05:30
Ben Sander
5d815937de
Add name for function
2016-12-17 08:54:09 -06:00
Ben Sander
2bd70ff345
Remove HSA dependency from hipFunction_t
...
Place _groupSegmentSize and _privateSegmentSize inside Function,
remove hsa_executable_symbol_t.
2016-12-17 07:22:56 -06:00
Ben Sander
3f9404d0e1
Refactor Module and Function APIs.
...
- hipFunction_t is now returned by value. This eliminates dynamic
allocation / memory management complexity in the module. Removed
the kernel
name so the structure is just 16 bytes now.
- Moved the hsa_executable_load_module and hsa_executable_freeze
calls to the hipModuleLoad and hipModuleLoadData calls.
- Apply sharedMemBytes in hipModuleLaunchKernel to group segment
size (not private).
2016-12-17 07:22:33 -06:00
Rahul Garg
bddaa0e81c
Mapped hipDevice_t to int
...
Change-Id: I6cfa56c42b7cd04aa0e0bce510c0d72d34ea211a
2016-12-17 16:53:03 +05:30
Aditya Atluri
c673aec971
disabled half native support as inline asm is not working
...
Change-Id: I3073d8ae39eed321987f0f2f0e689eec4cdbb48c
2016-12-16 09:24:59 -06:00
Aditya Atluri
a1d1fcfdac
fixed compilation issues
...
Change-Id: I96692538736e2e4f2da9dba9c8c29a164aec4c0d
2016-12-14 16:50:16 -06:00
Aditya Atluri
c20a86d866
added half2 support
...
Change-Id: I0f3b9b7037fed97e80ec99f5369c75a63f001aae
2016-12-14 14:18:48 -06:00
Aditya Atluri
01ed8e91e9
added simple half math ops
...
Change-Id: I10b1d1023a9e5f2ba63f28c4a2bbe60ee49a8aee
2016-12-13 20:20:58 -06:00
Aditya Atluri
26934a920c
disabled compiler flag hcc 4.0 for half support
...
Change-Id: I32175113f4c05d43310b3a05c2a14e12f6d48b09
2016-12-13 20:06:56 -06:00
Aditya Atluri
7a712aa76b
added few type reinterpret cast device functions
...
1. __int_as_float
2. __hiloint2double
Change-Id: Id247c196887b24a12090f0521bf91e13afeec733
2016-12-13 14:41:36 -06:00
Aditya Atluri
02eab122c5
added half math addition ISA support
...
Change-Id: I293b771f695b499b795d7e53f600c9e4fe2a2071
2016-12-13 09:18:34 -06:00
Rahul Garg
a6b2f9c3a0
Fixed build error due to GetPCIBusId overloaded function
...
Change-Id: I626446f2c72c8143f08c95367bc1c528abeaf69d
2016-12-08 14:35:58 +05:30
Maneesh Gupta
c677041b37
hcc_detail/hip_runtime_api.h: Fix IPC API signature
...
Change-Id: I0be0f09c62f231620341141bd66183c3338be56a
2016-12-08 12:50:25 +05:30
pensun
7ac5f2e8c3
HIP IPC implementation on ROCr IPC APIs
...
Change-Id: I1ca9d520f5d0b1b56694211471b81eb7c6c23d16
2016-12-07 15:38:36 -06:00
Rahul Garg
d8fdd6c6fc
hipDeviceGetPCIBusId int version changes for CUDA runtime API
...
Change-Id: I4d3b995f1d1ac83415ca84808a074e5c8cd72f3c
2016-12-07 12:12:40 +05:30
pensun
092924d660
IPC prototyps and part of the implementation included
...
Change-Id: Id88c7f155d23ec63f57a6ef05098fba43f8af336
2016-12-06 14:24:09 -06:00
pensun
808e555247
local changes for hipnccl
...
Change-Id: I05a1f0381ce2914a800f573342cc954eb5ff82d9
2016-12-06 14:22:02 -06:00
Ben Sander
783ac156ce
Add additional controls for forcing serialization and blocking.
...
Move HIP_COHERENT_HOST_ALLOC so it is read once at init time.
Add HIP_LAUNCH_BLOCKING_KERNELS, HIP_API_BLOCKING.
Update docs on debug and chicken bits.
Conflicts:
src/hip_hcc.cpp
2016-12-02 18:03:59 -06:00
Rahul Garg
bda0704213
Added support for hipMemGetAddressRange
...
Change-Id: I99a796a4eb765152cf15a12d6a86b58684d34f50
2016-11-29 22:04:09 +05:30
pensun
8e2980c7ef
Change the parameter type of hipDeviceGetPCIBusID to char*
...
Change-Id: Ia72f403126e95f65da53208fc246f45d1417381f
2016-11-28 10:47:18 -06:00
Aditya Atluri
de89b25d52
added support for rcp for float and double
...
Change-Id: Ibeba3a9f64494fc0a176bcb4a854fb2f56567b55
2016-11-23 20:01:18 -06:00
Aditya Atluri
cc1f8a1011
added fma for double and float
...
1. Added fma intrinsic support for double and float
2. Added test for fma
Change-Id: I909fdbec34a3d12c03ba6eff3a39376a7128ee43
2016-11-23 18:22:05 -06:00
pensun
69b43ec17c
Add some missing APIs on nv path and hipify
...
Change-Id: Ic0f4740ab06bf70b1de61b39fedc7a6e7605cb61
2016-11-23 14:36:30 -06:00
Aditya Atluri
c2f6ecf264
Added fast math flag
...
1. Use -DHIP_FAST_MATH to make precise math functions compiled to fast math
2. Added double fast math functions for sqrt
3. Changed hipcc to parse -use_fast_math (not working)
4. Added passed tag to hipFloatMath test
Change-Id: I72884b2436b4efe61e9a9297346c1358fee38a2d
2016-11-23 11:19:15 -06:00
Ben Sander
9db93a1b96
Improve docs in some places
...
Change-Id: If31e84fbf0c8595ca72edb842dce7ce47783579b
2016-11-23 08:16:18 -06:00
Rahul Garg
8a2685e6cd
Removed nested HIP calls from hip_device functions
...
Change-Id: I18785b0ee27e32fb8950982fa5c3a64d1ae6a9b8
2016-11-23 18:37:06 +05:30
Aditya Atluri
d9a3527769
added fast math intrinsics to HIP
...
1. Added fast math intrinsics for single precision data types
2. Added test to check the intrinsics
3. Added HIP_PRECISE_MATH macro to enable precise math on fast math
Change-Id: Iadacbb6182c31252c5e3252854372d1b80dfd27b
2016-11-22 15:26:00 -06:00
Aditya Atluri
1a85762f53
added fast math APIs
...
1. Added fast math apis for sin, cos, tan, sincos
2. Added test for trig math functions
3. Added logarithm fast math
4. Changed how hipGetDevice, hipDeviceGetCacheConfig emit errors
Change-Id: Ie6ab594ddd5853cbe85e39a2f6d3479a807fa323
2016-11-22 10:20:09 -06:00
Aditya Atluri
2ded0ce302
fixed texture header on nvcc
...
Change-Id: Ibe19f94be5edf972b6b51dea263e1088b6c60c1d
2016-11-21 13:53:28 -06:00
Aditya Atluri
fef766df88
removed warnings in macros
...
Change-Id: I992b11f6aee2bab09f46885a2d12234aa6814cc5
2016-11-21 09:04:36 -06:00
Aditya Atluri
2611de2477
fixed compilation bugs
...
1. Texture functions are now compiling fine
2. Fixed hipFuncCache to hipFuncCache_t
Change-Id: I8f815887e4de43ee115bbaff249905b236541c39
2016-11-21 08:56:30 -06:00
Aditya Atluri
b3c16ea7b5
Fixed hipDeviceGetCacheConfig on nvcc path
...
1. Changed test macro to emit line numbers
2. Added getcacheconfig api test for nvcc path
3. Fixed hipFuncCache_t data type
TODO: With this commit, right now there are 2 func cache datatypes
a. hipFuncCache_t for runtime API
b. hipFuncCache for driver API
Map these to a single data type
Change-Id: Ia47c9f5d7c2633638051bf17b1103048a1ede973
2016-11-20 12:18:08 -06:00
Aditya Atluri
cc829f04c5
added copy right to new header
...
Change-Id: I16e1d02194551e4b20019bcb6850a3f84882ef18
2016-11-19 23:02:56 -06:00
Aditya Atluri
6692ee09d7
added tests to check nvcc runtime api output
...
Change-Id: Ifdd39b5d0a6a58d20a8e9745e59dd82d50a90e2f
2016-11-19 21:36:28 -06:00
Maneesh Gupta
2195e3c37d
Refactor for building HIP as dynamic library
...
Change-Id: I65a3d9d589c4fdbbdcf1611e5427224253be8260
2016-11-18 14:33:20 +05:30
Aditya Atluri
3b1f0e903c
moved runtime macros to runtime_api.h
...
Change-Id: Ib47e449328e8e6ec55d1b6ee19899de4b591ea8e
2016-11-17 14:19:18 -06:00
Aditya Atluri
94984470d4
make texture as seperate header as of now
...
Change-Id: I3c65aa75f2f729eedd8c3292fa3cbc37709c1cfe
2016-11-17 11:55:29 -06:00
Aditya Atluri
603bb321ec
Added i8 packed math intrinsics
...
1. Added add, sub, mul packed math i8 intrinsics
2. Removed c++ packed data structures included from HCC
Change-Id: I1d109c5ce10c48b7cd3ea059478b88fc1de78499
TODO: Add better packed data structures support
2016-11-17 01:09:12 -06:00
pensun
9aa2269d5c
Update depreciated information for threadfence_system()
...
Change-Id: Id13d2f81edb51eb42b896a5c06913d59ec907c55
2016-11-10 11:55:12 -06:00
Rahul Garg
f86c7b5b3c
hipDeviceGetByPCIBusId support for HIP/NVCC
...
Change-Id: I8f82890e88d2a15f592bff192179e7d5c5362722
2016-11-10 11:40:59 +05:30
Maneesh Gupta
e3b5eef7c9
Merge branch 'rocm-rel-1.3' into amd-develop
...
Conflicts:
include/hip/nvcc_detail/hip_runtime_api.h
Change-Id: I990a7d008da9e8dcc68250cebbc8ee6e723c7e01
2016-11-10 08:56:38 +05:30
pensun
4a8a6a4697
fix hipProfiler* apis on NV path
...
Change-Id: I6adca6151fef3a9b35348163eb6bd13f5c414172
2016-11-09 15:44:01 -06:00
pensun
e5277ab4b6
fix for hipcallback function on NV path
...
Change-Id: If80c0cfe60b1f3b1a71627b5f3f79503cba4d491
2016-11-09 11:33:23 -06:00
Ben Sander
00276d141e
Add debug for Peer APIs. Enable PeerMemcpy APIs by default.
...
Change-Id: I46e39a9e7b07686a78484c1f3b5495b08e052fbb
2016-11-04 08:51:16 -05:00
pensun
212fa7033c
Update document for workaround suggestion on threadfence_system()
...
Change-Id: Icccab8270604a0e578a8614b9afb3f95372f4966
2016-11-02 16:08:27 -05:00
pensun
9f86e47800
Update hipStreamNonBlocking to use cuda define on NV path
...
Change-Id: I74ea09db99d602ba1c5f192b36ff7f2781176e6a
2016-11-01 20:30:56 -05:00
Aditya Atluri
f097b6ef81
added inter thread data movement intrinsics
...
Change-Id: I2a8a8ed49429cb7f96439bd28c4b83b5142737df
2016-11-01 16:37:33 -05:00
Rahul Garg
81c91f5b0b
Added hipDeviceGetByPCIBusId in hip/hcc path
...
Change-Id: I3cca0dc533d0281689d8a407c7da16ca1ba6a3a8
2016-11-01 10:57:48 +05:30
Ben Sander
9edaf0e3f7
add hip_profile.h
...
Change-Id: Id43a4336db53567020584cb7842baf5c1649fd8e
2016-10-28 07:08:46 -05:00