Ben Sander
ff2f54c1bf
Add additional controls for forcing serialization and blocking.
...
Move HIP_COHERENT_HOST_ALLOC so it is read once at init time.
Add HIP_LAUNCH_BLOCKING_KERNELS, HIP_API_BLOCKING.
Update docs on debug and chicken bits.
Conflicts:
src/hip_hcc.cpp
2016-12-02 18:03:59 -06:00
Rahul Garg
fe6ba656c9
Added support for hipMemGetAddressRange
...
Change-Id: I99a796a4eb765152cf15a12d6a86b58684d34f50
2016-11-29 22:04:09 +05:30
pensun
2fbbf2b136
Change the parameter type of hipDeviceGetPCIBusID to char*
...
Change-Id: Ia72f403126e95f65da53208fc246f45d1417381f
2016-11-28 10:47:18 -06:00
Aditya Atluri
7131d0b961
added support for rcp for float and double
...
Change-Id: Ibeba3a9f64494fc0a176bcb4a854fb2f56567b55
2016-11-23 20:01:18 -06:00
Aditya Atluri
30674382a4
added fma for double and float
...
1. Added fma intrinsic support for double and float
2. Added test for fma
Change-Id: I909fdbec34a3d12c03ba6eff3a39376a7128ee43
2016-11-23 18:22:05 -06:00
pensun
8a8c7a6b4d
Add some missing APIs on nv path and hipify
...
Change-Id: Ic0f4740ab06bf70b1de61b39fedc7a6e7605cb61
2016-11-23 14:36:30 -06:00
Aditya Atluri
043da795f6
Added fast math flag
...
1. Use -DHIP_FAST_MATH to make precise math functions compiled to fast math
2. Added double fast math functions for sqrt
3. Changed hipcc to parse -use_fast_math (not working)
4. Added passed tag to hipFloatMath test
Change-Id: I72884b2436b4efe61e9a9297346c1358fee38a2d
2016-11-23 11:19:15 -06:00
Ben Sander
dec59d9909
Improve docs in some places
...
Change-Id: If31e84fbf0c8595ca72edb842dce7ce47783579b
2016-11-23 08:16:18 -06:00
Rahul Garg
6a4f44bce0
Removed nested HIP calls from hip_device functions
...
Change-Id: I18785b0ee27e32fb8950982fa5c3a64d1ae6a9b8
2016-11-23 18:37:06 +05:30
Aditya Atluri
f843928ddd
added fast math intrinsics to HIP
...
1. Added fast math intrinsics for single precision data types
2. Added test to check the intrinsics
3. Added HIP_PRECISE_MATH macro to enable precise math on fast math
Change-Id: Iadacbb6182c31252c5e3252854372d1b80dfd27b
2016-11-22 15:26:00 -06:00
Aditya Atluri
94d2115d6d
added fast math APIs
...
1. Added fast math apis for sin, cos, tan, sincos
2. Added test for trig math functions
3. Added logarithm fast math
4. Changed how hipGetDevice, hipDeviceGetCacheConfig emit errors
Change-Id: Ie6ab594ddd5853cbe85e39a2f6d3479a807fa323
2016-11-22 10:20:09 -06:00
Aditya Atluri
d6ad91ffa4
fixed texture header on nvcc
...
Change-Id: Ibe19f94be5edf972b6b51dea263e1088b6c60c1d
2016-11-21 13:53:28 -06:00
Aditya Atluri
6052eaa761
removed warnings in macros
...
Change-Id: I992b11f6aee2bab09f46885a2d12234aa6814cc5
2016-11-21 09:04:36 -06:00
Aditya Atluri
2412c9a061
fixed compilation bugs
...
1. Texture functions are now compiling fine
2. Fixed hipFuncCache to hipFuncCache_t
Change-Id: I8f815887e4de43ee115bbaff249905b236541c39
2016-11-21 08:56:30 -06:00
Aditya Atluri
afaa5fcf96
Fixed hipDeviceGetCacheConfig on nvcc path
...
1. Changed test macro to emit line numbers
2. Added getcacheconfig api test for nvcc path
3. Fixed hipFuncCache_t data type
TODO: With this commit, right now there are 2 func cache datatypes
a. hipFuncCache_t for runtime API
b. hipFuncCache for driver API
Map these to a single data type
Change-Id: Ia47c9f5d7c2633638051bf17b1103048a1ede973
2016-11-20 12:18:08 -06:00
Aditya Atluri
a6c4304725
added copy right to new header
...
Change-Id: I16e1d02194551e4b20019bcb6850a3f84882ef18
2016-11-19 23:02:56 -06:00
Aditya Atluri
428041cfc2
added tests to check nvcc runtime api output
...
Change-Id: Ifdd39b5d0a6a58d20a8e9745e59dd82d50a90e2f
2016-11-19 21:36:28 -06:00
Maneesh Gupta
c0419cc749
Refactor for building HIP as dynamic library
...
Change-Id: I65a3d9d589c4fdbbdcf1611e5427224253be8260
2016-11-18 14:33:20 +05:30
Aditya Atluri
1618cb3f85
moved runtime macros to runtime_api.h
...
Change-Id: Ib47e449328e8e6ec55d1b6ee19899de4b591ea8e
2016-11-17 14:19:18 -06:00
Aditya Atluri
dc64a732d8
make texture as seperate header as of now
...
Change-Id: I3c65aa75f2f729eedd8c3292fa3cbc37709c1cfe
2016-11-17 11:55:29 -06:00
Aditya Atluri
12dd9df88f
Added i8 packed math intrinsics
...
1. Added add, sub, mul packed math i8 intrinsics
2. Removed c++ packed data structures included from HCC
Change-Id: I1d109c5ce10c48b7cd3ea059478b88fc1de78499
TODO: Add better packed data structures support
2016-11-17 01:09:12 -06:00
pensun
1ec5761a11
Update depreciated information for threadfence_system()
...
Change-Id: Id13d2f81edb51eb42b896a5c06913d59ec907c55
2016-11-10 11:55:12 -06:00
Rahul Garg
fcb94863f7
hipDeviceGetByPCIBusId support for HIP/NVCC
...
Change-Id: I8f82890e88d2a15f592bff192179e7d5c5362722
2016-11-10 11:40:59 +05:30
Maneesh Gupta
72c722c3d6
Merge branch 'rocm-rel-1.3' into amd-develop
...
Conflicts:
include/hip/nvcc_detail/hip_runtime_api.h
Change-Id: I990a7d008da9e8dcc68250cebbc8ee6e723c7e01
2016-11-10 08:56:38 +05:30
pensun
57cd3c8244
fix hipProfiler* apis on NV path
...
Change-Id: I6adca6151fef3a9b35348163eb6bd13f5c414172
2016-11-09 15:44:01 -06:00
pensun
76c3c20da6
fix for hipcallback function on NV path
...
Change-Id: If80c0cfe60b1f3b1a71627b5f3f79503cba4d491
2016-11-09 11:33:23 -06:00
Ben Sander
3f0a2b8dc1
Add debug for Peer APIs. Enable PeerMemcpy APIs by default.
...
Change-Id: I46e39a9e7b07686a78484c1f3b5495b08e052fbb
2016-11-04 08:51:16 -05:00
pensun
774de273d0
Update document for workaround suggestion on threadfence_system()
...
Change-Id: Icccab8270604a0e578a8614b9afb3f95372f4966
2016-11-02 16:08:27 -05:00
pensun
4817131cdc
Update hipStreamNonBlocking to use cuda define on NV path
...
Change-Id: I74ea09db99d602ba1c5f192b36ff7f2781176e6a
2016-11-01 20:30:56 -05:00
Aditya Atluri
f48c53534e
added inter thread data movement intrinsics
...
Change-Id: I2a8a8ed49429cb7f96439bd28c4b83b5142737df
2016-11-01 16:37:33 -05:00
Rahul Garg
2d15d0741c
Added hipDeviceGetByPCIBusId in hip/hcc path
...
Change-Id: I3cca0dc533d0281689d8a407c7da16ca1ba6a3a8
2016-11-01 10:57:48 +05:30
Ben Sander
87a2e8f12b
add hip_profile.h
...
Change-Id: Id43a4336db53567020584cb7842baf5c1649fd8e
2016-10-28 07:08:46 -05:00
Ben Sander
bb58f4f6fc
Add initial hipProfileStart/Stop
...
And modify sample to show how to use.
Still needs some work to understand interaction with CXL.
Change-Id: I2579824d2dd7863ea23874d34f0dabb3cb305d3e
2016-10-27 23:09:32 -05:00
pensun
334e9c6f8e
Add missing hipStream typedef for NV path
...
Change-Id: I915cd14a9ff32b55b0121062d7804a7fbbdc3341
2016-10-27 13:34:14 -05:00
pensun
2abf300797
Remove extra semicolons and extra spaces in header on NV path
...
Change-Id: Ib33aec2451a4e0b298d537dbb1b9df000405871b
2016-10-26 10:23:10 +05:30
pensun
8a7dcfce0b
Remove extra semicolons and extra spaces in header on NV path
...
Change-Id: Ib33aec2451a4e0b298d537dbb1b9df000405871b
2016-10-25 15:29:52 -05:00
pensun
1f11a9554e
Add workaround for hipStreamAddCallback function: call stream synchronize on host and then add execute the call back function
...
Change-Id: If361f8e053949904b19b9e09245d267f05e29f7b
2016-10-22 23:59:39 -05:00
Aditya Atluri
48f6d52e7c
Added support for constant memory
...
1. Added support for constant memory
2. Added test which uses memcpytosymbol for constant memory
3. Corrected code error on nvcc path
Change-Id: I2ab69f516832bf7a037132ac81273ea6f5107401
2016-10-20 09:57:53 -05:00
Ben Sander
261ff423e1
Add hipDeviceSchedule* support to queue wait
...
Change-Id: Iffa7a356500b026f3737c3f5719ca9f62b10d855
2016-10-18 22:27:16 -05:00
Ben Sander
d21d3ec222
Remove some TODO items
...
Change-Id: I7e9de2e43a8584f8dc9ee6d45c8ed00ca465f591
2016-10-18 22:27:16 -05:00
Ben Sander
61af94a555
Update docs for event, review event TODO.
...
Change-Id: Iec491f9f22df163f01c0af6639fcbe33c81acdcc
2016-10-18 22:27:16 -05:00
Ben Sander
9315ac1a29
Move some internal headers from "include/hip/" to src.
...
Change-Id: I7041bd5c803d9318979f4a7c1d658445c614691e
2016-10-18 22:27:16 -05:00
Maneesh Gupta
2df7159ad7
Rename hipComplex.h -> hip_complex.h
...
Change-Id: I86af4ddccc6ebb19606156b459e3065d2c979108
2016-10-16 11:02:36 +05:30
Maneesh Gupta
9608fb93b5
include headers: Update copyright header and fix line endings
...
Change-Id: If2b0855f4ebf1e966edb54de5667687d154cc574
2016-10-15 22:52:10 +05:30
Ben Sander
c54220eca9
Cleanup files from code review.
...
- Remove some stale code
- Update docs
- Correct define for __HIP_ARCH_HAS_GLOBAL_INT64_ATOMICS__
Change-Id: Ic5e3cdb8269b1c18f6d2693700b55e08c4d0080e
2016-10-15 11:51:20 -05:00
Ben Sander
50e0a363ce
Add code to use new HCC API accelerator_view::dispatch_hsa_kernel.
...
Disabed by default, can enable with USE_DISPATCH_HSA_KERNEL=1
Change-Id: I7a6ba76f2bada34952ed47f5335ce695fa2faea5
2016-10-14 23:46:29 -05:00
Maneesh Gupta
84283d0801
Remove orphaned hip_blas.h from hcc_detail and nvcc_detail
...
Change-Id: I7e2dda475b538d30942c52d86fbdb213918c630c
2016-10-14 12:55:50 +05:30
Maneesh Gupta
6a14f39f8b
Remove incorrect executable-bit from non-executable files
...
Change-Id: Iacc434374721e01f7d75d0ab54bceabe0b337f54
2016-10-14 12:53:13 +05:30
Aditya Atluri
e1929e8e82
added limit enum to nvcc
...
Change-Id: If9cb6b1205631da36ec18a84f736f2f2f5155885
2016-10-13 15:15:02 -05:00
Aditya Atluri
00c3db0e60
changed hipLimit to hipLimit_t and data type to enum
...
Change-Id: I94f408cdcac4b0bb38801d58709b68e9630d44d0
2016-10-13 15:13:11 -05:00