Evgeny Mankov
2b6fda77ca
Attribute hipDevAttrConcurrentKernels for obtaining Device property concurrentKernels is added.
2016-02-18 14:34:18 +03:00
Evgeny Mankov
ea8f99702d
Fix typo: maxThreadsPerMultiProcessor -> MaxSharedMemoryPerMultiprocessor
...
Device property MaxSharedMemoryPerMultiprocessor set equal to totalGlobalMem (HIP path).
Reason: MaxSharedMemoryPerMultiprocessor should be as the same as group memory size. Group memory will not be paged out, so, the physical memory size = total shared memory size = group region size. NVCC path remains untouched: CUDA's device property MaxSharedMemoryPerMultiprocessor is reported.
hipify is updated as well.
2016-02-12 01:29:20 +03:00
Evgeny Mankov
33f60c300d
BDFID (BusID/DeviceID/FunctionID) support.
...
Except FunctionID (or DomainID in CUDA) support, because cudaDeviceProp::pciDomainID is not reported by CUDA.
2016-02-11 22:26:01 +03:00
Evgeny Mankov
254da4ec53
Formatting, no functional changes
2016-02-10 17:21:18 +03:00
gargrahul
8c40a4ace4
Removed atomicInc and atomicDec support from HIP
2016-02-10 04:29:55 +05:30
Evgeny Mankov
950c3baacd
Device property concurrentKernels is added to hipDeviceProp_t struct.
...
For HCC path concurrentKernels is set to true since all ROCR hardware supports this feature.
For NVCC path concurrentKernels is obtained from CUDA's device property cudaDeviceProp::concurrentKernels.
2016-02-09 17:10:35 +03:00
Maneesh Gupta
3291e0ec96
Move HIP_DEVICE_COMPILE defines to hip_common.h
2016-02-09 10:57:20 +05:30
Ben Sander
9e2c3c8df3
minor doc touchup
2016-02-08 22:11:11 -06:00
Ben Sander
76ebe6dcfd
Fix getdeviceattr compilation for NVCC
2016-02-04 16:26:33 -06:00
Sam Kolton
0a27507208
Implementation of hipDeviceGetAttribute()
2016-02-04 17:39:27 +03:00
Peng Sun
c73996d041
Fix all TODO-doc
2016-02-02 21:29:09 -06:00
Peng Sun
8b74333204
Finish all TODO for error code
2016-02-02 17:39:46 -06:00
scchan
265c42500f
add inline attribute to shfl functions
2016-02-02 12:53:17 -06:00
streamhsa
974d491902
Adjusted the value of __any as per CUDA -sandeep
2016-02-02 15:25:42 +05:30
streamhsa
23904df99b
ADDED Support for __ffs() and __ffsll() having signed input -sandeep
2016-02-02 15:05:46 +05:30
scchan
04f3e3e598
adding shfl, shfl_up, shfl_down, shfl_xor intrinsics
2016-02-01 23:55:31 -06:00
Maneesh Gupta
861cba6f75
Add double and integer intrinsics to test
2016-02-01 16:00:45 +05:30
Maneesh Gupta
d2c6125a7c
Add few more single precision intrinsics to hcc_detail/hip_runtime.h
2016-02-01 14:29:50 +05:30
Maneesh Gupta
3b19fd578d
Restrict using namespace hc::precise_math to device only
2016-02-01 14:26:50 +05:30
Maneesh Gupta
e55f3778e0
Remove redundant #define __HCC__ in hcc_detail/hip_runtime.h
2016-02-01 14:24:41 +05:30
sunway513
02fa107967
Fix some typos and incorrect namings in comments
2016-01-28 13:17:44 -06:00
sunway513
71a841d764
Fix @file and @brief tag on header files
2016-01-28 10:59:21 -06:00
Ben Sander
f38e63ff18
Initial commit for GPUOpen Launch
2016-01-26 20:14:33 -06:00