Aditya Avinash Atluri
ecadb1623c
Merge pull request #4 from AMDComputeLibraries/memtracker
...
hipGetPointerAttrib behavioral changes
2016-02-27 10:51:23 -06:00
Aditya Avinash Atluri
6d66bd63de
Added CUDA support for hipPointerGetAttributes
2016-02-26 12:33:55 -06:00
Ben Sander
ff66ef0779
fixes for titan platform
2016-02-26 05:25:30 -06:00
Ben Sander
369e0d7b5b
Merge branch 'memtracker' into privatestaging
...
Conflicts:
include/nvcc_detail/hip_runtime_api.h
2016-02-26 06:17:05 -06:00
Ben Sander
c300ffe458
Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
2016-02-26 06:15:09 -06:00
Ben Sander
4adab7b7ef
Merge branch 'memtracker' into privatestaging
...
Conflicts:
src/hip_hcc.cpp
2016-02-25 19:38:46 -06:00
Evgeny Mankov
57e212606d
Attribute hipDeviceAttributeIsMultiGpuBoard for obtaining Device property isMultiGpuBoard is added.
...
On HIP path property obtaining done through hsa_iterate_agents and counting the devices of HSA_DEVICE_TYPE_GPU type.
P.S.
On multi-boards systems it might be problems with detection what board a GPU plugged into (not tested).
2016-02-25 23:44:39 +03:00
Ben Sander
c2d66a48a7
Fix memcpy for Titan. Add <threads> to common includes
2016-02-22 15:09:23 -06:00
Ben Sander
0a98db4b5f
Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker
2016-02-22 08:33:47 -06:00
gargrahul
a2fbf06129
Update for shared atomics support
2016-02-22 16:21:52 +05:30
Ben Sander
d33d806a5b
Track last command to a stream.
...
Passing simple tests.
2016-02-20 11:02:07 -06:00
Evgeny Mankov
833c9e52ad
Guard #ifdef USE_ROCR_20 is added for ROCR_20 device properties (memoryClockRate, memoryBusWidth)
...
By default isn't defined.
To add ROCR_20 support HIP have to be compiled as follows: make CXX_DEFINES+=-DUSE_ROCR_20
2016-02-19 13:27:03 +03:00
Evgeny Mankov
1c19dbb807
Device property memoryBusWidth implementation.
...
+ Device property memoryBusWidth is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryBusWidth is added to hipDeviceAttribute_t struct.
+ Tests update.
2016-02-18 18:15:01 +03:00
Evgeny Mankov
5ea8543d2e
Device property memoryClockRate implementation.
...
+ Device property memoryClockRate is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryClockRate is added to hipDeviceAttribute_t struct.
+ Tests update.
+ Rename hipDevAttrConcurrentKernels to hipDeviceAttributeConcurrentKernels.
2016-02-18 17:25:28 +03:00
Evgeny Mankov
2b6fda77ca
Attribute hipDevAttrConcurrentKernels for obtaining Device property concurrentKernels is added.
2016-02-18 14:34:18 +03:00
Ben Sander
b63470f4cc
remove extra :
2016-02-18 03:05:53 -06:00
Ben Sander
d653782d9d
Remove HIP-local AM tracker (now in HCC)
2016-02-17 21:33:32 -06:00
Ben Sander
caef9b5ced
Add per-stream pool for hsa_signals.
2016-02-16 01:59:13 -06:00
Ben Sander
38c735fd1d
Update before checkin to HCC.
...
Add support for USE_AM_TRACKER=2 (HCC version).
Add AM_ALLOC, AM_FREE indirection to ease swapping AM implementations.
2016-02-15 21:16:00 -06:00
Ben Sander
db3a63360b
Move warpSize to header, have shuffles use default warpsize.
2016-02-15 05:41:09 -06:00
Ben Sander
4637e19da4
Update docs, cleanup
2016-02-15 05:40:12 -06:00
Ben Sander
24c1fdb864
Step1 in staging buffer copy.
...
- use StagingBuffer class for copies.
- refactor g_device to use array rather than vector.
(keeps pointers from moving).
2016-02-12 18:24:08 -06:00
Ben Sander
d7396b5af3
Query tracked memory sizes.
...
Support more accurate hipMemGetInfo. Add test to hipPointerAttrib.
2016-02-12 18:24:08 -06:00
Ben Sander
de45e2291e
Tracker improvements
...
- add API to add / remove user-pointers from the tracker.
- test for thread-safety with MultiThreadtest_2 - rapid
insertions/removal.
- add mutex to provide thread-safety.
- rename tracker interface to "memtracker_..." for consistency.
- add am_memtracker_reset, connect to hipDeviceReset.
-
2016-02-12 18:24:08 -06:00
Ben Sander
4ee2a5229b
Create address tracker for am_alloc.
...
Tracks device where memory is allocated, pinned-host or device, and
more.
Uses memory-range-based lookups - so pointers that exist anywhere in
the range of hostPtr + size will find the associated AmPointerInfo.
The insertions and lookups use a self-balancing binary tree and
should support O(logN) lookup speed.
2016-02-12 18:24:08 -06:00
Evgeny Mankov
ea8f99702d
Fix typo: maxThreadsPerMultiProcessor -> MaxSharedMemoryPerMultiprocessor
...
Device property MaxSharedMemoryPerMultiprocessor set equal to totalGlobalMem (HIP path).
Reason: MaxSharedMemoryPerMultiprocessor should be as the same as group memory size. Group memory will not be paged out, so, the physical memory size = total shared memory size = group region size. NVCC path remains untouched: CUDA's device property MaxSharedMemoryPerMultiprocessor is reported.
hipify is updated as well.
2016-02-12 01:29:20 +03:00
Evgeny Mankov
33f60c300d
BDFID (BusID/DeviceID/FunctionID) support.
...
Except FunctionID (or DomainID in CUDA) support, because cudaDeviceProp::pciDomainID is not reported by CUDA.
2016-02-11 22:26:01 +03:00
Evgeny Mankov
254da4ec53
Formatting, no functional changes
2016-02-10 17:21:18 +03:00
gargrahul
8c40a4ace4
Removed atomicInc and atomicDec support from HIP
2016-02-10 04:29:55 +05:30
Evgeny Mankov
950c3baacd
Device property concurrentKernels is added to hipDeviceProp_t struct.
...
For HCC path concurrentKernels is set to true since all ROCR hardware supports this feature.
For NVCC path concurrentKernels is obtained from CUDA's device property cudaDeviceProp::concurrentKernels.
2016-02-09 17:10:35 +03:00
Maneesh Gupta
3291e0ec96
Move HIP_DEVICE_COMPILE defines to hip_common.h
2016-02-09 10:57:20 +05:30
Ben Sander
9e2c3c8df3
minor doc touchup
2016-02-08 22:11:11 -06:00
Ben Sander
76ebe6dcfd
Fix getdeviceattr compilation for NVCC
2016-02-04 16:26:33 -06:00
Sam Kolton
0a27507208
Implementation of hipDeviceGetAttribute()
2016-02-04 17:39:27 +03:00
Peng Sun
c73996d041
Fix all TODO-doc
2016-02-02 21:29:09 -06:00
Peng Sun
8b74333204
Finish all TODO for error code
2016-02-02 17:39:46 -06:00
scchan
265c42500f
add inline attribute to shfl functions
2016-02-02 12:53:17 -06:00
streamhsa
974d491902
Adjusted the value of __any as per CUDA -sandeep
2016-02-02 15:25:42 +05:30
streamhsa
23904df99b
ADDED Support for __ffs() and __ffsll() having signed input -sandeep
2016-02-02 15:05:46 +05:30
scchan
04f3e3e598
adding shfl, shfl_up, shfl_down, shfl_xor intrinsics
2016-02-01 23:55:31 -06:00
Maneesh Gupta
861cba6f75
Add double and integer intrinsics to test
2016-02-01 16:00:45 +05:30
Maneesh Gupta
d2c6125a7c
Add few more single precision intrinsics to hcc_detail/hip_runtime.h
2016-02-01 14:29:50 +05:30
Maneesh Gupta
3b19fd578d
Restrict using namespace hc::precise_math to device only
2016-02-01 14:26:50 +05:30
Maneesh Gupta
e55f3778e0
Remove redundant #define __HCC__ in hcc_detail/hip_runtime.h
2016-02-01 14:24:41 +05:30
sunway513
02fa107967
Fix some typos and incorrect namings in comments
2016-01-28 13:17:44 -06:00
sunway513
71a841d764
Fix @file and @brief tag on header files
2016-01-28 10:59:21 -06:00
Ben Sander
f38e63ff18
Initial commit for GPUOpen Launch
2016-01-26 20:14:33 -06:00