Граф коммитов

52 Коммитов

Автор SHA1 Сообщение Дата
pensun cb19da0aa7 resolve conflicts of doc_update
[ROCm/hip commit: 11ca71bd76]
2016-02-27 15:08:45 -06:00
Aditya Avinash Atluri f2dfb87abf Merge pull request #4 from AMDComputeLibraries/memtracker
hipGetPointerAttrib behavioral changes

[ROCm/hip commit: ecadb1623c]
2016-02-27 10:51:23 -06:00
Aditya Avinash Atluri ed96744f76 Added CUDA support for hipPointerGetAttributes
[ROCm/hip commit: 6d66bd63de]
2016-02-26 12:33:55 -06:00
Ben Sander 1ac07d2b87 fixes for titan platform
[ROCm/hip commit: ff66ef0779]
2016-02-26 05:25:30 -06:00
Ben Sander 193dbe4632 Merge branch 'memtracker' into privatestaging
Conflicts:
	include/nvcc_detail/hip_runtime_api.h


[ROCm/hip commit: 369e0d7b5b]
2016-02-26 06:17:05 -06:00
Ben Sander 8a2bcf2da3 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
[ROCm/hip commit: c300ffe458]
2016-02-26 06:15:09 -06:00
Ben Sander 5ca4914e0e Merge branch 'memtracker' into privatestaging
Conflicts:
	src/hip_hcc.cpp


[ROCm/hip commit: 4adab7b7ef]
2016-02-25 19:38:46 -06:00
Evgeny Mankov 82900a1888 Attribute hipDeviceAttributeIsMultiGpuBoard for obtaining Device property isMultiGpuBoard is added.
On HIP path property obtaining done through hsa_iterate_agents and counting the devices of HSA_DEVICE_TYPE_GPU type.

P.S.
On multi-boards systems it might be problems with detection what board a GPU plugged into (not tested).


[ROCm/hip commit: 57e212606d]
2016-02-25 23:44:39 +03:00
Ben Sander 1d027bcaea Fix memcpy for Titan. Add <threads> to common includes
[ROCm/hip commit: c2d66a48a7]
2016-02-22 15:09:23 -06:00
Ben Sander 23b257bca4 Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker
[ROCm/hip commit: 0a98db4b5f]
2016-02-22 08:33:47 -06:00
gargrahul ccd1ed0a97 Update for shared atomics support
[ROCm/hip commit: a2fbf06129]
2016-02-22 16:21:52 +05:30
Ben Sander ebf2700936 Track last command to a stream.
Passing simple tests.


[ROCm/hip commit: d33d806a5b]
2016-02-20 11:02:07 -06:00
Evgeny Mankov c3a600c63b Guard #ifdef USE_ROCR_20 is added for ROCR_20 device properties (memoryClockRate, memoryBusWidth)
By default isn't defined.
To add ROCR_20 support HIP have to be compiled as follows: make CXX_DEFINES+=-DUSE_ROCR_20


[ROCm/hip commit: 833c9e52ad]
2016-02-19 13:27:03 +03:00
Evgeny Mankov 4fcd9f2542 Device property memoryBusWidth implementation.
+ Device property memoryBusWidth is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryBusWidth is added to hipDeviceAttribute_t struct.
+ Tests update.


[ROCm/hip commit: 1c19dbb807]
2016-02-18 18:15:01 +03:00
Evgeny Mankov a0cc7134e3 Device property memoryClockRate implementation.
+ Device property memoryClockRate is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryClockRate is added to hipDeviceAttribute_t struct.
+ Tests update.
+ Rename hipDevAttrConcurrentKernels to hipDeviceAttributeConcurrentKernels.


[ROCm/hip commit: 5ea8543d2e]
2016-02-18 17:25:28 +03:00
Evgeny Mankov 8c1a0d1924 Attribute hipDevAttrConcurrentKernels for obtaining Device property concurrentKernels is added.
[ROCm/hip commit: 2b6fda77ca]
2016-02-18 14:34:18 +03:00
Ben Sander d064a446d0 remove extra :
[ROCm/hip commit: b63470f4cc]
2016-02-18 03:05:53 -06:00
Ben Sander a2d8f9d98e Remove HIP-local AM tracker (now in HCC)
[ROCm/hip commit: d653782d9d]
2016-02-17 21:33:32 -06:00
Ben Sander 512163b889 Add per-stream pool for hsa_signals.
[ROCm/hip commit: caef9b5ced]
2016-02-16 01:59:13 -06:00
Ben Sander 9ab5b92173 Update before checkin to HCC.
Add support for USE_AM_TRACKER=2 (HCC version).
Add AM_ALLOC, AM_FREE indirection to ease swapping AM implementations.


[ROCm/hip commit: 38c735fd1d]
2016-02-15 21:16:00 -06:00
Ben Sander d58eab1706 Move warpSize to header, have shuffles use default warpsize.
[ROCm/hip commit: db3a63360b]
2016-02-15 05:41:09 -06:00
Ben Sander b97e430921 Update docs, cleanup
[ROCm/hip commit: 4637e19da4]
2016-02-15 05:40:12 -06:00
Ben Sander c441d5ec29 Step1 in staging buffer copy.
- use StagingBuffer class for copies.
- refactor g_device to use array rather than vector.
   (keeps pointers from moving).


[ROCm/hip commit: 24c1fdb864]
2016-02-12 18:24:08 -06:00
Ben Sander b9dc0e9497 Query tracked memory sizes.
Support more accurate hipMemGetInfo.  Add test to hipPointerAttrib.


[ROCm/hip commit: d7396b5af3]
2016-02-12 18:24:08 -06:00
Ben Sander 680b600b4a Tracker improvements
- add API to add / remove user-pointers from the tracker.
- test for thread-safety with MultiThreadtest_2 - rapid
  insertions/removal.
- add mutex to provide thread-safety.
- rename tracker interface to "memtracker_..." for consistency.
- add am_memtracker_reset, connect to hipDeviceReset.
-


[ROCm/hip commit: de45e2291e]
2016-02-12 18:24:08 -06:00
Ben Sander d4a90f8afd Create address tracker for am_alloc.
Tracks device where memory is allocated, pinned-host or device, and
more.

Uses memory-range-based lookups - so pointers that exist anywhere in

the range of hostPtr + size will find the associated AmPointerInfo.

The insertions and lookups use a self-balancing binary tree and
should support O(logN) lookup speed.


[ROCm/hip commit: 4ee2a5229b]
2016-02-12 18:24:08 -06:00
Evgeny Mankov fcd154097f Fix typo: maxThreadsPerMultiProcessor -> MaxSharedMemoryPerMultiprocessor
Device property MaxSharedMemoryPerMultiprocessor set equal to totalGlobalMem (HIP path).
Reason: MaxSharedMemoryPerMultiprocessor should be as the same as group memory size. Group memory will not be paged out, so, the physical memory size = total shared memory size = group region size. NVCC path remains untouched: CUDA's device property MaxSharedMemoryPerMultiprocessor is reported.

hipify is updated as well.


[ROCm/hip commit: ea8f99702d]
2016-02-12 01:29:20 +03:00
Evgeny Mankov 4eade0ce83 BDFID (BusID/DeviceID/FunctionID) support.
Except FunctionID (or DomainID in CUDA) support, because cudaDeviceProp::pciDomainID is not reported by CUDA.


[ROCm/hip commit: 33f60c300d]
2016-02-11 22:26:01 +03:00
Evgeny Mankov 3a032ff317 Formatting, no functional changes
[ROCm/hip commit: 254da4ec53]
2016-02-10 17:21:18 +03:00
gargrahul 1ab2294657 Removed atomicInc and atomicDec support from HIP
[ROCm/hip commit: 8c40a4ace4]
2016-02-10 04:29:55 +05:30
Peng Sun 691aa5cda6 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into doc_update
[ROCm/hip commit: 1fb48b0714]
2016-02-09 15:08:39 -06:00
Peng Sun 00ef2c28a2 Fix TODO-Doc in hip_texture.h
[ROCm/hip commit: 28025f6a74]
2016-02-09 10:58:23 -06:00
Evgeny Mankov c38a69ef33 Device property concurrentKernels is added to hipDeviceProp_t struct.
For HCC path concurrentKernels is set to true since all ROCR hardware supports this feature.
For NVCC path concurrentKernels is obtained from CUDA's device property cudaDeviceProp::concurrentKernels.


[ROCm/hip commit: 950c3baacd]
2016-02-09 17:10:35 +03:00
Maneesh Gupta 77f61d1a46 Move HIP_DEVICE_COMPILE defines to hip_common.h
[ROCm/hip commit: 3291e0ec96]
2016-02-09 10:57:20 +05:30
Ben Sander a2dac9e12c minor doc touchup
[ROCm/hip commit: 9e2c3c8df3]
2016-02-08 22:11:11 -06:00
Peng Sun f28e3eedc5 fix merging conflicts
[ROCm/hip commit: fb3b11774b]
2016-02-08 15:35:49 -06:00
Ben Sander 9cb14a455c Fix getdeviceattr compilation for NVCC
[ROCm/hip commit: 76ebe6dcfd]
2016-02-04 16:26:33 -06:00
Sam Kolton 2306293526 Implementation of hipDeviceGetAttribute()
[ROCm/hip commit: 0a27507208]
2016-02-04 17:39:27 +03:00
Peng Sun 8347c2163d Additional typo and extra space fix
[ROCm/hip commit: 3d5608ea84]
2016-02-03 09:42:16 -06:00
Peng Sun 503ec9ad24 Fix all TODO-doc
[ROCm/hip commit: c73996d041]
2016-02-02 21:29:09 -06:00
Peng Sun 03630ee0a4 Finish all TODO for error code
[ROCm/hip commit: 8b74333204]
2016-02-02 17:39:46 -06:00
scchan 39fb16bc5f add inline attribute to shfl functions
[ROCm/hip commit: 265c42500f]
2016-02-02 12:53:17 -06:00
streamhsa af8cc35552 Adjusted the value of __any as per CUDA -sandeep
[ROCm/hip commit: 974d491902]
2016-02-02 15:25:42 +05:30
streamhsa e4635c36a0 ADDED Support for __ffs() and __ffsll() having signed input -sandeep
[ROCm/hip commit: 23904df99b]
2016-02-02 15:05:46 +05:30
scchan ca142c6d9c adding shfl, shfl_up, shfl_down, shfl_xor intrinsics
[ROCm/hip commit: 04f3e3e598]
2016-02-01 23:55:31 -06:00
Maneesh Gupta 9ed3ef50fe Add double and integer intrinsics to test
[ROCm/hip commit: 861cba6f75]
2016-02-01 16:00:45 +05:30
Maneesh Gupta 1b4ad3eedf Add few more single precision intrinsics to hcc_detail/hip_runtime.h
[ROCm/hip commit: d2c6125a7c]
2016-02-01 14:29:50 +05:30
Maneesh Gupta 01c51ce734 Restrict using namespace hc::precise_math to device only
[ROCm/hip commit: 3b19fd578d]
2016-02-01 14:26:50 +05:30
Maneesh Gupta 3e13c7dae7 Remove redundant #define __HCC__ in hcc_detail/hip_runtime.h
[ROCm/hip commit: e55f3778e0]
2016-02-01 14:24:41 +05:30
sunway513 1b93c2f456 Fix some typos and incorrect namings in comments
[ROCm/hip commit: 02fa107967]
2016-01-28 13:17:44 -06:00