Wykres commitów

71 Commity

Autor SHA1 Wiadomość Data
Ben Sander 3b45e064f9 Refactor staging buffer and sync copies.
- refactor staging buffer to operate on hsa* data structures not
  hc::accelerator.
- use hsa_memory_allocate to allocate staging buffers rather than
  am_alloc.
- Refactor device reset with single member function.  Don't reallocate
  staging buffers on reset.
- Properly track dependencies based on command type.  Add new deps for
  H2D and D2D rather than overloading H2D.
2016-03-17 20:09:10 -05:00
Ben Sander 1b7cc7d921 Refactor to isolate staging buffer code. 2016-03-17 00:20:56 -05:00
Ben Sander a1879ba59b Start separaration of staging_buffer.cpp code.
Still #include staging_buffer.cpp into hip_hcc.cpp.
Directed tests compile hip_hcc to static library and use the library.
2016-03-16 22:26:49 -05:00
Ben Sander 15a8e8f8a0 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
Conflicts:
	src/hip_hcc.cpp
	tests/src/CMakeLists.txt
2016-03-14 15:01:26 -05:00
Ben Sander 0d05517d0a enable DB, comments 2016-03-14 14:40:41 -05:00
Ben Sander ac272932f6 Improve error reporting.
use throw with error class.
fix bug when memcpyDefault resolved to D2D copy.
2016-03-12 04:02:04 -06:00
Aditya Atluri 3127969d97 Added hipHostRegister for hip with tests and added copyright 2016-03-08 12:57:22 -06:00
Aditya Atluri ffeba62a74 Added hipHostRegister flags 2016-03-07 10:52:40 -06:00
Aditya Atluri 496c549141 Added hipHostRegister feature for CUDA backend and its tests 2016-03-07 03:42:50 -06:00
Ben Sander 9b1b108ea8 Enhance HIP trace debug functions.
- Control with HIP_DB=mask (env var).  See src/hip_hcc.cpp for mask
  values:
    #define DB_API    0 /* 0x01 - shortcut to enable HIP_TRACE_API on single switch */
    #define DB_SYNC   1 /* 0x02 - trace synchronization pieces */
    #define DB_MEM    2 /* 0x04 - trace memory allocation / deallocation */
    #define DB_COPY1  3 /* 0x08 - trace memory copy commands. . */
    #define DB_SIGNAL 4 /* 0x10 - trace signal pool commands */
- Combine with HIP_TRACE to see debug with API trace.
- Use colors to distinguish different flows of debug.
- Add define COMPILE_DB_TRACE to allow removing all debug at compile-time
2016-03-06 23:50:52 -06:00
Maneesh Gupta b62040f6fd Fix typo in nvcc_detail/hip_runtime_api.h 2016-03-07 09:40:15 +05:30
Aditya Atluri 45408db5dc added feature for hipHostGetFlags for CUDA and HIP 2016-03-06 12:17:30 -06:00
Aditya Atluri 8a21b42943 corrected hipDeviceGetProperties to hipGetDeviceProperties - not docs 2016-03-06 08:31:04 -06:00
Aditya Atluri 411154f93f Added hipHostAlloc with hipHostAllocMapped flag 2016-03-05 15:57:56 -06:00
Aditya Atluri 2212d35e2d Added hipHostAlloc feature for CUDA 2016-03-05 13:58:56 -06:00
Aditya Atluri 6085d94f7b v2 Added canHostMapMemory 2016-03-05 13:15:07 -06:00
Aditya Atluri a8d30da648 Revert "Added canMapHostMemory feature"
This reverts commit 8c3777d317.
2016-03-05 13:08:57 -06:00
Aditya Atluri 8c3777d317 Added canMapHostMemory feature 2016-03-05 13:06:37 -06:00
Aditya Atluri 1e4d1002a0 Added canMapHostMemory to hipDeviceProp 2016-03-05 19:30:29 -06:00
pensun cb352a17c3 resolve conflicts of doc_update 2016-02-27 15:08:45 -06:00
Aditya Avinash Atluri 9c4819bc29 Merge pull request #4 from AMDComputeLibraries/memtracker
hipGetPointerAttrib behavioral changes
2016-02-27 10:51:23 -06:00
Aditya Avinash Atluri a31f878218 Added CUDA support for hipPointerGetAttributes 2016-02-26 12:33:55 -06:00
Ben Sander 8105bd636f fixes for titan platform 2016-02-26 05:25:30 -06:00
Ben Sander 7a1b4c3878 Merge branch 'memtracker' into privatestaging
Conflicts:
	include/nvcc_detail/hip_runtime_api.h
2016-02-26 06:17:05 -06:00
Ben Sander 4a6173fe58 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging 2016-02-26 06:15:09 -06:00
Ben Sander af97f5e317 Merge branch 'memtracker' into privatestaging
Conflicts:
	src/hip_hcc.cpp
2016-02-25 19:38:46 -06:00
Evgeny Mankov 7bb0f17656 Attribute hipDeviceAttributeIsMultiGpuBoard for obtaining Device property isMultiGpuBoard is added.
On HIP path property obtaining done through hsa_iterate_agents and counting the devices of HSA_DEVICE_TYPE_GPU type.

P.S.
On multi-boards systems it might be problems with detection what board a GPU plugged into (not tested).
2016-02-25 23:44:39 +03:00
Ben Sander 784ebcbc86 Fix memcpy for Titan. Add <threads> to common includes 2016-02-22 15:09:23 -06:00
Ben Sander 16b04fc0d3 Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker 2016-02-22 08:33:47 -06:00
gargrahul 14508fd0d6 Update for shared atomics support 2016-02-22 16:21:52 +05:30
Ben Sander d5c777268a Track last command to a stream.
Passing simple tests.
2016-02-20 11:02:07 -06:00
Evgeny Mankov d4b15399f5 Guard #ifdef USE_ROCR_20 is added for ROCR_20 device properties (memoryClockRate, memoryBusWidth)
By default isn't defined.
To add ROCR_20 support HIP have to be compiled as follows: make CXX_DEFINES+=-DUSE_ROCR_20
2016-02-19 13:27:03 +03:00
Evgeny Mankov da8169dd89 Device property memoryBusWidth implementation.
+ Device property memoryBusWidth is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryBusWidth is added to hipDeviceAttribute_t struct.
+ Tests update.
2016-02-18 18:15:01 +03:00
Evgeny Mankov 8aace64dce Device property memoryClockRate implementation.
+ Device property memoryClockRate is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryClockRate is added to hipDeviceAttribute_t struct.
+ Tests update.
+ Rename hipDevAttrConcurrentKernels to hipDeviceAttributeConcurrentKernels.
2016-02-18 17:25:28 +03:00
Evgeny Mankov d4bd94e9a0 Attribute hipDevAttrConcurrentKernels for obtaining Device property concurrentKernels is added. 2016-02-18 14:34:18 +03:00
Ben Sander 866e64f6e2 remove extra : 2016-02-18 03:05:53 -06:00
Ben Sander b08e468c06 Remove HIP-local AM tracker (now in HCC) 2016-02-17 21:33:32 -06:00
Ben Sander 5d721a2649 Add per-stream pool for hsa_signals. 2016-02-16 01:59:13 -06:00
Ben Sander 1ed431c0f6 Update before checkin to HCC.
Add support for USE_AM_TRACKER=2 (HCC version).
Add AM_ALLOC, AM_FREE indirection to ease swapping AM implementations.
2016-02-15 21:16:00 -06:00
Ben Sander bd7e3b83b9 Move warpSize to header, have shuffles use default warpsize. 2016-02-15 05:41:09 -06:00
Ben Sander 322a3bd9b2 Update docs, cleanup 2016-02-15 05:40:12 -06:00
Ben Sander 90af462b85 Step1 in staging buffer copy.
- use StagingBuffer class for copies.
- refactor g_device to use array rather than vector.
   (keeps pointers from moving).
2016-02-12 18:24:08 -06:00
Ben Sander f464cedcf4 Query tracked memory sizes.
Support more accurate hipMemGetInfo.  Add test to hipPointerAttrib.
2016-02-12 18:24:08 -06:00
Ben Sander 7216727fba Tracker improvements
- add API to add / remove user-pointers from the tracker.
- test for thread-safety with MultiThreadtest_2 - rapid
  insertions/removal.
- add mutex to provide thread-safety.
- rename tracker interface to "memtracker_..." for consistency.
- add am_memtracker_reset, connect to hipDeviceReset.
-
2016-02-12 18:24:08 -06:00
Ben Sander 721508cc2f Create address tracker for am_alloc.
Tracks device where memory is allocated, pinned-host or device, and
more.

Uses memory-range-based lookups - so pointers that exist anywhere in

the range of hostPtr + size will find the associated AmPointerInfo.

The insertions and lookups use a self-balancing binary tree and
should support O(logN) lookup speed.
2016-02-12 18:24:08 -06:00
Evgeny Mankov 460b501cbb Fix typo: maxThreadsPerMultiProcessor -> MaxSharedMemoryPerMultiprocessor
Device property MaxSharedMemoryPerMultiprocessor set equal to totalGlobalMem (HIP path).
Reason: MaxSharedMemoryPerMultiprocessor should be as the same as group memory size. Group memory will not be paged out, so, the physical memory size = total shared memory size = group region size. NVCC path remains untouched: CUDA's device property MaxSharedMemoryPerMultiprocessor is reported.

hipify is updated as well.
2016-02-12 01:29:20 +03:00
Evgeny Mankov 658e9f0484 BDFID (BusID/DeviceID/FunctionID) support.
Except FunctionID (or DomainID in CUDA) support, because cudaDeviceProp::pciDomainID is not reported by CUDA.
2016-02-11 22:26:01 +03:00
Evgeny Mankov d9a94191f2 Formatting, no functional changes 2016-02-10 17:21:18 +03:00
gargrahul 51f46d9ddf Removed atomicInc and atomicDec support from HIP 2016-02-10 04:29:55 +05:30
Peng Sun b61f0453c0 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into doc_update 2016-02-09 15:08:39 -06:00