İşleme Grafiği

1023 İşleme

Yazar SHA1 Mesaj Tarih
Ben Sander cea37c3e91 Deprecate hipMallocHost and hipFreeHost.
These will print compiler warnings if used, so we can weed them out
before removing.

Also add a default flags args for hipHostAlloc, in the C++ functioin
headers.  So you can replace hipMallocHost(&ptr, size( with hipHostAlloc(&ptr, size)
2016-03-19 22:53:59 -05:00
Ben Sander 0af4d3623f Refactor copy code.
-Move staging buffer locks inside the staging buffer code.
-Remove dedicated per-device completion_signal + per-device lock -
instead allocated signal from the per-stream pool.   This elimintes
the lock and allows more concurrency.
-remove switch HIP_DISABLE_BIDIR_MEMCPY
2016-03-18 03:02:00 -05:00
Ben Sander 7d500599fa Refactor staging buffer and sync copies.
- refactor staging buffer to operate on hsa* data structures not
  hc::accelerator.
- use hsa_memory_allocate to allocate staging buffers rather than
  am_alloc.
- Refactor device reset with single member function.  Don't reallocate
  staging buffers on reset.
- Properly track dependencies based on command type.  Add new deps for
  H2D and D2D rather than overloading H2D.
2016-03-17 20:09:10 -05:00
Ben Sander e7586adb33 Refactor to isolate staging buffer code. 2016-03-17 00:20:56 -05:00
Ben Sander 28ee7aff71 Start separaration of staging_buffer.cpp code.
Still #include staging_buffer.cpp into hip_hcc.cpp.
Directed tests compile hip_hcc to static library and use the library.
2016-03-16 22:26:49 -05:00
Ben Sander e1617b9604 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
Conflicts:
	src/hip_hcc.cpp
	tests/src/CMakeLists.txt
2016-03-14 15:01:26 -05:00
Ben Sander 1a27e5134e enable DB, comments 2016-03-14 14:40:41 -05:00
Ben Sander 250739666d Improve error reporting.
use throw with error class.
fix bug when memcpyDefault resolved to D2D copy.
2016-03-12 04:02:04 -06:00
Aditya Atluri 102f173396 Added hipHostRegister for hip with tests and added copyright 2016-03-08 12:57:22 -06:00
Aditya Atluri d9429dd4ec Added hipHostRegister flags 2016-03-07 10:52:40 -06:00
Aditya Atluri 4ed0b1cb1a Added hipHostRegister feature for CUDA backend and its tests 2016-03-07 03:42:50 -06:00
Ben Sander aa03e1264c Enhance HIP trace debug functions.
- Control with HIP_DB=mask (env var).  See src/hip_hcc.cpp for mask
  values:
    #define DB_API    0 /* 0x01 - shortcut to enable HIP_TRACE_API on single switch */
    #define DB_SYNC   1 /* 0x02 - trace synchronization pieces */
    #define DB_MEM    2 /* 0x04 - trace memory allocation / deallocation */
    #define DB_COPY1  3 /* 0x08 - trace memory copy commands. . */
    #define DB_SIGNAL 4 /* 0x10 - trace signal pool commands */
- Combine with HIP_TRACE to see debug with API trace.
- Use colors to distinguish different flows of debug.
- Add define COMPILE_DB_TRACE to allow removing all debug at compile-time
2016-03-06 23:50:52 -06:00
Maneesh Gupta 39d5a2c079 Fix typo in nvcc_detail/hip_runtime_api.h 2016-03-07 09:40:15 +05:30
Aditya Atluri 75952029d6 added feature for hipHostGetFlags for CUDA and HIP 2016-03-06 12:17:30 -06:00
Aditya Atluri d3ba2b9782 corrected hipDeviceGetProperties to hipGetDeviceProperties - not docs 2016-03-06 08:31:04 -06:00
Aditya Atluri 3aa764d5eb Added hipHostAlloc with hipHostAllocMapped flag 2016-03-05 15:57:56 -06:00
Aditya Atluri f479531be5 Added hipHostAlloc feature for CUDA 2016-03-05 13:58:56 -06:00
Aditya Atluri a5408ed7b6 v2 Added canHostMapMemory 2016-03-05 13:15:07 -06:00
Aditya Atluri 2ebbdd6ec5 Revert "Added canMapHostMemory feature"
This reverts commit af4edd277f.
2016-03-05 13:08:57 -06:00
Aditya Atluri af4edd277f Added canMapHostMemory feature 2016-03-05 13:06:37 -06:00
Aditya Atluri 4b271ec013 Added canMapHostMemory to hipDeviceProp 2016-03-05 19:30:29 -06:00
pensun 11ca71bd76 resolve conflicts of doc_update 2016-02-27 15:08:45 -06:00
Aditya Avinash Atluri ecadb1623c Merge pull request #4 from AMDComputeLibraries/memtracker
hipGetPointerAttrib behavioral changes
2016-02-27 10:51:23 -06:00
Aditya Avinash Atluri 6d66bd63de Added CUDA support for hipPointerGetAttributes 2016-02-26 12:33:55 -06:00
Ben Sander ff66ef0779 fixes for titan platform 2016-02-26 05:25:30 -06:00
Ben Sander 369e0d7b5b Merge branch 'memtracker' into privatestaging
Conflicts:
	include/nvcc_detail/hip_runtime_api.h
2016-02-26 06:17:05 -06:00
Ben Sander c300ffe458 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging 2016-02-26 06:15:09 -06:00
Ben Sander 4adab7b7ef Merge branch 'memtracker' into privatestaging
Conflicts:
	src/hip_hcc.cpp
2016-02-25 19:38:46 -06:00
Evgeny Mankov 57e212606d Attribute hipDeviceAttributeIsMultiGpuBoard for obtaining Device property isMultiGpuBoard is added.
On HIP path property obtaining done through hsa_iterate_agents and counting the devices of HSA_DEVICE_TYPE_GPU type.

P.S.
On multi-boards systems it might be problems with detection what board a GPU plugged into (not tested).
2016-02-25 23:44:39 +03:00
Ben Sander c2d66a48a7 Fix memcpy for Titan. Add <threads> to common includes 2016-02-22 15:09:23 -06:00
Ben Sander 0a98db4b5f Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker 2016-02-22 08:33:47 -06:00
gargrahul a2fbf06129 Update for shared atomics support 2016-02-22 16:21:52 +05:30
Ben Sander d33d806a5b Track last command to a stream.
Passing simple tests.
2016-02-20 11:02:07 -06:00
Evgeny Mankov 833c9e52ad Guard #ifdef USE_ROCR_20 is added for ROCR_20 device properties (memoryClockRate, memoryBusWidth)
By default isn't defined.
To add ROCR_20 support HIP have to be compiled as follows: make CXX_DEFINES+=-DUSE_ROCR_20
2016-02-19 13:27:03 +03:00
Evgeny Mankov 1c19dbb807 Device property memoryBusWidth implementation.
+ Device property memoryBusWidth is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryBusWidth is added to hipDeviceAttribute_t struct.
+ Tests update.
2016-02-18 18:15:01 +03:00
Evgeny Mankov 5ea8543d2e Device property memoryClockRate implementation.
+ Device property memoryClockRate is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryClockRate is added to hipDeviceAttribute_t struct.
+ Tests update.
+ Rename hipDevAttrConcurrentKernels to hipDeviceAttributeConcurrentKernels.
2016-02-18 17:25:28 +03:00
Evgeny Mankov 2b6fda77ca Attribute hipDevAttrConcurrentKernels for obtaining Device property concurrentKernels is added. 2016-02-18 14:34:18 +03:00
Ben Sander b63470f4cc remove extra : 2016-02-18 03:05:53 -06:00
Ben Sander d653782d9d Remove HIP-local AM tracker (now in HCC) 2016-02-17 21:33:32 -06:00
Ben Sander caef9b5ced Add per-stream pool for hsa_signals. 2016-02-16 01:59:13 -06:00
Ben Sander 38c735fd1d Update before checkin to HCC.
Add support for USE_AM_TRACKER=2 (HCC version).
Add AM_ALLOC, AM_FREE indirection to ease swapping AM implementations.
2016-02-15 21:16:00 -06:00
Ben Sander db3a63360b Move warpSize to header, have shuffles use default warpsize. 2016-02-15 05:41:09 -06:00
Ben Sander 4637e19da4 Update docs, cleanup 2016-02-15 05:40:12 -06:00
Ben Sander 24c1fdb864 Step1 in staging buffer copy.
- use StagingBuffer class for copies.
- refactor g_device to use array rather than vector.
   (keeps pointers from moving).
2016-02-12 18:24:08 -06:00
Ben Sander d7396b5af3 Query tracked memory sizes.
Support more accurate hipMemGetInfo.  Add test to hipPointerAttrib.
2016-02-12 18:24:08 -06:00
Ben Sander de45e2291e Tracker improvements
- add API to add / remove user-pointers from the tracker.
- test for thread-safety with MultiThreadtest_2 - rapid
  insertions/removal.
- add mutex to provide thread-safety.
- rename tracker interface to "memtracker_..." for consistency.
- add am_memtracker_reset, connect to hipDeviceReset.
-
2016-02-12 18:24:08 -06:00
Ben Sander 4ee2a5229b Create address tracker for am_alloc.
Tracks device where memory is allocated, pinned-host or device, and
more.

Uses memory-range-based lookups - so pointers that exist anywhere in

the range of hostPtr + size will find the associated AmPointerInfo.

The insertions and lookups use a self-balancing binary tree and
should support O(logN) lookup speed.
2016-02-12 18:24:08 -06:00
Evgeny Mankov ea8f99702d Fix typo: maxThreadsPerMultiProcessor -> MaxSharedMemoryPerMultiprocessor
Device property MaxSharedMemoryPerMultiprocessor set equal to totalGlobalMem (HIP path).
Reason: MaxSharedMemoryPerMultiprocessor should be as the same as group memory size. Group memory will not be paged out, so, the physical memory size = total shared memory size = group region size. NVCC path remains untouched: CUDA's device property MaxSharedMemoryPerMultiprocessor is reported.

hipify is updated as well.
2016-02-12 01:29:20 +03:00
Evgeny Mankov 33f60c300d BDFID (BusID/DeviceID/FunctionID) support.
Except FunctionID (or DomainID in CUDA) support, because cudaDeviceProp::pciDomainID is not reported by CUDA.
2016-02-11 22:26:01 +03:00
Evgeny Mankov 254da4ec53 Formatting, no functional changes 2016-02-10 17:21:18 +03:00