Commit graph

73 Commits

Autor SHA1 Nachricht Datum
Ben Sander 4c77ecef9a Deprecate hipMallocHost and hipFreeHost.
These will print compiler warnings if used, so we can weed them out
before removing.

Also add a default flags args for hipHostAlloc, in the C++ functioin
headers.  So you can replace hipMallocHost(&ptr, size( with hipHostAlloc(&ptr, size)


[ROCm/hip commit: cea37c3e91]
2016-03-19 22:53:59 -05:00
Ben Sander 0134651419 Refactor copy code.
-Move staging buffer locks inside the staging buffer code.
-Remove dedicated per-device completion_signal + per-device lock -
instead allocated signal from the per-stream pool.   This elimintes
the lock and allows more concurrency.
-remove switch HIP_DISABLE_BIDIR_MEMCPY


[ROCm/hip commit: 0af4d3623f]
2016-03-18 03:02:00 -05:00
Ben Sander 3320975a80 Refactor staging buffer and sync copies.
- refactor staging buffer to operate on hsa* data structures not
  hc::accelerator.
- use hsa_memory_allocate to allocate staging buffers rather than
  am_alloc.
- Refactor device reset with single member function.  Don't reallocate
  staging buffers on reset.
- Properly track dependencies based on command type.  Add new deps for
  H2D and D2D rather than overloading H2D.


[ROCm/hip commit: 7d500599fa]
2016-03-17 20:09:10 -05:00
Ben Sander fc27c61c58 Refactor to isolate staging buffer code.
[ROCm/hip commit: e7586adb33]
2016-03-17 00:20:56 -05:00
Ben Sander 0ae7bc7e14 Start separaration of staging_buffer.cpp code.
Still #include staging_buffer.cpp into hip_hcc.cpp.
Directed tests compile hip_hcc to static library and use the library.


[ROCm/hip commit: 28ee7aff71]
2016-03-16 22:26:49 -05:00
Ben Sander 0abf5db89e Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
Conflicts:
	src/hip_hcc.cpp
	tests/src/CMakeLists.txt


[ROCm/hip commit: e1617b9604]
2016-03-14 15:01:26 -05:00
Ben Sander 5951d581d9 enable DB, comments
[ROCm/hip commit: 1a27e5134e]
2016-03-14 14:40:41 -05:00
Ben Sander 4900ebb39f Improve error reporting.
use throw with error class.
fix bug when memcpyDefault resolved to D2D copy.


[ROCm/hip commit: 250739666d]
2016-03-12 04:02:04 -06:00
Aditya Atluri db1ce3ba84 Added hipHostRegister for hip with tests and added copyright
[ROCm/hip commit: 102f173396]
2016-03-08 12:57:22 -06:00
Aditya Atluri c05f4abd71 Added hipHostRegister flags
[ROCm/hip commit: d9429dd4ec]
2016-03-07 10:52:40 -06:00
Aditya Atluri 02760925a9 Added hipHostRegister feature for CUDA backend and its tests
[ROCm/hip commit: 4ed0b1cb1a]
2016-03-07 03:42:50 -06:00
Ben Sander 82116d905d Enhance HIP trace debug functions.
- Control with HIP_DB=mask (env var).  See src/hip_hcc.cpp for mask
  values:
    #define DB_API    0 /* 0x01 - shortcut to enable HIP_TRACE_API on single switch */
    #define DB_SYNC   1 /* 0x02 - trace synchronization pieces */
    #define DB_MEM    2 /* 0x04 - trace memory allocation / deallocation */
    #define DB_COPY1  3 /* 0x08 - trace memory copy commands. . */
    #define DB_SIGNAL 4 /* 0x10 - trace signal pool commands */
- Combine with HIP_TRACE to see debug with API trace.
- Use colors to distinguish different flows of debug.
- Add define COMPILE_DB_TRACE to allow removing all debug at compile-time


[ROCm/hip commit: aa03e1264c]
2016-03-06 23:50:52 -06:00
Maneesh Gupta 2de8a1a4a6 Fix typo in nvcc_detail/hip_runtime_api.h
[ROCm/hip commit: 39d5a2c079]
2016-03-07 09:40:15 +05:30
Aditya Atluri 91dbc3114d added feature for hipHostGetFlags for CUDA and HIP
[ROCm/hip commit: 75952029d6]
2016-03-06 12:17:30 -06:00
Aditya Atluri f1b8758919 corrected hipDeviceGetProperties to hipGetDeviceProperties - not docs
[ROCm/hip commit: d3ba2b9782]
2016-03-06 08:31:04 -06:00
Aditya Atluri 3c91a6d0a7 Added hipHostAlloc with hipHostAllocMapped flag
[ROCm/hip commit: 3aa764d5eb]
2016-03-05 15:57:56 -06:00
Aditya Atluri 8e0fc269d7 Added hipHostAlloc feature for CUDA
[ROCm/hip commit: f479531be5]
2016-03-05 13:58:56 -06:00
Aditya Atluri 5e9d9cbabf v2 Added canHostMapMemory
[ROCm/hip commit: a5408ed7b6]
2016-03-05 13:15:07 -06:00
Aditya Atluri 6bfbe0483a Revert "Added canMapHostMemory feature"
This reverts commit 8b585536ef.


[ROCm/hip commit: 2ebbdd6ec5]
2016-03-05 13:08:57 -06:00
Aditya Atluri 8b585536ef Added canMapHostMemory feature
[ROCm/hip commit: af4edd277f]
2016-03-05 13:06:37 -06:00
Aditya Atluri 29c423a22b Added canMapHostMemory to hipDeviceProp
[ROCm/hip commit: 4b271ec013]
2016-03-05 19:30:29 -06:00
pensun cb19da0aa7 resolve conflicts of doc_update
[ROCm/hip commit: 11ca71bd76]
2016-02-27 15:08:45 -06:00
Aditya Avinash Atluri f2dfb87abf Merge pull request #4 from AMDComputeLibraries/memtracker
hipGetPointerAttrib behavioral changes

[ROCm/hip commit: ecadb1623c]
2016-02-27 10:51:23 -06:00
Aditya Avinash Atluri ed96744f76 Added CUDA support for hipPointerGetAttributes
[ROCm/hip commit: 6d66bd63de]
2016-02-26 12:33:55 -06:00
Ben Sander 1ac07d2b87 fixes for titan platform
[ROCm/hip commit: ff66ef0779]
2016-02-26 05:25:30 -06:00
Ben Sander 193dbe4632 Merge branch 'memtracker' into privatestaging
Conflicts:
	include/nvcc_detail/hip_runtime_api.h


[ROCm/hip commit: 369e0d7b5b]
2016-02-26 06:17:05 -06:00
Ben Sander 8a2bcf2da3 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
[ROCm/hip commit: c300ffe458]
2016-02-26 06:15:09 -06:00
Ben Sander 5ca4914e0e Merge branch 'memtracker' into privatestaging
Conflicts:
	src/hip_hcc.cpp


[ROCm/hip commit: 4adab7b7ef]
2016-02-25 19:38:46 -06:00
Evgeny Mankov 82900a1888 Attribute hipDeviceAttributeIsMultiGpuBoard for obtaining Device property isMultiGpuBoard is added.
On HIP path property obtaining done through hsa_iterate_agents and counting the devices of HSA_DEVICE_TYPE_GPU type.

P.S.
On multi-boards systems it might be problems with detection what board a GPU plugged into (not tested).


[ROCm/hip commit: 57e212606d]
2016-02-25 23:44:39 +03:00
Ben Sander 1d027bcaea Fix memcpy for Titan. Add <threads> to common includes
[ROCm/hip commit: c2d66a48a7]
2016-02-22 15:09:23 -06:00
Ben Sander 23b257bca4 Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker
[ROCm/hip commit: 0a98db4b5f]
2016-02-22 08:33:47 -06:00
gargrahul ccd1ed0a97 Update for shared atomics support
[ROCm/hip commit: a2fbf06129]
2016-02-22 16:21:52 +05:30
Ben Sander ebf2700936 Track last command to a stream.
Passing simple tests.


[ROCm/hip commit: d33d806a5b]
2016-02-20 11:02:07 -06:00
Evgeny Mankov c3a600c63b Guard #ifdef USE_ROCR_20 is added for ROCR_20 device properties (memoryClockRate, memoryBusWidth)
By default isn't defined.
To add ROCR_20 support HIP have to be compiled as follows: make CXX_DEFINES+=-DUSE_ROCR_20


[ROCm/hip commit: 833c9e52ad]
2016-02-19 13:27:03 +03:00
Evgeny Mankov 4fcd9f2542 Device property memoryBusWidth implementation.
+ Device property memoryBusWidth is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryBusWidth is added to hipDeviceAttribute_t struct.
+ Tests update.


[ROCm/hip commit: 1c19dbb807]
2016-02-18 18:15:01 +03:00
Evgeny Mankov a0cc7134e3 Device property memoryClockRate implementation.
+ Device property memoryClockRate is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryClockRate is added to hipDeviceAttribute_t struct.
+ Tests update.
+ Rename hipDevAttrConcurrentKernels to hipDeviceAttributeConcurrentKernels.


[ROCm/hip commit: 5ea8543d2e]
2016-02-18 17:25:28 +03:00
Evgeny Mankov 8c1a0d1924 Attribute hipDevAttrConcurrentKernels for obtaining Device property concurrentKernels is added.
[ROCm/hip commit: 2b6fda77ca]
2016-02-18 14:34:18 +03:00
Ben Sander d064a446d0 remove extra :
[ROCm/hip commit: b63470f4cc]
2016-02-18 03:05:53 -06:00
Ben Sander a2d8f9d98e Remove HIP-local AM tracker (now in HCC)
[ROCm/hip commit: d653782d9d]
2016-02-17 21:33:32 -06:00
Ben Sander 512163b889 Add per-stream pool for hsa_signals.
[ROCm/hip commit: caef9b5ced]
2016-02-16 01:59:13 -06:00
Ben Sander 9ab5b92173 Update before checkin to HCC.
Add support for USE_AM_TRACKER=2 (HCC version).
Add AM_ALLOC, AM_FREE indirection to ease swapping AM implementations.


[ROCm/hip commit: 38c735fd1d]
2016-02-15 21:16:00 -06:00
Ben Sander d58eab1706 Move warpSize to header, have shuffles use default warpsize.
[ROCm/hip commit: db3a63360b]
2016-02-15 05:41:09 -06:00
Ben Sander b97e430921 Update docs, cleanup
[ROCm/hip commit: 4637e19da4]
2016-02-15 05:40:12 -06:00
Ben Sander c441d5ec29 Step1 in staging buffer copy.
- use StagingBuffer class for copies.
- refactor g_device to use array rather than vector.
   (keeps pointers from moving).


[ROCm/hip commit: 24c1fdb864]
2016-02-12 18:24:08 -06:00
Ben Sander b9dc0e9497 Query tracked memory sizes.
Support more accurate hipMemGetInfo.  Add test to hipPointerAttrib.


[ROCm/hip commit: d7396b5af3]
2016-02-12 18:24:08 -06:00
Ben Sander 680b600b4a Tracker improvements
- add API to add / remove user-pointers from the tracker.
- test for thread-safety with MultiThreadtest_2 - rapid
  insertions/removal.
- add mutex to provide thread-safety.
- rename tracker interface to "memtracker_..." for consistency.
- add am_memtracker_reset, connect to hipDeviceReset.
-


[ROCm/hip commit: de45e2291e]
2016-02-12 18:24:08 -06:00
Ben Sander d4a90f8afd Create address tracker for am_alloc.
Tracks device where memory is allocated, pinned-host or device, and
more.

Uses memory-range-based lookups - so pointers that exist anywhere in

the range of hostPtr + size will find the associated AmPointerInfo.

The insertions and lookups use a self-balancing binary tree and
should support O(logN) lookup speed.


[ROCm/hip commit: 4ee2a5229b]
2016-02-12 18:24:08 -06:00
Evgeny Mankov fcd154097f Fix typo: maxThreadsPerMultiProcessor -> MaxSharedMemoryPerMultiprocessor
Device property MaxSharedMemoryPerMultiprocessor set equal to totalGlobalMem (HIP path).
Reason: MaxSharedMemoryPerMultiprocessor should be as the same as group memory size. Group memory will not be paged out, so, the physical memory size = total shared memory size = group region size. NVCC path remains untouched: CUDA's device property MaxSharedMemoryPerMultiprocessor is reported.

hipify is updated as well.


[ROCm/hip commit: ea8f99702d]
2016-02-12 01:29:20 +03:00
Evgeny Mankov 4eade0ce83 BDFID (BusID/DeviceID/FunctionID) support.
Except FunctionID (or DomainID in CUDA) support, because cudaDeviceProp::pciDomainID is not reported by CUDA.


[ROCm/hip commit: 33f60c300d]
2016-02-11 22:26:01 +03:00
Evgeny Mankov 3a032ff317 Formatting, no functional changes
[ROCm/hip commit: 254da4ec53]
2016-02-10 17:21:18 +03:00