Aditya Atluri
3f5eb20cf0
Revert "fix nvcc for hipHostMalloc* flags."
...
This reverts commit b6962826eb .
2016-03-21 10:36:14 -05:00
Aditya Atluri
287ba34aca
Revert "fixed memory free apis"
...
This reverts commit 96a1899df7 .
2016-03-21 10:36:11 -05:00
Aditya Atluri
caa80af31b
Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
2016-03-21 10:34:08 -05:00
Aditya Atluri
96a1899df7
fixed memory free apis
2016-03-21 10:32:30 -05:00
Ben Sander
ab910efb96
hipHostRegister and hipHostMalloc refactor.
...
Note hipHostMalloc (not hipHostAlloc or hipMallocHost).
- the hipHost* is used for all HIP APIs dealing with Host memory.
(including hipHostMalloc, hipHostFree, hipHostRegister,
hipHostUnregister, hipHostGetFlags, hipHostGetDevicePointer).
- hipMallocHost is consistent with "hipMalloc" for allocating device
memory. Enumerations hipHostMalloc* also used as optional
flags parm to hipHostMalloc.
2016-03-22 02:30:10 -05:00
Ben Sander
b6962826eb
fix nvcc for hipHostMalloc* flags.
2016-03-21 09:33:46 -05:00
Ben Sander
deb38625ca
Implement hipHostFree on HCC path
2016-03-19 23:25:11 -05:00
Ben Sander
9941ba0bc6
fix nvcc compiler
...
- MallocHost and FreeHost deprecation.
- Change tests to call new hipHost* equivs.
- Add missing StreamSynchronize.
2016-03-19 04:20:15 -05:00
Ben Sander
cea37c3e91
Deprecate hipMallocHost and hipFreeHost.
...
These will print compiler warnings if used, so we can weed them out
before removing.
Also add a default flags args for hipHostAlloc, in the C++ functioin
headers. So you can replace hipMallocHost(&ptr, size( with hipHostAlloc(&ptr, size)
2016-03-19 22:53:59 -05:00
Ben Sander
0af4d3623f
Refactor copy code.
...
-Move staging buffer locks inside the staging buffer code.
-Remove dedicated per-device completion_signal + per-device lock -
instead allocated signal from the per-stream pool. This elimintes
the lock and allows more concurrency.
-remove switch HIP_DISABLE_BIDIR_MEMCPY
2016-03-18 03:02:00 -05:00
Ben Sander
7d500599fa
Refactor staging buffer and sync copies.
...
- refactor staging buffer to operate on hsa* data structures not
hc::accelerator.
- use hsa_memory_allocate to allocate staging buffers rather than
am_alloc.
- Refactor device reset with single member function. Don't reallocate
staging buffers on reset.
- Properly track dependencies based on command type. Add new deps for
H2D and D2D rather than overloading H2D.
2016-03-17 20:09:10 -05:00
Ben Sander
e7586adb33
Refactor to isolate staging buffer code.
2016-03-17 00:20:56 -05:00
Ben Sander
28ee7aff71
Start separaration of staging_buffer.cpp code.
...
Still #include staging_buffer.cpp into hip_hcc.cpp.
Directed tests compile hip_hcc to static library and use the library.
2016-03-16 22:26:49 -05:00
Ben Sander
e1617b9604
Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
...
Conflicts:
src/hip_hcc.cpp
tests/src/CMakeLists.txt
2016-03-14 15:01:26 -05:00
Ben Sander
1a27e5134e
enable DB, comments
2016-03-14 14:40:41 -05:00
Ben Sander
250739666d
Improve error reporting.
...
use throw with error class.
fix bug when memcpyDefault resolved to D2D copy.
2016-03-12 04:02:04 -06:00
Aditya Atluri
102f173396
Added hipHostRegister for hip with tests and added copyright
2016-03-08 12:57:22 -06:00
Aditya Atluri
d9429dd4ec
Added hipHostRegister flags
2016-03-07 10:52:40 -06:00
Aditya Atluri
4ed0b1cb1a
Added hipHostRegister feature for CUDA backend and its tests
2016-03-07 03:42:50 -06:00
Ben Sander
aa03e1264c
Enhance HIP trace debug functions.
...
- Control with HIP_DB=mask (env var). See src/hip_hcc.cpp for mask
values:
#define DB_API 0 /* 0x01 - shortcut to enable HIP_TRACE_API on single switch */
#define DB_SYNC 1 /* 0x02 - trace synchronization pieces */
#define DB_MEM 2 /* 0x04 - trace memory allocation / deallocation */
#define DB_COPY1 3 /* 0x08 - trace memory copy commands. . */
#define DB_SIGNAL 4 /* 0x10 - trace signal pool commands */
- Combine with HIP_TRACE to see debug with API trace.
- Use colors to distinguish different flows of debug.
- Add define COMPILE_DB_TRACE to allow removing all debug at compile-time
2016-03-06 23:50:52 -06:00
Maneesh Gupta
39d5a2c079
Fix typo in nvcc_detail/hip_runtime_api.h
2016-03-07 09:40:15 +05:30
Aditya Atluri
75952029d6
added feature for hipHostGetFlags for CUDA and HIP
2016-03-06 12:17:30 -06:00
Aditya Atluri
d3ba2b9782
corrected hipDeviceGetProperties to hipGetDeviceProperties - not docs
2016-03-06 08:31:04 -06:00
Aditya Atluri
3aa764d5eb
Added hipHostAlloc with hipHostAllocMapped flag
2016-03-05 15:57:56 -06:00
Aditya Atluri
f479531be5
Added hipHostAlloc feature for CUDA
2016-03-05 13:58:56 -06:00
Aditya Atluri
a5408ed7b6
v2 Added canHostMapMemory
2016-03-05 13:15:07 -06:00
Aditya Atluri
2ebbdd6ec5
Revert "Added canMapHostMemory feature"
...
This reverts commit af4edd277f .
2016-03-05 13:08:57 -06:00
Aditya Atluri
af4edd277f
Added canMapHostMemory feature
2016-03-05 13:06:37 -06:00
Aditya Atluri
4b271ec013
Added canMapHostMemory to hipDeviceProp
2016-03-05 19:30:29 -06:00
pensun
11ca71bd76
resolve conflicts of doc_update
2016-02-27 15:08:45 -06:00
Aditya Avinash Atluri
ecadb1623c
Merge pull request #4 from AMDComputeLibraries/memtracker
...
hipGetPointerAttrib behavioral changes
2016-02-27 10:51:23 -06:00
Aditya Avinash Atluri
6d66bd63de
Added CUDA support for hipPointerGetAttributes
2016-02-26 12:33:55 -06:00
Ben Sander
ff66ef0779
fixes for titan platform
2016-02-26 05:25:30 -06:00
Ben Sander
369e0d7b5b
Merge branch 'memtracker' into privatestaging
...
Conflicts:
include/nvcc_detail/hip_runtime_api.h
2016-02-26 06:17:05 -06:00
Ben Sander
c300ffe458
Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
2016-02-26 06:15:09 -06:00
Ben Sander
4adab7b7ef
Merge branch 'memtracker' into privatestaging
...
Conflicts:
src/hip_hcc.cpp
2016-02-25 19:38:46 -06:00
Evgeny Mankov
57e212606d
Attribute hipDeviceAttributeIsMultiGpuBoard for obtaining Device property isMultiGpuBoard is added.
...
On HIP path property obtaining done through hsa_iterate_agents and counting the devices of HSA_DEVICE_TYPE_GPU type.
P.S.
On multi-boards systems it might be problems with detection what board a GPU plugged into (not tested).
2016-02-25 23:44:39 +03:00
Ben Sander
c2d66a48a7
Fix memcpy for Titan. Add <threads> to common includes
2016-02-22 15:09:23 -06:00
Ben Sander
0a98db4b5f
Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker
2016-02-22 08:33:47 -06:00
gargrahul
a2fbf06129
Update for shared atomics support
2016-02-22 16:21:52 +05:30
Ben Sander
d33d806a5b
Track last command to a stream.
...
Passing simple tests.
2016-02-20 11:02:07 -06:00
Evgeny Mankov
833c9e52ad
Guard #ifdef USE_ROCR_20 is added for ROCR_20 device properties (memoryClockRate, memoryBusWidth)
...
By default isn't defined.
To add ROCR_20 support HIP have to be compiled as follows: make CXX_DEFINES+=-DUSE_ROCR_20
2016-02-19 13:27:03 +03:00
Evgeny Mankov
1c19dbb807
Device property memoryBusWidth implementation.
...
+ Device property memoryBusWidth is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryBusWidth is added to hipDeviceAttribute_t struct.
+ Tests update.
2016-02-18 18:15:01 +03:00
Evgeny Mankov
5ea8543d2e
Device property memoryClockRate implementation.
...
+ Device property memoryClockRate is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryClockRate is added to hipDeviceAttribute_t struct.
+ Tests update.
+ Rename hipDevAttrConcurrentKernels to hipDeviceAttributeConcurrentKernels.
2016-02-18 17:25:28 +03:00
Evgeny Mankov
2b6fda77ca
Attribute hipDevAttrConcurrentKernels for obtaining Device property concurrentKernels is added.
2016-02-18 14:34:18 +03:00
Ben Sander
b63470f4cc
remove extra :
2016-02-18 03:05:53 -06:00
Ben Sander
d653782d9d
Remove HIP-local AM tracker (now in HCC)
2016-02-17 21:33:32 -06:00
Ben Sander
caef9b5ced
Add per-stream pool for hsa_signals.
2016-02-16 01:59:13 -06:00
Ben Sander
38c735fd1d
Update before checkin to HCC.
...
Add support for USE_AM_TRACKER=2 (HCC version).
Add AM_ALLOC, AM_FREE indirection to ease swapping AM implementations.
2016-02-15 21:16:00 -06:00
Ben Sander
db3a63360b
Move warpSize to header, have shuffles use default warpsize.
2016-02-15 05:41:09 -06:00