Aditya Avinash Atluri
40eefc1cde
Update hip_hcc.cpp
2016-03-03 13:59:43 -06:00
Aditya Avinash Atluri
b6e34a44b0
Fix output of hipPointerGetAttributes
...
The output of hipPointerGetAttributes is fixed to match CUDA counterpart.
2016-03-03 13:58:18 -06:00
Aditya Atluri
ce7ae41d42
Initialize hip when single kernel is called
2016-03-02 08:08:45 -06:00
Aditya Avinash Atluri
180bc32db0
H2H Async memcpy fix
...
In this change, the cpu memcpy will wait until all the commands in the current stream are done.
Note that, it only waits on current stream. But not on other streams.
2016-02-29 12:49:50 -06:00
Ben Sander
ba9ad6be80
Copy dependency bug fixes and test modes.
...
Add dependency for host-to-host copy.
Add debug mode for HIP_DISABLE_HW_COPY_DEP and
HIP_DISABLE_HW_KERNEL_DEP - setting these to -1 now ignores
all dependencies.
2016-02-28 21:19:49 -06:00
Ben Sander
af22d056e0
touchup
2016-02-28 21:08:53 -06:00
pensun
39b44cb484
Test cases for HIP_VISIBLE_DEVICES/CUDA_VISIBLE_DEVICES.
...
hipEnvVar is the base test case, to be called by hipEnvVarDriver
at the run time.
Test case includes tests for normal use case of the environment
variable, invalid value/sequence and use CUDA_VISIBLE_DEVICES as a
alternative.
2016-02-27 14:48:00 -06:00
pensun
1f606261c1
improve the HIP_VISIBLE_DEVICES implementation
2016-02-27 14:14:08 -06:00
pensun
07e56d4666
Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
2016-02-27 04:25:28 -06:00
Aditya Avinash Atluri
ecadb1623c
Merge pull request #4 from AMDComputeLibraries/memtracker
...
hipGetPointerAttrib behavioral changes
2016-02-27 10:51:23 -06:00
Ben Sander
ea09557e1b
disable rocrv2, properly
2016-02-27 03:31:30 -06:00
Aditya Avinash Atluri
66aa7f2f8a
Corrected hipPointerGetAttribute
...
Made hipPointerGetAttribute work same as cudaPointerGetAttribute for HCC
2016-02-26 18:50:40 -06:00
pensun
57f60b34fb
relsove conflicts
2016-02-26 09:57:40 -06:00
pensun
ee7ac16396
fix compiling error
2016-02-26 09:50:00 -06:00
Ben Sander
ff66ef0779
fixes for titan platform
2016-02-26 05:25:30 -06:00
Ben Sander
6e0ccdfb95
Disable ROCR_V2
2016-02-26 23:34:45 -06:00
Ben Sander
369e0d7b5b
Merge branch 'memtracker' into privatestaging
...
Conflicts:
include/nvcc_detail/hip_runtime_api.h
2016-02-26 06:17:05 -06:00
Ben Sander
c300ffe458
Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging
2016-02-26 06:15:09 -06:00
Ben Sander
d319299ddb
Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker
...
Conflicts:
tests/src/hipMemcpy.cpp
2016-02-25 23:22:51 -06:00
Ben Sander
4adab7b7ef
Merge branch 'memtracker' into privatestaging
...
Conflicts:
src/hip_hcc.cpp
2016-02-25 19:38:46 -06:00
Ben Sander
8b64c0dc62
Improve memory copy and commands switching
...
- Add chicken bits to use host-side dependency management.
- Add optional PinInPlace path for unpinned copies
- Synchronize before pinned memcpy path.
- Add mutex to protect two threads launching to same stream.
2016-02-25 19:19:49 -06:00
Evgeny Mankov
57e212606d
Attribute hipDeviceAttributeIsMultiGpuBoard for obtaining Device property isMultiGpuBoard is added.
...
On HIP path property obtaining done through hsa_iterate_agents and counting the devices of HSA_DEVICE_TYPE_GPU type.
P.S.
On multi-boards systems it might be problems with detection what board a GPU plugged into (not tested).
2016-02-25 23:44:39 +03:00
Ben Sander
7090f5c3f9
Add tests for multi-threaded streams
2016-02-23 12:08:22 -06:00
Ben Sander
3886d494f4
Sync review.
...
- add calls to ihipInit missing from some routines.
- sync before draining a stream.
2016-02-23 04:07:11 -06:00
Ben Sander
549b18ce77
Improve async copy implementation.
...
- Add device-side signal waits when transitioning between command classes
(Kernel, H2D copy, D2H copy).
- Support waiting in staged memory copies as well.
- Add several chicken bits to control implementation:
- HIP_DISABLE_ENQ_BARRIER
- HIP_DISABLE_BIDIR_MEMCPY
- HIP_ONESHOT_COPY_DEP
- Refactor signal pool to support efficient deallocation based on
signsequnm.
- Deallocate copy signals on eventSynchronize.
- Improve copy tests, add pingpong.
2016-02-22 23:15:24 -06:00
Ben Sander
0a98db4b5f
Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker
2016-02-22 08:33:47 -06:00
gargrahul
a2fbf06129
Update for shared atomics support
2016-02-22 16:21:52 +05:30
Ben Sander
d33d806a5b
Track last command to a stream.
...
Passing simple tests.
2016-02-20 11:02:07 -06:00
Evgeny Mankov
833c9e52ad
Guard #ifdef USE_ROCR_20 is added for ROCR_20 device properties (memoryClockRate, memoryBusWidth)
...
By default isn't defined.
To add ROCR_20 support HIP have to be compiled as follows: make CXX_DEFINES+=-DUSE_ROCR_20
2016-02-19 13:27:03 +03:00
Evgeny Mankov
fbdeee39cd
Formatting, no functional changes.
2016-02-18 18:54:19 +03:00
Evgeny Mankov
1c19dbb807
Device property memoryBusWidth implementation.
...
+ Device property memoryBusWidth is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryBusWidth is added to hipDeviceAttribute_t struct.
+ Tests update.
2016-02-18 18:15:01 +03:00
Evgeny Mankov
5ea8543d2e
Device property memoryClockRate implementation.
...
+ Device property memoryClockRate is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryClockRate is added to hipDeviceAttribute_t struct.
+ Tests update.
+ Rename hipDevAttrConcurrentKernels to hipDeviceAttributeConcurrentKernels.
2016-02-18 17:25:28 +03:00
Evgeny Mankov
2b6fda77ca
Attribute hipDevAttrConcurrentKernels for obtaining Device property concurrentKernels is added.
2016-02-18 14:34:18 +03:00
Ben Sander
c6f8883b0d
Enable Tracker and ROCR by default, verify with HCC
2016-02-17 23:03:37 -06:00
Ben Sander
d653782d9d
Remove HIP-local AM tracker (now in HCC)
2016-02-17 21:33:32 -06:00
Ben Sander
44f40e171a
USE_AM_TRACKER=0 works
2016-02-17 21:23:36 -06:00
pensun
8aa4bfce57
1. Bug fix
...
2. passed initial tests on different sets of
HIP_VISIBLE_DEVICES: (0),(1),(0,1),(1,2),(2,3),(1,2,3),(2,3,4),(1,5,2,3)
and achieved expected choice of GPU devices at the runtime.
3. Passed HIP test suite.
2016-02-17 09:32:50 -06:00
pensun
c1e120fb1b
Implementation of HIP_VISIBLE_DEVICES in runtime
2016-02-17 06:59:18 -06:00
Ben Sander
59379ffb44
more work on async copies
2016-02-17 00:59:12 -06:00
pensun
060439b6ab
modify to add remove invalid devices numbers
2016-02-16 10:00:05 -06:00
pensun
d40cbef2af
Implement to read HIP_VISIBLE_DEVICES to internal global variable
2016-02-16 07:39:04 -06:00
Ben Sander
caef9b5ced
Add per-stream pool for hsa_signals.
2016-02-16 01:59:13 -06:00
Ben Sander
38c735fd1d
Update before checkin to HCC.
...
Add support for USE_AM_TRACKER=2 (HCC version).
Add AM_ALLOC, AM_FREE indirection to ease swapping AM implementations.
2016-02-15 21:16:00 -06:00
Ben Sander
db3a63360b
Move warpSize to header, have shuffles use default warpsize.
2016-02-15 05:41:09 -06:00
Ben Sander
6420655dc8
Add multi-threading synchonization on staging buffers and signals.
...
Also pre-allocate a couple signals for copies.
2016-02-13 03:18:01 -06:00
Ben Sander
b314777bc1
D2H multi-buffer
2016-02-13 01:15:23 -06:00
Ben Sander
1bfd3cdbd0
Improve copy testing
2016-02-12 18:24:08 -06:00
Ben Sander
134d7975ce
Improve copy testing implementation.
...
- add tests for (unpinned/pinned) x H2H x D2D.
- Free memory at end of test.
2016-02-12 18:24:08 -06:00
Ben Sander
24c1fdb864
Step1 in staging buffer copy.
...
- use StagingBuffer class for copies.
- refactor g_device to use array rather than vector.
(keeps pointers from moving).
2016-02-12 18:24:08 -06:00
Ben Sander
d7396b5af3
Query tracked memory sizes.
...
Support more accurate hipMemGetInfo. Add test to hipPointerAttrib.
2016-02-12 18:24:08 -06:00