Commit graph

61 Commits

Autor SHA1 Nachricht Datum
Aditya Avinash Atluri 40eefc1cde Update hip_hcc.cpp 2016-03-03 13:59:43 -06:00
Aditya Avinash Atluri b6e34a44b0 Fix output of hipPointerGetAttributes
The output of hipPointerGetAttributes is fixed to match CUDA counterpart.
2016-03-03 13:58:18 -06:00
Aditya Atluri ce7ae41d42 Initialize hip when single kernel is called 2016-03-02 08:08:45 -06:00
Aditya Avinash Atluri 180bc32db0 H2H Async memcpy fix
In this change, the cpu memcpy will wait until all the commands in the current stream are done.
Note that, it only waits on current stream. But not on other streams.
2016-02-29 12:49:50 -06:00
Ben Sander ba9ad6be80 Copy dependency bug fixes and test modes.
Add dependency for host-to-host copy.

Add debug mode for HIP_DISABLE_HW_COPY_DEP and
HIP_DISABLE_HW_KERNEL_DEP - setting these to -1 now ignores
all dependencies.
2016-02-28 21:19:49 -06:00
Ben Sander af22d056e0 touchup 2016-02-28 21:08:53 -06:00
pensun 39b44cb484 Test cases for HIP_VISIBLE_DEVICES/CUDA_VISIBLE_DEVICES.
hipEnvVar is the base test case, to be called by hipEnvVarDriver
at the run time.
Test case includes tests for normal use case of the environment
variable, invalid value/sequence and use CUDA_VISIBLE_DEVICES as a
alternative.
2016-02-27 14:48:00 -06:00
pensun 1f606261c1 improve the HIP_VISIBLE_DEVICES implementation 2016-02-27 14:14:08 -06:00
pensun 07e56d4666 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging 2016-02-27 04:25:28 -06:00
Aditya Avinash Atluri ecadb1623c Merge pull request #4 from AMDComputeLibraries/memtracker
hipGetPointerAttrib behavioral changes
2016-02-27 10:51:23 -06:00
Ben Sander ea09557e1b disable rocrv2, properly 2016-02-27 03:31:30 -06:00
Aditya Avinash Atluri 66aa7f2f8a Corrected hipPointerGetAttribute
Made hipPointerGetAttribute work same as cudaPointerGetAttribute for HCC
2016-02-26 18:50:40 -06:00
pensun 57f60b34fb relsove conflicts 2016-02-26 09:57:40 -06:00
pensun ee7ac16396 fix compiling error 2016-02-26 09:50:00 -06:00
Ben Sander ff66ef0779 fixes for titan platform 2016-02-26 05:25:30 -06:00
Ben Sander 6e0ccdfb95 Disable ROCR_V2 2016-02-26 23:34:45 -06:00
Ben Sander 369e0d7b5b Merge branch 'memtracker' into privatestaging
Conflicts:
	include/nvcc_detail/hip_runtime_api.h
2016-02-26 06:17:05 -06:00
Ben Sander c300ffe458 Merge branch 'privatestaging' of https://github.com/AMDComputeLibraries/HIP-privatestaging into privatestaging 2016-02-26 06:15:09 -06:00
Ben Sander d319299ddb Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker
Conflicts:
	tests/src/hipMemcpy.cpp
2016-02-25 23:22:51 -06:00
Ben Sander 4adab7b7ef Merge branch 'memtracker' into privatestaging
Conflicts:
	src/hip_hcc.cpp
2016-02-25 19:38:46 -06:00
Ben Sander 8b64c0dc62 Improve memory copy and commands switching
- Add chicken bits to use host-side dependency management.
- Add optional PinInPlace path for unpinned copies
- Synchronize before pinned memcpy path.
- Add mutex to protect two threads launching to same stream.
2016-02-25 19:19:49 -06:00
Evgeny Mankov 57e212606d Attribute hipDeviceAttributeIsMultiGpuBoard for obtaining Device property isMultiGpuBoard is added.
On HIP path property obtaining done through hsa_iterate_agents and counting the devices of HSA_DEVICE_TYPE_GPU type.

P.S.
On multi-boards systems it might be problems with detection what board a GPU plugged into (not tested).
2016-02-25 23:44:39 +03:00
Ben Sander 7090f5c3f9 Add tests for multi-threaded streams 2016-02-23 12:08:22 -06:00
Ben Sander 3886d494f4 Sync review.
- add calls to ihipInit missing from some routines.
- sync before draining a stream.
2016-02-23 04:07:11 -06:00
Ben Sander 549b18ce77 Improve async copy implementation.
- Add device-side signal waits when transitioning between command classes
(Kernel, H2D copy, D2H copy).
- Support waiting in staged memory copies as well.
- Add several chicken bits to control implementation:
    - HIP_DISABLE_ENQ_BARRIER
    - HIP_DISABLE_BIDIR_MEMCPY
    - HIP_ONESHOT_COPY_DEP
- Refactor signal pool to support efficient deallocation based on
signsequnm.
- Deallocate copy signals on eventSynchronize.
- Improve copy tests, add pingpong.
2016-02-22 23:15:24 -06:00
Ben Sander 0a98db4b5f Merge branch 'memtracker' of https://github.com/AMDComputeLibraries/HIP-privatestaging into memtracker 2016-02-22 08:33:47 -06:00
gargrahul a2fbf06129 Update for shared atomics support 2016-02-22 16:21:52 +05:30
Ben Sander d33d806a5b Track last command to a stream.
Passing simple tests.
2016-02-20 11:02:07 -06:00
Evgeny Mankov 833c9e52ad Guard #ifdef USE_ROCR_20 is added for ROCR_20 device properties (memoryClockRate, memoryBusWidth)
By default isn't defined.
To add ROCR_20 support HIP have to be compiled as follows: make CXX_DEFINES+=-DUSE_ROCR_20
2016-02-19 13:27:03 +03:00
Evgeny Mankov fbdeee39cd Formatting, no functional changes. 2016-02-18 18:54:19 +03:00
Evgeny Mankov 1c19dbb807 Device property memoryBusWidth implementation.
+ Device property memoryBusWidth is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryBusWidth is added to hipDeviceAttribute_t struct.
+ Tests update.
2016-02-18 18:15:01 +03:00
Evgeny Mankov 5ea8543d2e Device property memoryClockRate implementation.
+ Device property memoryClockRate is added to hipDeviceProp_t struct.
+ Device attribute hipDeviceAttributeMemoryClockRate is added to hipDeviceAttribute_t struct.
+ Tests update.
+ Rename hipDevAttrConcurrentKernels to hipDeviceAttributeConcurrentKernels.
2016-02-18 17:25:28 +03:00
Evgeny Mankov 2b6fda77ca Attribute hipDevAttrConcurrentKernels for obtaining Device property concurrentKernels is added. 2016-02-18 14:34:18 +03:00
Ben Sander c6f8883b0d Enable Tracker and ROCR by default, verify with HCC 2016-02-17 23:03:37 -06:00
Ben Sander d653782d9d Remove HIP-local AM tracker (now in HCC) 2016-02-17 21:33:32 -06:00
Ben Sander 44f40e171a USE_AM_TRACKER=0 works 2016-02-17 21:23:36 -06:00
pensun 8aa4bfce57 1. Bug fix
2. passed initial tests on different sets of
HIP_VISIBLE_DEVICES: (0),(1),(0,1),(1,2),(2,3),(1,2,3),(2,3,4),(1,5,2,3)
and achieved expected choice of GPU devices at the runtime.
3. Passed HIP test suite.
2016-02-17 09:32:50 -06:00
pensun c1e120fb1b Implementation of HIP_VISIBLE_DEVICES in runtime 2016-02-17 06:59:18 -06:00
Ben Sander 59379ffb44 more work on async copies 2016-02-17 00:59:12 -06:00
pensun 060439b6ab modify to add remove invalid devices numbers 2016-02-16 10:00:05 -06:00
pensun d40cbef2af Implement to read HIP_VISIBLE_DEVICES to internal global variable 2016-02-16 07:39:04 -06:00
Ben Sander caef9b5ced Add per-stream pool for hsa_signals. 2016-02-16 01:59:13 -06:00
Ben Sander 38c735fd1d Update before checkin to HCC.
Add support for USE_AM_TRACKER=2 (HCC version).
Add AM_ALLOC, AM_FREE indirection to ease swapping AM implementations.
2016-02-15 21:16:00 -06:00
Ben Sander db3a63360b Move warpSize to header, have shuffles use default warpsize. 2016-02-15 05:41:09 -06:00
Ben Sander 6420655dc8 Add multi-threading synchonization on staging buffers and signals.
Also pre-allocate a couple signals for copies.
2016-02-13 03:18:01 -06:00
Ben Sander b314777bc1 D2H multi-buffer 2016-02-13 01:15:23 -06:00
Ben Sander 1bfd3cdbd0 Improve copy testing 2016-02-12 18:24:08 -06:00
Ben Sander 134d7975ce Improve copy testing implementation.
- add tests for (unpinned/pinned) x H2H x D2D.
- Free memory at end of test.
2016-02-12 18:24:08 -06:00
Ben Sander 24c1fdb864 Step1 in staging buffer copy.
- use StagingBuffer class for copies.
- refactor g_device to use array rather than vector.
   (keeps pointers from moving).
2016-02-12 18:24:08 -06:00
Ben Sander d7396b5af3 Query tracked memory sizes.
Support more accurate hipMemGetInfo.  Add test to hipPointerAttrib.
2016-02-12 18:24:08 -06:00