Pierre
e60a95d7dd
Fix missing MARKER_END
...
Logging status of hipCtxSynchronize was missing
Test if hip profiling is active for MARKER_END in ihipPostLaunchKernel
Add MARKER_END after the completion of a kernel launched through
the "grid launch"
2017-11-13 16:13:19 -05:00
Ben Sander
955cfbfdc7
Make hipEvent_t thread safe.
...
Support re-recording of same event by different threads.
- Add criticalData structure to hipEvent_t, similar to mechanism used
for streams, contexts, device. Events are always locked
after streams to avoid deadlock.
- ihipEvent_t::locked_copyCrit can be used to copy critical state
including marker. The critical state in the event can then
be re-recorded.
- refactor hipEventElapsedTime. Remmove stale debug code, native signal
refs.
2017-11-06 23:49:25 +00:00
Ben Sander
fe32685fbc
Merge pull request #237 from bensander/use_ctxptr_for_p2p
...
Use ctxptr for p2p
2017-11-01 18:55:25 +01:00
Ben Sander
dc7d993a02
Add ns-level timer for HIP API routines
...
Refactor some miuses of ihipLogStatus, these should only be in top-level
HIP APIs and should be paired with HIP_API_INIT calls.
2017-10-30 20:20:51 +00:00
Ben Sander
7e8b39fc96
Merge pull request #222 from bensander/fix_device_prop
...
Fix device prop
2017-10-30 17:58:48 +01:00
Ben Sander
4c7b2be1c2
Check for null copyEngine before looking at peers.
2017-10-30 16:58:03 +00:00
Ben Sander
a417241507
Fix bug with peer-to-peer combined with context API
...
- Store context inside the tracker rather than using int deviceID that
was always mapped to primary context
- IsPeerWatcher now based on device IDs rather than specific peers.
2017-10-26 19:44:22 +00:00
Aditya Atluri
698721be34
Enhance debug for copy pointers
...
- show more pointer tracking fields
- show pointer info before and after "tailoring'
2017-10-26 19:44:22 +00:00
Siu Chi Chan
da7c37947c
replace __hcc_workweek__ with HC_FEATURE_PRINTF flag
2017-10-23 18:30:08 -04:00
Ben Sander
b4c7876244
Remove printf
2017-10-20 13:24:04 -07:00
Ben Sander
ed85b15c3e
Update device properties.
...
- clear properties to defined initial state.
- enable some property flags which are now supported.
2017-10-20 15:52:13 +00:00
Ben Sander
e738fa66b5
Modify device properties to use pool API.
...
- Also better error code checking
2017-10-20 14:49:29 +00:00
Siu Chi Chan
a1956f64e6
hipDeviceReset(): make sure to reinitialize the printf buffer in hcc RT
2017-10-18 16:26:13 -04:00
Wen-Heng (Jack) Chung
1efc99e69f
Bump device major version from 2 to 3
...
This would significantly improve performance for certain apps in kernel
selection logic.
2017-09-15 15:47:39 +00:00
Ben Sander
1ea468e279
Merge branch 'master' into hip_init_alloc
2017-09-14 11:53:33 -05:00
Ben Sander
fff42fd591
Add HIP_INIT_ALLOC to init allocated memory.
2017-09-13 23:31:48 +00:00
Ben Sander
fff74eee21
hipStreamQuery uses av::is_empty. Add HIP_FORCE_NULL_STREAM.
2017-08-31 03:00:14 +00:00
Ben Sander
ed8c3ba7e7
Refactor hipStreamWaitEvent
...
- Null streams use same flow as non-null.
- Add HIP_SYNC_STREAM_WAIT
- Resolve null stream.
2017-08-31 03:00:14 +00:00
Ben Sander
bc9ba7cd81
Lock streams when waiting on event completion or querying event safety.
2017-08-28 18:40:16 -05:00
Maneesh Gupta
172a568aa6
[texture] guard new HCC APIs under workweek
...
Change-Id: I4f60a64fb0b0496ca1eb01ffe6ddda121c25d976
2017-08-15 15:51:38 +05:30
Weixing Zhang
e4de2d1138
[HIP Texture] The GPU virtual address for texture memory needs to be
...
aligned.
In hcc_am, a bigger buffer will be allocated for alignment purpose
and _unalignedDevicePointer is added in struct AmPointerInfo for
original allocated address.
2017-08-08 11:18:00 -04:00
Maneesh Gupta
06b51109c6
Merge pull request #135 from bensander/fix_tracing
...
Some fixes to tracing.
2017-07-31 10:24:41 +05:30
Ben Sander
9e9f384899
Some fixes to tracing.
2017-07-28 22:13:43 -05:00
Maneesh Gupta
8330fb3fe0
Merge pull request #122 from bensander/enable_async_null_stream
...
Set HIP_SYNC_NULL_STREAM=0.
2017-07-28 09:15:56 +05:30
Ben Sander
cd42711134
Set HIP_SYNC_NULL_STREAM=0.
...
Optimizes null stream synchronization so it uses GPU-side dependency
resolution. Requires HCC __hcc_workweek__ > 17300.
2017-07-27 11:11:54 -05:00
Ben Sander
6576201ec2
Make host memory allocations coherent by default.
...
Associated change is to optimize event recording so it uses
agent-scope releaes (since it was only using system-scope release
to support non-coherent host mem).
Flags and environment variables exist to obtain previous behavior
if desired. Options are documented in new performance guide.
2017-07-26 19:20:34 -05:00
Ben Sander
cdc4291431
Enable HCC_OPT_FLUSH=1 (if HCC compiler new enough)
2017-07-24 18:57:19 -05:00
Wen-Heng (Jack) Chung
32909c4a92
Temporarily disables HCC_OPT_FLUSH
...
Change-Id: I290791e58dd52ab3823f6c3315e42b0d386e9d64
2017-07-12 16:08:36 +00:00
Ben Sander
61fafdceb1
Set default HIP_SYNC_NULL_STREAM=1.
2017-06-30 19:01:14 -05:00
Aditya Atluri
0fe0381608
automate gcnarch detection
...
Change-Id: Ibbad22db136f7f5e2be84c82e9169298a144cc77
2017-06-29 12:01:40 -05:00
Rahul Garg
a7c727f352
Fixed hipDeviceGetPCIBusId for HIP/HCC
...
Change-Id: I3688fa2476e1baada2d3c5fc3735cec3f15a1e21
2017-06-28 23:48:27 +05:30
Ben Sander
eb2c5e166c
Set default HIP_HIDDEN_FREE_MEM
2017-06-27 12:17:12 -05:00
Sun, Peng
e5ce585307
Add support of HIP_HIDDEN_FREE_MEM, to deduct the returned available
...
memory from hipMemGetInfo API, measured in MB.
Change-Id: I7a8260c12e032e04e26611db4c38c893a29f2653
2017-06-26 15:29:38 -05:00
Ben Sander
42882ddf9c
Clean up old USE_* and RELEASE.md notes.
2017-06-23 18:05:30 -05:00
Ben Sander
c2baa4f6e6
Enable HCC_OPT_FLUSH=1.
...
Requires appropriate HCC with this support :
commit 38e392b517a46a09a3b1c8f388e6a0db3741c510
2017-06-07 00:15:05 -05:00
Ben Sander
344b6cb0c0
Enable HIP_SYNC_NULL_STREAM=0 optimization.
2017-06-05 08:50:41 -05:00
Ben Sander
823281dcba
Fix HIP_SYNC_NULL_STREAM=0 mode.
...
- Fix null-stream sync
- hipStreamDestroy of null stream returns hipErrorInvalidResourceHandle
- Update documentation.
- Add tests for null stream sync, hipEventElapsedTime.
- Rename internal enum hipEventStatusRecorded to hipEventStatusComplete
- refactor hipStreamWaitEvent to streamline control-flow
2017-06-05 08:50:22 -05:00
Ben Sander
15f54fb943
Update tests, add p2p coherency test.
2017-06-03 17:11:34 -05:00
Ben Sander
942ec0eff8
Add event controls for release fences.
...
Env var : HIP_EVENT_SYS_RELEASE
Event allocation flags : hipEventReleaseToDevice, hipEventReleaseToSystem
(remove hipEventDisableSystemRelease)
Update test for new functionality.
2017-05-27 16:02:34 -05:00
Ben Sander
c8178c6838
Cleanup hipEvent. (Intermediate checkpoint)
...
Support hipEventDisableSystemRelease flag.
Update test.
Remove stray printf
2017-05-27 16:02:34 -05:00
Ben Sander
35212632e7
Remove HIP_NUM_KERNELS_INFLIGHT. (redundant with HCC controls)
2017-05-24 01:03:28 -05:00
Ben Sander
dda70ae514
Add hipHostMallocCoherent, hipHostMallocNonCoherent
...
Provide per-allocation control over coherent/non-coherent mem.
These overrid the default HIP_COHERENT_HOST_ALLOC setting.
2017-05-24 00:48:10 -05:00
Ben Sander
d43d57d39c
Remove HIP_MAX_QUEUES (replaced with HCC_MAX_QUEUES)
2017-05-23 23:48:01 -05:00
Ben Sander
2d5b3359c6
Use accelerator_scope for create_marker and create_blocking_marker.
...
As optimization when system-scope is not needed.
2017-05-23 23:15:45 -05:00
Ben Sander
8bc6ee5932
Add initial HIP_SYNC_NULL_STREAM=0 mode.
...
This eliminates host-synchronization for null stream. Instead, the
null-stream uses GPU-side events to wait for other streams.
Default is OFF pending additional testing.
Add enhanced null-stream test.
Also refine HIP_TRACE_API.
2017-05-16 19:04:25 -05:00
Ben Sander
7e7ba5027f
Add HIP_TRACE_API=4. Only display memory allocation/free apis.
2017-05-16 19:04:25 -05:00
Aditya Atluri
a6dc00f167
added gfx900 to hipDeviceProp_t
...
Change-Id: I49e7a32f218926fd55f1c94c5dc2366d6c8ac4ca
2017-05-12 21:43:34 -05:00
Ben Sander
c7c62dd022
Remove old USE_ switches no longer needed.
2017-05-12 16:06:03 -05:00
Ben Sander
2c2625cb9e
Add hipEventDisableSystemRelease flag.
2017-05-12 16:06:03 -05:00
Ben Sander
ff9bed6535
hipHostMalloc allocation are mapped to all devices by default.
...
Support hipHostMallocPortable flag.
Default flags are hipHostMallocPortable | hipHostMallocMapped.
Also:
-refactor tests to move addCount and addCountReverse into HipTest
namespace.
-test multi-GPU host memory.
2017-05-10 17:34:36 -05:00