Maneesh Gupta
c0419cc749
Refactor for building HIP as dynamic library
...
Change-Id: I65a3d9d589c4fdbbdcf1611e5427224253be8260
2016-11-18 14:33:20 +05:30
Aditya Atluri
c20c524400
added texture header to memory api source
...
Change-Id: I1af6d60aca5a9a9ef1cadf8c304bea892acbe061
2016-11-17 11:57:53 -06:00
Aditya Atluri
12dd9df88f
Added i8 packed math intrinsics
...
1. Added add, sub, mul packed math i8 intrinsics
2. Removed c++ packed data structures included from HCC
Change-Id: I1d109c5ce10c48b7cd3ea059478b88fc1de78499
TODO: Add better packed data structures support
2016-11-17 01:09:12 -06:00
Maneesh Gupta
888a3528d2
Enable USE_COPY_EXT_V2 by default
...
Change-Id: I2c0dc80f85a0ccb5744715b5418a604e38b249ed
2016-11-15 10:42:27 +05:30
pensun
50867efa10
Add direct test case for threadfence_system workaround
...
Change-Id: I5b21b590e957c901044741ac94e816cd8b1426f9
2016-11-11 15:09:43 -06:00
Aditya Atluri
abf6872b2b
fixed multi-dim module kernel launch
...
Change-Id: Id1d81f2375d058979ab526433f905cf0ea3d23d6
2016-11-11 12:25:23 -06:00
Ben Sander
1e5515ee9f
Add option to deny peer access.
...
Also fix test.
Change-Id: I1b247f6c4271442b008e560669bca4daf8eb94c7
2016-11-10 23:12:48 -06:00
Ben Sander
65584e48de
Use forceUnpinnedCopy to resolve P2p corner cases.
...
Change-Id: I2aebb419881246cebb696bec87798635bc71acc2
2016-11-10 23:12:48 -06:00
Ben Sander
d3d6feb4de
Enable async copy again.
...
Also add HIP_FORCE_SYNC_COPY chicken bit.
Change-Id: I76a385410494b99bf27305d3c08f55dd81987565
2016-11-10 23:12:48 -06:00
Ben Sander
ced9d72d94
Refactor copy and P2P logic.
...
Prefer use of source-engine for DMA copies, even if user submits copy
in a stream attached to a different device.
The stream is now used only for synchronization, and HIP
makes the most optimal decision for which engine to perform the
copy - typically the source copy engine.
HIP now makes decision on which engine should perform the copy
and passes this to HCC using new apis.
HIP has additional information about peer
visibility and will make a decision which agent should perform
the copy .
Change-Id: I0cf4cfebeae256e6ca795f08a7ed7130f4857d1f
2016-11-10 23:12:48 -06:00
Ben Sander
2dea3a0b1a
Improve memory debug
...
Change-Id: I0f033139aa4e4b47039eb016e404009127bd0a44
2016-11-10 23:12:48 -06:00
pensun
4d7ac1e091
resolve conflicts for git pull
...
Change-Id: Ie353b831e2241bc28042069b6cc7405257e871e1
2016-11-09 21:38:43 -06:00
pensun
f7e9f12bf1
Add option to alloc fingrained system memory
...
Change-Id: Ia13c8e058cb988b5857e75a590a4d67411362ae1
2016-11-09 21:36:30 -06:00
Ben Sander
d728819d17
Improve Peer support and testing.
...
Change-Id: Icadc65988aaf145a265587ab0357c5bf4d26f3eb
2016-11-06 03:22:36 -06:00
Ben Sander
092b3dacda
Set forceHostCopyEngine for other copy dirs. Support HIP_FORCE_P2P_HOST
...
Also: more debug for copy and P2p.
Change-Id: I87030c525410e041b2a00baaf6c68e6c0977ff42
2016-11-04 19:53:23 -05:00
Ben Sander
5d79384832
Refactor resolve-mem step1
...
Change-Id: I7b8b2bbb56d7b31a97b48ebd42002641cd07a460
2016-11-04 09:37:56 -05:00
Ben Sander
3f0a2b8dc1
Add debug for Peer APIs. Enable PeerMemcpy APIs by default.
...
Change-Id: I46e39a9e7b07686a78484c1f3b5495b08e052fbb
2016-11-04 08:51:16 -05:00
Aditya Atluri
f48c53534e
added inter thread data movement intrinsics
...
Change-Id: I2a8a8ed49429cb7f96439bd28c4b83b5142737df
2016-11-01 16:37:33 -05:00
Rahul Garg
2d15d0741c
Added hipDeviceGetByPCIBusId in hip/hcc path
...
Change-Id: I3cca0dc533d0281689d8a407c7da16ca1ba6a3a8
2016-11-01 10:57:48 +05:30
Ben Sander
024d9ab090
Print short hipLaunchKernel correctly.
...
Change-Id: I6ca03d7c707cd03d6982199830213953d5855f17
2016-10-27 23:09:32 -05:00
Ben Sander
bb58f4f6fc
Add initial hipProfileStart/Stop
...
And modify sample to show how to use.
Still needs some work to understand interaction with CXL.
Change-Id: I2579824d2dd7863ea23874d34f0dabb3cb305d3e
2016-10-27 23:09:32 -05:00
Ben Sander
ef8eac9b66
Add two levels of HIP_PROFILE_API (1=short,2=long)
...
Change-Id: I7ef98589f8731fb879db109fd573c62b489f2b61
2016-10-27 23:09:31 -05:00
Ben Sander
ab1836544a
Fix scoped marker so begin/end ATP timestamps correct
...
Change-Id: Ic944d3fc00d7bc31b756c0e6c327b99eb489537e
2016-10-27 23:09:31 -05:00
Ben Sander
e9056798f6
Rename HIP_ATP_MARKER and profiling vars
...
HIP_PROFILE_API
HIP_DB_START_API
HIP_DB_STOP_API
Change-Id: I6c4da67212ff8217e6356a2622d4c6278a188c34
2016-10-27 23:09:31 -05:00
Ben Sander
f5e8090f2f
Allow HIP_DB to be number or string flags (ie HIP_DB=api+mem+sync)
...
Add callbacks for processing env vars.
Change-Id: I4ddf50e2da56b1dae43f50657bc693b07b23c03d
2016-10-27 23:09:31 -05:00
Ben Sander
710be682ca
Add HIP_PROFILE_START_API, HIP_PROFILE_STOP_API
...
Refactor HIP_INIT_API to call recordApiTrace.
Change-Id: Ieff4b5018236f59e49e1b9841474440a34f821df
2016-10-27 23:09:31 -05:00
Ben Sander
739bc37503
Add per-thread API seqnum to debug
...
Change-Id: Ib13733a3e84cd56bae13a32bae40f936c20b7543
2016-10-27 23:09:31 -05:00
Ben Sander
354091f357
Don't call allow-access if allocating device's only peer is self.
...
Change-Id: Iac58e6c3e460675833f10b1e8b2e393de223654d
2016-10-27 23:09:31 -05:00
Ben Sander
346c519ace
Improve HIP TID printing in debug mode.
...
Map long thread-id to a short one that is printed with each message.
Remove clunky stirng creation code for tid_tr.
Print TID on every message.
Change-Id: I780a91d8ce789cb4957789036b478bf5cde8c4e4
2016-10-27 23:09:31 -05:00
Aditya Atluri
e1c1b4c009
reverted change for cache size query
...
Change-Id: I44a1f43818cd287a2a3b6265f43d183f9bd5b71c
2016-10-25 11:03:35 -05:00
Aditya Atluri
820a914b98
correct cachesize to output correct value
...
Change-Id: I5db031591eb718b0c12e78a35e4b19349de9526d
2016-10-25 09:33:45 -05:00
pensun
1f11a9554e
Add workaround for hipStreamAddCallback function: call stream synchronize on host and then add execute the call back function
...
Change-Id: If361f8e053949904b19b9e09245d267f05e29f7b
2016-10-22 23:59:39 -05:00
Ben Sander
203348ac9b
Fix P2P for async
...
Also improve HIP debug message: Add more DB_COPY1 messages. memcpyStr,
expand HIP_DB bitmask.
2016-10-20 10:02:23 -05:00
Rahul Garg
cd6eb7af78
Quickfix for HCSWAP-60, support for hipHostMallocPortable
...
Change-Id: I2a4fcacea9d916ef222324fc9e9d8191f6dc12d0
2016-10-20 10:44:30 +05:30
Ben Sander
000d75de95
Add HIP_WAIT_MODE env var.
...
Also weaken cases where hipSetDeviceFlags returns hipErrorInvalidValue.
Change-Id: I7f113338be6fe498eaf1ab40fd0fd6b23849bb5e
2016-10-18 22:27:16 -05:00
Ben Sander
261ff423e1
Add hipDeviceSchedule* support to queue wait
...
Change-Id: Iffa7a356500b026f3737c3f5719ca9f62b10d855
2016-10-18 22:27:16 -05:00
Ben Sander
d21d3ec222
Remove some TODO items
...
Change-Id: I7e9de2e43a8584f8dc9ee6d45c8ed00ca465f591
2016-10-18 22:27:16 -05:00
Ben Sander
2239e99f60
Fix event flag detection.
...
Change-Id: I0b0ba66c2339021320fe3d7760fdad1a0490a76b
2016-10-18 22:27:16 -05:00
Ben Sander
61af94a555
Update docs for event, review event TODO.
...
Change-Id: Iec491f9f22df163f01c0af6639fcbe33c81acdcc
2016-10-18 22:27:16 -05:00
Ben Sander
9315ac1a29
Move some internal headers from "include/hip/" to src.
...
Change-Id: I7041bd5c803d9318979f4a7c1d658445c614691e
2016-10-18 22:27:16 -05:00
Maneesh Gupta
8471682f26
src/*: Update copyright header
...
Change-Id: I455f5d0d12fe9cb39a3ba873bd22b4c25ed07cbf
2016-10-15 22:55:22 +05:30
Ben Sander
c54220eca9
Cleanup files from code review.
...
- Remove some stale code
- Update docs
- Correct define for __HIP_ARCH_HAS_GLOBAL_INT64_ATOMICS__
Change-Id: Ic5e3cdb8269b1c18f6d2693700b55e08c4d0080e
2016-10-15 11:51:20 -05:00
Ben Sander
50e0a363ce
Add code to use new HCC API accelerator_view::dispatch_hsa_kernel.
...
Disabed by default, can enable with USE_DISPATCH_HSA_KERNEL=1
Change-Id: I7a6ba76f2bada34952ed47f5335ce695fa2faea5
2016-10-14 23:46:29 -05:00
Maneesh Gupta
6a14f39f8b
Remove incorrect executable-bit from non-executable files
...
Change-Id: Iacc434374721e01f7d75d0ab54bceabe0b337f54
2016-10-14 12:53:13 +05:30
Aditya Atluri
00c3db0e60
changed hipLimit to hipLimit_t and data type to enum
...
Change-Id: I94f408cdcac4b0bb38801d58709b68e9630d44d0
2016-10-13 15:13:11 -05:00
Aditya Atluri
90a71c4be4
added compiler flag for polaris
...
Change-Id: Ib14c14c0618982ac7b48f5bc704c04b54ff40ed9
2016-10-13 14:16:48 -05:00
Ben Sander
89012201c9
Fix HIP_USE_PRODUCT_NAME detection.
...
Change-Id: I6879ec3a11845bea66a18a9328bd4eaf54713420
2016-10-13 11:51:53 -05:00
Ben Sander
fa075091b5
fix file-not-found detection
...
Change-Id: Ida84923ed18b3ebf8ffcfd6ee84d8a72f611ecd3
2016-10-13 11:43:49 -05:00
pensun
b70409f3ad
Add ifdef guard for the feature requires ROCm1.3
...
Change-Id: I7154517c47000c37fe5eb09a3c1cf2a9aacbe27c
2016-10-13 10:57:31 -05:00
Aditya Atluri
237837d9bd
added constant memory property to 16KB
...
Change-Id: If067b4057c2e3fc0c26cf4604a1d4fac7f139b12
2016-10-13 10:47:40 -05:00