Aditya Atluri
e2dd339cfd
added texture header to memory api source
...
Change-Id: I1af6d60aca5a9a9ef1cadf8c304bea892acbe061
[ROCm/clr commit: 84d0d10fad ]
2016-11-17 11:57:53 -06:00
Aditya Atluri
d3559bffb4
make texture as seperate header as of now
...
Change-Id: I3c65aa75f2f729eedd8c3292fa3cbc37709c1cfe
[ROCm/clr commit: 94984470d4 ]
2016-11-17 11:55:29 -06:00
Aditya Atluri
a997f0f074
Added i8 packed math intrinsics
...
1. Added add, sub, mul packed math i8 intrinsics
2. Removed c++ packed data structures included from HCC
Change-Id: I1d109c5ce10c48b7cd3ea059478b88fc1de78499
TODO: Add better packed data structures support
[ROCm/clr commit: 603bb321ec ]
2016-11-17 01:09:12 -06:00
Maneesh Gupta
65782bbaeb
Enable USE_COPY_EXT_V2 by default
...
Change-Id: I2c0dc80f85a0ccb5744715b5418a604e38b249ed
[ROCm/clr commit: 0696d4417f ]
2016-11-15 10:42:27 +05:30
Ben Sander
ab0362087f
tweak hcc demangler
...
[ROCm/clr commit: 2ea3f8f68e ]
2016-11-14 15:26:27 -07:00
Sandeep Kumar
dc599cf2b8
Add p2p for cookbook
...
Change-Id: Id2e77ab31123ef95885d665efe34bc0d4596733a
(cherry picked from commit 6fbd0352713ca36e399b1ed4f17c486207a53875)
[ROCm/clr commit: 39e1b16d0b ]
2016-11-14 06:10:36 +05:30
Maneesh Gupta
d8564db4a5
Revert "hipcc: Turn back linking hip_ir.ll by default"
...
This reverts commit 5a48591fc5 .
[ROCm/clr commit: f9d598d66c ]
2016-11-14 06:05:31 +05:30
Ben Sander
b8fb23009b
Add draft doc on profiling with hip.
...
Change-Id: I79727dd2500333b3f16acb381dd5852a15ed408a
[ROCm/clr commit: 09d88d3b97 ]
2016-11-13 10:01:05 -06:00
Ben Sander
603c3a3a38
Add   to demangler
...
Change-Id: I89586c7c17f5152b7a6850d0d6c2aa1d3ebc8190
[ROCm/clr commit: d3dbf66ab1 ]
2016-11-11 16:50:56 -06:00
pensun
dd1061b874
Add direct test case for threadfence_system workaround
...
Change-Id: I5b21b590e957c901044741ac94e816cd8b1426f9
[ROCm/clr commit: 992f94b3a1 ]
2016-11-11 15:09:43 -06:00
Aditya Atluri
a3286737aa
fixed multi-dim module kernel launch
...
Change-Id: Id1d81f2375d058979ab526433f905cf0ea3d23d6
[ROCm/clr commit: 6dcdf08e0d ]
2016-11-11 12:25:23 -06:00
Ben Sander
5e354dcd77
Add option to deny peer access.
...
Also fix test.
Change-Id: I1b247f6c4271442b008e560669bca4daf8eb94c7
[ROCm/clr commit: d666fbaafe ]
2016-11-10 23:12:48 -06:00
Ben Sander
40f8947cc3
Use forceUnpinnedCopy to resolve P2p corner cases.
...
Change-Id: I2aebb419881246cebb696bec87798635bc71acc2
[ROCm/clr commit: 6e54a600b6 ]
2016-11-10 23:12:48 -06:00
Ben Sander
f634f73fef
Enable async copy again.
...
Also add HIP_FORCE_SYNC_COPY chicken bit.
Change-Id: I76a385410494b99bf27305d3c08f55dd81987565
[ROCm/clr commit: 0eeaa3bcd5 ]
2016-11-10 23:12:48 -06:00
Ben Sander
0c66772f37
Doc change only - add comments to test.
...
Change-Id: Ie42087cf3c78e49337b18bb71f3f0e1e7950ee1b
[ROCm/clr commit: 85e65b55ff ]
2016-11-10 23:12:48 -06:00
Ben Sander
ee41609b48
Refactor copy and P2P logic.
...
Prefer use of source-engine for DMA copies, even if user submits copy
in a stream attached to a different device.
The stream is now used only for synchronization, and HIP
makes the most optimal decision for which engine to perform the
copy - typically the source copy engine.
HIP now makes decision on which engine should perform the copy
and passes this to HCC using new apis.
HIP has additional information about peer
visibility and will make a decision which agent should perform
the copy .
Change-Id: I0cf4cfebeae256e6ca795f08a7ed7130f4857d1f
[ROCm/clr commit: e767e0032e ]
2016-11-10 23:12:48 -06:00
Ben Sander
ae2992bcb9
Improve memory debug
...
Change-Id: I0f033139aa4e4b47039eb016e404009127bd0a44
[ROCm/clr commit: e9835617f1 ]
2016-11-10 23:12:48 -06:00
pensun
8ea566f2b2
Update depreciated information for threadfence_system()
...
Change-Id: Id13d2f81edb51eb42b896a5c06913d59ec907c55
[ROCm/clr commit: 9aa2269d5c ]
2016-11-10 11:55:12 -06:00
Maneesh Gupta
df096e871e
CMakeLists.txt: Cascade CMAKE_BUILD_TYPE to tests
...
Change-Id: I53a3ea951c1fd57e43a02381a457c1dedc1a34f7
[ROCm/clr commit: cdcf04d744 ]
2016-11-10 21:26:34 +05:30
Rahul Garg
cd7ad3d620
hipDeviceGetByPCIBusId support for HIP/NVCC
...
Change-Id: I8f82890e88d2a15f592bff192179e7d5c5362722
[ROCm/clr commit: f86c7b5b3c ]
2016-11-10 11:40:59 +05:30
Maneesh Gupta
d55b32b765
hipcc: Default to HIP_LIB_TYPE=1
...
Change-Id: I83b05accd76f7bc94bd724c66ae060fa0095bc8d
[ROCm/clr commit: 462ffb8117 ]
2016-11-10 11:34:00 +05:30
Maneesh Gupta
99678b0000
hcc_dialects/Makefile: use clamp-config
...
Change-Id: I86df82f75b75125825e22d0545209a19386d9936
[ROCm/clr commit: 052a580d5b ]
2016-11-10 11:31:50 +05:30
pensun
daf19a2dbb
resolve conflicts for git pull
...
Change-Id: Ie353b831e2241bc28042069b6cc7405257e871e1
[ROCm/clr commit: bbb619c732 ]
2016-11-09 21:38:43 -06:00
pensun
ce1b4bdc06
Add documentation on threadfence_system workaround guidelines.
...
Change-Id: I9636a3808798f3dabe992285ce5652187cee6eb8
[ROCm/clr commit: 94dfff9db2 ]
2016-11-09 21:36:30 -06:00
pensun
61635b585f
Add option to alloc fingrained system memory
...
Change-Id: Ia13c8e058cb988b5857e75a590a4d67411362ae1
[ROCm/clr commit: 23de0e1b50 ]
2016-11-09 21:36:30 -06:00
Maneesh Gupta
68b4d20b26
Merge branch 'rocm-rel-1.3' into amd-develop
...
Conflicts:
include/hip/nvcc_detail/hip_runtime_api.h
Change-Id: I990a7d008da9e8dcc68250cebbc8ee6e723c7e01
[ROCm/clr commit: e3b5eef7c9 ]
2016-11-10 08:56:38 +05:30
pensun
85fa855e18
fix hipProfiler* apis on NV path
...
Change-Id: I6adca6151fef3a9b35348163eb6bd13f5c414172
[ROCm/clr commit: 4a8a6a4697 ]
2016-11-09 15:44:01 -06:00
pensun
12a5923a2b
fix for hipcallback function on NV path
...
Change-Id: If80c0cfe60b1f3b1a71627b5f3f79503cba4d491
[ROCm/clr commit: e5277ab4b6 ]
2016-11-09 11:33:23 -06:00
Maneesh Gupta
ded929878c
Update release notes for 1.0 release
...
Change-Id: I74fa2b41afc334a76c309b125c27aa141cd59554
[ROCm/clr commit: 01a38c82a6 ]
2016-11-08 16:31:56 +05:30
Ben Sander
3d4a76d560
Fix tests to read warpSize from device props.
...
Change-Id: I9583577793afad49f9eb1ee9069bd4c6963a6023
[ROCm/clr commit: a13ec441bf ]
2016-11-06 04:26:28 -06:00
Ben Sander
445f888d97
Update gitignore for some common output files
...
Change-Id: I9cd60f042af4dba07fe0fdbd2ee442936ff8c7bd
[ROCm/clr commit: 0e5cfed3eb ]
2016-11-06 04:26:15 -06:00
Ben Sander
03fcf556e9
Improve Peer support and testing.
...
Change-Id: Icadc65988aaf145a265587ab0357c5bf4d26f3eb
[ROCm/clr commit: f3d38c2615 ]
2016-11-06 03:22:36 -06:00
Ben Sander
0af2722827
Set forceHostCopyEngine for other copy dirs. Support HIP_FORCE_P2P_HOST
...
Also: more debug for copy and P2p.
Change-Id: I87030c525410e041b2a00baaf6c68e6c0977ff42
[ROCm/clr commit: 06ecfa3975 ]
2016-11-04 19:53:23 -05:00
Ben Sander
2bf51afaa1
Expand hipP2PSimple testing.
...
Cover cases where P2P is used for H2D copies, where host is pinned
but not accessible to the copy agent.
Change-Id: I9464b787228b40f93473708c3fde9726e1986365
[ROCm/clr commit: 60a8a5405d ]
2016-11-04 16:13:32 -05:00
Ben Sander
06b9391974
Refactor resolve-mem step1
...
Change-Id: I7b8b2bbb56d7b31a97b48ebd42002641cd07a460
[ROCm/clr commit: 926e63c655 ]
2016-11-04 09:37:56 -05:00
Ben Sander
74c9c6e591
Add debug for Peer APIs. Enable PeerMemcpy APIs by default.
...
Change-Id: I46e39a9e7b07686a78484c1f3b5495b08e052fbb
[ROCm/clr commit: 00276d141e ]
2016-11-04 08:51:16 -05:00
Ben Sander
43723d77cc
Print non-peers too
...
Change-Id: I2a6905edcdf144aa732ae3120c17780477f232ac
[ROCm/clr commit: 44aee4b61c ]
2016-11-04 06:34:07 -05:00
Ben Sander
97d9a5722e
Pre-pend HIP_PATH/lib to linker, so we find developer object code
...
Previously might pick up libs from /opt/rocm/lib.
Change-Id: Ia7adb345defe433d5952aa61706fe03fd7cbcd35
[ROCm/clr commit: d1db786910 ]
2016-11-04 06:06:04 -05:00
pensun
8911a02b17
Update document for workaround suggestion on threadfence_system()
...
Change-Id: Icccab8270604a0e578a8614b9afb3f95372f4966
[ROCm/clr commit: 212fa7033c ]
2016-11-02 16:08:27 -05:00
pensun
00ce529177
Update hipStreamNonBlocking to use cuda define on NV path
...
Change-Id: I74ea09db99d602ba1c5f192b36ff7f2781176e6a
[ROCm/clr commit: 9f86e47800 ]
2016-11-01 20:30:56 -05:00
Aditya Atluri
2d299543bf
added inter thread data movement intrinsics
...
Change-Id: I2a8a8ed49429cb7f96439bd28c4b83b5142737df
[ROCm/clr commit: f097b6ef81 ]
2016-11-01 16:37:33 -05:00
Rahul Garg
5040e8bcc3
Added hipDeviceGetByPCIBusId in hip/hcc path
...
Change-Id: I3cca0dc533d0281689d8a407c7da16ca1ba6a3a8
[ROCm/clr commit: 81c91f5b0b ]
2016-11-01 10:57:48 +05:30
Evgeny Mankov
6b06d071b9
[HIPIFY] wrap kernel name with HIP_KERNEL_NAME macros...
...
only in case of commas in it.
[ROCm/clr commit: e1812a1319 ]
2016-10-28 20:05:51 +03:00
Evgeny Mankov
85b20ca376
* [HIPIFY] Initial Profiler support.
...
CUDA Driver API porting to HIP:
+ cuProfilerStart, cuProfilerStop.
- cuProfilerInitialize & cudaProfilerInitialize - unsupported yet by HIP.
[ROCm/clr commit: 3101c26d14 ]
2016-10-28 18:32:13 +03:00
Ben Sander
f31e602346
add hip_profile.h
...
Change-Id: Id43a4336db53567020584cb7842baf5c1649fd8e
[ROCm/clr commit: 9edaf0e3f7 ]
2016-10-28 07:08:46 -05:00
Maneesh Gupta
63ffd01391
hipdemangleatp: Try handling HC kernels as well
...
Change-Id: Ie438ddd28e5bc6067fcd682df849d3183046b40a
[ROCm/clr commit: c26f5d7d5a ]
2016-10-28 15:46:59 +05:30
Maneesh Gupta
04eb05f1a0
CMakeLists.txt: Update include paths needed for Fedora support
...
Change-Id: Ib84f9dba30d2c64f344d6f8e85ddbe15f30af1a0
[ROCm/clr commit: 1f08f2adaf ]
2016-10-28 14:12:53 +05:30
Maneesh Gupta
6872cb2ceb
hipcc: Update flags for Fedora support
...
Change-Id: I90be7768410e491b4f11c3b0f08470246d781d80
[ROCm/clr commit: 0d8aa10473 ]
2016-10-28 14:12:13 +05:30
Ben Sander
c8aad6ee8e
Print short hipLaunchKernel correctly.
...
Change-Id: I6ca03d7c707cd03d6982199830213953d5855f17
[ROCm/clr commit: 3d0fa30183 ]
2016-10-27 23:09:32 -05:00
Ben Sander
4378a14789
Add initial hipProfileStart/Stop
...
And modify sample to show how to use.
Still needs some work to understand interaction with CXL.
Change-Id: I2579824d2dd7863ea23874d34f0dabb3cb305d3e
[ROCm/clr commit: 18dbafe6e8 ]
2016-10-27 23:09:32 -05:00