1. Updated FAQ with shft*sync not supported hip_faq.md
2. Corrected some of input parameter description in hcc_details/hip_runtime_api.h
3. Redirect shfl*() to shfl_*_sync() for nvcc path where CUDA > 9.0
Change-Id: I3d8184db5fcc622852c9bad96b706348e8dfc16c
[ROCm/hip commit: 83b11f9a61]
find_package should now be the only way to import ROCclr. Also update
the build example comment.
The build scripts used 2 custom variables to manually specify the
build and source directories for where to find VDI. Once renamed to
ROCclr, these conflicted with the variables automatically set by
find_package(ROCclr). These hacks tried to satisfy this intermediate
step to try satisfying commit ordering problems to get through PSDB.
The INSTALL.md documentation should also be updated, but it's
completely missing any mention of ROCclr now, and still gives
directions for hcc.
Change-Id: I6fc94b6cb36241a9d4f22d24e49523367f803461
[ROCm/hip commit: a2d2709ec1]
When libamdhip64_static.a is built by Jenkin, sample square cannot been
built successfully because libamdhip64_static.a is archiveved in thin
mode. Thus in the patch it will be archiveved in full mode. Meanwhile
libamdhip64_static_temp.a will be useless and thus removed.
Change-Id: Ifd3882598ef0dc5e7af8db0e389e786025ceb455
[ROCm/hip commit: 470b89a6bf]
This points to the cmake directory where the find module was found,
not a prefix for where it was found.
Based on the search below looking in roctracer, searching in ROCclr
for the header doesn't make much sense. The header should be either
provided by ROCclr xor roctracer. Having it possibly be provided by
two different dependencies is confusing, and a potential source of
version mismatch problems.
Change-Id: Ic2f6ec03f9a7b86225cf7e5c43f39a1360318a34
[ROCm/hip commit: d6aad8ae91]
If the start and stop events have same command internally
then measure command end to command start
Change-Id: Ie70cfa37c06c06573f0ed58dab2bbe4434c1724b
[ROCm/hip commit: 50be95e169]
When the original size is devided accross all GPUs rounding can
occur, causing incorrect validation. Readjust the final value
for comparison to the new size accordingly.
Change-Id: I9b42149e33dfcb328de7419e546a0202a69a8610
[ROCm/hip commit: 20f0e36041]
We need this otherwise ROCr can give us a matching address
for another allocation and doing "insert" in ROCclr will not
update the map with the newest object. We would then end up
using stale objects (yikes)
SWDEV-234992
Change-Id: I3475adf9781a9309d64a024fae45181d7e5afb04
[ROCm/hip commit: a03fee04fe]
In case hipModule(Un)Load is called from different thread as hipInit we need to grab the lock
as both are going to modify modules_
Also add some logging for __hipExtractCodeObjectFromFatBinary in case binary isn't found for GPU
SWDEV-236032
Change-Id: Icbd72b412502df80d5066cea42a4fbcd5b0b8a98
[ROCm/hip commit: f100ae3679]
This issue happens because we getLastQueuedCommand when recording
the event and do end_ - start_ so it takes the ticks for the
completion of the last command before event record. This may not
happen if one records a marker command for hipEventRecord
Change-Id: I1d6b06a5befb3b93f16b67692c59dca25c982e0f
[ROCm/hip commit: 43986c6791]
Maintain compatability with the old finding for now for the
convenience of commit order.
Change-Id: I99b236cbb3d61b00650e3da7fe5931d4c4b3fec6
[ROCm/hip commit: 024764c337]
SWDEV-235579
Move the lock before destroying the queue as there's a multithreaded race condition if the queue
is being destroy and right after we set queue_ to nullptr, another thread can call ihipWaitStreams
which will then call create on that same stream because queue is now nullptr.
Moving the lock on streamSet prevents this from happening because we would remove the stream from that
list and therefore ihipWait will not try to call asHostQueue which tries to create the queue if not created yet
since the stream won't be in the list anymore
Change-Id: I3108657ab403d39d4123e83294fcf1f0880e5563
[ROCm/hip commit: 6b361bc1a0]