We need this otherwise ROCr can give us a matching address
for another allocation and doing "insert" in ROCclr will not
update the map with the newest object. We would then end up
using stale objects (yikes)
SWDEV-234992
Change-Id: I3475adf9781a9309d64a024fae45181d7e5afb04
[ROCm/hip commit: a03fee04fe]
In case hipModule(Un)Load is called from different thread as hipInit we need to grab the lock
as both are going to modify modules_
Also add some logging for __hipExtractCodeObjectFromFatBinary in case binary isn't found for GPU
SWDEV-236032
Change-Id: Icbd72b412502df80d5066cea42a4fbcd5b0b8a98
[ROCm/hip commit: f100ae3679]
This issue happens because we getLastQueuedCommand when recording
the event and do end_ - start_ so it takes the ticks for the
completion of the last command before event record. This may not
happen if one records a marker command for hipEventRecord
Change-Id: I1d6b06a5befb3b93f16b67692c59dca25c982e0f
[ROCm/hip commit: 43986c6791]
Maintain compatability with the old finding for now for the
convenience of commit order.
Change-Id: I99b236cbb3d61b00650e3da7fe5931d4c4b3fec6
[ROCm/hip commit: 024764c337]
SWDEV-235579
Move the lock before destroying the queue as there's a multithreaded race condition if the queue
is being destroy and right after we set queue_ to nullptr, another thread can call ihipWaitStreams
which will then call create on that same stream because queue is now nullptr.
Moving the lock on streamSet prevents this from happening because we would remove the stream from that
list and therefore ihipWait will not try to call asHostQueue which tries to create the queue if not created yet
since the stream won't be in the list anymore
Change-Id: I3108657ab403d39d4123e83294fcf1f0880e5563
[ROCm/hip commit: 6b361bc1a0]
This technique should never be used, and only accessed through
__builtins.
There's currently no builtin for groupstaticsize. I left ds_swizzle
since for some reason it switches to the builtin based on __HCC__ or
not.
Change-Id: If1e1394221dba83ea4add6db5e94d6b715552044
[ROCm/hip commit: d2dd307c7d]
The hipcc script takes arguments and uses this to build up a new
command. Characters which are special to the shell need to be quoted
to prevent them being interpreted.
In particular adding
--Wl,--enable-new-dtags -Wl,--rpath,'$ORIGIN:$ORIGIN/../lib'
to the command should pass quoted dollar signs into the resulting
string so the shell passes them on, rather than substituting the
values.
The arguments are processed in a conventional loop, but can be altered
during the course of the loop, and also by linker response files.
Tested by running
HIPCC_VERBOSE=7 HIP_COMPILER=clang hipcc --cxxflags \
fred.c -Wl,,--rpath,'$ORIGIN:$ORIGIN:/../lib'
and observing "-Wl,--rpath,\$ORIGIN\:\$ORIGIN\:..\/lib" in the
displayed hipcc-cmd output (and ignoring the errors due to rocm not
being installed)
Change-Id: I26b62f09ff3518cceeb85fa8823bb12a95c1c78e
Signed-off-by: Icarus Sparry <icarus.sparry@amd.com>
[ROCm/hip commit: a4f01ffca6]
We should be returning the max workgroup size calculated by the compiler.
Change-Id: If86590efbb9b291f470bdbe87e5df992e661c539
[ROCm/hip commit: 1b1c032e9f]
Before setting the HIP_RUNTIME and HIP_COMPILER variables, first check the environment if these are set. We should prioritize the environment settings. For windows, it will be set, and also explicitly call perl when invoking hipconfig.
Change-Id: I89ad267285239e6d8a897dc681c4af5906e7b9d8
[ROCm/hip commit: 5fbae827c2]
Support performance tests while direct tests commands keep unchanged.
To build performance tests, run "make build_perf".
To run all performance testis, run "make perf".
To run specific tests, for example, run
/usr/bin/ctest -C performance -R performance_tests/perfDispatch --verbose
To run individual test, for example, run
performance_tests/memory/hipPerfMemMallocCpyFree
Change-Id: I168c1b9ef1ec21b392d48648d0c71e8fbd37d57b
[ROCm/hip commit: 6e972dd3bb]