We do not want to release resources during setStatus in HIP because of Graphs
Change-Id: Idc7b188ab5f8be6975ea91005dd2bbf177401f8c
[ROCm/clr commit: 133287f31f]
Add lock protection for signal processing
If signal is reused, then disable reference to it from HIP
Increase the pool signal size to 32
Change-Id: I7d529b35910f83ce577c9eca6d3386759611ccc0
[ROCm/clr commit: a1629cad26]
- Create an env var ROC_ACTIVE_WAIT_TIMEOUT to set active wait timeout
- Record profiling informaion if marker_ts_ property is valid.
Change-Id: If0d8aec8d9b0715027cf0f7c3dc8a4c722a6bae6
[ROCm/clr commit: b416ad7b9d]
If AMD event contains a reference to a HW event, then runtime
could check/wait for HW event. CPU status update will occur later
after HSA signal callback, but it's not important for the result.
Change-Id: I591391a953bbdba6a25ac07e2cd98aeb17cd4596
[ROCm/clr commit: 85c70a7495]
Revert back to using the Raven (gfx902) target ID for Raven 2 (gfx909).
This is due to the HSAIL compiler not supporting gfx909.
In theory there should be no issue with running Raven isa on Raven 2.
Change-Id: I425edebc99075799eda5522fad231b8fb3184873
[ROCm/clr commit: 0b1481d4f1]
For DD, send a NOP packet so that we leverage the handler to indicate
completion.
Change-Id: Ie57ea0124a8497d39cc49da1c4575c2cd86b9319
[ROCm/clr commit: 9d0846e732]
Switch HSA_AMD_SVM_ATTRIB_READ_ONLY to
HSA_AMD_SVM_ATTRIB_READ_MOSTLY to match Cuda. The new attribute
was just exposed in ROCr/KFD.
Change-Id: I2ee522d33c347ba52a4e272d2cd7f67960490cf7
[ROCm/clr commit: 89b69638d1]
All KMD/asic_reg/UGL headers are located under the drivers folder. No
need for the AMD_UGL_PATH variable as it essentially is
${AMD_DRIVERS_PATH}/ugl.
Change-Id: I070d737d50f2096493b3e75ef9b9e824cb19d048
[ROCm/clr commit: 1423c1db64]
Add an extension to memory advise to disable cache coherency for
better performance
Change-Id: I283703d81d9c36ddfa2c8fffa15eef60e2195056
[ROCm/clr commit: a9a1e21445]
With HIP API callback runtime has to stall the queue until the
callback is done. Rocclr will introduce SW blocking HSA signal,
which will be released after the callback is done.
Change-Id: I6411f3efab31b468e3b87ebb5c8d155e116b613d
[ROCm/clr commit: d93df7037c]
This change refactors the current ROCclr cmake build to accomodate a
more modular approach. This allows easier support for multiple compiler
and/or multiple runtime backends.
Currently supported compilers:
HSAIL - enabled by ROCCLR_ENABLE_HSAIL (defaults to OFF)
LC - enabled by ROCCLR_ENABLE_LC (defaults to ON)
Currently supported runtimes:
HSA - enabled by ROCCLR_ENABLE_HSA (defaults to ON)
PAL - enabled by ROCCLR_ENABLE_PAL (defaults to OFF)
Any configuration is supported as long as at least one compiler and one
runtime is enabled.
Since ROCclr clients can configure it differently, one cannot reuse the
same ROCclr build artifacts between different clients. To assure this,
this patch assumes that ROCclr will be built as part of the clients
project.
Change-Id: Id4a5c43634296802b8ae87d1ad5984968391ccaf
[ROCm/clr commit: 7f0c18457d]
- Make sure barrier with dependent signals issues before queue
index reservation
- Don't issue extra barrier if it's already a barrier command
with dependent signals
Change-Id: I179a8b7adac79eed698f4a4d9eca2606d8e913aa
[ROCm/clr commit: 148a5cac39]
In HSAIL path, kernel akc info is obtained after code object
loading, and kernel signature creation requires the akc info
when one of the kernel argument is a reference object.
Change-Id: I9cdb1dbf2c72f4620b0b6e46a88402a2473c3e97
[ROCm/clr commit: 8cac880779]
With direct dispatch enabled make sure the queue is done before
destruction.
Change-Id: Ib80af3efb97dfb93e2dce60a11db34fb5c45f5cd
[ROCm/clr commit: a81756bba3]