The early return if the thread is not alive causes memory leaks.
Neither doorbell_ or urilocator are released if the thread is not alive.
This change alters the logic so regardless of the thread status the
HostcallListener releases its memory.
Change-Id: Ie912360ec0e2ee257de9937b1a8d7375e6aebd83
1.Move global amd::monitor listenerLock before global
class runtime_tear_down as it will be referenced in
~RuntimeTearDown() after main(). It should be freed
later than runtime_tear_down.
2.Update Device::~Device() to SVM free coopHostcallBuffer_
before context_ is released and freed.
Change-Id: I1d21378ff463477d3238d71e5e2a1a7d6b9147ad
Avoid a deadlock on the host call buffer creation. Since the buffer will be
allocated in the queue thread, then use direct device memory allocation
skipping the global context lock.
Change-Id: I09b55ee03bb42ab5d320c152b52a8c842c5fdcc1
- In ROCr, there is supposed to be exactly one HSA signal ever whose pointer is stored in every hostcall buffer so that device code can find it
- But, hostcallListener->initDevice creates a new HSA signal everytime enableHostcalls() gets called
Change-Id: I100595ec37442bcdb73da5991062f0a474de2935
Update timeout for hostcall wait for signal. If the timeout is small it
checks frequent enough to affect performance for certain applications
which may be CPU bound.
Change-Id: I0a879559e4ad111b09a994a5b82a6faf6e4fea3f
Change the scope of hostcall buffer access lock during destruction.
Make sure wait() returns the signal value after timeout. That
matches ROCr behaviour for HSA signal wait.
Change-Id: I3df34207e0c2e21972ec8052777e5742bda1dca0
Note that this requires base driver CL#2340320+ to have SQ interrupt
functionality enabled by default.
Change-Id: I04b936819ebe1eb7cf5de1db4fafe83af3a1b5f6
This reverts commit 9df70fa03ce60d47247eb0e8f278e1f8dbd33d6e.
Reason for revert: need SWDEV-294782 to be resolved before we can enable SQ interrupt support.
Change-Id: I328170b60f1a3aab28c0b1fd3191297a1a51ecb7
Since the majority of the Hostcall implementation now sits in the
commmon layer, the PAL backend simply just needs to invoke it. One thing
that is missing though is HSA signal support.
The newly added pal::Signal class is a light emulaion of what HSA
signals provide. The current implementation is just enough to get
Hostcall working, but it can be expanded in the future if needed to
fully emulate HSA signals.
The major difference for now between PAL and ROCm hostcall
implemenations is that PAL doesn't support blocking signals. This will
be enabled in the near future. For now use active wait for PAL.
Change-Id: I746557354ab9d71a7d4a31f9320fcc2fee5aee7f
This change unifies the hostcall implementation for all the backends,
by pushing the common logic to the device layer. This is done by
replacing the use of hsa_signal_t with device::Signal (a light wrapper
around it).
Change-Id: I7b6fca7930b5a0b199da5d85e2e048354cc04e7b