* SWDEV-532479 - Add tracking of hostcall memory allocations
* SWDEV-532479 - Remove hostcall allocations if request is received
* SWDEV-532479 - Cleanup
* SWDEV-532479 - Naming fix
* SWDEV-532479 - Add new separator after each new function
[ROCm/clr commit: b58faa2f37]
The early return if the thread is not alive causes memory leaks.
Neither doorbell_ or urilocator are released if the thread is not alive.
This change alters the logic so regardless of the thread status the
HostcallListener releases its memory.
Change-Id: Ie912360ec0e2ee257de9937b1a8d7375e6aebd83
[ROCm/clr commit: f0063ba8da]
Hence, It is not required to check it if thread is already finished processing packets.
Change-Id: If1b43a169a06203f3e1ab0529cf592879496d7c4
[ROCm/clr commit: 3f3f3d0f1c]
1.Move global amd::monitor listenerLock before global
class runtime_tear_down as it will be referenced in
~RuntimeTearDown() after main(). It should be freed
later than runtime_tear_down.
2.Update Device::~Device() to SVM free coopHostcallBuffer_
before context_ is released and freed.
Change-Id: I1d21378ff463477d3238d71e5e2a1a7d6b9147ad
[ROCm/clr commit: 544c45364f]
Avoid a deadlock on the host call buffer creation. Since the buffer will be
allocated in the queue thread, then use direct device memory allocation
skipping the global context lock.
Change-Id: I09b55ee03bb42ab5d320c152b52a8c842c5fdcc1
[ROCm/clr commit: 62559a6e5a]
- In ROCr, there is supposed to be exactly one HSA signal ever whose pointer is stored in every hostcall buffer so that device code can find it
- But, hostcallListener->initDevice creates a new HSA signal everytime enableHostcalls() gets called
Change-Id: I100595ec37442bcdb73da5991062f0a474de2935
[ROCm/clr commit: 42da508815]
Update timeout for hostcall wait for signal. If the timeout is small it
checks frequent enough to affect performance for certain applications
which may be CPU bound.
Change-Id: I0a879559e4ad111b09a994a5b82a6faf6e4fea3f
[ROCm/clr commit: 9292abb2d8]
Change the scope of hostcall buffer access lock during destruction.
Make sure wait() returns the signal value after timeout. That
matches ROCr behaviour for HSA signal wait.
Change-Id: I3df34207e0c2e21972ec8052777e5742bda1dca0
[ROCm/clr commit: 9a9d10a10b]
Note that this requires base driver CL#2340320+ to have SQ interrupt
functionality enabled by default.
Change-Id: I04b936819ebe1eb7cf5de1db4fafe83af3a1b5f6
[ROCm/clr commit: 4171e9e0a3]
This reverts commit 9df70fa03ce60d47247eb0e8f278e1f8dbd33d6e.
Reason for revert: need SWDEV-294782 to be resolved before we can enable SQ interrupt support.
Change-Id: I328170b60f1a3aab28c0b1fd3191297a1a51ecb7
[ROCm/clr commit: 6566361144]
Since the majority of the Hostcall implementation now sits in the
commmon layer, the PAL backend simply just needs to invoke it. One thing
that is missing though is HSA signal support.
The newly added pal::Signal class is a light emulaion of what HSA
signals provide. The current implementation is just enough to get
Hostcall working, but it can be expanded in the future if needed to
fully emulate HSA signals.
The major difference for now between PAL and ROCm hostcall
implemenations is that PAL doesn't support blocking signals. This will
be enabled in the near future. For now use active wait for PAL.
Change-Id: I746557354ab9d71a7d4a31f9320fcc2fee5aee7f
[ROCm/clr commit: 99e8ac55cd]
This change unifies the hostcall implementation for all the backends,
by pushing the common logic to the device layer. This is done by
replacing the use of hsa_signal_t with device::Signal (a light wrapper
around it).
Change-Id: I7b6fca7930b5a0b199da5d85e2e048354cc04e7b
[ROCm/clr commit: 671778bdd3]