Replace amd::Atomic with std::atomic. Remove make_atomic uses by
converting the variable to std::atomic and making sure the memory
order is relaxed when synchronizes-with is not needed.
Delete utils/atomic.hpp.
Change-Id: I0b36db8d604a8510ac6e36b32885fd16a1b8ccfa
1.Some files are not built in rocclr so the issues are
not found in rocclr build. But the issues are exposed
in TC build.
2.Clear unused codes in test cmake file.
Change-Id: I1ad4decdf4df5237b93e1ea2547eb39a19f7dc4a
There is a small window where a thread can go to sleep in
Monitor::wait after releasing the lock but before another thread
notifies the monitor and updates the on-deck thread.
A simple approach to fix this problem is to wake-up the Monitor::wait
every 10 milliseconds and check if it is on-deck.
Change-Id: I4b9abda89d1fc653cdae2b4c84cdda01efde1cf2
Reduce the size of the queueLock and lastCmdLock critical sections
to improve lock contention performance. The smaller the critical
sections are the better.
lasCmdLock is still needed to guarantee that getLastEnqueueCommand_
can retain the command before it is swapped out and released.
Change-Id: Id35d4a77c035b2da0de4c15568b153d49e958bb7
Replace constexpr with const in kernel source
codes because some kernel compiler doesn't
support constexpr.
Replace scheduler with __amd_rocclr_scheduler
due to name change.
Change-Id: I1ad4ddcdf1df5237b83e1ea2447eb39a19f7dc4a
root cause - cooperative queue is not inserted into queuePool_ (HSA queues) of ROC device calss causing a crash when creating hostcall buffers for printf
Change-Id: I3f9aceb4e5fe6a7c7a2a549a4bb0a3511fe02799
There is no synchronize with relationship between the monitor micro-
lock and the onDeck microlock, so it is possible for an onDeck.load to
move above a contendersList.store, or a contendersList.load to move
above an ondeck.store.
To fix this issue a full memory fence (mm_mfence on x86) is needed
after the last store in the contendersList and onDeck critical regions.
Change-Id: I5beb7dfe0d21010c5bf00cd65d59b9c7af58e919
Fix a typo with the name define, when compilation wasn't enabled.
Force CPU prefetch if system was forced in runtime
Change-Id: Id4b578f9fa44a45426fdb5d8ecb1da803aa42313
The current implementation creates default reference in the stack and assigns it to class member cuMasks_, so whenever the content of the stack changes, cuMask_ would change.
Change-Id: Iefab63c335d504b83c4ae90bd34ae76c6afb8f3c
Optimizaiton to remove extra syncs uncovered a bug with the cache
coherency layer, there runtime could lose the track of mem address
if coherency layer performed a sync.
Change-Id: I25647cfa4a4be9cdbd8577ff076a740bbdac79c8