Set top init_priority on affecting global variables so that
they will be created firstly and destroyed lastly.
Change-Id: Ied59fbecab66ba8195c4a7a02b6bef9fa2fad3af
TF doesn't reserve all available memory now. If any
client wants to reserve they can explicitly set
HIP_HIDDEN_FREE_MEM env var
Change-Id: Ied3a948b79f49aa7327f6a820e9789e39cec143b
This workaround is to avoid performance penalty of SDMA engine
taking a while to clock up from a lower DPM state. Add env var
GPU_FORCE_BLIT_COPY_SIZE (1024 by default for HIP in KB). Forcing
Src and Dst agent to be amdgpu makes ROCr take blit copy path for
what otherwise should have been SDMA copy
Change-Id: I222f687155f86000d17d66d25182e490b6710463
Don't error when querying the number of devices if there are no devices present in the system.
We should just return 0 for the number of devices in this case and let the application handle this situation.
Change-Id: I20614ade5e649f3ce9ddd970d4b38bfe296f6cdb
~45% to 50% of Performance drop on rocBLAS_int8 test
Add support for active waits without blocking the host thread.
Change-Id: Ie7bb48dcafcb4c93d448bf74749b829b626c3578