Add kernel excution perf test per blockSize and block number.
Implement a solution to roughly evaluate gpu variable
frequency based on clock64() and wall_clock64().
Change-Id: Ic87761a862d4a894fdcaab3431d63fe2592bb682
[ROCm/hip-tests commit: 0a22d14775]