Fixing missing header include for ROCM 3.0 changes
Adding ability to switch between memset/memcpy
* Adding standalone TransferBench tool