* Adding CPU based execution, fixing typos, adding Fine-grained mem * Exposing sampling factor when generating range of data sizes * Refactoring how Links are launched, now once per thread * Documentation updates
* Upgrading TransferBench to support pinned CPU memory, expanding functionality, cleaning up env vars
* Adding standalone TransferBench tool