- Remove "call-to-call" for hipStreamCreate and hipEventCreate.
These now call an internal functions rather than calling through
hipStreamCreateWithFalgs and hipEventCreateWithFlags.
- Add HIP_INIT_API for more functions so they trace correctly.
- Use stream#DEVICE.STREAMID in debug messages via new specialization in
tace_helper.
Note hipHostMalloc (not hipHostAlloc or hipMallocHost).
- the hipHost* is used for all HIP APIs dealing with Host memory.
(including hipHostMalloc, hipHostFree, hipHostRegister,
hipHostUnregister, hipHostGetFlags, hipHostGetDevicePointer).
- hipMallocHost is consistent with "hipMalloc" for allocating device
memory. Enumerations hipHostMalloc* also used as optional
flags parm to hipHostMalloc.