8264d5d6bd
add support for both cuda compatible implementation and hcc(faster) implementation with test Change-Id: I79a22344f458391d7dffac5f147619a542e97e4e