- hipLaunchModuleKernel maps to cuLaunchKernel
- Whole lot of new error codes added for the use of driver api
- KernelParams arguments is not yet supported
- hipLaunchModuleKernel is a synchronous api (will change eventually)
- All the commands in a stream will wait on host when hipLaunchModuleKernel is called on it
Change-Id: Ib4a4fae1db06fbb3a81d5a5575b026aa821264ed
- Complete translation tables for cudaError <-> hipError_t.
- Remove some odd errors that were not correctly translated or not used.
- Add HIPCHECK_API to test infrastructure. Used for negative testing
an API ; if a mismatch occurs it shows the expected return error
code. Can also print a warning rather than error.
- Enable hipMemoryAllocate on NV system, and review error coded.
- Add hipErrorName to nvcc.
Change-Id: I680427dcf32a5796d5913cf9e7f3b4c6f6b91599
Conflicts:
tests/src/CMakeLists.txt
Bug fixes and improved docs for hipFree and hipHostFree.
- Passing NULL pointer initialized runtime and return hipSuccess
(not an error like before).
- add negative test for this. (hipMemoryAllocate, improved)
- Match NVCC errors for invalid pointers, add to test.
- Update hipFree and hipHostFree docs.
- hipGetDevicePointer always set *devicePointer=NULL, even for
invalid flags.
- Gate shared memory usage on specific HCC work-week.
Change-Id: I533b4fd3280a3d6cdbf05eb768976f0c7506c012
introduce LockedAccessor option so destructor does not unlock.
Allows locks to exist across function boundaries, required
for hipLaunchKernel macro which has several unusual requirements.
(including C comppatibility, must use variadic macro, more).
Note hipHostMalloc (not hipHostAlloc or hipMallocHost).
- the hipHost* is used for all HIP APIs dealing with Host memory.
(including hipHostMalloc, hipHostFree, hipHostRegister,
hipHostUnregister, hipHostGetFlags, hipHostGetDevicePointer).
- hipMallocHost is consistent with "hipMalloc" for allocating device
memory. Enumerations hipHostMalloc* also used as optional
flags parm to hipHostMalloc.
These will print compiler warnings if used, so we can weed them out
before removing.
Also add a default flags args for hipHostAlloc, in the C++ functioin
headers. So you can replace hipMallocHost(&ptr, size( with hipHostAlloc(&ptr, size)
On HIP path property obtaining done through hsa_iterate_agents and counting the devices of HSA_DEVICE_TYPE_GPU type.
P.S.
On multi-boards systems it might be problems with detection what board a GPU plugged into (not tested).
Tracks device where memory is allocated, pinned-host or device, and
more.
Uses memory-range-based lookups - so pointers that exist anywhere in
the range of hostPtr + size will find the associated AmPointerInfo.
The insertions and lookups use a self-balancing binary tree and
should support O(logN) lookup speed.
Device property MaxSharedMemoryPerMultiprocessor set equal to totalGlobalMem (HIP path).
Reason: MaxSharedMemoryPerMultiprocessor should be as the same as group memory size. Group memory will not be paged out, so, the physical memory size = total shared memory size = group region size. NVCC path remains untouched: CUDA's device property MaxSharedMemoryPerMultiprocessor is reported.
hipify is updated as well.
For HCC path concurrentKernels is set to true since all ROCR hardware supports this feature.
For NVCC path concurrentKernels is obtained from CUDA's device property cudaDeviceProp::concurrentKernels.