1. Simply enable test on NV
Some need minor fix
performance/compute/hipPerfDotProduct.cpp
performance/dispatch/hipPerfDispatchSpeed.cpp
performance/memory/hipPerfBufferCopyRectSpeed.cpp
performance/memory/hipPerfBufferCopySpeed.cpp
performance/memory/hipPerfDevMemReadSpeed.cpp
performance/memory/hipPerfDevMemWriteSpeed.cpp
performance/memory/hipPerfMemcpy.cpp
performance/memory/hipPerfMemset.cpp
performance/memory/hipPerfSharedMemReadSpeed.cpp
performance/stream/hipPerfDeviceConcurrency.cpp
performance/stream/hipPerfStreamCreateCopyDestroy.cpp
2. Enable and fix on NV
performance/compute/hipPerfMandelbrot.cpp
Root cause: coordIdx is random
Solution: Initialize coordIdx correctly
performance/memory/hipPerfMemFill.cpp
Root cause: Hip ext Apis called.
Solution: Exclude case with Hip ext Apis involved
performance/memory/hipPerfMemMallocCpyFree.cpp
Root cause: Test allocates device memory more than GPU has.
Solution: Allocate device memory in terms of GPU capacity.
tests/performance/memory/hipPerfSampleRate.cpp
Root cause: Cuda has no operators += for float2 and float4.
Solution: Provide the operators.
performance/stream/hipPerfStreamConcurrency.cpp
Root cause:float4 format doesn't match cude.
operators are missing in cuda lib.
Solution: Use (x, y, z, w) format.
Add necessary float4 operatoris for cuda.
Change-Id: I5add29ebabcfb21fb3ef89d09004c5d13423a291
[ROCm/hip commit: 9035ae3154]
* SWDEV-266829 - Enable more tests on AMD and NV devices
1. Enable tests on AMD and NV devices
tests/src/runtimeApi/event/hipEventMultiThreaded.cpp
Loops and threads per core are changed smaller so that test can
finish in a shorter time.
tests/src/runtimeApi/stream/hipStreamCreateWithPriority.cpp
Fix logic error on how to get priority_normal
2. Simply enable test on AMD device
tests/src/runtimeApi/memory/hipManagedKeyword.cpp
tests/src/runtimeApi/module/hipManagedKeyword.cpp
tests/src/runtimeApi/stream/hipStreamACb_MultiThread.cpp
tests/src/runtimeApi/memory/p2p_copy_coherency.cpp
3. Simply enable test on NV device
tests/src/runtimeApi/module/hipModuleLoadDataMultThreaded.cpp
4. Fix typo
tests/src/runtimeApi/stream/hipStreamAddCallbackCatch.cpp
5. Remove useless tests
tests/src/hipC.c
tests/src/hipHcc.cpp
Change-Id: Ia4406353e64d69bd34c58ebb56185701f7ce1caa
* Remove tests/src/runtimeApi/module/hipModuleLoadDataMultThreaded.cpp for cuda test
Co-authored-by: anusha GodavarthySurya <Anusha.GodavarthySurya@amd.com>
Co-authored-by: Jenkins <jenkins-compute@amd.com>
[ROCm/hip commit: 3fd16c0b5b]
Fix the following failed tests on NV,
hipCGMultiGridGroupType
hipCGMultiGridGroupTypeViaBaseType
hipCGMultiGridGroupTypeViaPublicApi
1. Fix wrong logic in kernel for both AMD and NV.
2. Remove unnecessary hipDeviceSynchronize().
3. In hipCGMultiGridGroupTypeViaBaseType.cpp, change
multi_grid_group as thread_group which is originally expected.
4. hipFree(syncResultD) is fixed as hipHostFree(syncResultD)
5. Optimize some host codes.
Change-Id: I3fe6dac35a7b14bab12adf397b7885df83d28059
[ROCm/hip commit: c57e0f8fe5]
Enable cooperativeGrps/cooperative_streams on NV.
Add test cases of the least/half/full capacity.
Verify data in terms of AMD/NV devices.
Optimize codes
Change-Id: I3fe6dbc35b7b24abb11adf297b7885df83d28154
[ROCm/hip commit: 67b3681d26]
Add the new extension to HIP for quering coherency mode.
The new enum hipMemRangeAttributeCoherencyMode can be used in
hipMemRangeGetAttribute(s), which will return one of the following
values:
hipMemRangeCoherencyModeFineGrain, hipMemRangeCoherencyModeCoarseGrain,
hipMemRangeCoherencyModeIndeterminate
Change-Id: I8717873c254888ea69facc1178d3682e8747c3a7
[ROCm/hip commit: 1f53fbea8f]
Migrated malloc related files under memory folder into catch2 framework
Change-Id: I5aa07fc8148bdf6bef135947091aaf1d3c54663b
[ROCm/hip commit: 05e230f5c1]
Add test cases for filter modes: hipFilterModePoint and hipFilterModeLinear
Change-Id: I3fe6dbc35a7b14aab12adf297b7885df83d28056
[ROCm/hip commit: 48d8040b06]
Change return type from int to unsigned int.
It is correctly documented in hip-math-api.md but not at the place where this PR is updating the documentation.
[ROCm/hip commit: 7eaa45d2d5]
Migrated all hipMemcpy related APIs to CATCH2 framework by optmizing
the code and moving the stress related tests to stress folder.
Change-Id: Id47669b49304c35d1a68fabdaaf3f6e3ab0428a5
[ROCm/hip commit: 346a77b4c0]
1. In kernel/hipDynamicShared
Fix shared memory size and type mismatch in host and kernel.
2. In kernel/hipDynamicShared2
Cuda kernels relying on shared memory allocations over 48 KB require
to explicitly set size using hipFuncSetAttribute().
Change-Id: I4248b6cebd3dc156f9d5d427e1897da22fb964ed
[ROCm/hip commit: 5b739b0373]
make hipIpcOpenEventHandle has the same behavour of cudaIpcOpenEventHandle.
Add Api usages.
Change-Id: I4248b2cebd3de156f9d5d427e1797da22fb964eb
[ROCm/hip commit: c053d60282]
1.Fix hipModuleNegative failure on all NV GPUs
a.Add signal handler for signal sent by cuda functions.
b.Make hipModuleGetGlobal match cuModuleGetGlobal behavour.
That is, if one of the first two parameters is nullptr, ignore it.
2.Fix hipModuleLoadDataMultThreaded failure on NV RTX5000
Improve lamda function.
Change-Id: I3fe6dbc35a7a14aa9119df197b7885df83d28047
[ROCm/hip commit: ae30c5cd6b]