There were several error messages that appeared even if the hipEnvVarDriver.exe test passes and executes successfully. Now it is cleaned up. The following are those instances:
* When popen searches for directed_test directory but does not find it, it outputs an error, then finds the hipEnvVar at the same level. Currently the fix will prompt the test to only output an error if both searches for hipEnvVar fails.
* When assertion is used towards the later half of the test, conditions were set to specifically hide the devices, resulting in No Hip Device detected in the latter half of the test. The fix will make these errors not appear as they are intended to not find any devices. Assertions themselves are untouched.
HipEnvVarDriver.cpp has also been refactored. Reading HipEnvVar will now happen in a helper function for getDeviceNumber and getDevicePCIBusNumRemote, as the code to read HipEnvVar were really similar in them.
The existing one can have issues on certain systems, therefore this limits use of direct memcpy via largeBAR to sizes where it is unequivocally better.
Also addresses SWDEV-220030 and SWDEV-222237.
This is a quick workaround to match HCC behavior for performance since inlining usually
results in more optimization opportunities therefore better performance.
We will fine tuning inline threashold later.
* Use deque instead of vector for code readers so that the iterators and references will be stable
* Fix compile error
* Assign the iterator
* Add multithreaded test
* Make threads a multiple of hardware concurrency
* Output on failure
* Add setDevice to try and initialize the context on cuda
* Create context for cuda
* Set context on each thread
* Reduce threads on cuda
* Skip test on cuda
* Try to initialize the primary context on cuda
* Push ctx to the stack as current
* Revert "Push ctx to the stack as current"
This reverts commit bff8cbe950.
* Revert "Try to initialize the primary context on cuda"
This reverts commit fd98514113.
* updated test for nvidia path
* Add c++11 option for nvcc
Co-authored-by: satyanveshd <53337087+satyanveshd@users.noreply.github.com>
Some PyTorch unit tests have regression. Disabling cov3 to allow more
time to debug and unblock PyTorch
Change-Id: Iba7f425ef3499c20c42ec45d9152b5d27ce97d03
hipDeviceEnablePeerAccess returns success and adds peer into the list even if it is not accessible which creates problem in hipMalloc when it tries to share the ptr to peer device.
Proposed change is to check the access status before updating the peer list and update only when it can access the peer.
- The known target checking should skip `gfx000` as well as it won't be
used in real compilation command formation. The avoid generating
annoying warning on `gfx000`.