rocm-systems

Author	SHA1	Message	Date
Gilbert Lee	f1a9ce3fa5	Using GTEST_SKIP() to skip unit tests that have insufficient devices. Skipping out earlier	2021-02-09 03:54:04 +00:00
Stanley Tsang	d00b7d17bd	Update MP UT to support arbitrary # of GPUs; multiple bugfixes (#16 ) * Fixing temp file creation/deletion for Clique kernel mode. * Refactoring of MP unit tests; include bugfixes and general support for any number of GPUs * GroupCall MP UT properly quits when too many devices specified * MP UT will programmatically set NCCL_COMM_ID if not specified; updated install script	2021-02-05 16:49:25 -08:00
Wenkai Du	ab1e7a0318	Merge remote-tracking branch 'origin/develop' into 2.8.3	2021-02-04 20:02:34 -05:00
Gilbert Lee	01a998b17c	Removing in-place tests from Combined calls (no support for send/recv)	2021-01-28 20:09:03 +00:00
gilbertlee-amd	3e62ceddc5	Clique kernel support (#295 ) (#15 ) * Adding experimental clique-based kernels (opt-in only) Co-authored-by: Stanley Tsang <stanley.tsang@amd.com> Co-authored-by: Gilbert Lee <gilbert.lee@amd.com> Co-authored-by: Wenkai Du <43822138+wenkaidu@users.noreply.github.com> Co-authored-by: Stanley Tsang <stanley.tsang@amd.com> Co-authored-by: Wenkai Du <43822138+wenkaidu@users.noreply.github.com>	2021-01-28 09:45:01 -07:00
Stanley Tsang	d3fa257682	Adding multiprocess unit tests (#312 ) Adding multiprocess unit tests for collectives. To run, NCCL_COMM_ID=$HOSTNAME:12345 build/release/test/UnitTestsMultiProcess	2021-01-15 16:34:36 -07:00
Wenkai Du	b33a2cac8b	gtest: add scatter to combined calls and use loops (#303 ) * gtest: add scatter to combined calls and use loops * gtest: run validation inside loop * gtest: revert small element count to 2520 * gtest: fix memory leak in validation (cherry picked from commit `b0853ccd51`) * Fix combined call UT * Fix memory leak * Fix alltoallv test	2021-01-14 19:28:01 -05:00
Wenkai Du	b0853ccd51	gtest: add scatter to combined calls and use loops (#303 ) * gtest: add scatter to combined calls and use loops * gtest: run validation inside loop * gtest: revert small element count to 2520 * gtest: fix memory leak in validation	2020-11-13 17:57:44 -08:00
gilbertlee-amd	41bcfb8878	Clique kernel support (#295 ) * Adding experimental clique-based kernels (opt-in only) Co-authored-by: Stanley Tsang <stanley.tsang@amd.com> Co-authored-by: Gilbert Lee <gilbert.lee@amd.com> Co-authored-by: Wenkai Du <43822138+wenkaidu@users.noreply.github.com>	2020-11-10 15:44:10 -07:00
Wenkai Du	b871ea3c0c	Add Alltoallv RCCL kernel implementation (#269 ) * Add alltoallv API and implementation * Extend Rome P2P channel limit to multinode and alltoall kernels * topo_expl: fix compilation and sync up with main * gtest: use RCCL alltoallv API * Code review changes	2020-09-30 16:25:36 -07:00
Wenkai Du	60819dcf8d	Merge pull request #262 from wenkaidu/alignment Make data alignment requirements matching ISA manual	2020-09-08 10:40:42 -07:00
Aaron Enye Shi	958b213428	Add RCCL Static Lib Creation with -fgpu-rdc RCCL uses -fgpu-rdc to compile its source objects. When linking the RCCL static library, the link and archive step must do through hipcc and uses the flag --emit-static-lib. When compiling UnitTests, the librccl.a must be consumed through -l and -L.	2020-09-03 11:25:41 -04:00
Wenkai Du	b163a8898f	gtest: add alltoallv test	2020-09-02 21:28:32 +00:00
Wenkai Du	7e3f841fab	Merge remote-tracking branch 'nccl/master' into 2.7.8	2020-08-10 16:11:00 +00:00
saadrahim	0dc019e35f	Download GTest if not found in system (#237 ) Co-authored-by: Stanley Tsang <stanley.tsang@amd.com>	2020-08-06 09:36:58 -06:00
Stanley Tsang	684f3e6af4	Adding better naming to unit tests for filtering; adding short and full unit test suites (#235 )	2020-07-21 12:19:47 -06:00
saadrahim	99a491273f	Changing GTest inclusion in cmake to use find_package (#234 ) * GTest is used via find_package. No longer downloaded in cmake. * Adding error handling	2020-07-15 20:51:48 -06:00
gilbertlee-amd	f87ba17737	Removing UnitTest as install, removing unused env var (#231 )	2020-07-10 09:30:28 -06:00
Wenkai Du	8db0aa8f4c	gtest: extend testing up to 8 GPUs	2020-06-29 09:32:31 -07:00
Wenkai Du	fee1a20b74	gtest: add scatter, gather and all to all unit tests	2020-06-09 17:44:15 -07:00
Stanley Tsang	20fa04d9b6	Updating copyright notices for 2020.	2020-01-29 15:28:08 -08:00
Wenkai Du	fe6d012eb0	Merge remote-tracking branch 'remotes/rccl/master' into rccl_2.5.6_cleanup	2020-01-29 15:28:03 -08:00
Wenkai Du	1e55645d97	Misc fixes and improvements for 2.5.6 1. Fix RCCL unit test 2. Add ROME detection and tuning 3. Change default P2P level 4. Fix search algorithm for XGMI 5. Remove explicit channel duplication with implicit by using half of link speed 6. Add collective trace support 7. Correct Intel Skylake CPU detection and bandwidth 8. Fix topo connect function 9. Disable GDR read and remove unreachable code 10. Disable LL128 kernels 11. Add tuning parameters 12. Use original clock64() implementation which returns RTC counter value 13. Print out timestamp of collective trace 14. Do not use struct ncclColl in kernel launch parameter 15. Fix abort handling and add tracing 17. Add __launch_bounds__ to kernel functions 18. Remove unused abortCount 19. Unset default MIN_NRINGS and MIN_NCHANNELS 20. Do not allocate shared memory when not using LL128 kernels 21. Correct time print out in tuning log	2020-01-29 15:27:05 -08:00
gilbertlee-amd	000bce6f27	Removing OpenMP from unit tests (#163 )	2019-12-20 11:41:56 -07:00
Wenkai Du	4ca05c1297	Support bfloat16 on rest of the unit tests	2019-11-18 14:18:34 -08:00
Wenkai Du	bdac0256a5	Add bfloat16 all reduce unit test	2019-11-18 13:50:29 -08:00
Akila Premachandra	f48ae5c98d	Added hip-clang options to install script, and openmp/pthread options to CMakeLists.txt	2019-08-23 22:02:42 +00:00
Wenkai Du	f11c8f60cd	RCCL 2.4 update	2019-08-14 10:42:35 -07:00
Sylvain Jeaugey	f93fe9bfd9	2.3.5-5 Add support for inter-node communication using sockets and InfiniBand/RoCE. Improve latency. Add support for aggregation. Improve LL/regular tuning. Remove tests as those are now at github.com/nvidia/nccl-tests .	2018-09-25 14:12:01 -07:00
sclarkson	680a35c6b7	fix tests on maxwell	2017-11-11 19:22:06 -08:00
Sylvain Jeaugey	1093821c33	Replace min BW by average BW in tests	2016-12-01 15:16:35 -08:00
Sylvain Jeaugey	ca330b110a	Add scan tests	2016-09-22 11:58:33 -07:00
Sylvain Jeaugey	6c77476cc1	Make tests check for deltas and report bandwidth	2016-09-22 11:58:28 -07:00
Sylvain Jeaugey	75bad643bd	Updated LICENCE.txt	2016-08-26 15:08:20 -07:00
Nathan Luehr	55c42ad681	Fixed redundant contexts in multi-process apps Change-Id: If787014450fd281304f0c7baf01d25963e40905d	2016-07-25 10:10:30 -07:00
Sylvain Jeaugey	bd3cf73e6e	Changed CURAND generator to work on a wider set of platforms.	2016-06-06 14:34:03 -07:00
Nathan Luehr	03df4c7759	Moved no-as-needed flag to link rule. Avoids link errors for tests linked with nvcc.	2016-04-19 14:51:03 -07:00
Sylvain Jeaugey	9de361a1b9	Fix MPI test usage Only display usage from rank 0 and exit instead of continuing (and seg fault).	2016-04-19 10:43:38 -07:00
Nathan Luehr	2758353380	Added NCCL error checking to tests. Also cleaned up makefile so that tests and lib are not built unnecessarily. Change-Id: Ia0c596cc2213628de2f066be97615c09bb1bb262 Reviewed-on: http://git-master/r/999627 Reviewed-by: Przemek Tredak <ptredak@nvidia.com> Tested-by: Przemek Tredak <ptredak@nvidia.com>	2016-01-29 11:09:05 -08:00
Sylvain Jeaugey	c05312f151	Moved tests to separate dir and improved MPI test test sources moved to test/ directory. MPI test displays PASS/FAIL and returns code accordingly. Change-Id: I058ebd1bd5202d8f38cc9787898b2480100c102b Reviewed-on: http://git-master/r/936086 Reviewed-by: Przemek Tredak <ptredak@nvidia.com> Tested-by: Przemek Tredak <ptredak@nvidia.com>	2016-01-28 12:56:36 -08:00

40 Commits