Atul Kulkarni
9839d1c7c8
Updated tests based on NCCL 2.27.3-1 sync ( #1892 )
2025-09-18 09:56:09 -05:00
Laura Promberger
0f6fec1553
Bump minimum cmake version to 3.16 to enable cmake 4 ( #1909 )
...
Minimum required cmake version of test/CMakeList.txt is bumped from 2.8
to 3.16. This alignes with the version used in CMakeList.txt and will
enable building with cmake 4.
2025-09-16 23:10:22 -05:00
Kapil S. Pawar
86a6d06e40
Added new tests for rccl_wrap - rcclOverrideProtocol, rcclOverrideAlgorithm ( #1895 )
...
* Added new unit tests for rccl_wrap
2025-09-15 18:00:26 -05:00
Kapil S. Pawar
f418a4c6d0
Added new tests for rccl_wrap - rcclSetPipelining ( #1890 )
...
* Added tests for rcclSetPipelining
* Added conditions to skip the test
* Updated message size
2025-09-05 09:29:11 -05:00
ycui1984
361d596229
[rocm_regression] Return errors when HSA_NO_SCRATCH_RECLAIM=1 even for rocm>=6.4.0 ( #1867 )
...
* [rocm_regression] Return errors when HSA_NO_SCRATCH_RECLAIM=1 even for rocm >= 6.4.0
* [rocm_regression] Check firmware version
* [rocm_regression] Resolve review comments
* [rocm_regression] Move hsa env checking into init once func
* [rocm_regression] Prevent hot fix version in firmware
* [rocm_regression] Improve unit tests
2025-08-29 11:18:23 -05:00
Kapil S. Pawar
c9becd89cd
Code coverage tests for param.cc ( #1872 )
...
* Added code coverage unit tests for param.cc
* Updated ParamTests.cpp and removed ParamTestsConfFile.txt
* Updated ParamTests.cpp
* Removed NCCL_LOG_INFO and added sample cofig file
---------
Co-authored-by: Pawar <kpawar@ctr2-alola-ctrl-01.amd.com >
2025-08-27 09:30:37 -05:00
ishkool
c288fbf1b2
Code coverage tests for net_socket.cc ( #1840 )
...
* Code coverage UTs for net_socket.cc
* Addressed review comments
---------
Co-authored-by: Atul Kulkarni <atul.kulkarni@amd.com >
2025-08-27 09:24:21 -05:00
corey-derochie-amd
b88c134874
Changed TestBedChild to avoid hang if the call fails ( #1875 )
...
Changed `TestBedChild` protocol to send the result code before the return value to avoid hanging if the call fails. Switched `TestBedChild::GetUniqueId` to use this.
2025-08-23 00:17:34 -05:00
awelling2801
a1a65c65c4
Added new tests for rccl_wrap - rcclUpdateThreadThreshold ( #1855 )
...
* Added tests for rccl_wrap - rcclUpdateThreadThreshold
* Skipped tests gtest_skip added
* Added tests for new functions rcclSetP2pNetChunkSize and rcclSetPxn
---------
Co-authored-by: Atul Kulkarni <atul.kulkarni@amd.com >
2025-08-21 16:39:53 -05:00
Arm Patinyasakdikul
9d3acffa5f
Test: delete child object to address memory leak. ( #1863 )
2025-08-20 10:15:03 -05:00
ishkool
876f985e0f
Code Coverage: Proxy.cc tests ( #1818 )
...
* Proxy.cc tests
* Update ProxyTest.cpp
Cleaned up the code.
* Update ProxyTests.cpp
Bring back deleting dynamically allocated memory
2025-08-15 19:06:32 -05:00
Atul Kulkarni
84f3cc6a02
Added new unit tests for src/enqueue.cc ( #1853 )
2025-08-15 18:26:26 -05:00
ishkool
6453273aa6
Code Coverage Unit Tests for comm.h ( #1783 )
...
* File containing test for comm.h
* Update CommTest.cpp
Added gtest API for assert
* Update CommTest.cpp
Adding copyright
* Update CommTest.cpp
Removing info and tested as not required.
* Update and rename CommTest.cpp to CommTests.cpp
* Update CMakeLists.txt
2025-08-15 17:44:24 -05:00
Rahul Vaidya
ee9ed3ef87
[BUILD] Fix UT packaging on Debian family OS ( #1854 )
...
* Fix UT packaging on Debian family OSes
Signed-off-by: ravaidya <ravaidya@amd.com >
* Split OR condition when performing Debian checks
Signed-off-by: ravaidya <ravaidya@amd.com >
---------
Signed-off-by: ravaidya <ravaidya@amd.com >
2025-08-11 17:03:16 -05:00
Nilesh M Negi
5036d0e713
[BUILD] Fix UT packaging on Debian OS ( #1848 )
2025-08-11 09:43:26 -05:00
Rahul Vaidya
cbbc713b03
Fix rccl-UnitTests packaging on Debian systems ( #1846 )
...
Signed-off-by: ravaidya <ravaidya@amd.com >
2025-08-08 12:28:56 -05:00
awelling2801
82bea39280
Created coverage tests for rccl_wrap ( #1694 )
...
* Created coverage tests for rccl_wrap
RCCL_EXPOSE_STATIC off by default
Coverage tests for rccl_wrap.cc
* Remove RCCL_EXPOSE_STATIC dependency
* Removed Rcclwrap.RcclGetAlgoInfoTest
* Remove comments
* Corrected RCCL_EXPOSE_STATIC definition logic
---------
Co-authored-by: Welling <awelling@ctr2-alola-login-01.amd.com >
Co-authored-by: Atul Kulkarni <atul.kulkarni@amd.com >
2025-08-06 14:48:00 -05:00
Atul Kulkarni
0e7d7da55d
Add unit tests for graph/xml.cc & graph/xml.h ( #1833 )
...
* Added new binary for executing unit tests
Added new unit tests for argcheck.cc and alt_rsmi.cc files
Modified the method to execute unit tests to cover static methods
by using a bash script to convert static to non-static functions
and variables on the fly restricted to debug build type.
* Added new unit tests for src/transport/shm.cc
* Added new unit tests for graph/xml.cc
2025-08-01 14:20:27 -05:00
awelling2801
5ecc1b7ede
Added tests for coll_reg ( #1700 )
...
Changes to coll_reg
Co-authored-by: Welling <awelling@ctr2-alola-login-01.amd.com >
2025-07-31 13:49:23 -05:00
awelling2801
7320752bf3
Added tests for transport.cc ( #1725 )
...
Co-authored-by: Welling <awelling@ctr2-alola-login-01.amd.com >
2025-07-31 11:04:28 -05:00
Rahul Vaidya
0adc5edc74
Fix RHEL10 packaging for rcclras and rccl-UnitTests ( #1831 )
...
Signed-off-by: ravaidya <ravaidya@amd.com >
2025-07-31 11:00:49 -05:00
ycui1984
874cd657ef
Add collective latency profiler ( #1785 )
...
* [LatencyProfiler] Initial commit
* [LatencyProfiler] Add unit tests
* [LatencyProfiler] add more
* [LatencyProfiler] Pass unit tests
* [LatencyProfiler] Add hooks to integrate with meta internal tools
* [LatencyProfiler] Restore install.sh
* [LatencyProfiler] Resolved comments 1. add proper license 2. use proper namespace
* [LatencyProfiler] Add header
2025-07-30 14:59:28 -07:00
awelling2801
9843adaab2
Added tests for Ipcsocket ( #1690 )
...
Co-authored-by: Welling <awelling@ctr2-alola-ctrl-01.amd.com >
2025-07-29 10:03:28 -05:00
awelling2801
e118aadc14
Code coverage improvements for alloc.h ( #1676 )
...
* Added tests for alloc.h
* Added tests for ZeroElementCopy and MemcpyNullSrcOrDstPointer
---------
Co-authored-by: Welling <awelling@ctr2-alola-ctrl-01.amd.com >
2025-07-29 09:19:57 -05:00
peizhang56
fe182d6546
Add Unit Test for bitops.h ( #1821 )
...
* Add Unit Test for bitops.h
* Change the style
* Fix the code review comments
* Add more test cases
2025-07-28 11:25:15 -05:00
Atul Kulkarni
81ec6bff4c
Added new unit tests for src/transport/p2p.cc ( #1774 )
2025-07-25 12:57:57 -05:00
Atul Kulkarni
1c3d1b3842
Added new unit tests for src/transport/shm.cc ( #1689 )
2025-07-25 05:54:42 -05:00
Atul Kulkarni
275fdd43c1
Code coverage improvements ( #1665 )
...
* Increased max stack size to 640
* Added new binary for executing unit tests
Added new unit tests for argcheck.cc and alt_rsmi.cc files
Modified the method to execute unit tests to cover static methods
by using a bash script to convert static to non-static functions
and variables on the fly restricted to debug build type.
2025-07-17 11:20:49 -05:00
Nilesh M Negi
6b4ad0fd74
[BUILD] Use fmt-header instead of libfmt ( #1791 )
2025-07-10 17:19:53 -05:00
mberenjk
697bee4ee8
Improving build time by removing the gfx11xx and host code from rccl_float8.h ( #1789 )
...
* removing extra build time by removing the gfx11xx arch from using hip_fp8
---------
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com >
2025-07-09 14:03:47 -05:00
Rakesh Roy
dd3b1d816c
Fix chrono build error ( #1790 )
2025-07-04 08:27:30 -05:00
Dingming Wu
020dcf0a7c
Add proxyTrace ( #1732 )
...
This feature tracks the proxy events and status of each send/recv op. ProxyTrace keeps a fixed number of active ops in host mem and dumps the status of each op when the program crashes or hangs.
2025-06-25 23:01:34 -05:00
BertanDogancay
aaf023976a
Merge remote-tracking branch 'nccl/master' into develop
2025-06-20 07:54:49 -05:00
Tim
ba97c9c18b
replayer update v0 ( #1733 )
...
* First version of new replayer, with comments on future TODOs
* plus minor fixes for UT
* Updated format of recorder, especially in binary department, according to replayer's need
2025-06-13 15:05:34 -04:00
Arm Patinyasakdikul
6c37ae9470
Added missing copyright message. ( #1742 )
...
* Added missing copyright message.
* addressed comments.
2025-06-12 09:58:01 -05:00
Atul Kulkarni
682ed36fe6
Added new ENABLE_CODE_COVERAGE option. ( #1664 )
...
Modified install.sh script to add this new option
2025-06-10 12:12:36 -05:00
vstojilj
2ac44cfe4e
SWDEV-536040 - Include <thread> header ( #1724 )
2025-06-06 10:28:11 -06:00
Arm Patinyasakdikul
c07445d5b4
Test: bump max stacksize once again to match current expectation.
2025-05-23 11:18:25 -05:00
Arm Patinyasakdikul
523e0893e4
Test: Change max stack size to 520 to accomodate new ROCm changes.
2025-05-21 20:21:27 -05:00
corey-derochie-amd
170acf3bda
Switched to using the hip_fp8 header instead of rccl_float8, resolving compatibility issues. ( #1546 )
...
* Revert "Revert "replacing rccl_float8 with hip_fp8 and address compatibility …"
This reverts commit 824b81c034 .
* [UT] Modify max stack size to 496
* adding a check for OCP type and replacing ROCM_VERSION with HIP_VERSION
* addressing the ci failure
* Adding the device tag
---------
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com >
2025-05-14 15:33:03 -05:00
mberenjk
e70003736e
Write JSON file to /tmp directory to avoid incorrect write access in recorderTest ( #1680 )
...
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com >
2025-05-07 13:58:27 -05:00
Siu Chi Chan
9525c5b2ef
rccl-UnitTests - link to dl library ( #1673 )
2025-05-02 21:20:22 -05:00
deeksha-amd
2486838465
Added new tests for improving the code coverage ( #1656 )
...
Signed-off-by: Deeksha Goplani <deeksha.goplani@amd.com >
2025-04-30 18:01:11 -05:00
BertanDogancay
a6bf9bfc9e
Merge remote-tracking branch 'nccl/master' into develop
2025-04-23 20:47:43 -07:00
gilbertlee-amd
ee85a70bb4
Adding UT_DEBUG_PAUSE to unit tests ( #1653 )
2025-04-21 21:15:07 -06:00
Tim
9a55ff60a9
RCCL Replayer update ( #1603 )
...
RCCL recorder w/ suggested change and UT
2025-04-19 00:21:27 -04:00
AbandiGa
7a84c5dbb0
added copyright ( #1635 )
2025-04-14 09:46:18 -05:00
BertanDogancay
0b2062c560
Merge remote-tracking branch 'nccl/master' into develop
2025-03-27 12:53:04 -05:00
Nilesh M Negi
d6b987a53f
[UT] Increase stack size for StandaloneTests to 480 ( #1616 )
...
Signed-off-by: nileshnegi <Nilesh.Negi@amd.com >
2025-03-21 21:33:32 -05:00
gilbertlee-amd
626dc50ab5
Removing the experimental clique kernel files ( #1610 )
2025-03-20 18:10:01 -06:00