mberenjk
697bee4ee8
Improving build time by removing the gfx11xx and host code from rccl_float8.h ( #1789 )
...
* removing extra build time by removing the gfx11xx arch from using hip_fp8
---------
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com >
2025-07-09 14:03:47 -05:00
BertanDogancay
a6bf9bfc9e
Merge remote-tracking branch 'nccl/master' into develop
2025-04-23 20:47:43 -07:00
mberenjk
39483c55f8
Initializing all ranks to the same value to avoid failure of UT AllR… ( #1459 )
...
* Initializing all ranks to the same value to avoid failure of UT AllReduce for FP8 type
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com >
2025-01-02 11:39:02 -06:00
mberenjk
db840f024e
adding all nccl apis to api_support to enable rccl tracing by rocprofv3 ( #1297 )
...
* adding all nccl apis to api_support to enable rccl tracing by rocprofv3
Co-authored-by: Marzieh Berenjkoub <mberenjk@amd.com >
Co-authored-by: Jonathan R. Madsen <jonathanrmadsen@gmail.com >
2024-08-22 12:36:07 -05:00
Tim
a4793286c7
Adding User Buffer Registration support for Unit test ( #1199 )
...
* Adding UBR support for UT SendRecv
Signed-off-by: Tim Hu <timhu102@amd.com >
* Update test/common/TestBedChild.cpp
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com >
---------
Signed-off-by: Tim Hu <timhu102@amd.com >
Co-authored-by: corey-derochie-amd <161367113+corey-derochie-amd@users.noreply.github.com >
2024-07-30 13:39:25 -04:00
mberenjk
428837ffe4
replacing rccl_bfloat16 with hip_bfloat16 ( #1126 )
...
Co-authored-by: mberenjk <mberenjk@amd.com >
2024-04-11 11:30:37 -05:00
Andy li
6777e65c1d
Enable fp8 support ( #1101 )
...
* initial checkin
* resolve cr comments
* resolve the build issue
* fix the data correctless issue
* update fp8 header file and update the unit test for fp8 support
* remove fp16 from fp8 headers
* fix ut issue and catch up the latest code from develop
* udate according to cr comments
* update ut according to cr comments
* update num floats for each SumPostDiv from 4 to 6
* update fp8 header file name
* fix the typo
2024-03-08 15:17:53 -08:00
Tim
0d06b0f1de
Adding FP16 cases to unit tests( #1093 )
...
Signed-off-by: Tim Hu <timhu102@amd.com >
2024-02-26 12:08:04 -05:00
gilbertlee-amd
ebb8b5bf63
Updating files for missing licenses ( #637 )
2022-10-14 13:49:16 -06:00
akolliasAMD
06bce9d0c9
added stream synch after hipMemset ( #609 )
2022-08-30 16:18:37 -06:00
Edgar Gabriel
f6e00dec13
introduce support for ncclFloat16/half in UT
2022-08-24 15:28:24 +00:00
gilbertlee-amd
29ad0f5fbe
Unit test refactor ( #500 )
...
Refactoring and consolidating single-process / multi-process unit testing
2022-02-25 08:59:07 -07:00