Yiltan
1347d5d628
[GDA] Alltoall optimization - single warp ( #319 )
...
* Remove testing of data types
As the collective is templated, we are just testing if sizeof(T) works
* Added single threaded varients
* Applied thread puts optimization to barrier
* Apply single threaded optimization to alltoall
* This optimization only works on bnxt, so place a switch to protect it
* Handle the edge case where the thread count is smaller than the number of PEs
2025-11-19 14:25:29 -05:00
..
2025-10-07 14:34:18 -04:00
2025-10-06 10:50:50 -05:00
2025-10-07 14:34:18 -04:00
2025-10-06 10:50:50 -05:00
2025-10-07 14:34:18 -04:00
2025-10-06 10:50:50 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-23 13:40:41 -04:00
2025-10-22 16:04:58 -05:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-22 16:04:58 -05:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-22 16:04:58 -05:00
2025-07-28 12:01:02 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-10-23 13:40:41 -04:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-08-01 08:50:14 -05:00
2025-10-22 16:04:58 -05:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00
2025-10-23 13:40:41 -04:00
2025-10-23 13:40:41 -04:00
2025-10-23 13:40:41 -04:00
2025-10-30 11:54:49 -04:00
2025-10-23 13:40:41 -04:00
2025-11-19 14:25:29 -05:00
2025-10-23 13:40:41 -04:00
2025-04-28 16:06:05 -04:00
2025-11-05 11:01:14 -06:00
2025-07-03 13:26:54 -05:00
2025-10-07 14:34:18 -04:00
2025-07-03 13:26:54 -05:00