docs(distributed/collective): add alltoall_single

HermitSun · HermitSun · commit d4afa07b7845 · 2022-09-02T20:21:41.000+08:00
diff --git a/docs/api/paddle/distributed/Overview_cn.rst b/docs/api/paddle/distributed/Overview_cn.rst
@@ -86,12 +86,13 @@ paddle.distributed.fleet 是分布式训练的统一入口 API，用于配置分
     " :ref:`all_gather <cn_api_distributed_all_gather>` ", "组聚合，聚合进程组内的 tensor，结果广播至每个进程"
     " :ref:`all_gather_object <cn_api_distributed_all_gather_object>` ", "组聚合，聚合进程组内的 object，结果广播至每个进程"
     " :ref:`alltoall <cn_api_distributed_alltoall>` ", "分发 tensor 列表到每个进程并进行聚合"
+    " :ref:`alltoall_single <cn_api_distributed_alltoall_single>` ", "分发单个 tensor 到每个进程并聚合至目标 tensor"
     " :ref:`broadcast <cn_api_distributed_broadcast>` ", "广播一个 tensor 到每个进程"
     " :ref:`scatter <cn_api_distributed_scatter>` ", "分发 tensor 到每个进程"
     " :ref:`split <cn_api_distributed_split>` ", "切分参数到多个设备"
     " :ref:`barrier <cn_api_distributed_barrier>` ", "同步路障，进行阻塞操作，实现组内所有进程的同步"
-    " :ref:`send <cn_api_distributed_send>` ", "发送一个 tensor 到指定的接收者"
-    " :ref:`recv <cn_api_distributed_recv>` ", "接收一个来自指定发送者的 tensor"
-    " :ref:`isend <cn_api_distributed_isend>` ", "异步发送一个 tensor 到指定的接收者"
-    " :ref:`irecv <cn_api_distributed_irecv>` ", "异步接收一个来自指定发送者的 tensor"
+    " :ref:`send <cn_api_distributed_send>` ", "发送一个 tensor 到指定的进程"
+    " :ref:`recv <cn_api_distributed_recv>` ", "接收一个来自指定进程的 tensor"
+    " :ref:`isend <cn_api_paddle_distributed_isend>` ", "异步发送一个 tensor 到指定的进程"
+    " :ref:`irecv <cn_api_paddle_distributed_irecv>` ", "异步接收一个来自指定进程的 tensor"
     " :ref:`reduce_scatter <cn_api_paddle_distributed_reduce_scatter>` ", "规约，然后将 tensor 列表分散到组中的所有进程上"
diff --git a/docs/api/paddle/distributed/alltoall_single_cn.rst b/docs/api/paddle/distributed/alltoall_single_cn.rst
@@ -0,0 +1,25 @@
+.. _cn_api_distributed_alltoall_single:
+
+alltoall_single
+-------------------------------
+
+
+.. py:function:: alltoall_single(in_tensor, out_tensor, in_split_sizes=None, out_split_sizes=None, group=None, use_calc_stream=True)
+
+将输入的 tensor 分发到所有进程，并将接收到的 tensor 聚合到 out_tensor 中。
+
+参数
+:::::::::
+    - in_tensor (Tensor): 输入的 tensor，其数据类型必须是 float16、float32、float64、int32、int64、int8、uint8、bool。
+    - out_tensor (Tensor): 输出的 tensor，其数据类型与输入的 tensor 一致。
+    - in_split_sizes (list[int]，可选): 对 in_tensor 的 dim[0] 进行切分的大小。若该参数未指定，in_tensor 将被均匀切分到各个进程中（需要确保 in_tensor 的大小能够被组中的进程数整除）。默认值：None。
+    - out_split_sizes (list[int]，可选): 对 out_tensor 的 dim[0] 进行切分的大小。若该参数未指定，out_tensor 将均匀地聚合来自各个进程的数据（需要确保 out_tensor 的大小能够被组中的进程数整除）。默认值：None。
+    - use_calc_stream (bool，可选) - 标识使用计算流（若为 True）还是通信流。默认值：True。
+
+返回
+:::::::::
+若 use_calc_stream=True，无返回值；若 use_calc_stream=False，返回一个 Task。
+
+代码示例
+:::::::::
+COPY-FROM: paddle.distributed.alltoall_single