Skip to content

Commit 4eedf85

Browse files
committed
Reduce the overhead of UCX add_procs with intercommunicators
* When creating a large number of intercommunicators with `MPI_Intercomm_create` the UCX pml add_procs routine is called for each "new" process. This results in a call to `ucp_ep_create` and overwrites the old endpoint at the PML level if there was already one in place. However, it adds a new endpoint to the UCX instance below without removing the old endpoint. This results in accumulating a large number of endpoints paired with the UCX worker. Creating the endpoints has overhead which contributes to the slowdown for the `MPI_Intercomm_create` function. * On Finalize cleaning these up endpoints occurs in the `ucp_worker_destroy` function. Since there are a signifiant number of endpoints it takes quite a while to cleanup. * In this patch, we first check to see if an endpoint has already been created for this process. If so then we skip adding it again. Otherwise we create a new endpoint. Signed-off-by: Joshua Hursey <jhursey@us.ibm.com>
1 parent fb66617 commit 4eedf85

File tree

1 file changed

+5
-0
lines changed

1 file changed

+5
-0
lines changed

ompi/mca/pml/ucx/pml_ucx.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -410,6 +410,11 @@ static ucp_ep_h mca_pml_ucx_add_proc_common(ompi_proc_t *proc)
410410
ucp_ep_h ep;
411411
int ret;
412412

413+
/* Do not add a new endpoint if we already created one */
414+
if (NULL != proc->proc_endpoints[OMPI_PROC_ENDPOINT_TAG_PML]) {
415+
return proc->proc_endpoints[OMPI_PROC_ENDPOINT_TAG_PML];
416+
}
417+
413418
ret = mca_pml_ucx_recv_worker_address(proc, &address, &addrlen);
414419
if (ret < 0) {
415420
return NULL;

0 commit comments

Comments
 (0)