Skip to content

Commit 58c5777

Browse files
wckzhangPalmer Stolly
andcommitted
pml/cm: Add CUDA detection to the PML/CM send paths
PML/CM recv paths already have CUDA detection, both recv and irecv make calls to opal_convertor_copy_and_prepare_for_recv, which does detection. This patch and a subsequent MTL datapack patch addresses an error using heterogenous device send buffers (The code will malloc a temporary send buffer and then due to CONVERTOR_CUDA never being set, the subsequent opal_cuda_memcpy will attempt a mempcy from device to host buffer, which will cause an error). Preparing the send buffers and setting CONVERTOR_CUDA will allow for using the correct cuMemcpy. Signed-off-by: William Zhang <wilzhang@amazon.com> Co-authored-by: William Zhang <wilzhang@amazon.com> Co-authored-by: Palmer Stolly <pstolly@amazon.com>
1 parent 68a7310 commit 58c5777

File tree

3 files changed

+24
-1
lines changed

3 files changed

+24
-1
lines changed

ompi/mca/mtl/portals4/mtl_portals4_component.c

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -428,6 +428,12 @@ ompi_mtl_portals4_component_init(bool enable_progress_threads,
428428
id.phys.nid, id.phys.pid));
429429

430430
ompi_mtl_portals4.base.mtl_max_tag = MTL_PORTALS4_MAX_TAG;
431+
432+
/* Disable opal from checking if buffer being sent is cuda */
433+
#if OPAL_CUDA_SUPPORT
434+
ompi_mtl_portals4.base.mtl_flags |= MCA_MTL_BASE_FLAG_CUDA_INIT_DISABLE;
435+
#endif /* OPAL_CUDA_SUPPORT */
436+
431437
return &ompi_mtl_portals4.base;
432438

433439
error:

ompi/mca/pml/cm/pml_cm.h

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -379,7 +379,16 @@ mca_pml_cm_send(const void *buf,
379379
convertor.pBaseBuf = (unsigned char*)buf + datatype->super.true_lb;
380380
convertor.count = count;
381381
convertor.pDesc = &datatype->super;
382-
} else
382+
383+
#if OPAL_CUDA_SUPPORT
384+
/* Switches off CUDA detection if
385+
MTL set MCA_MTL_BASE_FLAG_CUDA_INIT_DISABLE during init */
386+
MCA_PML_CM_SWITCH_CUDA_CONVERTOR_OFF(flags, datatype, count);
387+
convertor.flags |= flags;
388+
/* Sets CONVERTOR_CUDA flag if CUDA buffer */
389+
opal_convertor_prepare_for_send( &convertor, &datatype->super, count, buf );
390+
#endif
391+
} else
383392
#endif
384393
{
385394
ompi_proc = ompi_comm_peer_lookup(comm, dst);

ompi/mca/pml/cm/pml_cm_sendreq.h

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -242,6 +242,14 @@ do { \
242242
(unsigned char*)buf + datatype->super.true_lb; \
243243
(req_send)->req_base.req_convertor.count = count; \
244244
(req_send)->req_base.req_convertor.pDesc = &datatype->super; \
245+
/* Switches off CUDA detection if \
246+
MTL set MCA_MTL_BASE_FLAG_CUDA_INIT_DISABLE during init */ \
247+
MCA_PML_CM_SWITCH_CUDA_CONVERTOR_OFF(flags, datatype, count); \
248+
(req_send)->req_base.req_convertor.flags |= flags; \
249+
/* Sets CONVERTOR_CUDA flag if CUDA buffer */ \
250+
opal_convertor_prepare_for_send( \
251+
&req_send->req_base.req_convertor, \
252+
&datatype->super, count, buf ); \
245253
} else { \
246254
MCA_PML_CM_SWITCH_CUDA_CONVERTOR_OFF(flags, datatype, count); \
247255
opal_convertor_copy_and_prepare_for_send( \

0 commit comments

Comments
 (0)