Skip to content

Commit 1758e3d

Browse files
committed
btl/base: push operation->hdr to am_rdma_respond for queued operation
Currently, when calling am_rdma_respond() for a queued operation, amd_rdma_retry_operation() pass NULL for the hdr argument. The idea is that hdr is only used for allocating operation->descriptor. A queued operation should already have a descriptor, therefore does not need hdr. This missed the possibility that the allocation of descriptor in am_rdma_respond() can fail, which will lead to the operation to be queued without a descriptor. This patch make retry_operation() to pass operation->hdr to am_rdma_repsond() to address the issue. It also added an assertion in am_rdma_repsond() about hdr must not be NULL before hdr is being used. Signed-off-by: Wei Zhang <wzam@amazon.com>
1 parent 8b7976a commit 1758e3d

File tree

1 file changed

+2
-1
lines changed

1 file changed

+2
-1
lines changed

opal/mca/btl/base/btl_base_am_rdma.c

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -605,6 +605,7 @@ static int am_rdma_respond(mca_btl_base_module_t *btl,
605605
*descriptor = NULL;
606606

607607
if (NULL == send_descriptor) {
608+
assert(NULL != hdr);
608609
am_rdma_response_hdr_t *resp_hdr;
609610
size_t data_size = am_rdma_is_atomic(hdr->type) ? hdr->data.atomic.size
610611
: hdr->data.rdma.size;
@@ -780,7 +781,7 @@ static void am_rdma_retry_operation(am_rdma_operation_t *operation)
780781
} else {
781782
ret = am_rdma_respond(operation->btl, operation->endpoint,
782783
&operation->descriptor,
783-
/*addr=*/NULL, /*hdr=*/NULL);
784+
/*addr=*/NULL, &operation->hdr);
784785
}
785786

786787
if (OPAL_SUCCESS == ret) {

0 commit comments

Comments
 (0)