Skip to content

Commit 614733f

Browse files
bcodding-rhTrond Myklebust
authored andcommitted
nfs/blocklayout: Limit repeat device registration on failure
Every pNFS SCSI IO wants to do LAYOUTGET, then within the layout find the device which can drive GETDEVINFO, then finally may need to prep the device with a reservation. This slow work makes a mess of IO latencies if one of the later steps is going to fail for awhile. If we're unable to register a SCSI device, ensure we mark the device as unavailable so that it will timeout and be re-added via GETDEVINFO. This avoids repeated doomed attempts to register a device in the IO path. Add some clarifying comments as well. Fixes: d869da9 ("nfs/blocklayout: Fix premature PR key unregistration") Signed-off-by: Benjamin Coddington <bcodding@redhat.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Reviewed-by: Chuck Lever <chuck.lever@oracle.com> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
1 parent 3a4ce14 commit 614733f

File tree

1 file changed

+14
-1
lines changed

1 file changed

+14
-1
lines changed

fs/nfs/blocklayout/blocklayout.c

Lines changed: 14 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -571,19 +571,32 @@ bl_find_get_deviceid(struct nfs_server *server,
571571
if (!node)
572572
return ERR_PTR(-ENODEV);
573573

574+
/*
575+
* Devices that are marked unavailable are left in the cache with a
576+
* timeout to avoid sending GETDEVINFO after every LAYOUTGET, or
577+
* constantly attempting to register the device. Once marked as
578+
* unavailable they must be deleted and never reused.
579+
*/
574580
if (test_bit(NFS_DEVICEID_UNAVAILABLE, &node->flags)) {
575581
unsigned long end = jiffies;
576582
unsigned long start = end - PNFS_DEVICE_RETRY_TIMEOUT;
577583

578584
if (!time_in_range(node->timestamp_unavailable, start, end)) {
585+
/* Uncork subsequent GETDEVINFO operations for this device */
579586
nfs4_delete_deviceid(node->ld, node->nfs_client, id);
580587
goto retry;
581588
}
582589
goto out_put;
583590
}
584591

585-
if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node)))
592+
if (!bl_register_dev(container_of(node, struct pnfs_block_dev, node))) {
593+
/*
594+
* If we cannot register, treat this device as transient:
595+
* Make a negative cache entry for the device
596+
*/
597+
nfs4_mark_deviceid_unavailable(node);
586598
goto out_put;
599+
}
587600

588601
return node;
589602

0 commit comments

Comments
 (0)