Skip to content

Commit 4e18c82

Browse files
josefbacikgregkh
authored andcommitted
btrfs: wait for actual caching progress during allocation
commit fc1f91b upstream. Recently we've been having mysterious hangs while running generic/475 on the CI system. This turned out to be something like this: Task 1 dmsetup suspend --nolockfs -> __dm_suspend -> dm_wait_for_completion -> dm_wait_for_bios_completion -> Unable to complete because of IO's on a plug in Task 2 Task 2 wb_workfn -> wb_writeback -> blk_start_plug -> writeback_sb_inodes -> Infinite loop unable to make an allocation Task 3 cache_block_group ->read_extent_buffer_pages ->Waiting for IO to complete that can't be submitted because Task 1 suspended the DM device The problem here is that we need Task 2 to be scheduled completely for the blk plug to flush. Normally this would happen, we normally wait for the block group caching to finish (Task 3), and this schedule would result in the block plug flushing. However if there's enough free space available from the current caching to satisfy the allocation we won't actually wait for the caching to complete. This check however just checks that we have enough space, not that we can make the allocation. In this particular case we were trying to allocate 9MiB, and we had 10MiB of free space, but we didn't have 9MiB of contiguous space to allocate, and thus the allocation failed and we looped. We specifically don't cycle through the FFE loop until we stop finding cached block groups because we don't want to allocate new block groups just because we're caching, so we short circuit the normal loop once we hit LOOP_CACHING_WAIT and we found a caching block group. This is normally fine, except in this particular case where the caching thread can't make progress because the DM device has been suspended. Fix this by not only waiting for free space to >= the amount of space we want to allocate, but also that we make some progress in caching from the time we start waiting. This will keep us from busy looping when the caching is taking a while but still theoretically has enough space for us to allocate from, and fixes this particular case by forcing us to actually sleep and wait for forward progress, which will flush the plug. With this fix we're no longer hanging with generic/475. CC: stable@vger.kernel.org # 6.1+ Reviewed-by: Boris Burkov <boris@bur.io> Signed-off-by: Josef Bacik <josef@toxicpanda.com> Signed-off-by: David Sterba <dsterba@suse.com> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
1 parent b8cd871 commit 4e18c82

File tree

2 files changed

+17
-2
lines changed

2 files changed

+17
-2
lines changed

fs/btrfs/block-group.c

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -436,13 +436,23 @@ void btrfs_wait_block_group_cache_progress(struct btrfs_block_group *cache,
436436
u64 num_bytes)
437437
{
438438
struct btrfs_caching_control *caching_ctl;
439+
int progress;
439440

440441
caching_ctl = btrfs_get_caching_control(cache);
441442
if (!caching_ctl)
442443
return;
443444

445+
/*
446+
* We've already failed to allocate from this block group, so even if
447+
* there's enough space in the block group it isn't contiguous enough to
448+
* allow for an allocation, so wait for at least the next wakeup tick,
449+
* or for the thing to be done.
450+
*/
451+
progress = atomic_read(&caching_ctl->progress);
452+
444453
wait_event(caching_ctl->wait, btrfs_block_group_done(cache) ||
445-
(cache->free_space_ctl->free_space >= num_bytes));
454+
(progress != atomic_read(&caching_ctl->progress) &&
455+
(cache->free_space_ctl->free_space >= num_bytes)));
446456

447457
btrfs_put_caching_control(caching_ctl);
448458
}
@@ -660,8 +670,10 @@ static int load_extent_tree_free(struct btrfs_caching_control *caching_ctl)
660670

661671
if (total_found > CACHING_CTL_WAKE_UP) {
662672
total_found = 0;
663-
if (wakeup)
673+
if (wakeup) {
674+
atomic_inc(&caching_ctl->progress);
664675
wake_up(&caching_ctl->wait);
676+
}
665677
}
666678
}
667679
path->slots[0]++;
@@ -767,6 +779,7 @@ int btrfs_cache_block_group(struct btrfs_block_group *cache, bool wait)
767779
init_waitqueue_head(&caching_ctl->wait);
768780
caching_ctl->block_group = cache;
769781
refcount_set(&caching_ctl->count, 2);
782+
atomic_set(&caching_ctl->progress, 0);
770783
btrfs_init_work(&caching_ctl->work, caching_thread, NULL, NULL);
771784

772785
spin_lock(&cache->lock);

fs/btrfs/block-group.h

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -70,6 +70,8 @@ struct btrfs_caching_control {
7070
wait_queue_head_t wait;
7171
struct btrfs_work work;
7272
struct btrfs_block_group *block_group;
73+
/* Track progress of caching during allocation. */
74+
atomic_t progress;
7375
refcount_t count;
7476
};
7577

0 commit comments

Comments
 (0)