Skip to content

Commit 255265e

Browse files
fdmananakdave
authored andcommitted
btrfs: propagate last_unlink_trans earlier when doing a rmdir
In case the removed directory had a snapshot that was deleted, we are propagating its inode's last_unlink_trans to the parent directory after we removed the entry from the parent directory. This leaves a small race window where someone can log the parent directory after we removed the entry and before we updated last_unlink_trans, and as a result if we ever try to replay such a log tree, we will fail since we will attempt to remove a snapshot during log replay, which is currently not possible and results in the log replay (and mount) to fail. This is the type of failure described in commit 1ec9a1a ("Btrfs: fix unreplayable log after snapshot delete + parent dir fsync"). So fix this by propagating the last_unlink_trans to the parent directory before we remove the entry from it. Fixes: 44f714d ("Btrfs: improve performance on fsync against new inode after rename/unlink") Signed-off-by: Filipe Manana <fdmanana@suse.com> Reviewed-by: Johannes Thumshirn <johannes.thumshirn@wdc.com> Signed-off-by: David Sterba <dsterba@suse.com>
1 parent 2b6f3e4 commit 255265e

File tree

1 file changed

+18
-18
lines changed

1 file changed

+18
-18
lines changed

fs/btrfs/inode.c

Lines changed: 18 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -4707,7 +4707,6 @@ static int btrfs_rmdir(struct inode *dir, struct dentry *dentry)
47074707
struct btrfs_fs_info *fs_info = BTRFS_I(inode)->root->fs_info;
47084708
int ret = 0;
47094709
struct btrfs_trans_handle *trans;
4710-
u64 last_unlink_trans;
47114710
struct fscrypt_name fname;
47124711

47134712
if (inode->i_size > BTRFS_EMPTY_DIR_SIZE)
@@ -4733,6 +4732,23 @@ static int btrfs_rmdir(struct inode *dir, struct dentry *dentry)
47334732
goto out_notrans;
47344733
}
47354734

4735+
/*
4736+
* Propagate the last_unlink_trans value of the deleted dir to its
4737+
* parent directory. This is to prevent an unrecoverable log tree in the
4738+
* case we do something like this:
4739+
* 1) create dir foo
4740+
* 2) create snapshot under dir foo
4741+
* 3) delete the snapshot
4742+
* 4) rmdir foo
4743+
* 5) mkdir foo
4744+
* 6) fsync foo or some file inside foo
4745+
*
4746+
* This is because we can't unlink other roots when replaying the dir
4747+
* deletes for directory foo.
4748+
*/
4749+
if (BTRFS_I(inode)->last_unlink_trans >= trans->transid)
4750+
BTRFS_I(dir)->last_unlink_trans = BTRFS_I(inode)->last_unlink_trans;
4751+
47364752
if (unlikely(btrfs_ino(BTRFS_I(inode)) == BTRFS_EMPTY_SUBVOL_DIR_OBJECTID)) {
47374753
ret = btrfs_unlink_subvol(trans, BTRFS_I(dir), dentry);
47384754
goto out;
@@ -4742,27 +4758,11 @@ static int btrfs_rmdir(struct inode *dir, struct dentry *dentry)
47424758
if (ret)
47434759
goto out;
47444760

4745-
last_unlink_trans = BTRFS_I(inode)->last_unlink_trans;
4746-
47474761
/* now the directory is empty */
47484762
ret = btrfs_unlink_inode(trans, BTRFS_I(dir), BTRFS_I(d_inode(dentry)),
47494763
&fname.disk_name);
4750-
if (!ret) {
4764+
if (!ret)
47514765
btrfs_i_size_write(BTRFS_I(inode), 0);
4752-
/*
4753-
* Propagate the last_unlink_trans value of the deleted dir to
4754-
* its parent directory. This is to prevent an unrecoverable
4755-
* log tree in the case we do something like this:
4756-
* 1) create dir foo
4757-
* 2) create snapshot under dir foo
4758-
* 3) delete the snapshot
4759-
* 4) rmdir foo
4760-
* 5) mkdir foo
4761-
* 6) fsync foo or some file inside foo
4762-
*/
4763-
if (last_unlink_trans >= trans->transid)
4764-
BTRFS_I(dir)->last_unlink_trans = last_unlink_trans;
4765-
}
47664766
out:
47674767
btrfs_end_transaction(trans);
47684768
out_notrans:

0 commit comments

Comments
 (0)