Skip to content

Commit 194505b

Browse files
Brian Fostertytso
authored andcommitted
ext4: drop dio overwrite only flag and associated warning
The commit referenced below opened up concurrent unaligned dio under shared locking for pure overwrites. In doing so, it enabled use of the IOMAP_DIO_OVERWRITE_ONLY flag and added a warning on unexpected -EAGAIN returns as an extra precaution, since ext4 does not retry writes in such cases. The flag itself is advisory in this case since ext4 checks for unaligned I/Os and uses appropriate locking up front, rather than on a retry in response to -EAGAIN. As it turns out, the warning check is susceptible to false positives because there are scenarios where -EAGAIN can be expected from lower layers without necessarily having IOCB_NOWAIT set on the iocb. For example, one instance of the warning has been seen where io_uring sets IOCB_HIPRI, which in turn results in REQ_POLLED|REQ_NOWAIT on the bio. This results in -EAGAIN if the block layer is unable to allocate a request, etc. [Note that there is an outstanding patch to untangle REQ_POLLED and REQ_NOWAIT such that the latter relies on IOCB_NOWAIT, which would also address this instance of the warning.] Another instance of the warning has been reproduced by syzbot. A dio write is interrupted down in __get_user_pages_locked() waiting on the mm lock and returns -EAGAIN up the stack. If the iomap dio iteration layer has made no progress on the write to this point, -EAGAIN returns up to the filesystem and triggers the warning. This use of the overwrite flag in ext4 is precautionary and half-baked. I.e., ext4 doesn't actually implement overwrite checking in the iomap callbacks when the flag is set, so the only extra verification it provides are i_size checks in the generic iomap dio layer. Combined with the tendency for false positives, the added verification is not worth the extra trouble. Remove the flag, associated warning, and update the comments to document when concurrent unaligned dio writes are allowed and why said flag is not used. Cc: stable@kernel.org Reported-by: syzbot+5050ad0fb47527b1808a@syzkaller.appspotmail.com Reported-by: Pengfei Xu <pengfei.xu@intel.com> Fixes: 310ee09 ("ext4: allow concurrent unaligned dio overwrites") Signed-off-by: Brian Foster <bfoster@redhat.com> Reviewed-by: Jan Kara <jack@suse.cz> Link: https://lore.kernel.org/r/20230810165559.946222-1-bfoster@redhat.com Signed-off-by: Theodore Ts'o <tytso@mit.edu>
1 parent 68228da commit 194505b

File tree

1 file changed

+10
-15
lines changed

1 file changed

+10
-15
lines changed

fs/ext4/file.c

Lines changed: 10 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -476,6 +476,11 @@ static ssize_t ext4_dio_write_checks(struct kiocb *iocb, struct iov_iter *from,
476476
* required to change security info in file_modified(), for extending
477477
* I/O, any form of non-overwrite I/O, and unaligned I/O to unwritten
478478
* extents (as partial block zeroing may be required).
479+
*
480+
* Note that unaligned writes are allowed under shared lock so long as
481+
* they are pure overwrites. Otherwise, concurrent unaligned writes risk
482+
* data corruption due to partial block zeroing in the dio layer, and so
483+
* the I/O must occur exclusively.
479484
*/
480485
if (*ilock_shared &&
481486
((!IS_NOSEC(inode) || *extend || !overwrite ||
@@ -492,21 +497,12 @@ static ssize_t ext4_dio_write_checks(struct kiocb *iocb, struct iov_iter *from,
492497

493498
/*
494499
* Now that locking is settled, determine dio flags and exclusivity
495-
* requirements. Unaligned writes are allowed under shared lock so long
496-
* as they are pure overwrites. Set the iomap overwrite only flag as an
497-
* added precaution in this case. Even though this is unnecessary, we
498-
* can detect and warn on unexpected -EAGAIN if an unsafe unaligned
499-
* write is ever submitted.
500-
*
501-
* Otherwise, concurrent unaligned writes risk data corruption due to
502-
* partial block zeroing in the dio layer, and so the I/O must occur
503-
* exclusively. The inode lock is already held exclusive if the write is
504-
* non-overwrite or extending, so drain all outstanding dio and set the
505-
* force wait dio flag.
500+
* requirements. We don't use DIO_OVERWRITE_ONLY because we enforce
501+
* behavior already. The inode lock is already held exclusive if the
502+
* write is non-overwrite or extending, so drain all outstanding dio and
503+
* set the force wait dio flag.
506504
*/
507-
if (*ilock_shared && unaligned_io) {
508-
*dio_flags = IOMAP_DIO_OVERWRITE_ONLY;
509-
} else if (!*ilock_shared && (unaligned_io || *extend)) {
505+
if (!*ilock_shared && (unaligned_io || *extend)) {
510506
if (iocb->ki_flags & IOCB_NOWAIT) {
511507
ret = -EAGAIN;
512508
goto out;
@@ -608,7 +604,6 @@ static ssize_t ext4_dio_write_iter(struct kiocb *iocb, struct iov_iter *from)
608604
iomap_ops = &ext4_iomap_overwrite_ops;
609605
ret = iomap_dio_rw(iocb, from, iomap_ops, &ext4_dio_write_ops,
610606
dio_flags, NULL, 0);
611-
WARN_ON_ONCE(ret == -EAGAIN && !(iocb->ki_flags & IOCB_NOWAIT));
612607
if (ret == -ENOTBLK)
613608
ret = 0;
614609

0 commit comments

Comments
 (0)