Skip to content

Commit 2b3f004

Browse files
Darrick J. WongChandan Babu R
authored andcommitted
xfs: drop xfarray sortinfo folio on error
Chandan Babu reports the following livelock in xfs/708: run fstests xfs/708 at 2024-05-04 15:35:29 XFS (loop16): EXPERIMENTAL online scrub feature in use. Use at your own risk! XFS (loop5): Mounting V5 Filesystem e96086f0-a2f9-4424-a1d5-c75d53d823be XFS (loop5): Ending clean mount XFS (loop5): Quotacheck needed: Please wait. XFS (loop5): Quotacheck: Done. XFS (loop5): EXPERIMENTAL online scrub feature in use. Use at your own risk! INFO: task xfs_io:143725 blocked for more than 122 seconds. Not tainted 6.9.0-rc4+ #1 "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. task:xfs_io state:D stack:0 pid:143725 tgid:143725 ppid:117661 flags:0x00004006 Call Trace: <TASK> __schedule+0x69c/0x17a0 schedule+0x74/0x1b0 io_schedule+0xc4/0x140 folio_wait_bit_common+0x254/0x650 shmem_undo_range+0x9d5/0xb40 shmem_evict_inode+0x322/0x8f0 evict+0x24e/0x560 __dentry_kill+0x17d/0x4d0 dput+0x263/0x430 __fput+0x2fc/0xaa0 task_work_run+0x132/0x210 get_signal+0x1a8/0x1910 arch_do_signal_or_restart+0x7b/0x2f0 syscall_exit_to_user_mode+0x1c2/0x200 do_syscall_64+0x72/0x170 entry_SYSCALL_64_after_hwframe+0x76/0x7e The shmem code is trying to drop all the folios attached to a shmem file and gets stuck on a locked folio after a bnobt repair. It looks like the process has a signal pending, so I started looking for places where we lock an xfile folio and then deal with a fatal signal. I found a bug in xfarray_sort_scan via code inspection. This function is called to set up the scanning phase of a quicksort operation, which may involve grabbing a locked xfile folio. If we exit the function with an error code, the caller does not call xfarray_sort_scan_done to put the xfile folio. If _sort_scan returns an error code while si->folio is set, we leak the reference and never unlock the folio. Therefore, change xfarray_sort to call _scan_done on exit. This is safe to call multiple times because it sets si->folio to NULL and ignores a NULL si->folio. Also change _sort_scan to use an intermediate variable so that we never pollute si->folio with an errptr. Fixes: 232ea05 ("xfs: enable sorting of xfile-backed arrays") Reported-by: Chandan Babu R <chandanbabu@kernel.org> Signed-off-by: "Darrick J. Wong" <djwong@kernel.org> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Chandan Babu R <chandanbabu@kernel.org>
1 parent b33874f commit 2b3f004

File tree

1 file changed

+6
-3
lines changed

1 file changed

+6
-3
lines changed

fs/xfs/scrub/xfarray.c

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -822,12 +822,14 @@ xfarray_sort_scan(
822822

823823
/* Grab the first folio that backs this array element. */
824824
if (!si->folio) {
825+
struct folio *folio;
825826
loff_t next_pos;
826827

827-
si->folio = xfile_get_folio(si->array->xfile, idx_pos,
828+
folio = xfile_get_folio(si->array->xfile, idx_pos,
828829
si->array->obj_size, XFILE_ALLOC);
829-
if (IS_ERR(si->folio))
830-
return PTR_ERR(si->folio);
830+
if (IS_ERR(folio))
831+
return PTR_ERR(folio);
832+
si->folio = folio;
831833

832834
si->first_folio_idx = xfarray_idx(si->array,
833835
folio_pos(si->folio) + si->array->obj_size - 1);
@@ -1048,6 +1050,7 @@ xfarray_sort(
10481050

10491051
out_free:
10501052
trace_xfarray_sort_stats(si, error);
1053+
xfarray_sort_scan_done(si);
10511054
kvfree(si);
10521055
return error;
10531056
}

0 commit comments

Comments
 (0)