Skip to content

Commit 936184e

Browse files
konisakpm00
authored andcommitted
nilfs2: fix unexpected freezing of nilfs_segctor_sync()
A potential and reproducible race issue has been identified where nilfs_segctor_sync() would block even after the log writer thread writes a checkpoint, unless there is an interrupt or other trigger to resume log writing. This turned out to be because, depending on the execution timing of the log writer thread running in parallel, the log writer thread may skip responding to nilfs_segctor_sync(), which causes a call to schedule() waiting for completion within nilfs_segctor_sync() to lose the opportunity to wake up. The reason why waking up the task waiting in nilfs_segctor_sync() may be skipped is that updating the request generation issued using a shared sequence counter and adding an wait queue entry to the request wait queue to the log writer, are not done atomically. There is a possibility that log writing and request completion notification by nilfs_segctor_wakeup() may occur between the two operations, and in that case, the wait queue entry is not yet visible to nilfs_segctor_wakeup() and the wake-up of nilfs_segctor_sync() will be carried over until the next request occurs. Fix this issue by performing these two operations simultaneously within the lock section of sc_state_lock. Also, following the memory barrier guidelines for event waiting loops, move the call to set_current_state() in the same location into the event waiting loop to ensure that a memory barrier is inserted just before the event condition determination. Link: https://lkml.kernel.org/r/20240520132621.4054-3-konishi.ryusuke@gmail.com Fixes: 9ff0512 ("nilfs2: segment constructor") Signed-off-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Tested-by: Ryusuke Konishi <konishi.ryusuke@gmail.com> Cc: <stable@vger.kernel.org> Cc: "Bai, Shuangpeng" <sjb7183@psu.edu> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent f5d4e04 commit 936184e

File tree

1 file changed

+13
-4
lines changed

1 file changed

+13
-4
lines changed

fs/nilfs2/segment.c

Lines changed: 13 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -2168,19 +2168,28 @@ static int nilfs_segctor_sync(struct nilfs_sc_info *sci)
21682168
struct nilfs_segctor_wait_request wait_req;
21692169
int err = 0;
21702170

2171-
spin_lock(&sci->sc_state_lock);
21722171
init_wait(&wait_req.wq);
21732172
wait_req.err = 0;
21742173
atomic_set(&wait_req.done, 0);
2174+
init_waitqueue_entry(&wait_req.wq, current);
2175+
2176+
/*
2177+
* To prevent a race issue where completion notifications from the
2178+
* log writer thread are missed, increment the request sequence count
2179+
* "sc_seq_request" and insert a wait queue entry using the current
2180+
* sequence number into the "sc_wait_request" queue at the same time
2181+
* within the lock section of "sc_state_lock".
2182+
*/
2183+
spin_lock(&sci->sc_state_lock);
21752184
wait_req.seq = ++sci->sc_seq_request;
2185+
add_wait_queue(&sci->sc_wait_request, &wait_req.wq);
21762186
spin_unlock(&sci->sc_state_lock);
21772187

2178-
init_waitqueue_entry(&wait_req.wq, current);
2179-
add_wait_queue(&sci->sc_wait_request, &wait_req.wq);
2180-
set_current_state(TASK_INTERRUPTIBLE);
21812188
wake_up(&sci->sc_wait_daemon);
21822189

21832190
for (;;) {
2191+
set_current_state(TASK_INTERRUPTIBLE);
2192+
21842193
if (atomic_read(&wait_req.done)) {
21852194
err = wait_req.err;
21862195
break;

0 commit comments

Comments
 (0)