Skip to content

Commit 8bc3547

Browse files
committed
workqueue: Fix spruious data race in __flush_work()
When flushing a work item for cancellation, __flush_work() knows that it exclusively owns the work item through its PENDING bit. 134874e ("workqueue: Allow cancel_work_sync() and disable_work() from atomic contexts on BH work items") added a read of @work->data to determine whether to use busy wait for BH work items that are being canceled. While the read is safe when @from_cancel, @work->data was read before testing @from_cancel to simplify code structure: data = *work_data_bits(work); if (from_cancel && !WARN_ON_ONCE(data & WORK_STRUCT_PWQ) && (data & WORK_OFFQ_BH)) { While the read data was never used if !@from_cancel, this could trigger KCSAN data race detection spuriously: ================================================================== BUG: KCSAN: data-race in __flush_work / __flush_work write to 0xffff8881223aa3e8 of 8 bytes by task 3998 on cpu 0: instrument_write include/linux/instrumented.h:41 [inline] ___set_bit include/asm-generic/bitops/instrumented-non-atomic.h:28 [inline] insert_wq_barrier kernel/workqueue.c:3790 [inline] start_flush_work kernel/workqueue.c:4142 [inline] __flush_work+0x30b/0x570 kernel/workqueue.c:4178 flush_work kernel/workqueue.c:4229 [inline] ... read to 0xffff8881223aa3e8 of 8 bytes by task 50 on cpu 1: __flush_work+0x42a/0x570 kernel/workqueue.c:4188 flush_work kernel/workqueue.c:4229 [inline] flush_delayed_work+0x66/0x70 kernel/workqueue.c:4251 ... value changed: 0x0000000000400000 -> 0xffff88810006c00d Reorganize the code so that @from_cancel is tested before @work->data is accessed. The only problem is triggering KCSAN detection spuriously. This shouldn't need READ_ONCE() or other access qualifiers. No functional changes. Signed-off-by: Tejun Heo <tj@kernel.org> Reported-by: syzbot+b3e4f2f51ed645fd5df2@syzkaller.appspotmail.com Fixes: 134874e ("workqueue: Allow cancel_work_sync() and disable_work() from atomic contexts on BH work items") Link: http://lkml.kernel.org/r/000000000000ae429e061eea2157@google.com Cc: Jens Axboe <axboe@kernel.dk>
1 parent 98cc173 commit 8bc3547

File tree

1 file changed

+25
-20
lines changed

1 file changed

+25
-20
lines changed

kernel/workqueue.c

Lines changed: 25 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -4166,7 +4166,6 @@ static bool start_flush_work(struct work_struct *work, struct wq_barrier *barr,
41664166
static bool __flush_work(struct work_struct *work, bool from_cancel)
41674167
{
41684168
struct wq_barrier barr;
4169-
unsigned long data;
41704169

41714170
if (WARN_ON(!wq_online))
41724171
return false;
@@ -4184,29 +4183,35 @@ static bool __flush_work(struct work_struct *work, bool from_cancel)
41844183
* was queued on a BH workqueue, we also know that it was running in the
41854184
* BH context and thus can be busy-waited.
41864185
*/
4187-
data = *work_data_bits(work);
4188-
if (from_cancel &&
4189-
!WARN_ON_ONCE(data & WORK_STRUCT_PWQ) && (data & WORK_OFFQ_BH)) {
4190-
/*
4191-
* On RT, prevent a live lock when %current preempted soft
4192-
* interrupt processing or prevents ksoftirqd from running by
4193-
* keeping flipping BH. If the BH work item runs on a different
4194-
* CPU then this has no effect other than doing the BH
4195-
* disable/enable dance for nothing. This is copied from
4196-
* kernel/softirq.c::tasklet_unlock_spin_wait().
4197-
*/
4198-
while (!try_wait_for_completion(&barr.done)) {
4199-
if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
4200-
local_bh_disable();
4201-
local_bh_enable();
4202-
} else {
4203-
cpu_relax();
4186+
if (from_cancel) {
4187+
unsigned long data = *work_data_bits(work);
4188+
4189+
if (!WARN_ON_ONCE(data & WORK_STRUCT_PWQ) &&
4190+
(data & WORK_OFFQ_BH)) {
4191+
/*
4192+
* On RT, prevent a live lock when %current preempted
4193+
* soft interrupt processing or prevents ksoftirqd from
4194+
* running by keeping flipping BH. If the BH work item
4195+
* runs on a different CPU then this has no effect other
4196+
* than doing the BH disable/enable dance for nothing.
4197+
* This is copied from
4198+
* kernel/softirq.c::tasklet_unlock_spin_wait().
4199+
*/
4200+
while (!try_wait_for_completion(&barr.done)) {
4201+
if (IS_ENABLED(CONFIG_PREEMPT_RT)) {
4202+
local_bh_disable();
4203+
local_bh_enable();
4204+
} else {
4205+
cpu_relax();
4206+
}
42044207
}
4208+
goto out_destroy;
42054209
}
4206-
} else {
4207-
wait_for_completion(&barr.done);
42084210
}
42094211

4212+
wait_for_completion(&barr.done);
4213+
4214+
out_destroy:
42104215
destroy_work_on_stack(&barr.work);
42114216
return true;
42124217
}

0 commit comments

Comments
 (0)