Skip to content

Commit 3cb97a9

Browse files
Chen Ridonghtejun
authored andcommitted
cgroup/cpuset: remove kernfs active break
A warning was found: WARNING: CPU: 10 PID: 3486953 at fs/kernfs/file.c:828 CPU: 10 PID: 3486953 Comm: rmdir Kdump: loaded Tainted: G RIP: 0010:kernfs_should_drain_open_files+0x1a1/0x1b0 RSP: 0018:ffff8881107ef9e0 EFLAGS: 00010202 RAX: 0000000080000002 RBX: ffff888154738c00 RCX: dffffc0000000000 RDX: 0000000000000007 RSI: 0000000000000004 RDI: ffff888154738c04 RBP: ffff888154738c04 R08: ffffffffaf27fa15 R09: ffffed102a8e7180 R10: ffff888154738c07 R11: 0000000000000000 R12: ffff888154738c08 R13: ffff888750f8c000 R14: ffff888750f8c0e8 R15: ffff888154738ca0 FS: 00007f84cd0be740(0000) GS:ffff8887ddc00000(0000) knlGS:0000000000000000 CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 CR2: 0000555f9fbe00c8 CR3: 0000000153eec001 CR4: 0000000000370ee0 DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 Call Trace: kernfs_drain+0x15e/0x2f0 __kernfs_remove+0x165/0x300 kernfs_remove_by_name_ns+0x7b/0xc0 cgroup_rm_file+0x154/0x1c0 cgroup_addrm_files+0x1c2/0x1f0 css_clear_dir+0x77/0x110 kill_css+0x4c/0x1b0 cgroup_destroy_locked+0x194/0x380 cgroup_rmdir+0x2a/0x140 It can be explained by: rmdir echo 1 > cpuset.cpus kernfs_fop_write_iter // active=0 cgroup_rm_file kernfs_remove_by_name_ns kernfs_get_active // active=1 __kernfs_remove // active=0x80000002 kernfs_drain cpuset_write_resmask wait_event //waiting (active == 0x80000001) kernfs_break_active_protection // active = 0x80000001 // continue kernfs_unbreak_active_protection // active = 0x80000002 ... kernfs_should_drain_open_files // warning occurs kernfs_put_active This warning is caused by 'kernfs_break_active_protection' when it is writing to cpuset.cpus, and the cgroup is removed concurrently. The commit 3a5a6d0 ("cpuset: don't nest cgroup_mutex inside get_online_cpus()") made cpuset_hotplug_workfn asynchronous, This change involves calling flush_work(), which can create a multiple processes circular locking dependency that involve cgroup_mutex, potentially leading to a deadlock. To avoid deadlock. the commit 76bb5ab ("cpuset: break kernfs active protection in cpuset_write_resmask()") added 'kernfs_break_active_protection' in the cpuset_write_resmask. This could lead to this warning. After the commit 2125c00 ("cgroup/cpuset: Make cpuset hotplug processing synchronous"), the cpuset_write_resmask no longer needs to wait the hotplug to finish, which means that concurrent hotplug and cpuset operations are no longer possible. Therefore, the deadlock doesn't exist anymore and it does not have to 'break active protection' now. To fix this warning, just remove kernfs_break_active_protection operation in the 'cpuset_write_resmask'. Fixes: bdb2fd7 ("kernfs: Skip kernfs_drain_open_files() more aggressively") Fixes: 76bb5ab ("cpuset: break kernfs active protection in cpuset_write_resmask()") Reported-by: Ji Fa <jifa@huawei.com> Signed-off-by: Chen Ridong <chenridong@huawei.com> Acked-by: Waiman Long <longman@redhat.com> Signed-off-by: Tejun Heo <tj@kernel.org>
1 parent 9b496a8 commit 3cb97a9

File tree

1 file changed

+0
-25
lines changed

1 file changed

+0
-25
lines changed

kernel/cgroup/cpuset.c

Lines changed: 0 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -3124,29 +3124,6 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
31243124
int retval = -ENODEV;
31253125

31263126
buf = strstrip(buf);
3127-
3128-
/*
3129-
* CPU or memory hotunplug may leave @cs w/o any execution
3130-
* resources, in which case the hotplug code asynchronously updates
3131-
* configuration and transfers all tasks to the nearest ancestor
3132-
* which can execute.
3133-
*
3134-
* As writes to "cpus" or "mems" may restore @cs's execution
3135-
* resources, wait for the previously scheduled operations before
3136-
* proceeding, so that we don't end up keep removing tasks added
3137-
* after execution capability is restored.
3138-
*
3139-
* cpuset_handle_hotplug may call back into cgroup core asynchronously
3140-
* via cgroup_transfer_tasks() and waiting for it from a cgroupfs
3141-
* operation like this one can lead to a deadlock through kernfs
3142-
* active_ref protection. Let's break the protection. Losing the
3143-
* protection is okay as we check whether @cs is online after
3144-
* grabbing cpuset_mutex anyway. This only happens on the legacy
3145-
* hierarchies.
3146-
*/
3147-
css_get(&cs->css);
3148-
kernfs_break_active_protection(of->kn);
3149-
31503127
cpus_read_lock();
31513128
mutex_lock(&cpuset_mutex);
31523129
if (!is_cpuset_online(cs))
@@ -3179,8 +3156,6 @@ ssize_t cpuset_write_resmask(struct kernfs_open_file *of,
31793156
out_unlock:
31803157
mutex_unlock(&cpuset_mutex);
31813158
cpus_read_unlock();
3182-
kernfs_unbreak_active_protection(of->kn);
3183-
css_put(&cs->css);
31843159
flush_workqueue(cpuset_migrate_mm_wq);
31853160
return retval ?: nbytes;
31863161
}

0 commit comments

Comments
 (0)