Skip to content

Commit 2feab24

Browse files
joshdonPeter Zijlstra
authored andcommitted
Revert "sched/fair: Make sure to try to detach at least one movable task"
This reverts commit b0defa7. b0defa7 changed the load balancing logic to ignore env.max_loop if all tasks examined to that point were pinned. The goal of the patch was to make it more likely to be able to detach a task buried in a long list of pinned tasks. However, this has the unfortunate side effect of creating an O(n) iteration in detach_tasks(), as we now must fully iterate every task on a cpu if all or most are pinned. Since this load balance code is done with rq lock held, and often in softirq context, it is very easy to trigger hard lockups. We observed such hard lockups with a user who affined O(10k) threads to a single cpu. When I discussed this with Vincent he initially suggested that we keep the limit on the number of tasks to detach, but increase the number of tasks we can search. However, after some back and forth on the mailing list, he recommended we instead revert the original patch, as it seems likely no one was actually getting hit by the original issue. Fixes: b0defa7 ("sched/fair: Make sure to try to detach at least one movable task") Signed-off-by: Josh Don <joshdon@google.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Reviewed-by: Vincent Guittot <vincent.guittot@linaro.org> Link: https://lore.kernel.org/r/20240620214450.316280-1-joshdon@google.com
1 parent f266106 commit 2feab24

File tree

1 file changed

+3
-9
lines changed

1 file changed

+3
-9
lines changed

kernel/sched/fair.c

Lines changed: 3 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -9149,12 +9149,8 @@ static int detach_tasks(struct lb_env *env)
91499149
break;
91509150

91519151
env->loop++;
9152-
/*
9153-
* We've more or less seen every task there is, call it quits
9154-
* unless we haven't found any movable task yet.
9155-
*/
9156-
if (env->loop > env->loop_max &&
9157-
!(env->flags & LBF_ALL_PINNED))
9152+
/* We've more or less seen every task there is, call it quits */
9153+
if (env->loop > env->loop_max)
91589154
break;
91599155

91609156
/* take a breather every nr_migrate tasks */
@@ -11393,9 +11389,7 @@ static int sched_balance_rq(int this_cpu, struct rq *this_rq,
1139311389

1139411390
if (env.flags & LBF_NEED_BREAK) {
1139511391
env.flags &= ~LBF_NEED_BREAK;
11396-
/* Stop if we tried all running tasks */
11397-
if (env.loop < busiest->nr_running)
11398-
goto more_balance;
11392+
goto more_balance;
1139911393
}
1140011394

1140111395
/*

0 commit comments

Comments
 (0)