Skip to content

Commit 7da7107

Browse files
committed
Merge tag 'pm-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm
Pull power management updates from Rafael Wysocki: "These add support for new processors (Sierra Forest, Grand Ridge and Meteor Lake) to the intel_idle driver, make intel_pstate run on Emerald Rapids without HWP support and adjust it to utilize EPP values supplied by the platform firmware, fix issues, clean up code and improve documentation. The most significant fix addresses deadlocks in the core system-wide resume code that occur if async_schedule_dev() attempts to run its argument function synchronously (for example, due to a memory allocation failure). It rearranges the code in question which may increase the system resume time in some cases, but this basically is a removal of a premature optimization. That optimization will be added back later, but properly this time. Specifics: - Add support for the Sierra Forest, Grand Ridge and Meteorlake SoCs to the intel_idle cpuidle driver (Artem Bityutskiy, Zhang Rui) - Do not enable interrupts when entering idle in the haltpoll cpuidle driver (Borislav Petkov) - Add Emerald Rapids support in no-HWP mode to the intel_pstate cpufreq driver (Zhenguo Yao) - Use EPP values programmed by the platform firmware as balanced performance ones by default in intel_pstate (Srinivas Pandruvada) - Add a missing function return value check to the SCMI cpufreq driver to avoid unexpected behavior (Alexandra Diupina) - Fix parameter type warning in the armada-8k cpufreq driver (Gregory CLEMENT) - Rework trans_stat_show() in the devfreq core code to avoid buffer overflows (Christian Marangi) - Synchronize devfreq_monitor_[start/stop] so as to prevent a timer list corruption from occurring when devfreq governors are switched frequently (Mukesh Ojha) - Fix possible deadlocks in the core system-wide PM code that occur if device-handling functions cannot be executed asynchronously during resume from system-wide suspend (Rafael J. Wysocki) - Clean up unnecessary local variable initializations in multiple places in the hibernation code (Wang chaodong, Li zeming) - Adjust core hibernation code to avoid missing wakeup events that occur after saving an image to persistent storage (Chris Feng) - Update hibernation code to enforce correct ordering during image compression and decompression (Hongchen Zhang) - Use kmap_local_page() instead of kmap_atomic() in copy_data_page() during hibernation and restore (Chen Haonan) - Adjust documentation and code comments to reflect recent tasks freezer changes (Kevin Hao) - Repair excess function parameter description warning in the hibernation image-saving code (Randy Dunlap) - Fix _set_required_opps when opp is NULL (Bryan O'Donoghue) - Use device_get_match_data() in the OPP code for TI (Rob Herring) - Clean up OPP level and other parts and call dev_pm_opp_set_opp() recursively for required OPPs (Viresh Kumar)" * tag 'pm-6.8-rc1' of git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm: (35 commits) OPP: Rename 'rate_clk_single' OPP: Pass rounded rate to _set_opp() OPP: Relocate dev_pm_opp_sync_regulators() PM: sleep: Fix possible deadlocks in core system-wide PM code OPP: Move dev_pm_opp_icc_bw to internal opp.h async: Introduce async_schedule_dev_nocall() async: Split async_schedule_node_domain() cpuidle: haltpoll: Do not enable interrupts when entering idle OPP: Fix _set_required_opps when opp is NULL OPP: The level field is always of unsigned int type PM: hibernate: Repair excess function parameter description warning PM: sleep: Remove obsolete comment from unlock_system_sleep() cpufreq: intel_pstate: Add Emerald Rapids support in no-HWP mode Documentation: PM: Adjust freezing-of-tasks.rst to the freezer changes PM: hibernate: Use kmap_local_page() in copy_data_page() intel_idle: add Sierra Forest SoC support intel_idle: add Grand Ridge SoC support PM / devfreq: Synchronize devfreq_monitor_[start/stop] cpufreq: armada-8k: Fix parameter type warning PM: hibernate: Enforce ordering during image compression/decompression ...
2 parents 7f73ba6 + f1e5e46 commit 7da7107

File tree

21 files changed

+658
-395
lines changed

21 files changed

+658
-395
lines changed

Documentation/ABI/testing/sysfs-class-devfreq

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,9 @@ Description:
5252

5353
echo 0 > /sys/class/devfreq/.../trans_stat
5454

55+
If the transition table is bigger than PAGE_SIZE, reading
56+
this will return an -EFBIG error.
57+
5558
What: /sys/class/devfreq/.../available_frequencies
5659
Date: October 2012
5760
Contact: Nishanth Menon <nm@ti.com>

Documentation/power/freezing-of-tasks.rst

Lines changed: 48 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -14,27 +14,28 @@ architectures).
1414
II. How does it work?
1515
=====================
1616

17-
There are three per-task flags used for that, PF_NOFREEZE, PF_FROZEN
18-
and PF_FREEZER_SKIP (the last one is auxiliary). The tasks that have
19-
PF_NOFREEZE unset (all user space processes and some kernel threads) are
20-
regarded as 'freezable' and treated in a special way before the system enters a
21-
suspend state as well as before a hibernation image is created (in what follows
22-
we only consider hibernation, but the description also applies to suspend).
17+
There is one per-task flag (PF_NOFREEZE) and three per-task states
18+
(TASK_FROZEN, TASK_FREEZABLE and __TASK_FREEZABLE_UNSAFE) used for that.
19+
The tasks that have PF_NOFREEZE unset (all user space tasks and some kernel
20+
threads) are regarded as 'freezable' and treated in a special way before the
21+
system enters a sleep state as well as before a hibernation image is created
22+
(hibernation is directly covered by what follows, but the description applies
23+
to system-wide suspend too).
2324

2425
Namely, as the first step of the hibernation procedure the function
2526
freeze_processes() (defined in kernel/power/process.c) is called. A system-wide
26-
variable system_freezing_cnt (as opposed to a per-task flag) is used to indicate
27-
whether the system is to undergo a freezing operation. And freeze_processes()
28-
sets this variable. After this, it executes try_to_freeze_tasks() that sends a
29-
fake signal to all user space processes, and wakes up all the kernel threads.
30-
All freezable tasks must react to that by calling try_to_freeze(), which
31-
results in a call to __refrigerator() (defined in kernel/freezer.c), which sets
32-
the task's PF_FROZEN flag, changes its state to TASK_UNINTERRUPTIBLE and makes
33-
it loop until PF_FROZEN is cleared for it. Then, we say that the task is
34-
'frozen' and therefore the set of functions handling this mechanism is referred
35-
to as 'the freezer' (these functions are defined in kernel/power/process.c,
36-
kernel/freezer.c & include/linux/freezer.h). User space processes are generally
37-
frozen before kernel threads.
27+
static key freezer_active (as opposed to a per-task flag or state) is used to
28+
indicate whether the system is to undergo a freezing operation. And
29+
freeze_processes() sets this static key. After this, it executes
30+
try_to_freeze_tasks() that sends a fake signal to all user space processes, and
31+
wakes up all the kernel threads. All freezable tasks must react to that by
32+
calling try_to_freeze(), which results in a call to __refrigerator() (defined
33+
in kernel/freezer.c), which changes the task's state to TASK_FROZEN, and makes
34+
it loop until it is woken by an explicit TASK_FROZEN wakeup. Then, that task
35+
is regarded as 'frozen' and so the set of functions handling this mechanism is
36+
referred to as 'the freezer' (these functions are defined in
37+
kernel/power/process.c, kernel/freezer.c & include/linux/freezer.h). User space
38+
tasks are generally frozen before kernel threads.
3839

3940
__refrigerator() must not be called directly. Instead, use the
4041
try_to_freeze() function (defined in include/linux/freezer.h), that checks
@@ -43,31 +44,40 @@ if the task is to be frozen and makes the task enter __refrigerator().
4344
For user space processes try_to_freeze() is called automatically from the
4445
signal-handling code, but the freezable kernel threads need to call it
4546
explicitly in suitable places or use the wait_event_freezable() or
46-
wait_event_freezable_timeout() macros (defined in include/linux/freezer.h)
47-
that combine interruptible sleep with checking if the task is to be frozen and
48-
calling try_to_freeze(). The main loop of a freezable kernel thread may look
47+
wait_event_freezable_timeout() macros (defined in include/linux/wait.h)
48+
that put the task to sleep (TASK_INTERRUPTIBLE) or freeze it (TASK_FROZEN) if
49+
freezer_active is set. The main loop of a freezable kernel thread may look
4950
like the following one::
5051

5152
set_freezable();
52-
do {
53-
hub_events();
54-
wait_event_freezable(khubd_wait,
55-
!list_empty(&hub_event_list) ||
56-
kthread_should_stop());
57-
} while (!kthread_should_stop() || !list_empty(&hub_event_list));
58-
59-
(from drivers/usb/core/hub.c::hub_thread()).
60-
61-
If a freezable kernel thread fails to call try_to_freeze() after the freezer has
62-
initiated a freezing operation, the freezing of tasks will fail and the entire
63-
hibernation operation will be cancelled. For this reason, freezable kernel
64-
threads must call try_to_freeze() somewhere or use one of the
53+
54+
while (true) {
55+
struct task_struct *tsk = NULL;
56+
57+
wait_event_freezable(oom_reaper_wait, oom_reaper_list != NULL);
58+
spin_lock_irq(&oom_reaper_lock);
59+
if (oom_reaper_list != NULL) {
60+
tsk = oom_reaper_list;
61+
oom_reaper_list = tsk->oom_reaper_list;
62+
}
63+
spin_unlock_irq(&oom_reaper_lock);
64+
65+
if (tsk)
66+
oom_reap_task(tsk);
67+
}
68+
69+
(from mm/oom_kill.c::oom_reaper()).
70+
71+
If a freezable kernel thread is not put to the frozen state after the freezer
72+
has initiated a freezing operation, the freezing of tasks will fail and the
73+
entire system-wide transition will be cancelled. For this reason, freezable
74+
kernel threads must call try_to_freeze() somewhere or use one of the
6575
wait_event_freezable() and wait_event_freezable_timeout() macros.
6676

6777
After the system memory state has been restored from a hibernation image and
6878
devices have been reinitialized, the function thaw_processes() is called in
69-
order to clear the PF_FROZEN flag for each frozen task. Then, the tasks that
70-
have been frozen leave __refrigerator() and continue running.
79+
order to wake up each frozen task. Then, the tasks that have been frozen leave
80+
__refrigerator() and continue running.
7181

7282

7383
Rationale behind the functions dealing with freezing and thawing of tasks
@@ -96,7 +106,8 @@ III. Which kernel threads are freezable?
96106
Kernel threads are not freezable by default. However, a kernel thread may clear
97107
PF_NOFREEZE for itself by calling set_freezable() (the resetting of PF_NOFREEZE
98108
directly is not allowed). From this point it is regarded as freezable
99-
and must call try_to_freeze() in a suitable place.
109+
and must call try_to_freeze() or variants of wait_event_freezable() in a
110+
suitable place.
100111

101112
IV. Why do we do that?
102113
======================

0 commit comments

Comments
 (0)