Skip to content

Commit a72bbec

Browse files
Eric DeVolderakpm00
authored andcommitted
crash: hotplug support for kexec_load()
The hotplug support for kexec_load() requires changes to the userspace kexec-tools and a little extra help from the kernel. Given a kdump capture kernel loaded via kexec_load(), and a subsequent hotplug event, the crash hotplug handler finds the elfcorehdr and rewrites it to reflect the hotplug change. That is the desired outcome, however, at kernel panic time, the purgatory integrity check fails (because the elfcorehdr changed), and the capture kernel does not boot and no vmcore is generated. Therefore, the userspace kexec-tools/kexec must indicate to the kernel that the elfcorehdr can be modified (because the kexec excluded the elfcorehdr from the digest, and sized the elfcorehdr memory buffer appropriately). To facilitate hotplug support with kexec_load(): - a new kexec flag KEXEC_UPATE_ELFCOREHDR indicates that it is safe for the kernel to modify the kexec_load()'d elfcorehdr - the /sys/kernel/crash_elfcorehdr_size node communicates the preferred size of the elfcorehdr memory buffer - The sysfs crash_hotplug nodes (ie. /sys/devices/system/[cpu|memory]/crash_hotplug) dynamically take into account kexec_file_load() vs kexec_load() and KEXEC_UPDATE_ELFCOREHDR. This is critical so that the udev rule processing of crash_hotplug is all that is needed to determine if the userspace unload-then-load of the kdump image is to be skipped, or not. The proposed udev rule change looks like: # The kernel updates the crash elfcorehdr for CPU and memory changes SUBSYSTEM=="cpu", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" SUBSYSTEM=="memory", ATTRS{crash_hotplug}=="1", GOTO="kdump_reload_end" The table below indicates the behavior of kexec_load()'d kdump image updates (with the new udev crash_hotplug rule in place): Kernel |Kexec -------+-----+---- Old |Old |New | a | a -------+-----+---- New | a | b -------+-----+---- where kexec 'old' and 'new' delineate kexec-tools has the needed modifications for the crash hotplug feature, and kernel 'old' and 'new' delineate the kernel supports this crash hotplug feature. Behavior 'a' indicates the unload-then-reload of the entire kdump image. For the kexec 'old' column, the unload-then-reload occurs due to the missing flag KEXEC_UPDATE_ELFCOREHDR. An 'old' kernel (with 'new' kexec) does not present the crash_hotplug sysfs node, which leads to the unload-then-reload of the kdump image. Behavior 'b' indicates the desired optimized behavior of the kernel directly modifying the elfcorehdr and avoiding the unload-then-reload of the kdump image. If the udev rule is not updated with crash_hotplug node check, then no matter any combination of kernel or kexec is new or old, the kdump image continues to be unload-then-reload on hotplug changes. To fully support crash hotplug feature, there needs to be a rollout of kernel, kexec-tools and udev rule changes. However, the order of the rollout of these pieces does not matter; kexec_load()'d kdump images still function for hotplug as-is. Link: https://lkml.kernel.org/r/20230814214446.6659-7-eric.devolder@oracle.com Signed-off-by: Eric DeVolder <eric.devolder@oracle.com> Suggested-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Hari Bathini <hbathini@linux.ibm.com> Acked-by: Baoquan He <bhe@redhat.com> Cc: Akhil Raj <lf32.dev@gmail.com> Cc: Bjorn Helgaas <bhelgaas@google.com> Cc: Borislav Petkov (AMD) <bp@alien8.de> Cc: Boris Ostrovsky <boris.ostrovsky@oracle.com> Cc: Dave Hansen <dave.hansen@linux.intel.com> Cc: Dave Young <dyoung@redhat.com> Cc: David Hildenbrand <david@redhat.com> Cc: Eric W. Biederman <ebiederm@xmission.com> Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org> Cc: "H. Peter Anvin" <hpa@zytor.com> Cc: Ingo Molnar <mingo@redhat.com> Cc: Jonathan Corbet <corbet@lwn.net> Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Mimi Zohar <zohar@linux.ibm.com> Cc: Naveen N. Rao <naveen.n.rao@linux.vnet.ibm.com> Cc: Oscar Salvador <osalvador@suse.de> Cc: "Rafael J. Wysocki" <rafael@kernel.org> Cc: Sean Christopherson <seanjc@google.com> Cc: Sourabh Jain <sourabhjain@linux.ibm.com> Cc: Takashi Iwai <tiwai@suse.de> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: Thomas Weißschuh <linux@weissschuh.net> Cc: Valentin Schneider <vschneid@redhat.com> Cc: Vivek Goyal <vgoyal@redhat.com> Cc: Vlastimil Babka <vbabka@suse.cz> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
1 parent ea53ad9 commit a72bbec

File tree

8 files changed

+102
-6
lines changed

8 files changed

+102
-6
lines changed

arch/x86/include/asm/kexec.h

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -214,14 +214,17 @@ void arch_crash_handle_hotplug_event(struct kimage *image);
214214
#define arch_crash_handle_hotplug_event arch_crash_handle_hotplug_event
215215

216216
#ifdef CONFIG_HOTPLUG_CPU
217-
static inline int crash_hotplug_cpu_support(void) { return 1; }
218-
#define crash_hotplug_cpu_support crash_hotplug_cpu_support
217+
int arch_crash_hotplug_cpu_support(void);
218+
#define crash_hotplug_cpu_support arch_crash_hotplug_cpu_support
219219
#endif
220220

221221
#ifdef CONFIG_MEMORY_HOTPLUG
222-
static inline int crash_hotplug_memory_support(void) { return 1; }
223-
#define crash_hotplug_memory_support crash_hotplug_memory_support
222+
int arch_crash_hotplug_memory_support(void);
223+
#define crash_hotplug_memory_support arch_crash_hotplug_memory_support
224224
#endif
225+
226+
unsigned int arch_crash_get_elfcorehdr_size(void);
227+
#define crash_get_elfcorehdr_size arch_crash_get_elfcorehdr_size
225228
#endif
226229

227230
#endif /* __ASSEMBLY__ */

arch/x86/kernel/crash.c

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -429,6 +429,33 @@ int crash_load_segments(struct kimage *image)
429429
#undef pr_fmt
430430
#define pr_fmt(fmt) "crash hp: " fmt
431431

432+
/* These functions provide the value for the sysfs crash_hotplug nodes */
433+
#ifdef CONFIG_HOTPLUG_CPU
434+
int arch_crash_hotplug_cpu_support(void)
435+
{
436+
return crash_check_update_elfcorehdr();
437+
}
438+
#endif
439+
440+
#ifdef CONFIG_MEMORY_HOTPLUG
441+
int arch_crash_hotplug_memory_support(void)
442+
{
443+
return crash_check_update_elfcorehdr();
444+
}
445+
#endif
446+
447+
unsigned int arch_crash_get_elfcorehdr_size(void)
448+
{
449+
unsigned int sz;
450+
451+
/* kernel_map, VMCOREINFO and maximum CPUs */
452+
sz = 2 + CONFIG_NR_CPUS_DEFAULT;
453+
if (IS_ENABLED(CONFIG_MEMORY_HOTPLUG))
454+
sz += CONFIG_CRASH_MAX_MEMORY_RANGES;
455+
sz *= sizeof(Elf64_Phdr);
456+
return sz;
457+
}
458+
432459
/**
433460
* arch_crash_handle_hotplug_event() - Handle hotplug elfcorehdr changes
434461
* @image: a pointer to kexec_crash_image

include/linux/kexec.h

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -320,6 +320,10 @@ struct kimage {
320320
unsigned int preserve_context : 1;
321321
/* If set, we are using file mode kexec syscall */
322322
unsigned int file_mode:1;
323+
#ifdef CONFIG_CRASH_HOTPLUG
324+
/* If set, allow changes to elfcorehdr of kexec_load'd image */
325+
unsigned int update_elfcorehdr:1;
326+
#endif
323327

324328
#ifdef ARCH_HAS_KIMAGE_ARCH
325329
struct kimage_arch arch;
@@ -396,9 +400,9 @@ bool kexec_load_permitted(int kexec_image_type);
396400

397401
/* List of defined/legal kexec flags */
398402
#ifndef CONFIG_KEXEC_JUMP
399-
#define KEXEC_FLAGS KEXEC_ON_CRASH
403+
#define KEXEC_FLAGS (KEXEC_ON_CRASH | KEXEC_UPDATE_ELFCOREHDR)
400404
#else
401-
#define KEXEC_FLAGS (KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT)
405+
#define KEXEC_FLAGS (KEXEC_ON_CRASH | KEXEC_PRESERVE_CONTEXT | KEXEC_UPDATE_ELFCOREHDR)
402406
#endif
403407

404408
/* List of defined/legal kexec file flags */
@@ -486,6 +490,8 @@ static inline void arch_kexec_pre_free_pages(void *vaddr, unsigned int pages) {
486490
static inline void arch_crash_handle_hotplug_event(struct kimage *image) { }
487491
#endif
488492

493+
int crash_check_update_elfcorehdr(void);
494+
489495
#ifndef crash_hotplug_cpu_support
490496
static inline int crash_hotplug_cpu_support(void) { return 0; }
491497
#endif
@@ -494,6 +500,10 @@ static inline int crash_hotplug_cpu_support(void) { return 0; }
494500
static inline int crash_hotplug_memory_support(void) { return 0; }
495501
#endif
496502

503+
#ifndef crash_get_elfcorehdr_size
504+
static inline unsigned int crash_get_elfcorehdr_size(void) { return 0; }
505+
#endif
506+
497507
#else /* !CONFIG_KEXEC_CORE */
498508
struct pt_regs;
499509
struct task_struct;

include/uapi/linux/kexec.h

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,7 @@
1212
/* kexec flags for different usage scenarios */
1313
#define KEXEC_ON_CRASH 0x00000001
1414
#define KEXEC_PRESERVE_CONTEXT 0x00000002
15+
#define KEXEC_UPDATE_ELFCOREHDR 0x00000004
1516
#define KEXEC_ARCH_MASK 0xffff0000
1617

1718
/*

kernel/Kconfig.kexec

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -143,4 +143,8 @@ config CRASH_MAX_MEMORY_RANGES
143143
memory buffer/segment size under 1MiB. This represents a sane choice
144144
to accommodate both baremetal and virtual machine configurations.
145145

146+
For the kexec_load() syscall path, CRASH_MAX_MEMORY_RANGES is part of
147+
the computation behind the value provided through the
148+
/sys/kernel/crash_elfcorehdr_size attribute.
149+
146150
endmenu

kernel/crash_core.c

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -740,6 +740,33 @@ subsys_initcall(crash_notes_memory_init);
740740
#ifdef CONFIG_CRASH_HOTPLUG
741741
#undef pr_fmt
742742
#define pr_fmt(fmt) "crash hp: " fmt
743+
744+
/*
745+
* This routine utilized when the crash_hotplug sysfs node is read.
746+
* It reflects the kernel's ability/permission to update the crash
747+
* elfcorehdr directly.
748+
*/
749+
int crash_check_update_elfcorehdr(void)
750+
{
751+
int rc = 0;
752+
753+
/* Obtain lock while reading crash information */
754+
if (!kexec_trylock()) {
755+
pr_info("kexec_trylock() failed, elfcorehdr may be inaccurate\n");
756+
return 0;
757+
}
758+
if (kexec_crash_image) {
759+
if (kexec_crash_image->file_mode)
760+
rc = 1;
761+
else
762+
rc = kexec_crash_image->update_elfcorehdr;
763+
}
764+
/* Release lock now that update complete */
765+
kexec_unlock();
766+
767+
return rc;
768+
}
769+
743770
/*
744771
* To accurately reflect hot un/plug changes of cpu and memory resources
745772
* (including onling and offlining of those resources), the elfcorehdr
@@ -770,6 +797,10 @@ static void crash_handle_hotplug_event(unsigned int hp_action, unsigned int cpu)
770797

771798
image = kexec_crash_image;
772799

800+
/* Check that updating elfcorehdr is permitted */
801+
if (!(image->file_mode || image->update_elfcorehdr))
802+
goto out;
803+
773804
if (hp_action == KEXEC_CRASH_HP_ADD_CPU ||
774805
hp_action == KEXEC_CRASH_HP_REMOVE_CPU)
775806
pr_debug("hp_action %u, cpu %u\n", hp_action, cpu);

kernel/kexec.c

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -129,6 +129,11 @@ static int do_kexec_load(unsigned long entry, unsigned long nr_segments,
129129
if (flags & KEXEC_PRESERVE_CONTEXT)
130130
image->preserve_context = 1;
131131

132+
#ifdef CONFIG_CRASH_HOTPLUG
133+
if (flags & KEXEC_UPDATE_ELFCOREHDR)
134+
image->update_elfcorehdr = 1;
135+
#endif
136+
132137
ret = machine_kexec_prepare(image);
133138
if (ret)
134139
goto out;

kernel/ksysfs.c

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -165,6 +165,18 @@ static ssize_t vmcoreinfo_show(struct kobject *kobj,
165165
}
166166
KERNEL_ATTR_RO(vmcoreinfo);
167167

168+
#ifdef CONFIG_CRASH_HOTPLUG
169+
static ssize_t crash_elfcorehdr_size_show(struct kobject *kobj,
170+
struct kobj_attribute *attr, char *buf)
171+
{
172+
unsigned int sz = crash_get_elfcorehdr_size();
173+
174+
return sysfs_emit(buf, "%u\n", sz);
175+
}
176+
KERNEL_ATTR_RO(crash_elfcorehdr_size);
177+
178+
#endif
179+
168180
#endif /* CONFIG_CRASH_CORE */
169181

170182
/* whether file capabilities are enabled */
@@ -255,6 +267,9 @@ static struct attribute * kernel_attrs[] = {
255267
#endif
256268
#ifdef CONFIG_CRASH_CORE
257269
&vmcoreinfo_attr.attr,
270+
#ifdef CONFIG_CRASH_HOTPLUG
271+
&crash_elfcorehdr_size_attr.attr,
272+
#endif
258273
#endif
259274
#ifndef CONFIG_TINY_RCU
260275
&rcu_expedited_attr.attr,

0 commit comments

Comments
 (0)