Skip to content

Add support for PVH direct boot API #3155

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 14, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@

### Added

- Added support for PVH boot mode. This is used when an x86 kernel provides
the appropriate ELF Note to indicate that PVH boot mode is supported.
Linux kernels compiled with CONFIG_XEN_PVH=y set this ELF Note, as do
FreeBSD kernels.

### Changed

- Updated deserialization of `bitmap` for custom CPU templates to allow usage
Expand Down
15 changes: 15 additions & 0 deletions docs/pvh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
# PVH boot mode

Firecracker supports booting x86 kernels in "PVH direct boot" mode
[as specified by the Xen project](https://github.com/xen-project/xen/blob/master/docs/misc/pvh.pandoc).
If a kernel is provided which contains the XEN_ELFNOTE_PHYS32_ENTRY ELF Note
then this boot mode will be used. This boot mode was designed for virtualized
environments which load the kernel directly, and is simpler than the "Linux
boot" mode which is designed to be launched from a legacy boot loader.

PVH boot mode can be enabled for Linux by setting CONFIG_XEN_PVH=y in the
kernel configuration. (This is not the default setting.)

PVH boot mode is enabled by default in FreeBSD, which has support for
Firecracker starting with FreeBSD 14.0. Instructions on building a FreeBSD
kernel and root filesystem are available [here](rootfs-and-kernel-setup.md).
46 changes: 44 additions & 2 deletions docs/rootfs-and-kernel-setup.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Creating Custom rootfs and kernel Images

## Creating a kernel Image
## Creating a Linux kernel Image

### Manual compilation

Expand Down Expand Up @@ -72,7 +72,7 @@ config="resources/guest_configs/microvm-kernel-arm64-4.14.config"

on an aarch64 machine.

## Creating a rootfs Image
## Creating a Linux rootfs Image

A rootfs image is just a file system image, that hosts at least an init system.
For instance, our getting started guide uses an ext4 filesystem image. Note
Expand Down Expand Up @@ -178,3 +178,45 @@ adjust the script(s) to suit your use case.

You should now have a kernel image (`vmlinux`) and a rootfs image
(`rootfs.ext4`), that you can boot with Firecracker.

## Creating FreeBSD rootfs and kernel Images

Here's a quick step-by-step guide to building a FreeBSD rootfs and kernel that
Firecracker can boot:

1. Boot a FreeBSD system. In EC2, the
[FreeBSD 13 Marketplace image](https://aws.amazon.com/marketplace/pp/prodview-ukzmy5dzc6nbq)
is a good option; you can also use weekly snapshot AMIs published by the
FreeBSD project. (Firecracker support is in FreeBSD 14 and later, so you'll
need FreeBSD 13 or later to build it.)

The build will require about 50 GB of disk space, so size the disk
appropriately.

1. Log in to the FreeBSD system and become root. If using EC2, you'll want to
ssh in as `ec2-user` with your chosen SSH key and then `su` to become root.

1. Install git and check out the FreeBSD src tree:

```sh
pkg install -y git
git clone https://git.freebsd.org/src.git /usr/src
```

At present (July 2023) Firecracker support is only present in the `main`
branch.

1. Build FreeBSD:

```sh
make -C /usr/src buildworld buildkernel KERNCONF=FIRECRACKER
make -C /usr/src/release firecracker DESTDIR=`pwd`
```

You should now have a rootfs `freebsd-rootfs.bin` and a kernel `freebsd-kern.bin`
in the current directory (or elsewhere if you change the `DESTDIR` value) that
you can boot with Firecracker. Note that the FreeBSD rootfs generated in this
manner is somewhat minimized compared to "stock" FreeBSD; it omits utilities
which are only relevant on physical systems (e.g., utilities related to floppy
disks, USB devices, and some network interfaces) and also debug files and the
system compiler.
31 changes: 31 additions & 0 deletions src/vmm/src/arch/mod.rs
Original file line number Diff line number Diff line change
Expand Up @@ -60,3 +60,34 @@ impl fmt::Display for DeviceType {
write!(f, "{:?}", self)
}
}

/// Suported boot protocols for
#[derive(Debug, Copy, Clone, PartialEq)]
pub enum BootProtocol {
/// Linux 64-bit boot protocol
LinuxBoot,
#[cfg(target_arch = "x86_64")]
/// PVH boot protocol (x86/HVM direct boot ABI)
PvhBoot,
}

impl fmt::Display for BootProtocol {
fn fmt(&self, f: &mut ::std::fmt::Formatter) -> ::std::fmt::Result {
match self {
BootProtocol::LinuxBoot => write!(f, "Linux 64-bit boot protocol"),
#[cfg(target_arch = "x86_64")]
BootProtocol::PvhBoot => write!(f, "PVH boot protocol"),
}
}
}

#[derive(Debug, Copy, Clone)]
/// Specifies the entry point address where the guest must start
/// executing code, as well as which boot protocol is to be used
/// to configure the guest initial state.
pub struct EntryPoint {
/// Address in guest memory where the guest must start execution
pub entry_addr: utils::vm_memory::GuestAddress,
/// Specifies which boot protocol to use
pub protocol: BootProtocol,
}
35 changes: 33 additions & 2 deletions src/vmm/src/arch/x86_64/gdt.rs
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
// Copyright © 2020, Oracle and/or its affiliates.
//
// Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
// SPDX-License-Identifier: Apache-2.0
//
Expand All @@ -24,8 +26,37 @@ fn get_base(entry: u64) -> u64 {
| (((entry) & 0x0000_0000_FFFF_0000) >> 16)
}

// Extract the segment limit from the GDT segment descriptor.
//
// In a segment descriptor, the limit field is 20 bits, so it can directly describe
// a range from 0 to 0xFFFFF (1 MB). When G flag is set (4-KByte page granularity) it
// scales the value in the limit field by a factor of 2^12 (4 Kbytes), making the effective
// limit range from 0xFFF (4 KBytes) to 0xFFFF_FFFF (4 GBytes).
//
// However, the limit field in the VMCS definition is a 32 bit field, and the limit value is not
// automatically scaled using the G flag. This means that for a desired range of 4GB for a
// given segment, its limit must be specified as 0xFFFF_FFFF. Therefore the method of obtaining
// the limit from the GDT entry is not sufficient, since it only provides 20 bits when 32 bits
// are necessary. Fortunately, we can check if the G flag is set when extracting the limit since
// the full GDT entry is passed as an argument, and perform the scaling of the limit value to
// return the full 32 bit value.
//
// The scaling mentioned above is required when using PVH boot, since the guest boots in protected
// (32-bit) mode and must be able to access the entire 32-bit address space. It does not cause
// issues for the case of direct boot to 64-bit (long) mode, since in 64-bit mode the processor does
// not perform runtime limit checking on code or data segments.
//
// (For more information concerning the formats of segment descriptors, VMCS fields, et cetera,
// please consult the Intel Software Developer Manual.)
fn get_limit(entry: u64) -> u32 {
((((entry) & 0x000F_0000_0000_0000) >> 32) | ((entry) & 0x0000_0000_0000_FFFF)) as u32
let limit: u32 =
((((entry) & 0x000F_0000_0000_0000) >> 32) | ((entry) & 0x0000_0000_0000_FFFF)) as u32;

// Perform manual limit scaling if G flag is set
match get_g(entry) {
0 => limit,
_ => (limit << 12) | 0xFFF, // G flag is either 0 or 1
}
}

fn get_g(entry: u64) -> u8 {
Expand Down Expand Up @@ -109,7 +140,7 @@ mod tests {
assert_eq!(0xB, seg.type_);
// base and limit
assert_eq!(0x10_0000, seg.base);
assert_eq!(0xfffff, seg.limit);
assert_eq!(0xffff_ffff, seg.limit);
assert_eq!(0x0, seg.unusable);
}
}
11 changes: 11 additions & 0 deletions src/vmm/src/arch/x86_64/layout.rs
Original file line number Diff line number Diff line change
Expand Up @@ -27,5 +27,16 @@ pub const IRQ_MAX: u32 = 23;
/// Address for the TSS setup.
pub const KVM_TSS_ADDRESS: u64 = 0xfffb_d000;

/// Address of the hvm_start_info struct used in PVH boot
pub const PVH_INFO_START: u64 = 0x6000;

/// Starting address of array of modules of hvm_modlist_entry type.
/// Used to enable initrd support using the PVH boot ABI.
pub const MODLIST_START: u64 = 0x6040;

/// Address of memory map table used in PVH boot. Can overlap
/// with the zero page address since they are mutually exclusive.
pub const MEMMAP_START: u64 = 0x7000;

/// The 'zero page', a.k.a linux kernel bootparams.
pub const ZERO_PAGE_START: u64 = 0x7000;
Loading