-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Add option to force guests to use swiotlb in dedicated memory region for all I/O #5108
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
roypat
merged 17 commits into
firecracker-microvm:feature/secret-hiding
from
roypat:secret-freedom-v4
Apr 8, 2025
Merged
Add option to force guests to use swiotlb in dedicated memory region for all I/O #5108
roypat
merged 17 commits into
firecracker-microvm:feature/secret-hiding
from
roypat:secret-freedom-v4
Apr 8, 2025
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
8fe8462
to
75f4ee1
Compare
5573725
to
ad06919
Compare
e11ee8e
to
7635ee8
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## feature/secret-hiding #5108 +/- ##
=========================================================
- Coverage 83.03% 82.86% -0.18%
=========================================================
Files 250 251 +1
Lines 26911 27291 +380
=========================================================
+ Hits 22345 22614 +269
- Misses 4566 4677 +111
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
6c587d6
to
c9019db
Compare
ac4f8ce
to
62d7fbb
Compare
This lint forbids using `..Default::default()` in struct initializers after all fields have already been initialized, but this is a useful pattern if you know you want to add more fields to a struct in a future PR without needing to touch a ton of initializers in unittests again (_heavy foreshadowing_). So silence the paperclip. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
There's no need to test this through VmResources when it can be tested in isolation. Also, everytime I touch MachineConfig I get confsued by where the hell the tests are, cuz not only are they in a different module, they're also one directory level away. So move the tests into machine_config.rs, where it makes sense to have them. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
62d7fbb
to
1c44d53
Compare
We will use the generified GuestRegionCollection type from vm-memory#312, so temporarily copy it from the vm-memory PR into the firecracker repository. This commit must be dropped in favor of the upstream solution once vm-memory#312 is merged, and before the feature branch is merged. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
Use a special KvmRegion type, which allows us to manage kvm_userspace_memory_region2 inside of GuestRegionCollection. This means we can stop using .zip(0u32..) to determine kvm slot indices, which doesnt work anymore when multiple GuestMemoryMmaps are around, which have no order imposed on them wrt which one is initialized first. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
1c44d53
to
547fde1
Compare
kalyazin
reviewed
Apr 7, 2025
Add a mem_config parameter to the PUT/PATCH /machine-config endpoints, which allows configuration of the size of a swiotlb region to use. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
Add a special `GuestMemoryMmap` to `struct Vm` that has the semantics that if its not empty, it represents the swiotlb area in the guest. Have all virtio devices only get access to swiotlb regions if they exist, only falling back onto having access to all of guest memory otherwise. This means that in case of swiotlb regions being set up, virtio devices cannot access generic memory, so if the guest decides to (incorrectly) place I/O buffers outside of swiotlb we'll get a somewhat meaningful error of "invalid guest address" (or similar), instead of random things failing further down the line in case generic memory isn't accessible by firecracker/the host kernel for some reason (e.g. if its guest_memfd backed in the future). When a VM with swiotlb regions is being snapshotted, the regions are simply concatenated to the end of the snapshot memory file. This changes the snapshot memory file format in the sense that regions in the file are not necessarily sorted by guest address anymore. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
If swiotlb is requested through the API, this function will allocate a single contiguous region intended for swiotlb use. It needs to be continguous because we must be able to describe it using a single memory range in FDT, as devices cannot be assigned to reserved memory consisting of multiple regions (e.g. FDT assignment is to memory regions, not to #reserved-memory nodes). While we're at it, always use anon memory for the "normal" part of guest memory if swiotlb is enabled, as if swiotlb is enabled we know that vhost devices will never need to access that part of memory. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
Allocate the swiotlb region if requested via the /machine-config API endpoint, and register it to the VM/KVM. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
describe the swiotlb region as a restricted-mem node in the flattened device tree, and force all virtio devices to use it for their vrings and buffers by setting their `memory-region` property. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
Have all virtio devices off VIRTIO_F_PLATFORM_ACCESS, which communicates to the guest that DMA APIs should be used for allocating vrings and virtio buffers. Since we have created a dedicated swiotlb region in the geust and assigned all virtio devices to this, it means that all such allocations will go through the swiotlb region. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
Support restoring VMs with swiotlb regions. For this, untangle the uffd handshake from the actual restoration of memory regions, as we first need to restore and register to the Vm _all_ memory regions, before we can then send a single handshake containing both normal and swiotlb regions to the Uffd handler. While we're here, significantly simplify the jungle of error types. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
This will make it easier to validate in an integration test that virt queues are actually placed in the swiotlb region, as a sort of sanity check that on Firecracker's side the swiotlb regions are actually passed to the virtio devices, and that the guest is honoring the request to place virtio stuff into the swiotlb regions. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
This is needed for the guest to correctly parse the restricted-memory nodes that firecracker now puts into FDT. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
these contain 6.1 guest kernels that have CONFIG_DMA_RESTRICTED_MEM enabled and can thus parse swiotlb region descriptions from FDT. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
If the CI artifacts dont contain old firecracker releases, still succeed at setting them up after downloading them. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
23752f4
to
9188feb
Compare
On aarch64, additionally parametrize our throughput performance tests (network, block and vsock) by swiotlb size, to get a feeling for the performance impact bounce buffering will have. Try out different sizes of swiotlb region to see whether we can identify a cut off. Also add it to the boottime test, because swiotlb can impact the time taken to read the rootfs. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
Add an integration test on ARM that configures an in-guest swiotlb region to which all virtio devices should be bound. Asserts the guest sees the swiotlb region, and also that block and net I/O work (indirectly through loading of rootfs during boot and by running commands via SSH). Also asserts that all virtio queue components have been placed into swiotlb, by parsing firecracker log messages. Also add a test for snapshotting VMs with swiotlb regions, which just checks the network device after restoration to sanity check that the swiotlb region was correctly restored. Signed-off-by: Patrick Roy <roypat@amazon.co.uk>
9188feb
to
08e0f03
Compare
kalyazin
approved these changes
Apr 8, 2025
JackThomson2
approved these changes
Apr 8, 2025
71e33a0
into
firecracker-microvm:feature/secret-hiding
5 of 7 checks passed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes
Add a new API field to /machine-config,
initial_swiotlb_size
, which causes Firecracker to carve out a region of guest memory of the specified size, mark is as dma-restricted in FDT, and force all virtio devices to use this memory for their vrings and buffers.Reason
With secret hiding, I/O needs to be handled by swiotlb, as we cannot do I/O into guest memory that is direct map removed
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
tools/devtool checkstyle
to verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md
.Runbook for Firecracker API changes.
integration tests.
TODO
.rust-vmm
.