fix `IovDeque` for non 4K pages #5222

ShadowCurse · 2025-05-22T14:33:22Z

Changes

The L const generic was determining the maximum number of iov
elements in the IovDeque. This cases the issue when the host kernel
uses pages which can contain more entries than L. For example usual
4K pages can contain 256 iovs while 16K pages can contain 1024 iovs.
Current implementation on 16K (and any other bigger than 4K page size)
will continue wrap IovDeque when it reaches 256'th element. This
breaks the implementation since elements written past 256'th index will
not be 'duplicated' at the beginning of the queue.

Curren implementation expects this behavior:

 page 1 page 2
|ABCD|#|ABCD|
      ^ will wrap here

With big page sizes current impl will:

 page 1              page2
|ABCD|EFGD________|#|ABCDEFGD________|
     ^ sill wrap here
                   ^ but should wrap here

The solution is to calculate the maximum capacity the IovDeque can
hold, and use it for wrapping purposes. This capacity is allowed to be
bigger than L. The actual used number of entries in the queue will
still be guarded by the L parameter used in the is_full method.

Reason

Fixes #5217

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

I have read and understand CONTRIBUTING.md.
I have run tools/devtool checkstyle to verify that the PR passes the
automated style checks.
I have described what is done in these changes, why they are needed, and
how they are solving the problem in a clear and encompassing way.
I have updated any relevant documentation (both in code and in the docs)
in the PR.
I have mentioned all user-facing changes in CHANGELOG.md.
If a specific issue led to this PR, this PR closes the issue.
When making API changes, I have followed the
Runbook for Firecracker API changes.
I have tested all new and changed functionalities in unit tests and/or
integration tests.
I have linked an issue to every new TODO.

This functionality cannot be added in rust-vmm.

codecov · 2025-05-22T14:38:17Z

Codecov Report

Attention: Patch coverage is 87.50000% with 1 line in your changes missing coverage. Please review.

Project coverage is 82.93%. Comparing base (6417786) to head (29df8ea).
Report is 3 commits behind head on main.

Files with missing lines	Patch %	Lines
src/vmm/src/devices/virtio/iov_deque.rs	87.50%	1 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5222      +/-   ##
==========================================
+ Coverage   82.88%   82.93%   +0.05%     
==========================================
  Files         250      250              
  Lines       26936    26942       +6     
==========================================
+ Hits        22325    22344      +19     
+ Misses       4611     4598      -13

Flag	Coverage Δ
5.10-c5n.metal	`83.37% <87.50%> (-0.01%)`	⬇️
5.10-m5n.metal	`83.36% <87.50%> (-0.01%)`	⬇️
5.10-m6a.metal	`82.58% <87.50%> (+<0.01%)`	⬆️
5.10-m6g.metal	`79.20% <87.50%> (+<0.01%)`	⬆️
5.10-m6i.metal	`83.36% <87.50%> (+<0.01%)`	⬆️
5.10-m7a.metal-48xl	`82.57% <87.50%> (?)`
5.10-m7g.metal	`79.20% <87.50%> (+<0.01%)`	⬆️
5.10-m7i.metal-24xl	`83.33% <87.50%> (?)`
5.10-m7i.metal-48xl	`83.33% <87.50%> (?)`
5.10-m8g.metal-24xl	`79.19% <87.50%> (?)`
5.10-m8g.metal-48xl	`79.19% <87.50%> (?)`
6.1-c5n.metal	`83.42% <87.50%> (-0.01%)`	⬇️
6.1-m5n.metal	`83.41% <87.50%> (-0.01%)`	⬇️
6.1-m6a.metal	`82.63% <87.50%> (+<0.01%)`	⬆️
6.1-m6g.metal	`79.20% <87.50%> (+<0.01%)`	⬆️
6.1-m6i.metal	`83.40% <87.50%> (+<0.01%)`	⬆️
6.1-m7a.metal-48xl	`82.62% <87.50%> (?)`
6.1-m7g.metal	`79.20% <87.50%> (+<0.01%)`	⬆️
6.1-m7i.metal-24xl	`83.43% <87.50%> (?)`
6.1-m7i.metal-48xl	`83.42% <87.50%> (?)`
6.1-m8g.metal-24xl	`79.19% <87.50%> (?)`
6.1-m8g.metal-48xl	`79.19% <87.50%> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

src/vmm/src/devices/virtio/iov_deque.rs

louwers · 2025-05-22T17:08:21Z

Confirmed that this fix seems to solve the problem I reported.

The issue is no longer reproducible.

src/vmm/src/devices/virtio/iov_deque.rs

Manciukic · 2025-05-27T13:36:57Z

Build is complaining about some markdown formatting

FAILED integration_tests/style/test_markdown.py::test_markdown_style - AssertionError: Some markdown files need formatting. Either run `./tools/devtool sh mdformat .` in the repository root, or apply the above diffs manually.

The `L` const generic was determining the maximum number of `iov` elements in the `IovDeque`. This cases the issue when the host kernel uses pages which can contain more entries than `L`. For example usual 4K pages can contain 256 `iov`s while 16K pages can contain 1024 `iov`s. Current implementation on 16K (and any other bigger than 4K page size) will continue wrap `IovDeque` when it reaches 256'th element. This breaks the implementation since elements written past 256'th index will not be 'duplicated' at the beginning of the queue. Curren implementation expects this behavior: page 1 page 2 |ABCD|#|ABCD| ^ will wrap here With big page sizes current impl will: page 1 page2 |ABCD|EFGD________|#|ABCDEFGD________| ^ sill wrap here ^ but should wrap here The solution is to calculate the maximum capacity the `IovDeque` can hold, and use it for wrapping purposes. This capacity is allowed to be bigger than `L`. The actual used number of entries in the queue will still be guarded by the `L` parameter used in the `is_full` method. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>

CHANGELOG.md

Add note about `IovDeque` fix for non 4K pages. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>

Currently only 4K pages on the host and in the guest are officially supported. Other configurations might work, but not continuously tested. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>

ShadowCurse force-pushed the net_16k_fix branch 4 times, most recently from 62393e7 to 9b52af9 Compare May 22, 2025 15:01

ShadowCurse mentioned this pull request May 22, 2025

[Bug] Regression v1.10.0 tap device unreliable and unresponsive #5217

Closed

3 tasks

Manciukic reviewed May 22, 2025

View reviewed changes

src/vmm/src/devices/virtio/iov_deque.rs Show resolved Hide resolved

ShadowCurse force-pushed the net_16k_fix branch 3 times, most recently from 5b7c45f to 4ae81dc Compare May 27, 2025 13:19

ShadowCurse marked this pull request as ready for review May 27, 2025 13:19

ShadowCurse requested review from xmarcalx, kalyazin and pb8o as code owners May 27, 2025 13:19

ShadowCurse self-assigned this May 27, 2025

ShadowCurse added Status: Awaiting review Indicates that a pull request is ready to be reviewed Type: Documentation Indicates a need for improvements or additions to documentation Type: Fix Indicates a fix to existing code labels May 27, 2025

Manciukic previously approved these changes May 27, 2025

View reviewed changes

src/vmm/src/devices/virtio/iov_deque.rs Show resolved Hide resolved

ShadowCurse dismissed Manciukic’s stale review via cc98fe9 May 27, 2025 14:23

ShadowCurse force-pushed the net_16k_fix branch from 4ae81dc to cc98fe9 Compare May 27, 2025 14:23

ShadowCurse force-pushed the net_16k_fix branch from cc98fe9 to 650b3c5 Compare May 27, 2025 15:13

ShadowCurse requested a review from Manciukic May 27, 2025 15:13

Manciukic previously approved these changes May 27, 2025

View reviewed changes

roypat reviewed May 27, 2025

View reviewed changes

CHANGELOG.md Outdated Show resolved Hide resolved

ShadowCurse added 2 commits May 27, 2025 23:52

chore: update CHANGELOG with a fix

7ecc47e

Add note about `IovDeque` fix for non 4K pages. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>

chore: update documentation about page size support

29df8ea

Currently only 4K pages on the host and in the guest are officially supported. Other configurations might work, but not continuously tested. Signed-off-by: Egor Lazarchuk <yegorlz@amazon.co.uk>

ShadowCurse dismissed Manciukic’s stale review via 29df8ea May 27, 2025 22:52

ShadowCurse force-pushed the net_16k_fix branch from 650b3c5 to 29df8ea Compare May 27, 2025 22:52

roypat approved these changes May 28, 2025

View reviewed changes

ShadowCurse requested a review from Manciukic May 28, 2025 08:18

Manciukic approved these changes May 28, 2025

View reviewed changes

roypat merged commit 1d0f9af into firecracker-microvm:main May 28, 2025
6 of 7 checks passed

ShadowCurse deleted the net_16k_fix branch May 28, 2025 08:42

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix `IovDeque` for non 4K pages #5222

fix `IovDeque` for non 4K pages #5222

Uh oh!

ShadowCurse commented May 22, 2025 •

edited

Loading

Uh oh!

codecov bot commented May 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

louwers commented May 22, 2025 •

edited

Loading

Uh oh!

Uh oh!

Manciukic commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fix IovDeque for non 4K pages #5222

fix IovDeque for non 4K pages #5222

Uh oh!

Conversation

ShadowCurse commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Reason

License Acceptance

PR Checklist

Uh oh!

codecov bot commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Uh oh!

louwers commented May 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Manciukic commented May 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

fix `IovDeque` for non 4K pages #5222

fix `IovDeque` for non 4K pages #5222

ShadowCurse commented May 22, 2025 •

edited

Loading

codecov bot commented May 22, 2025 •

edited

Loading

louwers commented May 22, 2025 •

edited

Loading