test(restore_latency): do not reuse netns #5287

kalyazin · 2025-07-01T17:31:34Z

Changes

Disable netns reuse for test_restore_latency test.

Reason

This is because with netns reuse, the first test variant spends time on initialising netns, while the subsequent test variants do not. Disable netns reuse to make the performance data consistent.

License Acceptance

By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md.

PR Checklist

I have read and understand CONTRIBUTING.md.
I have run tools/devtool checkstyle to verify that the PR passes the
automated style checks.
I have described what is done in these changes, why they are needed, and
how they are solving the problem in a clear and encompassing way.
[ ] I have updated any relevant documentation (both in code and in the docs)
in the PR.
~~[ ] I have mentioned all user-facing changes in CHANGELOG.md.~~
~~[ ] If a specific issue led to this PR, this PR closes the issue.~~
[ ] When making API changes, I have followed the
Runbook for Firecracker API changes.
[ ] I have tested all new and changed functionalities in unit tests and/or
integration tests.
~~[ ] I have linked an issue to every new TODO.~~

This functionality cannot be added in rust-vmm.

codecov · 2025-07-01T17:40:04Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 82.91%. Comparing base (033825d) to head (cb28847).
Report is 2 commits behind head on main.

Additional details and impacted files

@@            Coverage Diff             @@
##             main    #5287      +/-   ##
==========================================
+ Coverage   82.86%   82.91%   +0.05%     
==========================================
  Files         250      250              
  Lines       26897    26897              
==========================================
+ Hits        22288    22302      +14     
+ Misses       4609     4595      -14

Flag	Coverage Δ
5.10-c5n.metal	`83.35% <ø> (ø)`
5.10-m5n.metal	`83.34% <ø> (ø)`
5.10-m6a.metal	`82.56% <ø> (+<0.01%)`	⬆️
5.10-m6g.metal	`79.17% <ø> (ø)`
5.10-m6i.metal	`83.34% <ø> (-0.01%)`	⬇️
5.10-m7a.metal-48xl	`82.55% <ø> (?)`
5.10-m7g.metal	`79.17% <ø> (ø)`
5.10-m7i.metal-24xl	`83.30% <ø> (?)`
5.10-m7i.metal-48xl	`83.31% <ø> (?)`
5.10-m8g.metal-24xl	`79.17% <ø> (?)`
5.10-m8g.metal-48xl	`79.17% <ø> (?)`
6.1-c5n.metal	`83.40% <ø> (ø)`
6.1-m5n.metal	`83.40% <ø> (+<0.01%)`	⬆️
6.1-m6a.metal	`82.61% <ø> (+<0.01%)`	⬆️
6.1-m6g.metal	`79.17% <ø> (ø)`
6.1-m6i.metal	`83.38% <ø> (-0.01%)`	⬇️
6.1-m7a.metal-48xl	`82.60% <ø> (?)`
6.1-m7g.metal	`79.17% <ø> (ø)`
6.1-m7i.metal-24xl	`83.40% <ø> (?)`
6.1-m7i.metal-48xl	`83.41% <ø> (?)`
6.1-m8g.metal-24xl	`79.17% <ø> (?)`
6.1-m8g.metal-48xl	`79.17% <ø> (?)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

roypat

Mh, do you see any way we can make this work even if someone only runs a subset of these tests? Maybe something as simple as just always dropping the first datapoint when emitting metrics?

kalyazin · 2025-07-02T08:02:06Z

Mh, do you see any way we can make this work even if someone only runs a subset of these tests? Maybe something as simple as just always dropping the first datapoint when emitting metrics?

Good point, let me see. The problem is not a single datapoint for some reason, but the entire test result.

kalyazin · 2025-07-02T13:30:12Z

Mh, do you see any way we can make this work even if someone only runs a subset of these tests? Maybe something as simple as just always dropping the first datapoint when emitting metrics?

Good point, let me see. The problem is not a single datapoint for some reason, but the entire test result.

@roypat Playing with it a bit more, I can't really think of a better way. The netnses are cached between "functions" (ie tests). After a test finishes, it returns all its nses to the pool so subsequent tests can reuse them. There is no reuse within a test, so we can't run something every test that would create an ns and omit metrics from such execution. I was thinking of "prefilling" nses at every test, ie create and return them to the pool immediately. But this doesn't help, because even though the nses are preallocated, they aren't used/initialised (the ioctl isn't called), so they aren't any faster than the ones created on-demand.

This is to be able to disable netns reuse to get consistent performance results. Signed-off-by: Nikita Kalyazin <kalyazin@amazon.com>

This is because with netns reuse, the first test variant spends time on initialising netns, while the subsequent test variants do not. Disable netns reuse to make the performance data consistent. Signed-off-by: Nikita Kalyazin <kalyazin@amazon.com>

kalyazin self-assigned this Jul 1, 2025

kalyazin added the Status: Awaiting review Indicates that a pull request is ready to be reviewed label Jul 1, 2025

roypat reviewed Jul 2, 2025

View reviewed changes

kalyazin changed the title ~~test(restore_latency): add warmup run~~ [WIP] test(restore_latency): add warmup run Jul 2, 2025

kalyazin marked this pull request as draft July 2, 2025 09:46

kalyazin force-pushed the snap_warmup branch from 95971b8 to 19c5dcb Compare July 2, 2025 14:27

kalyazin marked this pull request as ready for review July 2, 2025 14:31

kalyazin changed the title ~~[WIP] test(restore_latency): add warmup run~~ test(restore_latency): add warmup run Jul 2, 2025

kalyazin requested a review from roypat July 2, 2025 21:01

test(microvm): add no_netns_reuse in build_n_from_snapshot

bb88b14

This is to be able to disable netns reuse to get consistent performance results. Signed-off-by: Nikita Kalyazin <kalyazin@amazon.com>

kalyazin force-pushed the snap_warmup branch from 19c5dcb to 0b9660e Compare July 3, 2025 11:24

kalyazin changed the title ~~test(restore_latency): add warmup run~~ test(restore_latency): do not reuse netns Jul 3, 2025

kalyazin force-pushed the snap_warmup branch from 0b9660e to faf0b01 Compare July 3, 2025 11:28

roypat approved these changes Jul 3, 2025

View reviewed changes

Merge branch 'main' into snap_warmup

cb28847

zulinx86 approved these changes Jul 3, 2025

View reviewed changes

JackThomson2 approved these changes Jul 3, 2025

View reviewed changes

kalyazin merged commit 0fd78cf into firecracker-microvm:main Jul 3, 2025
7 checks passed

kalyazin deleted the snap_warmup branch July 3, 2025 13:55

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

test(restore_latency): do not reuse netns #5287

test(restore_latency): do not reuse netns #5287

Uh oh!

kalyazin commented Jul 1, 2025 •

edited

Loading

Uh oh!

codecov bot commented Jul 1, 2025 •

edited

Loading

Uh oh!

roypat left a comment

Uh oh!

kalyazin commented Jul 2, 2025

Uh oh!

kalyazin commented Jul 2, 2025

Uh oh!

Uh oh!

Uh oh!

test(restore_latency): do not reuse netns #5287

test(restore_latency): do not reuse netns #5287

Uh oh!

Conversation

kalyazin commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Reason

License Acceptance

PR Checklist

Uh oh!

codecov bot commented Jul 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

roypat left a comment

Choose a reason for hiding this comment

Uh oh!

kalyazin commented Jul 2, 2025

Uh oh!

kalyazin commented Jul 2, 2025

Uh oh!

Uh oh!

Uh oh!

kalyazin commented Jul 1, 2025 •

edited

Loading

codecov bot commented Jul 1, 2025 •

edited

Loading