-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Fixing VMM & RNG per-device metrics. (Minor Tap fix) #5196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @bstrong04 . Thanks for your contribution!
I left some inline comments regarding the approach.
I also triggered a CI run: https://buildkite.com/firecracker/firecracker-pr/builds/13383#0196b489-4d01-4b39-b984-2e3a363e5745 . It reports some style errors. You can run those yourself locally (please see https://github.com/firecracker-microvm/firecracker/blob/main/tests/README.md).
Worked with @gjkeller for this.
I suggest you split your change in multiple commits, at least the tap test change and rng/vsock changes. You could tag @gjkeller appropriately in those commits via Co-Developed-by, Tested-by, etc.
@@ -47,37 +47,39 @@ where | |||
const PROCESS_NOTIFY_BACKEND: u32 = 4; | |||
|
|||
pub fn handle_rxq_event(&mut self, evset: EventSet) -> bool { | |||
let global = VsockMetricsPerDevice::alloc("global".to_string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the reason for allocating it on every request? We should be allocating it only when we create a new device or when we restore it from a snapshot.
seq.serialize_entry("vsock", &METRICS)?; | ||
let vsock_metrics = METRICS.read().unwrap(); | ||
let metrics_len = vsock_metrics.metrics.len(); | ||
// +1 to accomodate aggregate net metrics |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here and below replace net with vsock
static METRICS: RwLock<VsockMetricsPerDevice> = RwLock::new(VsockMetricsPerDevice { | ||
metrics: { | ||
let tree = BTreeMap::new(); | ||
tree.insert( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess if we allocate per device structure on device creation, it looks like we don't need the global anymore or am I missing something?
|
||
let json_output = serde_json::to_string(&*METRICS.read().unwrap()).unwrap(); | ||
|
||
// Optional: print JSON to visually verify structure |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this as we run our tests continuously. If there is something extra to check we should implement those via asserts.
@@ -113,14 +113,15 @@ impl Entropy { | |||
} | |||
|
|||
fn handle_one(&mut self) -> Result<u32, EntropyError> { | |||
let global = EntropyMetricsPerDevice::alloc("global".to_string()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please see my comments for the vsock device
Changes
Added a new device metric system for both RNG and VMM VIRTIO devices. This includes mapping them through a BTreeMap as mentioned in issue #4145 for the net devices. Handles multiple devices through ID matching while also adding an Arc to each metric to make them thread safe. Also adds some minor tests replicating the ones in the net devices folder to check for thread safety and aggregation functionality.
Also resolves all former references to metrics within the vsock and rng objects to a reference to the "global" id set to arbitrarily point to the single required vsock and rng devices for their respective files.
Also adds a minor fix to the test_tap_name method allowing for regex matching.
Reason
Done to allow unit tests to run in parallel to each other. In production one vsock device and one rng device suffice for use, but with this implementation device metrics will overwrite each other during testing and tests would have to be run sequentially. Therefore, support for a global metric and multiple other metrics during testing is needed to ensure functionality for tests and production. This behavior is simply replicated on both vsock and rng.
test_tap_name changed from prior hardcoded value of tap0 to handle any numerical value after 'tap' when passed in.
Closes #4709
Worked with @gjkeller for this.
License Acceptance
By submitting this pull request, I confirm that my contribution is made under
the terms of the Apache 2.0 license. For more information on following Developer
Certificate of Origin and signing off your commits, please check
CONTRIBUTING.md
.PR Checklist
tools/devtool checkstyle
to verify that the PR passes theautomated style checks.
how they are solving the problem in a clear and encompassing way.
in the PR.
CHANGELOG.md
.Runbook for Firecracker API changes.
integration tests.
TODO
.rust-vmm
.