Skip to content

Conversation

sahityadg
Copy link
Contributor

@sahityadg sahityadg commented Oct 14, 2025

This PR updates metric names to follow consistent naming patterns for OTLP export. However, the logs are still not fully updated to use the new convention. This also adds integration tests for metrics validation

TODO: Remove all TODO messages in desc. For PR description guidance, review https://github.com/awslabs/mountpoint-s3/blob/main/doc/CONTRIBUTING.md#pull-request-title-and-description.

TODO: What changed and why?

Does this change impact existing behavior?

Yes, it changes metrics names in logs

Does this change need a changelog entry? Does it require a version change?

Yes, it require an update about metric names.


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license and I agree to the terms of the Developer Certificate of Origin (DCO).

Signed-off-by: Sahitya Damera <sahityad@amazon.com>
Signed-off-by: Sahitya Damera <sahityad@amazon.com>
…ion tracking

Signed-off-by: Sahitya Damera <sahityad@amazon.com>
Signed-off-by: Sahitya Damera <sahityad@amazon.com>
This adds TestRecorder utility for metrics validation in tests. This
also updates tracing integation for metrics collection in all our tests.

Signed-off-by: Sahitya Damera <sahityad@amazon.com>
Signed-off-by: Sahitya Damera <sahityad@amazon.com>
use opentelemetry_sdk::metrics::data::{AggregatedMetrics, MetricData, ResourceMetrics};
use opentelemetry_sdk::metrics::in_memory_exporter::InMemoryMetricExporter;
use opentelemetry_sdk::metrics::{PeriodicReader, SdkMeterProvider};
use std::sync::Arc;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: No newline

}

// Every expected label must be in the actual labels
labels.iter().all(|(k, v)| actual_labels.contains(&(*k, *v)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: O(n^2) here though shouldn't matter due to it being only a test

pub fn get_metric(recorder: &TestRecorder, name: &str, labels: &[(&str, &str)]) -> Arc<Metric> {
recorder
.get(name, labels)
.unwrap_or_else(|| panic!("Expected metric '{name}' with labels {labels:?} to exist"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use expect

fn get_metric(recorder: &TestRecorder, name: &str, labels: &[(&str, &str)]) -> Arc<Metric> {
recorder
.get(name, labels)
.unwrap_or_else(|| panic!("Expected metric '{name}' with labels {labels:?} to exist"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use expect

std::thread::sleep(Duration::from_millis(100));

let read_io_size = get_histogram(&context.recorder, FUSE_IO_SIZE, &[(ATTR_FUSE_REQUEST, "read")]);
assert!(read_io_size.contains(&1024.0));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this assertion really what we want to verify?

@sahityadg sahityadg marked this pull request as ready for review October 14, 2025 13:29
Copy link
Contributor

@dannycjones dannycjones left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I haven't reviewed the test changes in detail nor the previous metric changes, but overall looks good

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already said that logged metrics aren't stable in docs, but I think it would be worth adding a note to the changelog to cover how we expect the metrics to change. (Any log line format change, metric name changes, etc..)

if last_mem != process.memory() {
metrics::gauge!("process.memory_usage").set(process.memory() as f64);
metrics::gauge!(PROCESS_MEMORY_USAGE).set(process.memory() as f64);
metrics::gauge!("system.available_memory").set(sys.available_memory() as f64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not also for system memory now?

use std::sync::Once;
use std::time::Duration;

static RECORDER: OnceCell<TestRecorder> = OnceCell::new();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can replace all of the logic for RECORDER with std's LazyCell, or I'm assuming it'll actually need std's LazyLock.

https://doc.rust-lang.org/std/sync/struct.LazyLock.html

This should avoid taking a dependency on once_cell crate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants