Skip to content

Metrics Reporter API #1466

Open
Open
@DerGut

Description

@DerGut

Is your feature request related to a problem or challenge?

I'd like to get the Metrics Reporting API into Rust. It's part of the catalog spec and currently only implemented by Iceberg Java (Python, Go and Rust haven't implemented it yet).

In short, the Metrics API allows us to monitor and better understand how clients operate on Iceberg files by providing numbers around:

  • how long did the scan planning phase take (e.g. determining the data files to look at)
  • how many manifest files were considered, skipped and scanned (from a manifest list)
  • how many data files were considered, skipped and scanned (from a manifest file)
  • same for delete files
  • similar metrics for commits

To me, this is a crucial piece of functionality. In an Iceberg deployment, various clients can use the same catalog but operate on cloud storage directly. From the catalog, we only get to see which tables are accessed, or how they are modified at a high level. The data plane is completely opaque to the catalog. This means it can be hard to understand how each of the clients is using the stored data, at least in a consistent way. The metrics reporting API provides a solution for this.

I've checked in the #rust Slack channel and there seems to be general interest for this functionality 👍

Note that I marked open questions with

Describe the solution you'd like

👣 The general steps I think this can be broken down into:

1. Foundational work

All other work items will re-use these APIs.

An API to consume reports (i.e. Java's MetricsReporter interface)

This trait will need to be async because it will likely use IO. The RestMetricsReporter is a good example using the network to send out its metrics reports.
It will also need the supertraits Debug, Send and Sync. Debug is necessary because the structs that will hold the reporter are deriving it (e.g. TableScan). Send and Sync are necessary because the same reporter may be reused by many (concurrently running) operations.
It's visibility can be pub(crate) for now. This allows us to integrate metrics reporting into the table operations, and provide a default implementation (like the LoggingMetricsReporter). Later, this will be made public so that users can implement their own, and we can add a RestMetricsReporter.
While metrics reporting can fail (e.g. when using IO), an error during reporting shouldn't affect the table operation in any way. To emphasize this, I think it makes sense to force the user to handle or log any errors internally and not return a Result<()>.

  • ❓The report parameter probably suffices to be a borrowed value. At the same time, reporting should be the terminal operation of any report. After that it probably shouldn't be reused. The signature might as well be explicit about this.
/// This trait defines the basic API for reporting metrics on table operations.
#[async_trait]
pub(crate) trait MetricsReporter: Debug + Send + Sync {
    /// Indicates that an operation is done by publishing a MetricsReport.
    async fn report(&self, report: MetricsReport);
}
Types for metric reports

This will be the scan and commit reports, encoding all the information made available to the reporter. For a list of fields, we can refer to the Java implementations ScanReport.java and CommitReport.java.
Note that the ScanMetrics and CommitMetrics structs can grow quite large because they contain a bunch of counters (see ScanMetricsResult.java and CommitMetricsResult.java for reference). In my local draft, Clippy recommended to Box them for a more similar size of enum variants. This is subject to change.

pub(crate) enum MetricsReport {
    Scan {
        // ...All scan report fields...
        metrics: Box<ScanMetrics>,
    },
    Commit {
        // ...All commit report fields...
        metrics: Box<CommitMetrics>,
    },
}

pub(crate) struct ScanMetrics {
    total_planning_duration: std::time::Duration,
    result_data_files: Counter,
    // more metrics...
}

The Java implementation uses custom classes for both DurationMetric and CounterMetric. I don't see a lot of value in creating a separate type for the duration because the unit is already encoded in std::time::Duration.
This is not the case for u64s, so I included a custom Counter in the snippet above. It probably makes sense to inform reporter implementations about which unit a newly added metric has but I'm open for suggestions here.

pub(crate) struct Counter {
    pub(crate) value: u64,
    pub(crate) unit: String,
}

Since the first expected usage of these types is to only build them internally and then pass them to potentially public reporter implementations, we should keep the fields pub(crate) and provide accessors to them. This will only become relevant later however.

impl ScanMetrics {
    // Accessors for all metrics
}

In particular, this approach differs from the Java implementation in that the MetricsReport only defines an already built report - ready to be passed to the reporter. The Java implementation instead uses MetricsReports and MetricsReportResults. How the counters and durations are determined and set on the ScanMetrics will be discussed next.

2. Report building

Relevant table operations will need to create metric reports, populate them with data and pass them to the metrics reporter.

I've primarily looked at the table scan operation (see TableScan.plan_files) so far (this partially equates to Java's SnapshotScan.planFiles.
Challenges I've identified are mainly around it's multi-threaded nature. The operation spawns multiple threads to process and filter manifest files and data (+delete) files (example). This means that we can't simply pass around the metrics structs but need to come up with a more sophisticated builder pattern that takes concurrent writes into account. I'm still working on this in my local draft.

Commit reporting will likely be analogous but I didn't yet look into it. This could also be added as a follow-up contribution.

3. Reporter implementations

To serve most use cases, and to validate the API, I think it will make sense to include at least one default implementation: the trivial LoggingReporter.

  • ❓I don't yet see tracing or another log crate used outside of tests. If we don't want to commit to a specific log crate, we may need to leave this implementation to the user or make it a feature instead.

In a follow-up I would like to add a REST reporter. In Java, this is the default reporter whenever a REST catalog is used. It's basically sending the report to the catalog so it can aggregate reports across clients.
The iceberg-rest-fixture that's already used in this repo ships with the /namespaces/{namespace}/tables/{table}/metrics endpoint which can be used to validate this implementation.

4. Configuration

As a start, we will only have the LoggingMetricsReporter available. The Java implementation uses it by default when nothing else is configured. For as long as the reporter API is kept private, we won't need to expose any configuration.

Looking forward, user may want to pass their own reporter implementations. The Java implementation allows to do this at the catalog-level, and on individual table operations.
If we decide to follow this approach, the reporter will need to be stored in the catalog and then passed down to scan and commit operations.


Note: While I will be able to work on this independently, I'm still happy and thankful for feedback or comments! 🙏

Willingness to contribute

I can contribute to this feature independently

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions