-
Notifications
You must be signed in to change notification settings - Fork 17
feat(metrics): add metrics collection mechanism #1071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
c2b81c4
to
7d678a3
Compare
555b13b
to
0650834
Compare
cf: #1074 -> Introduce the watermark to bound the time range |
36a8e6c
to
7fe2dbd
Compare
Notes from our discussion :
|
2c1f4f2
to
f2a0d4a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read some of the PR, though I'm not finished. I'll try to finish tomorrow.
Side question btw, you did the dataset and table creation on BQ manually ? |
For now, yes, manually in the sandbox account. Once we end up on a finalized schema, I will write the Terraform for it. This is what the table might look like after the few changes we mentioned: create table `marble-dev-sandbox.metrics.events` (
start_time timestamp not null,
end_time timestamp not null,
deployment_id string(36) not null,
license_key string(36),
org_id string(36),
event_type string not null,
value float64,
text string,
collected_at timestamp default current_timestamp()
)
partition by timestamp_trunc(start_time, month)
cluster by deployment_id, license_key, org_id, event_type; Partition and clustering keys to be discussed, of course. |
…llection configuration
…o ensure all org IDs are represented with zero counts if absent
…ompatibility in metrics adaptation
…nd table, and update ingestion repository to use dynamic dataset/table names
…e the license name
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
…n metric collection
…P request logic in metrics ingestion
…roved initialization and error handling
…nsureGlobalQueuesAreActive function
…torsV1 and update related tests and implementations
…on and caching for deployment ID
…d MetricData structs for consistency
…eamline watermark management in repositories
…r for job duration management
…EMETRY and streamline BigQuery initialization for Marble SaaS projects
509aa6c
to
9658147
Compare
…necessary project ID checks and using IsMarbleSaasProject function
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's merge and improve afterwards.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This PR introduces the concept of a metric collector, a new use case that runs periodically to launch all collectors. It introduces the notion of a Collector which has a list of collectors (both global and organization-specific), a collector is a module which implement how to compute/extract a metric.
A versioning system is implemented within the Collector to determine the list of collectors we are likely to receive and to know how to parse the results accordingly.
Now that the mechanism is implemented, the next steps will be to define the list of metrics and implement the collectors in usecases/metrics_collection, following the example of the stub.
This PR also introduces the concept of periodicity to determine if the metric is monthly (calendar-based), daily, or instantaneous. If there is an overlap between two periods, the collector will handle the separation and emit two items. A utility is provided to manage these date notions.
There are still some TODOs remaining, and there are PRs branching from this one to implement these TODOs. Some configuration still needs to be set up.