Mimir recording rule/alert gaps #11276
Unanswered
danielpof
asked this question in
Help and support
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Recently moved our alertmanager and ruler to mimir and as we did we started getting gaps in the alert/recording rule metrics.
When looking into the
ALERTS
metric there's a gap (it also seems to happen with recording rules). Trying to reproduce said gap i was able to cause it by making a single ingester replica unreachable (changed stateful set to remove all pod ports) making the ALERTS metric become absent.A normal rollout restart, repeatedly killing one of the ingester replicas or scaling down to 2 (keeping the replication factor of 3) doesn't seem to show this behaviour and the metrics continue to exist.
The documentation https://grafana.com/docs/mimir/latest/references/architecture/components/ingester/#replication-and-availability doesn't seem to go into much detail on this so I'm at a loss on how to stop these ALERTS/recording rule gaps from happening.
Has anyone had any problem like this?
Beta Was this translation helpful? Give feedback.
All reactions