Replies: 1 comment 1 reply
-
Well, we got it tuned to a way that those connections stop happening but we still experience very high heap usage in 'some' pods and those might eventually result in 'envoy overloaded' errors. |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
We recently went live with Envoy in production and while we don't notice any impact from the outside we notice some off behavior in our metrics.
Couple facts about our setup:
2000m
(currently)What we are experiencing is that even with above values, some pods "fail" and get recycled by K8s. Before we made the liveness probe adjustments above we saw sudden CPU spikes (150% avg) that caused the majority of the pods to fail.
My question is - is there something generally wrong with our configuration? It appears that pods should be generally stable but we experience a "crash" of 5-10 pods every 5-10 minutes.
Since we do seem to have a large amount of connections (upwards of 70,000) do we have to adjust:
??
A screenshot of logs below in case they help.
Beta Was this translation helpful? Give feedback.
All reactions