workflow controller keeps crashing under load #14232
static-moonlight
started this conversation in
General
Replies: 1 comment
-
related to death by |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Scenario: we are using Argo to run smaller workflows, lots of them, during normal operation ~500 per hour.
After a small outage, Argo gets flooded with 1000+ workflows. It seems the workflow controller can't handle that.
This is a serious problem. I need to know why the workflow controller keeps crashing. How do I find out? Where do I need to look?
On that note: I also need ideas how to make it more stable/resilient/reliable.
Any ideas?
Beta Was this translation helpful? Give feedback.
All reactions