Replies: 1 comment 5 replies
-
What version of Strimzi are you using? What is your configuration? Logs? Etc. In general, no, I do not think the tasks are expected to be moved to other nodes during rolling updates. |
Beta Was this translation helpful? Give feedback.
5 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi all!
I have question on the behavior of Strimzi Kafka Connect rebalances during rolling restarts.
While going through a/the Kafka Connect book I found a section on rebalancing and how it's best to reduce the amount of rebalances in an attempt to reduce cluster strain and amount of stream pauses. It also goes into how you can tweak Connect so that during cluster restarts rebalances don't trigger as long as workers come back up fast enough using
scheduled.rebalance.max.delay.ms
.When triggering a rolling restart with Strimzi Connect in a 3 replica cluster (for example by changing a
spec.config
, orspec.template.pod.annotations
value in aKafkaConnect
resource) I see that tasks are immediately rebalanced to the other workers the moment one pod terminates. Am I observing this right? Is this expected behavior?The potential issue that I see with this is that the moment worker 1 restarts, its tasks will get rebalanced to the other workers, which in turn will be restarted in the very near future. Causing what indeed looks like unnecessary rebalances; as work will need to be redistributed a couple of times during the roll.
If this is indeed the expected behavior (and it's not due to user error on my side) I would like to tweak the settings in such a way that a rebalance is not triggered if a worker comes up fast enough.
We are currently not setting
scheduled.rebalance.max.delay.ms
and assuming it will take the default of 5min. Our pods come upReady
within 90secs, but we are still seeing the behavior described above. The rest of the settings related to rebalancingsession.timeout.ms
andrebalance.timeout.ms
have sane defaults (according to the book and I tend to agree), so I'd rather not mess with those.Cheers!
TL;DR
What is the expected rebalance behavior during a Strimzi Kafka Connect cluster restart?
Is it possible to prevent unnecessary rebalances when rolling a Strimzi Kafka Connect cluster?
Beta Was this translation helpful? Give feedback.
All reactions