-
Notifications
You must be signed in to change notification settings - Fork 1.4k
Description
This issue is related to #11854 and the corresponding solution #11857 (still under discussion) but it's anyway valid, whatever will be the solution of #11854, because min.insync.replicas
can't be deleted when ELR is enabled.
Create a Kafka
custom resource without min.insync.replicas
set. It's going to be defaulted to 1
.
When the cluster is ready, the operator set the following warning within the Kafka
CR status:
conditions:
- lastTransitionTime: "2025-09-13T14:16:45.398420014Z"
message: min.insync.replicas option is not configured. It defaults to 1 which
does not guarantee reliability and availability. You should configure this option
in .spec.kafka.config.
reason: KafkaMinInsyncReplicas
status: "True"
type: Warning
Let's decide to change the min.insync.replicas
and set it explicitly within the .spec.kafka.config
section as min.insync.replicas: 2
.
The operator is going to change dynamically (with no brokers to restart) the min.insync.replicas
as a cluster wide configuration, so it's now 2
and the warning condition is cleaned from the Kafka
CR.
Now, let's decide to remove the min.insync.replicas
from the configuration.
We can have two situation here:
- If ELR is disabled (default in Kafka 4.0), the configuration is deleted (because it's allowed) and the brokers come back to use the default
1
, the operator shows the warning message again in theKafka
CR and it makes sense. - If the ELR is enabled (default in Kafka 4.1), the configuration cannot be deleted (it's not allowed), the brokers stay with the current value
2
but because the.spec.kafka.config
doesn't include the parameter, the operator shows the warning message again in theKafka
CR which this time doesn't make any sense and it's not true.
Showing the warning is done within the KafkaSpecChecker
class.
In order to avoid useless complexity also adding the check through the Admin API if ELR is enabled or not, I would just remove this check starting from 0.48.0, which is anyway an advice to the users (to guarantee reliability and availability), it's not something that makes the cluster not ready.