Skip to content

Removing min.insync.replicas from Kafka CR can lead to a misleading warning #11866

@ppatierno

Description

@ppatierno

This issue is related to #11854 and the corresponding solution #11857 (still under discussion) but it's anyway valid, whatever will be the solution of #11854, because min.insync.replicas can't be deleted when ELR is enabled.

Create a Kafka custom resource without min.insync.replicas set. It's going to be defaulted to 1.
When the cluster is ready, the operator set the following warning within the Kafka CR status:

conditions:
  - lastTransitionTime: "2025-09-13T14:16:45.398420014Z"
    message: min.insync.replicas option is not configured. It defaults to 1 which
      does not guarantee reliability and availability. You should configure this option
      in .spec.kafka.config.
    reason: KafkaMinInsyncReplicas
    status: "True"
    type: Warning

Let's decide to change the min.insync.replicas and set it explicitly within the .spec.kafka.config section as min.insync.replicas: 2.
The operator is going to change dynamically (with no brokers to restart) the min.insync.replicas as a cluster wide configuration, so it's now 2 and the warning condition is cleaned from the Kafka CR.

Now, let's decide to remove the min.insync.replicas from the configuration.
We can have two situation here:

  • If ELR is disabled (default in Kafka 4.0), the configuration is deleted (because it's allowed) and the brokers come back to use the default 1, the operator shows the warning message again in the Kafka CR and it makes sense.
  • If the ELR is enabled (default in Kafka 4.1), the configuration cannot be deleted (it's not allowed), the brokers stay with the current value 2 but because the .spec.kafka.config doesn't include the parameter, the operator shows the warning message again in the Kafka CR which this time doesn't make any sense and it's not true.

Showing the warning is done within the KafkaSpecChecker class.
In order to avoid useless complexity also adding the check through the Admin API if ELR is enabled or not, I would just remove this check starting from 0.48.0, which is anyway an advice to the users (to guarantee reliability and availability), it's not something that makes the cluster not ready.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions