-
Hello, Using strimzi kafka operator with cruise control. There are metrices as below, can you please give input for the same?
Cruise control logs - AnomalyDetectorState: {selfHealingEnabled:[], selfHealingDisabled:[DISK_FAILURE, BROKER_FAILURE, GOAL_VIOLATION, METRIC_ANOMALY, TOPIC_ANOMALY, MAINTENANCE_EVENT], selfHealingEnabledRatio:{DISK_FAILURE=0.0, BROKER_FAILURE=0.0, GOAL_VIOLATION=0.0, METRIC_ANOMALY=0.0, TOPIC_ANOMALY=0.0, MAINTENANCE_EVENT=0.0}, recentGoalViolations:[], recentBrokerFailures:[], recentMetricAnomalies:[], recentDiskFailures:[], recentTopicAnomalies:[], recentMaintenanceEvents:[], metrics:{meanTimeBetweenAnomalies:{GOAL_VIOLATION:0.00 milliseconds, BROKER_FAILURE:0.00 milliseconds, METRIC_ANOMALY:0.00 milliseconds, DISK_FAILURE:0.00 milliseconds, TOPIC_ANOMALY:0.00 milliseconds}, meanTimeToStartFix:0.00 milliseconds, numSelfHealingStarted:0, numSelfHealingFailedToStart:0, ongoingAnomalyDuration=0.00 milliseconds}, ongoingSelfHealingAnomaly:None, balancednessScore:100.000} |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 7 replies
-
This is the number of times the
The default value is set to |
Beta Was this translation helpful? Give feedback.
-
Then this metric cannot be used to monitor how frequent rebalance is
happing. This will confuse user.
Any other metric which is more suitable and gives better view of
rebalancing rate?
…On Fri, 8 Aug, 2025, 8:41 pm Kyle Liberti, ***@***.***> wrote:
kafka_cruisecontrol_KafkaCruiseControlServlet_REBALANCE_request_rate_Count
- this metric shows much higher value as compared to rebalancing done in
actual. I am doing manual rebalancing.
This is the number of times the REBALANCE endpoint is requested. It is
not only incremented when a rebalance is executed but also when the
proposal, status, or result of that rebalance is being checked. So even if
a single KafkaRebalance resource was created for a partition rebalance,
the Strimzi Operator will hit this endpoint several times throughout the
lifecycle of the rebalancing process.
kafka_cruisecontrol_AnomalyDetector_balancedness_score_Value -
self-healing is disabled, and no anomaly detector goals are set, still this
metric shows value all the time as 100.
The default value is set to 100, the score is only decreased by anomaly
detection goals that are violated. Since no anomaly detection goals are
listed, no anomaly detection goals are violated and the balancedness score
stays at 100.
—
Reply to this email directly, view it on GitHub
<#11728 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/A5KYFZCHCDYHA6A6BHGN6V33MS43RAVCNFSM6AAAAACDNB6XOSVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTIMBUG42TINQ>
.
You are receiving this because you authored the thread.Message ID:
<strimzi/strimzi-kafka-operator/repo-discussions/11728/comments/14047546@
github.com>
|
Beta Was this translation helpful? Give feedback.
What would be the right metric to expose here?
Note that this dashboard example only exposes metrics that are provided by Cruise Control "sensors".
From what I understand from the upstream Cruise Control wiki this metric isn't supposed be the number of rebalances executed. It is supposed to be the average number of HTTP requests to Cruise Control's "REBALANCE" endpoint