-
Notifications
You must be signed in to change notification settings - Fork 949
Open
Labels
Description
The current implementation of clusterNodeCleanupFailureReports
walks the entire fail reports in the list on every invocation and performs a listDelNode() for each expired or non‑voting entry. When a cluster node has accumulated hundreds or thousands of failure reports, this function becomes a CPU hotspot.
Lines 1702 to 1721 in 3ceae81
void clusterNodeCleanupFailureReports(clusterNode *node) { | |
list *l = node->fail_reports; | |
if (!listLength(l)) return; | |
listNode *ln; | |
listIter li; | |
clusterNodeFailReport *fr; | |
mstime_t maxtime = server.cluster_node_timeout * CLUSTER_FAIL_REPORT_VALIDITY_MULT; | |
mstime_t now = mstime(); | |
listRewind(l, &li); | |
while ((ln = listNext(&li)) != NULL) { | |
fr = ln->value; | |
if (now - fr->time > maxtime) { | |
listDelNode(l, ln); | |
} else if (!clusterNodeIsVotingPrimary(fr->node)) { | |
listDelNode(l, ln); | |
} | |
} | |
} |
(The above result was observed during a 2,000‑node cluster failover scenario)
Current behavior:
- Where N is the number of failure reports, O(N) scan on every call, regardless of how many reports are actually expired.
Improvement ideas:
Use a priority queue with expiration time, so expired nodes can be removed in O(RlogN) rather than scanned in full. (R is # of actually expired reports)- Use lazy deletion instead of removing them on every call
- Use early exit when meeting quorum
Metadata
Metadata
Assignees
Labels
Type
Projects
Status
Optional for next release candidate