-
Notifications
You must be signed in to change notification settings - Fork 952
stop master on AOF short write if there are enough good replicas #2375
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
Signed-off-by: kronwerk <kronwerk@users.noreply.github.com>
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #2375 +/- ##
=========================================
Coverage 71.41% 71.41%
=========================================
Files 123 123
Lines 67139 67153 +14
=========================================
+ Hits 47947 47959 +12
- Misses 19192 19194 +2
🚀 New features to boost your workflow:
|
Signed-off-by: kronwerk <kronwerk@users.noreply.github.com>
223d7a7
to
6489e52
Compare
Signed-off-by: kronwerk <kronwerk@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! I think there is definitely a gap here - thanks for identifying it and putting together a solution. I have a few comments about the approach, please take a look!
listIter li; | ||
listNode *ln; | ||
int good = 0; | ||
|
||
if (!server.repl_min_replicas_to_write || !server.repl_min_replicas_max_lag) return; | ||
|
||
listRewind(server.replicas, &li); | ||
while ((ln = listNext(&li))) { | ||
client *replica = ln->value; | ||
time_t lag = server.unixtime - replica->repl_data->repl_ack_time; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lag here means "how long (in seconds) since our last REPLCONF ACK" response
If we are testing lag==0, that doesn't necessarily guarantee that the replica is fully caught up. It just means that we have gotten a REPLCONF ACK on the current second (this could have been up to 999ms ago in the worst case).
Should we compare the replication offset instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@murphyjacob4 hmmm, this was introduced 12 years ago - maybe could be improved.
not sure though about the right way to do it - in my opinion repl_min_replicas_max_lag is a way to allow considering a replica "good" even if there is a set-by-user lag.
if we want to switch to offsets - what should we do? ignore repl_min_replicas_max_lag, deprecate it, introduce repl_min_replicas_max_offset? would it be convenient and clear for a user which offset is enough to use for the same use case? or repl_min_replicas_max_offset should be always 0 (equal offsets for primary and replica) - but in this case we lose this functionality (allowing to consider lagged replica "good"), is it ok for us?
Signed-off-by: kronwerk <kronwerk@users.noreply.github.com>
when we have a primary disk filled with AOF we might finally have it stalled forever in that state - in this PR the config option is added to kill master having enough good replicas on such occasion (with recently added failover attempt for cluster-version reused)