-
Couldn't load subscription status.
- Fork 929
Respect REPLCONF VERSION in full sync #2735
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: unstable
Are you sure you want to change the base?
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## unstable #2735 +/- ##
============================================
- Coverage 72.58% 72.41% -0.18%
============================================
Files 128 128
Lines 71277 70165 -1112
============================================
- Hits 51736 50807 -929
+ Misses 19541 19358 -183
🚀 New features to boost your workflow:
|
Given that this is a config for replica behavior, maybe we should also have a config here? AWS has the If we were to add a config
AWS also quietly launched a downgrade feature from 7.2 -> 7.1, and users have mentioned the most important benefit is that they feel a lot more comfortable to do the upgrade. The code looks mostly OK, but I didn't dive too deeply yet. I know this was contentious in the past, so it would be good to get @enjoy-binbin @soloestoy @PingXie @hwware to chime in as well. I think we probably need to just make a decision here since it keeps coming up as important. |
Fine! Let's always abort in this case then? We don't need a config IMHO. What matters is that it works when no new features are used. If the user has started playing with HFE keys, they need to delete them and then it will work. |
Yeah, I'm OK with this too. |
|
I think I will vote as abstention, because in our side, we have no this option to our users, and we have no idea which kind of surprise could happen. But I think this feature is useful for some other users, thus I vote "abstention" |
Only in full sync. Use the replica's REPLCONF VERSION to select RDB version. Abort if any data that can't be represented in the older RDB version is found. Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
a5e0272 to
d04cf78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience, providing a way to undo an upgrade makes users less worried and actually more willing to upgrade. This is true not least when Valkey is a part of a larger system which is upgraded as a whole. Valkey may be just one microservice of many within a larger system; not uncommon in on-prem deployments. If anything goes wrong (even if it's not Valkey itself) the user wants a way to roll back the whole system to the last working state.
I agree with this, and this is indeed an important one, i am in it. The code mostly LGTM but i did not dive too deeply, just comment that i am ok with this
Signed-off-by: Viktor Söderqvist <viktor.soderqvist@est.tech>
Provide an RDB snapshot that the replica can handle, for resilience during rolling upgrades.
This permits an older replica to do a full sync from a newer primary. The primary takes the replica's announced version into account when generating the snapshot. In particular, it allows latest Valkey to send snapshots as RDB 11 to replicas running Valkey 7.2 and 8.x.
The implementation is structurally similar to how filtered snapshots with REPLCONF RDB-FILTER-ONLY works (commits 1bf6d6f, 65a7635) and to the feature negotiation that replicas initiate using REPLCONF CAPA.
If any new features that the replica can't handle (such as hash-field expiration and atomic slot migration) are in use, the full sync is aborted and the replica connection is closed.
This mechanism will allow us to do RDB changes more often. In the recent years, we have been avoiding RDB changes. With this mechanism, there is no need to avoid introducing RDB changes such as new encodings and new compression algorithms (#1962).
In my experience, providing a way to undo an upgrade makes users less worried and actually more willing to upgrade. This is true not least when Valkey is a part of a larger system which is upgraded as a whole. Valkey may be just one microservice of many within a larger system; not uncommon in on-prem deployments. If anything goes wrong (even if it's not Valkey itself) the user wants a way to roll back the whole system to the last working state.
This speaker in this talk presents a similar experience:
https://www.youtube.com/watch?v=IIElfMbE8cw
In the talk, it's mentioned that rolling back a system can even be part of compliance and SLA requirements.