-
Notifications
You must be signed in to change notification settings - Fork 3.8k
Add host replacement to tracked keyspaces #4396
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: cep-45-mutation-tracking
Are you sure you want to change the base?
Add host replacement to tracked keyspaces #4396
Conversation
// for correctness vs complex protocols topology updates. You could make the case that mutable state would be | ||
// a better tradeoff for node replacement, but it seems likely that handling token movements will be simpler | ||
// if we use a copy on write pattern for topology changes. | ||
private final ReentrantReadWriteLock shardLock = new ReentrantReadWriteLock(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Practically it should be fine to keep this lock unfair, but I wonder if we'll find workloads with high read and write query throughput to starve topology changes. Could be worth using StampedLock here, we don't seem to require reentrancy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree this is probably fine, but it would be worth looking at better solutions. Would you mind if we just added a TODO to consider better options here before merge, instead of addressing it in this ticket?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah definitely, fine to defer
*/ | ||
private boolean isTrackedReplicationEnabled(String keyspace) | ||
{ | ||
return ClusterMetadata.current().schema.getKeyspaceMetadata(keyspace).useMutationTracking(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Null check for dropped keyspace?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
StreamPlan assumes that you're only sending/requesting streams for keyspaces that exist, and the rest of the class has the same null unsafety around keyspace lookups. I can add a more descriptive error message if you like, but if the keyspace doesn't exist, other parts of stream plan will have already thrown an NPE before getting here
|
||
public long serializedSize(int version) | ||
{ | ||
return 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intentional?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is, yes. OutgoingStreamMessage does the same thing since these messages aren't serialized into a buffer, but put directly into the socket.
// end-of-stream marker | ||
out.writeBoolean(false); | ||
|
||
session.logStreamSent(this); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we avoid side effects like timeout scheduling during serde?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is the same pattern used by sstable streaming
mutation.getKeyspaceName(), | ||
mutation.key().getToken()); | ||
|
||
mutation.apply(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking we'd receive all the mutation logs, then do replay to apply all before completing the session, rather than deserializing and applying each at a time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d thought about doing that as part of this initial implementation, but it wasn’t really clear what would be gained from the additional state changes. IIRC the motivation for staging sstables before making them visible to reads is to prevent data resurrection, but that’s not a concern with read reconciliation. Additionally, I think we’ll only be doing log streaming like this for pending ranges, so any data written here won’t actually be read until the streams complete successfully anyway.
No description provided.