-
Notifications
You must be signed in to change notification settings - Fork 949
Open
Labels
Description
Describe the bug
Bug where a shard ends up with 2 primaries and when cluster-allow-replica-migration
is disabled.
To reproduce
Let’s say a shard has a primary(A) and a replica(B). Also let's assume that cluster-allow-replica-migration is disabled.
- Node A goes down, and Node B takes over primaryship.
- Node A continues to be down while another Node C is added as a replica of B.
- Node B goes down, and Node C takes over primaryship.
- Node A and Node B come back up and start learning about the topology.
- Node A comes up thinking it was the primary (but has an older config epoch compared to C).
- Node A learns about Node C via gossip and assigns it a random
shard_id
. - Node A receives a direct ping from Node C.
a. Node C advertises the same set of slots that Node A was earlier owning.
b. Since Node A assigns a random shard ID to Node C, Node A thinks that it is still a primary and it lost all its slots to Node C, which is in another shard. - Node A then updates the actual
shard_id
of Node C while processingshard_id
in ping extensions. - Node A and Node C end up being primaries in the same shard while Node C continues to own slots.
Expected behavior
I would have expected node A to become a replica of node C after learning that node C is in the same shard.
Potential fix
Currently, when a node A receives a ping from node C, it first processes the slots config and then processes the shard_id in the ping extension. One way to fix this could be to process the shard_id or all the ping extensions and then later update slots configuration.
sarthakaggarwal97