[BUG] Two primaries in same shard

**Describe the bug**

Bug where a shard ends up with 2 primaries and when `cluster-allow-replica-migration` is disabled.

**To reproduce**

Let’s say a shard has a primary(A) and a replica(B). Also let's assume that [cluster-allow-replica-migration](https://github.com/valkey-io/valkey/blob/8.1/src/cluster_legacy.c#L6873) is disabled.

1. Node A goes down, and Node B takes over primaryship.
2. Node A continues to be down while another Node C is added as a replica of B.
3. Node B goes down, and Node C takes over primaryship.
4. Node A and Node B come back up and start learning about the topology.
5. Node A comes up thinking it was the primary (but has an older config epoch compared to C).
6. Node A learns about Node C via gossip and assigns it a random `shard_id`.
7. Node A receives a direct ping from Node C.
    a. Node C advertises the same set of slots that Node A was earlier owning.
    b. Since Node A assigns a random shard ID to Node C, Node A thinks that it is still a primary and it lost all its slots to Node C, which is in another shard.
8. Node A then updates the actual `shard_id` of Node C while processing `shard_id` in ping extensions.
9. Node A and Node C end up being primaries in the same shard while Node C continues to own slots.

**Expected behavior**

I would have expected node A to become a replica of node C after learning that node C is in the same shard.

**Potential fix**

Currently, when a node A receives a ping from node C, it [first processes the slots config](https://github.com/valkey-io/valkey/blob/8.1/src/cluster_legacy.c#L3652) and [then processes the shard_id in the ping extension](https://github.com/valkey-io/valkey/blob/8.1/src/cluster_legacy.c#L3701). One way to fix this could be to process the shard_id or all the ping extensions and then later update slots configuration.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[BUG] Two primaries in same shard #2261

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[BUG] Two primaries in same shard #2261

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions