Skip to content

Add a section about manual failover in HA #1231

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Apr 14, 2025
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
41 changes: 23 additions & 18 deletions pages/clustering/high-availability.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -267,6 +267,27 @@ If there is already a MAIN instance in the cluster, this query will fail.

This operation will result in writing to the Raft log.

### Demote instance

Demote instance query can be used by an admin to demote the current main to replica. In this case, the leader coordinator won't perform a failover, but as a user,
you should choose promote one of the data instances to MAIN using the `SET INSTANCE `instance` TO MAIN` query.

```plaintext
DEMOTE INSTANCE instanceName;
```

This operation will result in writing to the Raft log.

<Callout type="info">

By combining the functionalities of queries `DEMOTE INSTANCE instanceName` and `SET INSTANCE instanceName TO MAIN` you get the manual failover capability. This can be useful
e.g during a maintenance work on the instance where the current MAIN is deployed.

</Callout>




### Unregister instance

There are various reasons which could lead to the decision that an instance needs to be removed from the cluster. The hardware can be broken,
Expand All @@ -284,22 +305,6 @@ operation cannot be guaranteed to succeed.

The instance requested to be unregistered will also be unregistered from the current MAIN's REPLICA set.

### Demote instance

Demoting instances should be done by a leader coordinator.

Demote instance query will result in several actions:
1. The coordinator instance will demote the instance to REPLICA.
2. The coordinator instance will continue pinging the data instance every `--instance-health-check-frequency-sec` seconds to check its status.
In this case, the leader coordinator won't choose a new MAIN, but as a user, you should choose one instance to promote it to MAIN using the `SET INSTANCE `instance` TO MAIN` query.

```plaintext
DEMOTE INSTANCE instanceName;
```

This operation will result in writing to the Raft log.


### Force reset cluster state

In case the cluster gets stuck there is an option to do the force reset of the cluster. You need to execute a command on the leader coordinator.
Expand Down Expand Up @@ -502,7 +507,7 @@ the cluster:
Memgraph prioritizes availability over strict consistency (leaning towards AP in
the CAP theorem). While it aims to maintain consistency as much as possible, the
current failover logic can result in a non-zero Recovery Point Objective (RPO),
that is, data loss, because:
that is, the loss of committed data, because:
- The promoted MAIN might not have received all commits from the previous MAIN
before the failure.
- This design ensures that the MAIN remains writable for the maximum possible
Expand Down Expand Up @@ -604,7 +609,7 @@ Failure of REPLICA data instance isn't very harmful since users can continue wri
REPLICAs. The most important thing to analyze is what happens when MAIN gets down. In that case, the leader coordinator uses
user-controllable parameters related to the frequency of health checks from the leader to replication instances (`--instance-health-check-frequency-sec`)
and the time needed to realize the instance is down (`--instance-down-timeout-sec`). After collecting enough evidence, the leader concludes the MAIN is down and performs failover
using just a handful of RPC messages (correct time depends on the distance between instances). It is important to mention that the whole failover is performed with a zero data-loss
using just a handful of RPC messages (correct time depends on the distance between instances). It is important to mention that the whole failover is performed without the loss of committed data
if the newly chosen MAIN (previously REPLICA) had all up-to-date data.

Current deployment assumes the existence of only one datacenter, which automatically means that Memgraph won't be available in the case the whole datacenter goes down. We are actively
Expand Down