Skip to content

Add TypeDB Cluster support and failover #765

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 35 commits into
base: cluster-support-feature-branch
Choose a base branch
from

Conversation

farost
Copy link
Member

@farost farost commented Jun 26, 2025

Usage and product changes

Introduce TypeDB Cluster support. Including:

Multiple addresses and address translation

Drivers can be created using a single address, multiple addresses, or address translation (mapping between public user-facing addresses and internal cluster addresses for special needs). Address translation can be updated after the driver is created.

Server and replica statuses

Server version and distribution, as well as the list of active replicas, can be retrieved using TypeDB Driver.

Multiple consistency levels for read requests

Read operations can be executed using one of three strategies:

  • Strong consistency level -- Strongest consistency, always up-to-date due to the guarantee of the primary replica usage.
  • Eventual consistency level -- Allow stale reads from any replica. May not reflect latest writes (will not be supported in the first Cluster release on the server side, so will probably return errors)
  • Replica dependent consistency level -- The operation is executed against the provided replica address only. Can be especially useful for testing.

By default, all operations are executed using the strongest consistency level. However, all read operations now support consistency level specification. Read transactions can be configured through transaction options.

Failover customization

Driver options can be used to configure the failover strategies used with strong and eventual consistency levels.

primary_failover_retries -- Limits the number of attempts to redirect a strongly consistent request to another primary replica in case of a failure due to the change of replica roles. Defaults to 1.
replica_discovery_attempts -- Limits the number of driver attempts to discover a single working replica to perform an operation in case of a replica unavailability. Every replica is tested once, which means that at most:
- {limit} operations are performed if the limit <= the number of replicas.
- {number of replicas} operations are performed if the limit > the number of replicas.
- {number of replicas} operations are performed if the limit is None.
Affects every eventually consistent operation, including redirect failover, when the new primary replica is unknown. Defaults to None.

Implementation

Missing parts:

  • Address translation runtime update is not fully implemented, it probably requires more updates, or can be removed for the first release.
  • Testing cannot be automated without Cluster snapshots. Instead, for local testing, use temp-cluster-server (instructions below).
  • The driver works itself, the tests for Core pass. However, its replication features are untested. There is a new clustering integration test, which successfully boots up servers, but there are some missing functionalities of the server that prevent the testing progress. Once the PR referenced below is completed and merged, it can be completed, as well. More BDD tests should be added here Add cluster-specific steps typedb-behaviour#373

This driver can be tested using the server from https://github.com/typedb/typedb-cluster/pull/608. To do so, run cargo build in typedb-cluster, and then copy the resulting binary to typedb-driver like:

cp path-to-typedb/typedb-cluster/target/debug/typedb_server_bin path-to-typedb/typedb-driver/tool/test/temp-cluster-server/typedb

Then, test scripts like ./tool/test/temp-cluster-server/start-cluster-servers.sh 3 can be used.
Note that it is a temporary directory, created to use the server while we are not able to use Cluster snapshots from Bazel.

@farost farost requested a review from lolski June 26, 2025 15:27
@farost farost changed the base branch from master to cluster-support-feature-branch June 26, 2025 15:28
```

These pointers are then used for further operations:
```c
char* dbName = "hello";
Driver *driver = driver_open_core("127.0.0.1:1729");
Driver *driver = driver_new_core("127.0.0.1:1729");
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't mind this for now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant