|
| 1 | +# Asynchronous replication |
| 2 | + |
| 3 | +Asynchronous replication allows for synchronizing data between {{ ydb-short-name }} [databases](glossary.md#database) in near real time. It can also be used for data migration between databases with minimal downtime for applications interacting with these databases. Such databases can be located in the same {{ ydb-short-name }} [cluster](glossary.md#cluster) as well as in different clusters. |
| 4 | + |
| 5 | +## Overview {#how-it-works} |
| 6 | + |
| 7 | +Asynchronous replication is based on [Change Data Capture](cdc.md) and operates on logical data. The following diagram illustrates the replication process: |
| 8 | + |
| 9 | +```mermaid |
| 10 | +sequenceDiagram |
| 11 | + participant dst as Source |
| 12 | + participant src as Target |
| 13 | +
|
| 14 | + dst-->src: Initialization |
| 15 | + dst->>dst: Creating an asynchronous replication instance |
| 16 | + dst->>src: Creating changefeeds |
| 17 | + dst->>dst: Creating replica objects |
| 18 | +
|
| 19 | + dst-->src: Initial table scan |
| 20 | + loop |
| 21 | + dst->>src: Request to get data |
| 22 | + src->>dst: Data |
| 23 | + end |
| 24 | +
|
| 25 | + dst-->src: Change data replication |
| 26 | + loop |
| 27 | + dst->>src: Request to get data |
| 28 | + src->>dst: Data |
| 29 | + end |
| 30 | +``` |
| 31 | + |
| 32 | +As shown in the diagram above, asynchronous replication involves two databases: |
| 33 | + |
| 34 | +1. **Source**. A database with [replicated objects](glossary.md#replicated-object). |
| 35 | +2. **Target**. A database where an [asynchronous replication instance](glossary.md#async-replication-instance) and [replica objects](glossary.md#replica-object) will be created. |
| 36 | + |
| 37 | +Asynchronous replication consists of the following stages: |
| 38 | + |
| 39 | +* [Initialization](#init) |
| 40 | +* [Initial table scan](#initial-scan) |
| 41 | +* [Change data replication](#replication-of-changes) |
| 42 | + |
| 43 | +### Initialization {#init} |
| 44 | + |
| 45 | +Initialization of asynchronous replication includes the following steps: |
| 46 | + |
| 47 | +* Creating an asynchronous replication instance on the target database using the [CREATE ASYNC REPLICATION](../yql/reference/syntax/create-async-replication.md) YQL expression. |
| 48 | +* Establishing a connection with the source database. The target database connects to the source using the [connection parameters](../yql/reference/syntax/create-async-replication.md#params) specified during the creation of the asynchronous replication instance. |
| 49 | + |
| 50 | +{% note info %} |
| 51 | + |
| 52 | +The user account that is used to connect to the source database must have the following [permissions](../security/short-access-control-notation.md#access-rights): |
| 53 | + |
| 54 | +* Read permissions for schema objects and directory objects |
| 55 | +* Create, update, delete, and read permissions for changefeeds |
| 56 | + |
| 57 | +{% endnote %} |
| 58 | + |
| 59 | +* The following objects are created for replicated objects on the source: |
| 60 | + * [changefeeds](glossary.md#changefeed) on the source |
| 61 | + * [replica objects](glossary.md#replica-object) on the target |
| 62 | + |
| 63 | +{% note info %} |
| 64 | + |
| 65 | +Replicas are created under the user account that was used to create the asynchronous replication instance. |
| 66 | + |
| 67 | +{% endnote %} |
| 68 | + |
| 69 | +### Initial table scan {#initial-scan} |
| 70 | + |
| 71 | +During the [initial table scan](cdc.md#initial-scan) the source data is exported to changefeeds. The target runs [consumers](topic.md#consumer) that read the source data from the changefeeds and write it to replicas. |
| 72 | + |
| 73 | +You can get the progress of the initial table scan from the [description](../reference/ydb-cli/commands/scheme-describe.md) of the asynchronous replication instance. |
| 74 | + |
| 75 | +### Change data replication {#replication-of-changes} |
| 76 | + |
| 77 | +After the initial table scan is completed, the consumers read the change data and write it to replicas. |
| 78 | + |
| 79 | +Each change data block has its *creation time* ($created\_at$). Consumers track the *reception time* of the change data ($received\_at$). Thus, you can use the following formula to calculate the *replication lag*: |
| 80 | + |
| 81 | +$$ |
| 82 | +replication\_lag = received\_at - created\_at |
| 83 | +$$ |
| 84 | + |
| 85 | +You can also get the replication lag from the [description](../reference/ydb-cli/commands/scheme-describe.md) of the asynchronous replication instance. |
| 86 | + |
| 87 | +## Restrictions {#restrictions} |
| 88 | + |
| 89 | +* The set of replicated objects is immutable and is generated when {{ ydb-short-name }} creates an asynchronous replication instance. |
| 90 | +* {{ ydb-short-name }} supports the following types of replicated objects: |
| 91 | + |
| 92 | + * [row-based tables](datamodel/table.md#row-oriented-tables) |
| 93 | + * [directories](datamodel/dir.md) |
| 94 | + |
| 95 | + {{ ydb-short-name }} will replicate all row-based tables that are located in the given directories and subdirectories at the time the asynchronous replication instance is created. |
| 96 | + |
| 97 | +* During asynchronous replication, you cannot [add or delete columns](../yql/reference/syntax/alter_table/columns.md) in the source tables. |
| 98 | +* During asynchronous replication, replicas are available only for reading. |
| 99 | + |
| 100 | +## Error handling during asynchronous replication {#error-handling} |
| 101 | + |
| 102 | +Possible errors during asynchronous replication can be grouped into the following classes: |
| 103 | + |
| 104 | +* **Temporary failures**, such as transport errors, system overload, etc. Requests will be resent until they are processed successfully. |
| 105 | +* **Critical errors**, such as access violation errors, schema errors, etc. Replication will be aborted, and the [description](../reference/ydb-cli/commands/scheme-describe.md) of the asynchronous replication instance will include the text of the error. |
| 106 | + |
| 107 | +{% note warning %} |
| 108 | + |
| 109 | +Currently, asynchronous replication that is aborted due to a critical error cannot be resumed. In this case, you must [drop](../yql/reference/syntax/drop-async-replication.md) and [create](../yql/reference/syntax/create-async-replication.md) a new asynchronous replication instance. |
| 110 | + |
| 111 | +{% endnote %} |
| 112 | + |
| 113 | +For more information about error classes and how to address them, refer to [Error handling](../reference/ydb-sdk/error_handling.md). |
| 114 | + |
| 115 | +## Asynchronous replication completion {#done} |
| 116 | + |
| 117 | +Completion of asynchronous replication might be an end goal of data migration from one database to another. In this case the client stops writing data to the source, waits for the zero replication lag, and completes replication. After the replication process is completed, replicas become available both for reading and writing. Then you can switch the load from the source database to the target database and complete data migration. |
| 118 | + |
| 119 | +{% note info %} |
| 120 | + |
| 121 | +You cannot resume completed asynchronous replication. |
| 122 | + |
| 123 | +{% endnote %} |
| 124 | + |
| 125 | +{% note warning %} |
| 126 | + |
| 127 | +{{ ydb-short-name }} currently supports only **forced** completion of asynchronous replication, when no additional checks are performed for data consistency, replication lag, etc. |
| 128 | + |
| 129 | +{% endnote %} |
| 130 | + |
| 131 | +To complete asynchronous replication, use the [ALTER ASYNC REPLICATION](../yql/reference/syntax/alter-async-replication.md) YQL expression. |
| 132 | + |
| 133 | +## Dropping an asynchronous replication instance {#drop} |
| 134 | + |
| 135 | +When you drop an asynchronous replication instance: |
| 136 | + |
| 137 | +* Changefeeds are deleted in the source tables. |
| 138 | +* The source tables are unlocked, and you can add and delete columns again. |
| 139 | +* Optionally, all replicas are deleted. |
| 140 | +* Asynchronous replication instance is deleted. |
| 141 | + |
| 142 | +To drop an asynchronous replication instance, use the [DROP ASYNC REPLICATION](../yql/reference/syntax/drop-async-replication.md) YQL expression. |
0 commit comments