You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
After you deploy a pipeline, it goes through the following phases:
71
+
72
+
1.*Deploy* - when you deploy the pipeline, RDI first validates it before use.
73
+
Then, the [operator]({{< relref "/integrate/redis-data-integration/architecture#how-rdi-is-deployed">}}) creates and configures the collector and stream processor that will run the pipeline.
74
+
1.*Snapshot* - The collector starts the pipeline by creating a snapshot of the full
75
+
dataset. This involves reading all the relevant source data, transforming it and then
76
+
writing it into the Redis target. You should expect this phase to take minutes or
77
+
hours to complete if you have a lot of data.
78
+
1.*CDC* - Once the snapshot is complete, the collector starts listening for updates to
79
+
the source data. Whenever a change is committed to the source, the collector captures
80
+
it and adds it to the target through the pipeline. This phase continues indefinitely
81
+
unless you change the pipeline configuration.
82
+
1.*Update* - If you update the pipeline configuration, the operator applies it
83
+
to the collector and the stream processor. Note that the changes only affect newly-captured
84
+
data unless you reset the pipeline completely. Once RDI has accepted the updates, the
85
+
pipeline returns to the CDC phase with the new configuration.
86
+
1.*Reset* - There are circumstances where you might want to rebuild the dataset
87
+
completely. For example, you might want to apply a new transformation to all the source
88
+
data or refresh the dataset if RDI is disconnected from the
89
+
source for a long time. In situations like these, you can *reset* the pipeline back
90
+
to the snapshot phase. When this is complete, the pipeline continues with CDC as usual.
91
+
92
+
## Using a pipeline
93
+
94
+
Follow the steps described in the sections below to prepare and run an RDI pipeline.
95
+
96
+
### 1. Prepare the source database
97
+
98
+
Before using the pipeline you must first prepare your source database to use
99
+
the Debezium connector for *change data capture (CDC)*. See the
command (VM deployment) or the `rdi-secret.sh` script (K8s deployment) to set the secret value.
28
28
You can then refer to these secrets in the `config.yaml` file using the syntax "`${SECRET_NAME}`"
29
-
(the sample [config.yaml file]({{< relref "/integrate/redis-data-integration/data-pipelines/data-pipelines#the-configyaml-file" >}}) shows these secrets in use).
The table below lists all valid secret names. Note that the
32
34
username and password are required for the source and target, but the other
@@ -252,7 +254,7 @@ Note that the certificate paths contained in the secrets `SOURCE_DB_CACERT`, `SO
252
254
253
255
## Deploy a pipeline
254
256
255
-
When you have created your configuration, including the [jobs]({{< relref "/integrate/redis-data-integration/data-pipelines/data-pipelines#job-files" >}}), you are
257
+
When you have created your configuration, including the [jobs]({{< relref "/integrate/redis-data-integration/data-pipelines/transform-examples" >}}), you are
256
258
ready to deploy. Use [Redis Insight]({{< relref "/develop/tools/insight/rdi-connector" >}})
257
259
to configure and deploy pipelines for both VM and K8s installations.
When your configuration is ready, you must deploy it to start using the pipeline. See
305
-
[Deploy a pipeline]({{< relref "/integrate/redis-data-integration/data-pipelines/deploy" >}})
306
-
to learn how to do this.
307
-
308
-
## Pipeline lifecycle
309
-
310
-
A pipeline goes through the following phases:
311
-
312
-
1.*Deploy* - when you deploy the pipeline, RDI first validates it before use.
313
-
Then, the [operator]({{< relref "/integrate/redis-data-integration/architecture#how-rdi-is-deployed">}}) creates and configures the collector and stream processor that will run the pipeline.
314
-
1.*Snapshot* - The collector starts the pipeline by creating a snapshot of the full
315
-
dataset. This involves reading all the relevant source data, transforming it and then
316
-
writing it into the Redis target. You should expect this phase to take minutes or
317
-
hours to complete if you have a lot of data.
318
-
1.*CDC* - Once the snapshot is complete, the collector starts listening for updates to
319
-
the source data. Whenever a change is committed to the source, the collector captures
320
-
it and adds it to the target through the pipeline. This phase continues indefinitely
321
-
unless you change the pipeline configuration.
322
-
1.*Update* - If you update the pipeline configuration, the operator applies it
323
-
to the collector and the stream processor. Note that the changes only affect newly-captured
324
-
data unless you reset the pipeline completely. Once RDI has accepted the updates, the
325
-
pipeline returns to the CDC phase with the new configuration.
326
-
1.*Reset* - There are circumstances where you might want to rebuild the dataset
327
-
completely. For example, you might want to apply a new transformation to all the source
328
-
data or refresh the dataset if RDI is disconnected from the
329
-
source for a long time. In situations like these, you can *reset* the pipeline back
330
-
to the snapshot phase. When this is complete, the pipeline continues with CDC as usual.
0 commit comments