You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This document describes the options RDI's `config.yaml` file in detail. See
12
+
[Configure data pipelines]({{< relref "/integrate/redis-data-integration/data-pipelines/data-pipelines" >}})
13
+
for more information about the role `config.yaml` plays in defining a pipeline.
14
+
15
+
## Note about fully-qualified table names
16
+
17
+
Throughout this document we use the format `<databaseName>.<tableName>` to refer to a fully-qualified table name. This format is actually the one used by MySQL, but for Oracle,
18
+
SQLServer, and PostgreSQL, you should use `<schemaName>`.`<tableName>` instead.
19
+
20
+
{{< note >}}You can specify the fully-qualified table name `<databaseName>.<tableName>` as
21
+
a regular expression instead of providing the full name of the `databaseName` and `tableName`.
22
+
{{< /note >}}
23
+
24
+
The example below shows the MySQL format specifying the desired columns and primary keys
25
+
for the `chinook.customer` and `chinook.employee` tables:
26
+
27
+
```yaml
28
+
tables:
29
+
# Sync a specific table with all its columns:
30
+
chinook.customer:
31
+
columns:
32
+
- ID
33
+
- FirstName
34
+
- LastName
35
+
- Company
36
+
- Address
37
+
- Email
38
+
keys:
39
+
- FirstName
40
+
- LastName
41
+
chinook.employee:
42
+
columns:
43
+
- ID
44
+
- FirstName
45
+
- LastName
46
+
- ReportsTo
47
+
- Address
48
+
- City
49
+
- State
50
+
keys:
51
+
- FirstName
52
+
- LastName
53
+
```
54
+
11
55
## Top level objects
12
56
13
57
These objects define the sections at the root level of `config.yaml`.
@@ -74,62 +118,81 @@ See the Debezium documentation for more information about the specific connector
74
118
| `schema.exclude.list` | `string` | Oracle, PostgreSQL, SQLServer | An optional, comma-separated list of regular expressions that match names of schemas for which you do not want to capture changes. The connector captures changes in any schema whose name is not included in `schema.exclude.list`. Do no specify the `schemas` section if you are using the `schema.exclude.list` property to filter out schemas. |
75
119
| `table.exclude.list` | `string` | MariaDB, MySQL, Oracle, PostgreSQL, SQLServer | An optional comma-separated list of regular expressions that match fully-qualified table identifiers for the tables that you want to exclude from being captured; The connector captures all tables that are not included in `table.exclude.list`. Do not specify the `tables` block in the configuration if you are using the `table.exclude.list` property to filter out tables. |
76
120
| `column.exclude.list` | `string` | MariaDB, MySQL, Oracle, PostgreSQL, SQLServer | An optional comma-separated list of regular expressions that match the fully-qualified names of columns that should be excluded from change event message values. Fully-qualified names for columns are of the form `schemaName.tableName.columnName`. Do not specify the `columns` block in the configuration if you are using the `column.exclude.list` property to filter out columns. |
77
-
|`snapshot.select.statement.overrides`|`string`| MariaDB, MySQL, Oracle, PostgreSQL, SQLServer |Specifies the table rows to include in a snapshot. Use this property if you want a snapshot to include only a subset of the rows in a table. This property affects snapshots only. It does not apply to events that the connector reads from the log. |
121
+
| `snapshot.select.statement.overrides` | `string` | MariaDB, MySQL, Oracle, PostgreSQL, SQLServer |Specifies the table rows to include in a snapshot. Use this property if you want a snapshot to include only a subset of the rows in a table. This property affects snapshots only. It does not apply to events that the connector reads from the log. See [Using custom queries in the initial snapshot](#custom-initial-query) below for more information. |
78
122
| `log.enabled` | `string` | Oracle | Enables capturing and serialization of large object (CLOB, NCLOB, and BLOB) column values in change events.<br/>Default: `false`|
79
123
| `unavailable.value.placeholder` | Special | Oracle | Specifies the constant that the connector provides to indicate that the original value is unchanged and not provided by the database (this has the type `__debezium_unavailable_value`). |
80
124
81
-
### Using queries in the initial snapshot (relevant for MySQL, Oracle, PostgreSQL and SQLServer)
125
+
### Using custom queries in the initial snapshot {#custom-initial-query}
82
126
83
-
- In case you want a snapshot to include only a subset of the rows in a table, you need to add the property `snapshot.select.statement.overrides` and add a comma-separated list of [fully-qualified table names](#fully-qualified-table-name). The list should include every table for which you want to add a SELECT statement.
127
+
{{< note >}}This section is relevant only for MySQL, Oracle, PostgreSQL, and SQLServer.
128
+
{{< /note >}}
84
129
85
-
-**For each table in the list above, add a further configuration property** that specifies the `SELECT` statement for the connector to run on the table when it takes a snapshot.
130
+
By default, the initial snapshot captures all rows from each table.
131
+
If you want the snapshot to include only a subset of the rows in a table, you can use a
132
+
custom `SELECT` statement to override the default and select only the rows you are interested in.
133
+
To do this, you must first specify the tables whose `SELECT` statement you want to override by adding a `snapshot.select.statement.overrides` in the `source` section with a comma-separated list of [fully-qualified table names](#fully-qualified-table-name).
86
134
87
-
The specified `SELECT` statement determines the subset of table rows to include in the snapshot.
135
+
After the `snapshot.select.statement.overrides` list, you must then add another configuration property for each table in the list to specify the custom `SELECT` statement for that table.
136
+
The format of the property name depends on the database you are using:
88
137
89
-
Use the following format to specify the name of this `SELECT` statement property:
138
+
- For Oracle, SQLServer, and PostrgreSQL, use `snapshot.select.statement.overrides.<SCHEMA_NAME>.<TABLE_NAME>`
139
+
- For MySQL, use: `snapshot.select.statement.overrides<DATABASE_NAME>.<TABLE_NAME>`
For example, with PostgreSQL, you would have a configuration like the following:
93
142
94
-
- Add the list of columns you want to include in the `SELECT` statement using fully-qualified names. Each column should be specified in the configuration as shown below:
- To capture all columns from a table, use empty curly braces `{}` instead of listing individual columns:
159
+
You must also add the list of columns you want to include in the custom `SELECT` statement using fully-qualified names under "sources.tables". Specify each column in the configuration as shown below:
160
+
161
+
```yaml
162
+
tables:
163
+
schema_name.table_name: # For MySQL: use database_name.table_name
164
+
columns:
165
+
- column_name1 # Each column on a new line
166
+
- column_name2
167
+
- column_name3
168
+
```
169
+
170
+
If you want to capture all columns from a table, you can use empty curly braces `{}` instead of listing all the individual columns:
106
171
107
172
```yaml
108
173
tables:
109
174
schema_name.table_name: {} # Captures all columns
110
175
```
111
176
112
-
### Example
113
-
114
-
To select the columns `CustomerId`, `FirstName` and `LastName` from `customer` table and join it with `invoice` table in order to get customers with total invoices greater than 8000, we need to add the following properties to the `config.yaml` file:
177
+
The example configuration below selects the columns `CustomerId`, `FirstName` and `LastName` from the `customer` table and joins it with the `invoice` table to select customers with total invoices greater than 8000:
### Form custom message key(s) for change event records
@@ -154,52 +217,7 @@ advanced:
154
217
- When specifying columns in the `keys` field, ensure that these same columns are also listed under the `columns` field in your configuration.
155
218
- There is no limit to the number of columns that can be used to create custom message keys. However, it’s best to use the minimum required number of columns to specify a unique key.
156
219
157
-
### Fully-qualified table name
158
-
159
-
In this document we refer to the fully-qualified table name as `<databaseName>.<tableName>`. This format is for MySQL database. For Oracle, SQLServer and Postgresql databases use `<schemaName>`.`<tableName>` instead.
{{< note >}}You can specify the fully-qualified table name `<databaseName>.<tableName>` as
167
-
a regular expression instead of providing the full name of the `databaseName` and `tableName`.
168
-
{{< /note >}}
169
-
170
-
### Examples
171
220
172
-
- The primary key of the tables `customer` and `employee` is `ID`.
173
-
174
-
To establish custom messages keys based on `FirstName` and `LastName` for the tables `customer` and `employee`, add the following block to the `config.yaml` file:
0 commit comments