Skip to content

Commit d762fd6

Browse files
Merge pull request #1577 from ilianiliev-redis/RDSC-3566-data-denormalisation-docs-improvement
RDSC-3566: Improving documentation for data denormalization
2 parents 15fb199 + 30df7bc commit d762fd6

File tree

1 file changed

+114
-14
lines changed

1 file changed

+114
-14
lines changed

content/integrate/redis-data-integration/data-pipelines/data-denormalization.md

Lines changed: 114 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -25,19 +25,118 @@ at the expense of speed.
2525
A Redis cache, on the other hand, is focused on making *read* queries fast, so RDI provides data
2626
*denormalization* to help with this.
2727

28-
## Nest strategy
28+
## Joining one-to-one relationships
2929

30-
*Nesting* is the strategy RDI uses to denormalize many-to-one relationships in the source database.
31-
It does this by representing the
32-
parent object (the "one") as a JSON document with the children (the "many") nested inside a JSON map
33-
attribute in the parent. The diagram belows shows a nesting with the child objects in a map
34-
called `InvoiceLineItems`:
30+
You can join one-to-one relationships by making more than one job to write to the same Redis key.
31+
32+
First, you must configure the parent entity to use `merge` as the `on_update` strategy.
33+
34+
```yaml
35+
# jobs/customers.yaml
36+
source:
37+
table: customers
38+
39+
output:
40+
- uses: redis.write
41+
with:
42+
data_type: json
43+
on_update: merge
44+
```
45+
46+
Then, you can configure the child entity to write to the same Redis key as the parent entity. You can do this by using the `key` attribute in the `with` block of the job, as shown in this example:
47+
48+
```yaml
49+
# jobs/addresses.yaml
50+
source:
51+
table: addresses
52+
53+
transform:
54+
- uses: add_field
55+
with:
56+
field: customer_address
57+
language: jmespath
58+
# You can use the following JMESPath expression to create a JSON object and combine the address fields into a single object.
59+
expression: |
60+
{
61+
"street": street,
62+
"city": city,
63+
"state": state,
64+
"zip": zip
65+
}
66+
67+
output:
68+
- uses: redis.write
69+
with:
70+
data_type: json
71+
# We specify the key to write to the same key as the parent entity.
72+
key:
73+
expression: concat(['customers:id:', customer_id])
74+
language: jmespath
75+
on_update: merge
76+
mapping:
77+
# You can specify one or more fields to write to the parent entity.
78+
- customer_address: customer_address
79+
```
80+
81+
The joined data will look like this in Redis:
82+
83+
```json
84+
{
85+
"id": "1",
86+
"first_name": "John",
87+
"last_name": "Doe",
88+
"customer_address": {
89+
"street": "123 Main St",
90+
"city": "Anytown",
91+
"state": "CA",
92+
"zip": "12345"
93+
}
94+
}
95+
```
96+
97+
{{< note >}}
98+
If you don't set `merge` as the `on_update` strategy for all jobs targeting the same key, the entire parent record in Redis will be overwritten whenever any related record in the source database is updated. This will result in the loss of values written by other jobs.
99+
{{< /note >}}
100+
101+
When using this approach, you must ensure that the `key` expression in the child job matches the key expression in the parent job. If you use a different key expression, the child data will not be written to the same Redis key as the parent data.
102+
103+
In the example above, the `addresses` job uses the default key pattern to write to the same Redis key as the `customers` job. You can find more information about the default key pattern [here]({{< relref "/integrate/redis-data-integration/data-pipelines/transform-examples/redis-set-key-name" >}}).
104+
105+
You can also use custom keys for the parent entity, as long as you use the same key for all jobs that write to the same Redis key.
106+
107+
## Joining one-to-many relationships
108+
109+
To join one-to-many relationships, you can use the *Nesting* strategy.
110+
With this, the parent object (the "one") is represented as a JSON document with the children (the "many") nested inside it as a JSON map attribute. The diagram below shows a nesting with the child objects in a map called `InvoiceLineItems`:
35111

36112
{{< image filename="/images/rdi/ingest/nest-flow.webp" width="500px" >}}
37113

38-
You configure normalization with a `nest` block in the child entities' RDI job, as shown in this example:
114+
115+
To configure normalization, you must first configure the parent entity to use JSON as the target data type. Add `data_type: json` to the parent job as shown in the example below:
116+
117+
```yaml
118+
# jobs/invoice.yaml
119+
source:
120+
server_name: chinook
121+
schema: public
122+
table: Invoice
123+
124+
output:
125+
- uses: redis.write
126+
with:
127+
# Setting the data type to json ensures that the parent object will be created in a way that supports nesting.
128+
data_type: json
129+
# Important: do not set a custom key for the parent entity.
130+
# When nesting the child object under the parent, the parent key is automatically calculated based on
131+
# the parent table name and the parent key field and if a custom key is set, it will cause a mismatch
132+
# between the key used to write the parent and the key used to write the child.
133+
134+
```
135+
136+
After you have configured the parent entity, you can then configure the child entities to be nested under it, based on their relation type. To do this, use the `nest` block, as shown in this example:
39137

40138
```yaml
139+
# jobs/invoice_line.yaml
41140
source:
42141
server_name: chinook
43142
schema: public
@@ -50,8 +149,9 @@ output:
50149
# server_name: chinook
51150
# schema: public
52151
table: Invoice
53-
nesting_key: InvoiceLineId # cannot be composite
54-
parent_key: InvoiceId # cannot be composite
152+
nesting_key: InvoiceLineId # the unique key in the composite structure under which the child data will be stored
153+
parent_key: InvoiceId
154+
child_key: InvoiceId # optional, if different from parent_key
55155
path: $.InvoiceLineItems # path must start from document root ($)
56156
structure: map # only map supported for now
57157
on_update: merge # only merge supported for now
@@ -67,17 +167,16 @@ The job must include the following attributes in the `nest` block:
67167
`schema` attributes. Note that this attribute refers to a Redis *key* that will be added to the target
68168
database, not to a table you can access from the pipeline. See [Using nesting](#using-nesting) below
69169
for the format of the key that is generated.
70-
- `nesting_key`: The field of the child entity that stores the unique ID (primary key) of the child entity.
71-
- `parent_key`: The field in the parent entity that stores the unique ID (foreign key) of the parent entity.
72-
- `child_key`: The field in the child entity that stores the unique ID (foreign key) of the parent entity.
73-
You only need to add this attribute if the name of the child's foreign key field is different from the parent's.
170+
- `nesting_key`: The unique key of each child entry in the JSON map that will be created under the path.
171+
- `parent_key`: The field in the parent entity that stores the unique ID (foreign key) of the parent entity. This can't be a composite key.
172+
- `child_key`: The field in the child entity that stores the unique ID (foreign key) to the parent entity. You only need to add this attribute if the name of the child's foreign key field is different from the parent's. This can't be a composite key.
74173
- `path`: The [JSONPath](https://goessner.net/articles/JsonPath/)
75174
for the map where you want to store the child entities. The path must start with the `$` character, which denotes
76175
the document root.
77176
- `structure`: (Optional) The type of JSON nesting structure for the child entities. Currently, only a JSON map
78177
is supported so if you supply this attribute then the value must be `map`.
79178

80-
## Using nesting
179+
### Using nesting
81180

82181
There are several important things to note when you use nesting:
83182

@@ -111,3 +210,4 @@ There are several important things to note when you use nesting:
111210
See the
112211
[Debezium PostgreSQL Connector Documentation](https://debezium.io/documentation/reference/connectors/postgresql.html#postgresql-replica-identity)
113212
for more information about this.
213+
- If you change the foreign key of a child object, the child object will be added to the new parent, but the old parent will not be updated. This is a known limitation of the current implementation and is subject to change in future versions.

0 commit comments

Comments
 (0)