Skip to content

Commit 3f22a35

Browse files
Improving documentation regarding full row format and operation codes
1 parent 92aff1a commit 3f22a35

File tree

2 files changed

+121
-3
lines changed

2 files changed

+121
-3
lines changed

content/integrate/redis-data-integration/data-pipelines/data-pipelines.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -318,13 +318,17 @@ The main sections of these files are:
318318
*transformation block* that will use the parameters supplied in the `with` section. See the
319319
[data transformation reference]({{< relref "/integrate/redis-data-integration/reference/data-transformation" >}})
320320
for more details about the supported transformation blocks, and also the
321-
[JMESPath custom functions]({{< relref "/integrate/redis-data-integration/reference/jmespath-custom-functions" >}}) reference. You can test your transformation logic using the [dry run]({{< relref "/integrate/redis-data-integration/reference/api-reference/#tag/secure/operation/job_dry_run_api_v1_pipelines_jobs_dry_run_post" >}}) feature in the API.
321+
[JMESPath custom functions]({{< relref "/integrate/redis-data-integration/reference/jmespath-custom-functions" >}}) reference. You can test your transformation logic using the [dry run]({{< relref "/integrate/redis-data-integration/reference/api-reference/#tag/secure/operation/job_dry_run_api_v1_pipelines_jobs_dry_run_post" >}}) feature in the API.
322322

323323
{{< note >}}If you set `row_format` to `full` under the `source` settings, you can access extra data from the
324324
change record in the transformation:
325-
- Use the expression `key.key` to get the generated Redis key as a string.
325+
- Use can access the attributes of the key using under the `key` object. For example, `key.id` will give you the value of it the `id` column as long as it is part of the primary key.
326326
- Use `before.<FIELD_NAME>` to get the value of a field *before* it was updated in the source database
327-
(the field name by itself gives you the value *after* the update).{{< /note >}}
327+
- Use `after.<FIELD_NAME>` to get the value of a field *after* it was updated in the source database
328+
- Use `after.<FIELD_NAME>` when adding new fields during transformations
329+
330+
A more detailed explanation about the full format can be found under [Row Format]({{< relref "/integrate/redis-data-integration/data-pipelines/transform-examples/redis-row-format#full" >}})
331+
{{< /note >}}
328332

329333
- `output`: This is a mandatory section to specify the data structure(s) that
330334
RDI will write to
Lines changed: 114 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,114 @@
1+
---
2+
Title: Row Format
3+
aliases: /integrate/redis-data-integration/ingest/data-pipelines/transform-examples/redis-row-format/
4+
alwaysopen: false
5+
categories:
6+
- docs
7+
- integrate
8+
- rs
9+
- rdi
10+
description: null
11+
group: di
12+
linkTitle: Row Format
13+
summary: Explanation of the row formats supported by Redis Data Integration jobs.
14+
type: integration
15+
weight: 30
16+
---
17+
18+
19+
The RDI pipelines support two separate row formats which can be specified in the `source` section of the job file:
20+
21+
- `basic` - this is the default one, when no `row_format` is specified. It contains only the current value of the row
22+
- `full` - you get a full representation of the row, including the key, the before and after values, the operation code, etc.
23+
24+
The `full` row format is useful when you want to access the metadata associated with the row, such as the operation code, the before and after values, etc.
25+
Based on the used row format, the structure of the data passed to the `transform` and `output` sections is different and the keys should be referenced accordingly to the chosen row format.
26+
The following two examples demonstrate the difference between the two row formats.
27+
28+
## Default row format
29+
30+
With the default row format, the input value is a JSON object containing the current value of the row, and fields can be referenced directly by their name.
31+
32+
Usage example:
33+
34+
```yaml
35+
source:
36+
table: addresses
37+
transform:
38+
- uses: add_field
39+
with:
40+
field: city_state
41+
expression: concat([CITY, ', ', STATE])
42+
language: jmespath
43+
- uses: add_field
44+
with:
45+
field: op_code_value
46+
# Operation code is not available in standard row format
47+
# so the following expression will result in `op_code - None`
48+
expression: concat(['op_code', ' - ', opcode])
49+
language: jmespath
50+
output:
51+
- uses: redis.write
52+
with:
53+
data_type: hash
54+
key:
55+
expression: concat(['addresses', '#', ID])
56+
language: jmespath
57+
```
58+
59+
60+
## Full row format {#full}
61+
62+
With `row_format: full` the input value is a JSON object with the following structure:
63+
64+
- `key` - An object structure containing the attributes of the primary key. For example, `key.id` will give you the value of it the `id` column as long as it is part of the primary key.
65+
- `before` - An object structure containing the previous value of the row.
66+
- `after` - An object structure containing the current value of the row.
67+
- `opcode` - The operation code. Please note that some databases use different values for the operation code. Please refer to the [operation code values]({{< relref "#operation-codes" >}}) section for more information.
68+
- `db` - The database name.
69+
- `table` - The table name.
70+
- `schema` - The schema name.
71+
72+
Note: The `db` and `schema` are database-specific and may not be available in all databases. For example with MySQL `schema` is not available and `db` is the database name.
73+
74+
75+
Usage example:
76+
77+
```yaml
78+
source:
79+
table: addresses
80+
row_format: full
81+
transform:
82+
- uses: add_field
83+
with:
84+
# opcode is only available in full row format and can be used in the transformations
85+
field: after.op_code_value
86+
expression: address
87+
language: jmespath
88+
- uses: add_field
89+
with:
90+
field: after.city_state
91+
# Note that we need to use the `after` prefix to access the current value of the row
92+
# or `before` to access the previous value
93+
expression: concat([after.CITY, ', ', after.STATE])
94+
language: jmespath
95+
output:
96+
- uses: redis.write
97+
with:
98+
data_type: hash
99+
key:
100+
# There are different ways to express the key
101+
# If the `ID` column is the primary key the following expressions
102+
# are equivalent - `key.ID`, `after.ID`, `values(key)[0]`
103+
expression: concat(['addresses-full', '#', values(key)[0]])
104+
language: jmespath
105+
```
106+
107+
## Operation code values {#operation-codes}
108+
109+
- r - Read (applies to only snapshots)
110+
- c - Create
111+
- u - Update
112+
- d - Delete
113+
- t = truncate (PostgreSQL specific)
114+
- m = message (PostgreSQL specific)

0 commit comments

Comments
 (0)