Skip to content

Commit 6e3bfe8

Browse files
Merge pull request #1565 from ilianiliev-redis/improve-row-format-documentation
Improving documentation regarding full row format and operation codes
2 parents f539fc2 + 88dfc88 commit 6e3bfe8

File tree

2 files changed

+120
-3
lines changed

2 files changed

+120
-3
lines changed

content/integrate/redis-data-integration/data-pipelines/data-pipelines.md

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -336,13 +336,17 @@ The main sections of these files are:
336336
*transformation block* that will use the parameters supplied in the `with` section. See the
337337
[data transformation reference]({{< relref "/integrate/redis-data-integration/reference/data-transformation" >}})
338338
for more details about the supported transformation blocks, and also the
339-
[JMESPath custom functions]({{< relref "/integrate/redis-data-integration/reference/jmespath-custom-functions" >}}) reference. You can test your transformation logic using the [dry run]({{< relref "/integrate/redis-data-integration/reference/api-reference/#tag/secure/operation/job_dry_run_api_v1_pipelines_jobs_dry_run_post" >}}) feature in the API.
339+
[JMESPath custom functions]({{< relref "/integrate/redis-data-integration/reference/jmespath-custom-functions" >}}) reference. You can test your transformation logic using the [dry run]({{< relref "/integrate/redis-data-integration/reference/api-reference/#tag/secure/operation/job_dry_run_api_v1_pipelines_jobs_dry_run_post" >}}) feature in the API.
340340

341341
{{< note >}}If you set `row_format` to `full` under the `source` settings, you can access extra data from the
342342
change record in the transformation:
343-
- Use the expression `key.key` to get the generated Redis key as a string.
343+
- Use the `key` object to access the attributes of the key. For example, `key.id` will give you the value of the `id` column as long as it is part of the primary key.
344344
- Use `before.<FIELD_NAME>` to get the value of a field *before* it was updated in the source database
345-
(the field name by itself gives you the value *after* the update).{{< /note >}}
345+
- Use `after.<FIELD_NAME>` to get the value of a field *after* it was updated in the source database
346+
- Use `after.<FIELD_NAME>` when adding new fields during transformations
347+
348+
See [Row Format]({{< relref "/integrate/redis-data-integration/data-pipelines/transform-examples/redis-row-format#full" >}}) for a more detailed explanation of the full format.
349+
{{< /note >}}
346350

347351
- `output`: This is a mandatory section to specify the data structure(s) that
348352
RDI will write to
Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
---
2+
Title: Row Format
3+
alwaysopen: false
4+
categories:
5+
- docs
6+
- integrate
7+
- rs
8+
- rdi
9+
description: null
10+
group: di
11+
linkTitle: Row Format
12+
summary: Explanation of the row formats supported by Redis Data Integration jobs.
13+
type: integration
14+
weight: 30
15+
---
16+
17+
18+
The RDI pipelines support two separate row formats which you can specify in the `source` section of the job file:
19+
20+
- `basic` - (Default) Contains the current value of the row only.
21+
- `full` - Contains all information available for the row, including the key, the before and after values, and the operation code.
22+
23+
The `full` row format is useful when you want to access the metadata associated with the row, such as the operation code, and the before and after values.
24+
The structure of the data passed to the `transform` and `output` sections is different depending on the row format you choose. Consider which row format you are using when you reference keys.
25+
The following two examples demonstrate the difference between the two row formats.
26+
27+
## Default row format
28+
29+
With the default row format, the input value is a JSON object containing the current value of the row, and fields can be referenced directly by their name.
30+
31+
Usage example:
32+
33+
```yaml
34+
source:
35+
table: addresses
36+
transform:
37+
- uses: add_field
38+
with:
39+
field: city_state
40+
expression: concat([CITY, ', ', STATE])
41+
language: jmespath
42+
- uses: add_field
43+
with:
44+
field: op_code_value
45+
# Operation code is not available in standard row format
46+
# so the following expression will result in `op_code - None`
47+
expression: concat(['op_code', ' - ', opcode])
48+
language: jmespath
49+
output:
50+
- uses: redis.write
51+
with:
52+
data_type: hash
53+
key:
54+
expression: concat(['addresses', '#', ID])
55+
language: jmespath
56+
```
57+
58+
59+
## Full row format {#full}
60+
61+
With `row_format: full` the input value is a JSON object with the following structure:
62+
63+
- `key` - An object containing the attributes of the primary key. For example, `key.id` will give you the value of the `id` column as long as it is part of the primary key.
64+
- `before` - An object containing the previous value of the row.
65+
- `after` - An object containing the current value of the row.
66+
- `opcode` - The operation code. Different databases use different values for the operation code. See [operation code values]({{< relref "#operation-codes" >}}) below for more information.
67+
- `db` - The database name.
68+
- `table` - The table name.
69+
- `schema` - The schema name.
70+
71+
Note: The `db` and `schema` fields are database-specific and may not be available in all databases. For example, MySQL doesn't use `schema` and uses `db` as the database name.
72+
73+
74+
Usage example:
75+
76+
```yaml
77+
source:
78+
table: addresses
79+
row_format: full
80+
transform:
81+
- uses: add_field
82+
with:
83+
# opcode is only available in full row format and can be used in the transformations
84+
field: after.op_code_value
85+
expression: address
86+
language: jmespath
87+
- uses: add_field
88+
with:
89+
field: after.city_state
90+
# Note that we need to use the `after` prefix to access the current value of the row
91+
# or `before` to access the previous value
92+
expression: concat([after.CITY, ', ', after.STATE])
93+
language: jmespath
94+
output:
95+
- uses: redis.write
96+
with:
97+
data_type: hash
98+
key:
99+
# There are different ways to express the key
100+
# If the `ID` column is the primary key the following expressions
101+
# are equivalent - `key.ID`, `after.ID`, `values(key)[0]`
102+
expression: concat(['addresses-full', '#', values(key)[0]])
103+
language: jmespath
104+
```
105+
106+
## Operation code values {#operation-codes}
107+
108+
- r - Read (applies to only snapshots)
109+
- c - Create
110+
- u - Update
111+
- d - Delete
112+
- t = truncate (PostgreSQL specific)
113+
- m = message (PostgreSQL specific)

0 commit comments

Comments
 (0)