Merge pull request #1565 from ilianiliev-redis/improve-row-format-documentation

andy-stark-redis · web-flow · commit 6e3bfe8380cd · 2025-05-27T11:21:02.000+01:00
Improving documentation regarding full row format and operation codes
diff --git a/content/integrate/redis-data-integration/data-pipelines/data-pipelines.md b/content/integrate/redis-data-integration/data-pipelines/data-pipelines.md
@@ -336,13 +336,17 @@ The main sections of these files are:
   *transformation block* that will use the parameters supplied in the `with` section. See the 
   [data transformation reference]({{< relref "/integrate/redis-data-integration/reference/data-transformation" >}})
   for more details about the supported transformation blocks, and also the
-  [JMESPath custom functions]({{< relref "/integrate/redis-data-integration/reference/jmespath-custom-functions" >}}) reference. You can test your transformation logic using the [dry run]({{< relref "/integrate/redis-data-integration/reference/api-reference/#tag/secure/operation/job_dry_run_api_v1_pipelines_jobs_dry_run_post" >}}) feature in the API. 
+  [JMESPath custom functions]({{< relref "/integrate/redis-data-integration/reference/jmespath-custom-functions" >}}) reference. You can test your transformation logic using the [dry run]({{< relref "/integrate/redis-data-integration/reference/api-reference/#tag/secure/operation/job_dry_run_api_v1_pipelines_jobs_dry_run_post" >}}) feature in the API.
 
   {{< note >}}If you set `row_format` to `full` under the `source` settings, you can access extra data from the
   change record in the transformation:
-  - Use the expression `key.key` to get the generated Redis key as a string.
+  - Use the `key` object to access the attributes of the key. For example, `key.id` will give you the value of the `id` column as long as it is part of the primary key.
   - Use `before.<FIELD_NAME>` to get the value of a field *before* it was updated in the source database
-    (the field name by itself gives you the value *after* the update).{{< /note >}}
+  - Use `after.<FIELD_NAME>` to get the value of a field *after* it was updated in the source database
+  - Use `after.<FIELD_NAME>` when adding new fields during transformations
+  
+  See [Row Format]({{< relref "/integrate/redis-data-integration/data-pipelines/transform-examples/redis-row-format#full" >}}) for a more detailed explanation of the full format.
+  {{< /note >}}
  
 - `output`: This is a mandatory section to specify the data structure(s) that
   RDI will write to
diff --git a/content/integrate/redis-data-integration/data-pipelines/transform-examples/redis-row-format.md b/content/integrate/redis-data-integration/data-pipelines/transform-examples/redis-row-format.md
@@ -0,0 +1,113 @@
+---
+Title: Row Format
+alwaysopen: false
+categories:
+- docs
+- integrate
+- rs
+- rdi
+description: null
+group: di
+linkTitle: Row Format
+summary: Explanation of the row formats supported by Redis Data Integration jobs.
+type: integration
+weight: 30
+---
+
+
+The RDI pipelines support two separate row formats which you can specify in the `source` section of the job file:
+
+- `basic` - (Default) Contains the current value of the row only.
+- `full` - Contains all information available for the row, including the key, the before and after values, and the operation code.
+
+The `full` row format is useful when you want to access the metadata associated with the row, such as the operation code, and the before and after values.
+The structure of the data passed to the `transform` and `output` sections is different depending on the row format you choose. Consider which row format you are using when you reference keys.
+The following two examples demonstrate the difference between the two row formats.
+
+## Default row format
+
+With the default row format, the input value is a JSON object containing the current value of the row, and fields can be referenced directly by their name.
+
+Usage example:
+
+```yaml
+source:
+  table: addresses
+transform:
+  - uses: add_field
+    with:
+      field: city_state
+      expression: concat([CITY, ', ', STATE])
+      language: jmespath
+  - uses: add_field
+    with:
+      field: op_code_value
+      # Operation code is not available in standard row format
+      # so the following expression will result in `op_code - None`
+      expression: concat(['op_code', ' - ', opcode])
+      language: jmespath
+output:
+  - uses: redis.write
+    with:
+      data_type: hash
+      key:
+        expression: concat(['addresses', '#', ID])
+        language: jmespath
+```
+
+
+## Full row format {#full}
+
+With `row_format: full` the input value is a JSON object with the following structure:
+
+- `key` - An object containing the attributes of the primary key. For example, `key.id` will give you the value of the `id` column as long as it is part of the primary key.
+- `before` - An object containing the previous value of the row.
+- `after` - An object containing the current value of the row.
+- `opcode` - The operation code. Different databases use different values for the operation code. See [operation code values]({{< relref "#operation-codes" >}}) below for more information.
+- `db` - The database name.
+- `table` - The table name.
+- `schema` - The schema name. 
+ 
+Note: The `db` and `schema` fields are database-specific and may not be available in all databases. For example, MySQL doesn't use `schema` and uses `db` as the database name.
+
+
+Usage example:
+
+```yaml
+source:
+  table: addresses
+  row_format: full
+transform:
+  - uses: add_field
+    with:
+      # opcode is only available in full row format and can be used in the transformations
+      field: after.op_code_value
+      expression: address
+      language: jmespath
+  - uses: add_field
+    with:
+      field: after.city_state
+      # Note that we need to use the `after` prefix to access the current value of the row
+      # or `before` to access the previous value
+      expression: concat([after.CITY, ', ', after.STATE])
+      language: jmespath
+output:
+  - uses: redis.write
+    with:
+      data_type: hash
+      key:
+        # There are different ways to express the key
+        # If the `ID` column is the primary key the following expressions 
+        # are equivalent - `key.ID`, `after.ID`, `values(key)[0]`
+        expression: concat(['addresses-full', '#', values(key)[0]])
+        language: jmespath
+```
+
+## Operation code values {#operation-codes}
+
+- r - Read (applies to only snapshots)
+- c - Create
+- u - Update
+- d - Delete
+- t = truncate (PostgreSQL specific)
+- m = message (PostgreSQL specific)