You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
7.[delta live table pipeline migration](/docs/process#delta-live-table-pipeline-migration-process)
14
+
8.[final details](#final-details)
14
15
15
16
The migration process can be schematic visualized as:
16
17
@@ -288,6 +289,7 @@ databricks labs ucx revert-migrated-tables --schema X --table Y [--delete-manage
288
289
The [`revert-migrated-tables` command](/docs/reference/commands#revert-migrated-tables) drops the Unity Catalog table or view and reset
289
290
the `upgraded_to` property on the source object. Use this command to allow for migrating a table or view again.
290
291
292
+
291
293
## Code Migration
292
294
293
295
After you're done with the [table migration](#table-migration-process) and
@@ -307,6 +309,45 @@ After investigating the code linter advices, code can be migrated. We recommend
307
309
- Use the [`migrate-` commands`](/docs/reference/commands#code-migration-commands) to migrate resources.
308
310
- Set the [default catalog](https://docs.databricks.com/en/catalogs/default.html) to Unity Catalog.
309
311
312
+
313
+
## Delta Live Table Pipeline Migration Process
314
+
315
+
> You are required to complete the [assessment workflow](/docs/reference/workflows#assessment-workflow) before starting the pipeline migration workflow.
316
+
317
+
The pipeline migration process is a workflow that clones the Hive Metastore Delta Live Table (DLT) pipelines to the Unity Catalog.
318
+
Upon the first update, the cloned pipeline will copy over all the data and checkpoints, and then run normally thereafter. After the cloned pipeline reaches ‘RUNNING’, both the original and the cloned pipeline can run independently.
319
+
320
+
#### Example:
321
+
Existing HMS DLT pipeline is called "dlt_pipeline", the pipeline will be stopped and renamed to "dlt_pipeline [OLD]". The new cloned pipeline will be "dlt_pipeline".
322
+
323
+
### Known issues and Limitations:
324
+
- Only clones from HMS to UC are supported.
325
+
- Pipelines may only be cloned within the same workspace.
326
+
- HMS pipelines must currently be publishing tables to some target schema.
327
+
- Only the following streaming sources are supported:
- If your pipeline uses Autoloader with file notification events, do not run the original HMS pipeline after cloning as this will cause some file notification events to be dropped from the UC clone. If the HMS original was started accidentally, missed files can be backfilled by using the `cloudFiles.backfillInterval` option in Autoloader.
331
+
- Kafka where `kafka.group.id` is not set
332
+
- Kinesis where `consumerMode` is not "efo"
333
+
-[Maintenance](https://docs.databricks.com/en/delta-live-tables/index.html#maintenance-tasks-performed-by-delta-live-tables) is automatically paused (for both pipelines) while migration is in progress
334
+
- If an Autoloader source specifies an explicit `cloudFiles.schemaLocation`, `mergeSchema` needs to be set to true for the HMS original and UC clone to operate concurrently.
335
+
- Pipelines that publish tables to custom schemas are not supported.
336
+
- On tables cloned to UC, time travel queries are undefined when querying by timestamp to versions originally written on HMS. Time travel queries by version will work correctly, as will time travel queries by timestamp to versions written on UC.
337
+
-[All existing limitations](https://docs.databricks.com/en/delta-live-tables/unity-catalog.html#limitations) of using DLT on UC.
- If tables in the HMS pipeline specify storage locations (using the "path" parameter in Python or the LOCATION clause in SQL), the configuration "pipelines.migration.ignoreExplicitPath" can be set to "true" to ignore the parameter in the cloned pipeline.
340
+
341
+
342
+
### Considerations
343
+
- Do not edit the notebooks that define the pipeline during cloning.
344
+
- The original pipeline should not be running when requesting the clone.
345
+
- When a clone is requested, DLT will automatically start an update to migrate the existing data and metadata for Streaming Tables, allowing them to pick up where the original pipeline left off.
346
+
- It is expected that the update metrics do not include the migrated data.
347
+
- Make sure all name-based references in the HMS pipeline are fully qualified, e.g. hive_metastore.schema.table
348
+
- After the UC clone reaches RUNNING, both the original pipeline and the cloned pipeline may run independently.
349
+
350
+
310
351
## Final details
311
352
312
353
Once you're done with the [code migration](#code-migration), you can run the:
Copy file name to clipboardExpand all lines: docs/ucx/docs/reference/commands/index.mdx
+10Lines changed: 10 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -660,6 +660,16 @@ It takes a `WorkspaceClient` object and `from` and `to` parameters as parameters
660
660
the `TableMove` class. This command is useful for developers and administrators who want to create an alias for a table.
661
661
It can also be used to debug issues related to table aliasing.
662
662
663
+
## Pipeline migration commands
664
+
665
+
These commands are for [pipeline migration](/docs/process#delta-live-table-pipeline-migration-process) and require the [assessment workflow](/docs/reference/workflows#assessment-workflow) to be completed.
666
+
667
+
### `migrate-dlt-pipelines`
668
+
669
+
```text
670
+
$ databricks labs ucx migrate-dlt-pipelines [--include-pipeline-ids <comma separated list of pipeline ids>] [--exclude-pipeline-ids <comma separated list of pipeline ids>]
0 commit comments