Skip to content

[FEATURE]: Migrate tables in unsupported filesystem #1065

@qziyuan

Description

@qziyuan

Is there an existing issue for this?

  • I have searched the existing issues

Problem statement

External tables stored in adl:// and wasbs:// will be crawled and marked with What.EXTERNAL_NO_SYNC.
We will need more What enum to differentiate following scenarios:

  • Hiveserde tables, like ParquetHiveSerDe, which cannot be SYNC, but can be in place migrated by creating a UC table with supported data source (for example create external table ... using parquet ... location)
  • Hiveserde tables that have to be migrated using CTAS
  • Tables in unsupported filesystem like adl:// and wasbs://. It require either:
    • migrate the storage to ADLS Gen2 first and update the HMS table location, then migrate to UC.
    • or deep clone or CTAS the table to a UC.

Proposed Solution

  • Add more What enum.
  • Discuss the strategy of how to migrate those tables in the future.

Additional Context

Related issue:

Metadata

Metadata

Assignees

No one assigned

    Labels

    cloud/azureissues related to Azurefeat/workflowtriggered as a Databricks Job managed by UCXmigrate/managedgo/uc/upgrade Upgrade Managed Tables and Jobs

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions