-
Notifications
You must be signed in to change notification settings - Fork 96
Open
Labels
cloud/azureissues related to Azureissues related to Azurefeat/workflowtriggered as a Databricks Job managed by UCXtriggered as a Databricks Job managed by UCXmigrate/managedgo/uc/upgrade Upgrade Managed Tables and Jobsgo/uc/upgrade Upgrade Managed Tables and Jobs
Description
Is there an existing issue for this?
- I have searched the existing issues
Problem statement
External tables stored in adl://
and wasbs://
will be crawled and marked with What.EXTERNAL_NO_SYNC
.
We will need more What
enum to differentiate following scenarios:
- Hiveserde tables, like ParquetHiveSerDe, which cannot be SYNC, but can be in place migrated by creating a UC table with supported data source (for example
create external table ... using parquet ... location
) - Hiveserde tables that have to be migrated using CTAS
- Tables in unsupported filesystem like
adl://
andwasbs://
. It require either:- migrate the storage to ADLS Gen2 first and update the HMS table location, then migrate to UC.
- or deep clone or CTAS the table to a UC.
Proposed Solution
- Add more
What
enum. - Discuss the strategy of how to migrate those tables in the future.
Additional Context
Related issue:
- Report on ADLSv1 and Azure Blob Storage as
EXTERNAL
table sources don't work with UC #355 which reports unsupported table in dashboard. - [BUG]: Migrate UC External Location should skip unsupported filesystem #1064 Migrate UC External Location should skip unsupported filesystem
Metadata
Metadata
Assignees
Labels
cloud/azureissues related to Azureissues related to Azurefeat/workflowtriggered as a Databricks Job managed by UCXtriggered as a Databricks Job managed by UCXmigrate/managedgo/uc/upgrade Upgrade Managed Tables and Jobsgo/uc/upgrade Upgrade Managed Tables and Jobs
Type
Projects
Status
No status