Skip to content

Crawlers: append snapshots to history journal, if available #2743

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 89 commits into from
Oct 23, 2024
Merged
Show file tree
Hide file tree
Changes from 87 commits
Commits
Show all changes
89 commits
Select commit Hold shift + click to select a range
15dd48d
Introduce an optional history log, where crawler snapshots are journa…
asnare Sep 25, 2024
625f16a
Merge branch 'main' into crawler-snapshot-history
asnare Sep 25, 2024
622792e
Switch to integer identifiers for run_id and snapshot_id.
asnare Sep 25, 2024
3d31a1f
Merge branch 'main' into crawler-snapshot-history
asnare Sep 25, 2024
4ebf9b6
Update to store object data with first-level attributes exposed as a …
asnare Sep 25, 2024
d82e06b
Use a 56-bit random number for the snapshot_id.
asnare Sep 25, 2024
4d676bb
Switch to composite object identifier.
asnare Sep 25, 2024
cf48771
Formatting.
asnare Sep 25, 2024
e877532
Modify TODO to not trigger linter.
asnare Sep 25, 2024
b321903
Unit tests for crawler appending new snapshots to the history.
asnare Sep 25, 2024
02fc40f
Merge branch 'main' into crawler-snapshot-history
asnare Sep 30, 2024
6170a63
Ensure call-by-keyword, and indicate the return type.
asnare Sep 30, 2024
4b55625
Fix unit test.
asnare Sep 30, 2024
155cda7
Merge branch 'main' into crawler-snapshot-history
asnare Oct 9, 2024
260becf
Back out changes to the crawler that relate to the history.
asnare Oct 9, 2024
0efb3e5
Merge branch 'main' into crawler-snapshot-history
asnare Oct 9, 2024
6e649af
Merge branch 'main' into crawler-snapshot-history
asnare Oct 9, 2024
a2dd1d5
Merge branch 'main' into crawler-snapshot-history
asnare Oct 14, 2024
f897d98
Merge branch 'main' into crawler-snapshot-history
asnare Oct 15, 2024
4a5321c
Replace initial history record conversion and logging with a new vers…
asnare Oct 16, 2024
76aeefb
Merge branch 'main' into crawler-snapshot-history
asnare Oct 16, 2024
e321589
Mark inner dataclasses in tests as immutable, so they are safe to use…
asnare Oct 16, 2024
4e3e3c8
Fix type hint.
asnare Oct 16, 2024
616b0f1
Mark test class as immutable.
asnare Oct 16, 2024
a52720d
Unit tests for the failures[] mechanism.
asnare Oct 16, 2024
6f38192
When encoding a string field, also handle it being optional.
asnare Oct 16, 2024
af1fff2
Fix comparison.
asnare Oct 16, 2024
724abb2
Unit tests for the history log.
asnare Oct 16, 2024
dd73486
Remove dead code.
asnare Oct 16, 2024
31740dd
Merge branch 'main' into crawler-snapshot-history
asnare Oct 16, 2024
00d5d85
Type hint.
asnare Oct 16, 2024
57f63ba
Test detection of naive timestamps during encoding.
asnare Oct 16, 2024
3aa2431
Rename test argument to avoid shadowing a global.
asnare Oct 16, 2024
4c86ba6
Unit test for handling unserializable values.
asnare Oct 16, 2024
5c27d8b
Merge branch 'main' into crawler-snapshot-history
asnare Oct 17, 2024
39f1105
Inline trivial method.
asnare Oct 17, 2024
1d50ce1
Update error message on unserializable value to provide more context.
asnare Oct 17, 2024
d431322
Merge branch 'main' into crawler-snapshot-history
asnare Oct 17, 2024
95d2a48
Merge branch 'main' into crawler-snapshot-history
asnare Oct 17, 2024
4ce4ce4
Rename for consistency.
asnare Oct 17, 2024
7441077
Update HistoryLog initializer: it doesn't need a workspace client.
asnare Oct 18, 2024
07bc711
Update to object_id support to allow properties as well as fields.
asnare Oct 18, 2024
3b3e5bd
Tweak type signature: the snapshot to append can be any iterable type.
asnare Oct 18, 2024
a9dd77f
Unit tests for logging Table records into the history log.
asnare Oct 18, 2024
0f9a4cd
Update Grants to support logging into the history log.
asnare Oct 18, 2024
ad482b8
Update the migration progress workflow to log tables and grants to th…
asnare Oct 18, 2024
fd2b3ac
Unit tests for a migration-progress task that wasn't covered yet.
asnare Oct 18, 2024
9d4ed60
Merge branch 'main' into crawler-snapshot-history
asnare Oct 18, 2024
3703848
Support classes whose failures attribute is a string (containing JSON…
asnare Oct 18, 2024
62390b5
Ensure updated UDF snapshots are logged to the history table.
asnare Oct 18, 2024
c1fea83
Naming consistency.
asnare Oct 18, 2024
8c70541
Fix copypasta.
asnare Oct 18, 2024
6f41fc0
Ensure updated JobInfo snapshots are appended to the history log.
asnare Oct 18, 2024
927392a
Fix type hints.
asnare Oct 18, 2024
78f592d
Ensure updated ClusterInfo snapshots are stored in the history table.
asnare Oct 18, 2024
6dfed4c
Fix some test names.
asnare Oct 18, 2024
34b1e72
Ensure updated PipelineInfo snapshots are appended to the history table.
asnare Oct 18, 2024
77a06e1
Ensure updated Cluster Policy snapshots are logged to the history table.
asnare Oct 18, 2024
ac60675
Formatting.
asnare Oct 18, 2024
f00a06e
Fix docstring.
asnare Oct 18, 2024
4d6989c
Ensure that updated TableMigrationStatus snapshots are appended to th…
asnare Oct 18, 2024
17100d5
Update query to return at most a single record.
asnare Oct 21, 2024
78a9a13
Update integration test to verify that the history log is written to.
asnare Oct 21, 2024
21ad44d
Update history log to write to multiworkspace.historical instead of u…
asnare Oct 21, 2024
febc9ce
Ensure all tasks run on a UC-enabled cluster.
asnare Oct 21, 2024
383027d
Merge branch 'main' into crawler-snapshot-history
asnare Oct 21, 2024
e223612
Formatting.
asnare Oct 21, 2024
c938dd8
Split the crawling and history-log update across 2 tasks for the tabl…
asnare Oct 21, 2024
8a378dc
Note a limitation of the fast-scan table crawler.
asnare Oct 21, 2024
2e5869d
Factor out the ownership components so they can be used elsewhere.
asnare Oct 21, 2024
bf542a4
Mark the __id_attributes__ sequence as immutable.
asnare Oct 21, 2024
3aad8bf
Mark linters as still needing to be done.
asnare Oct 21, 2024
a990327
Merge branch 'main' into crawler-snapshot-history
asnare Oct 21, 2024
2381d18
Handle UDF failures, which aren't JSON-encoded as with other classes.
asnare Oct 21, 2024
23e7720
Sort imports.
asnare Oct 22, 2024
567d809
Test case (and fix) for when __id_attributes__ is annotated as None.
asnare Oct 22, 2024
f0bd963
Docstring explaining HistoricalEncoder design and intent.
asnare Oct 22, 2024
5b2ab77
Rename some things to align more closely with what they are.
asnare Oct 22, 2024
0ff2e95
Remove redundant type alternative.
asnare Oct 22, 2024
971cb4c
Docstring wording and formatting updates.
asnare Oct 22, 2024
be16738
Mention use-case for these records.
asnare Oct 22, 2024
2b9b476
Clarify reason for assumption.
asnare Oct 22, 2024
f17dc74
Document reason for non-default JSON separators.
asnare Oct 22, 2024
823c886
Detect and handle non-string values being passed in string-hinted fie…
asnare Oct 22, 2024
7377727
Merge branch 'main' into crawler-snapshot-history
asnare Oct 22, 2024
fdb03b2
Handle the remaining lsql-supported fields types.
asnare Oct 22, 2024
b7af858
All tasks in the workflow are supposed to depend on the assessment ha…
asnare Oct 22, 2024
9b489b9
Explicitly declare the dependency of the record_workflow_run on the t…
asnare Oct 23, 2024
0a2f4f2
Use correct method for converting rows to dictionaries.
asnare Oct 23, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions src/databricks/labs/ucx/assessment/clusters.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import logging
from collections.abc import Iterable
from dataclasses import dataclass
from typing import ClassVar

from databricks.labs.lsql.backends import SqlBackend
from databricks.sdk import WorkspaceClient
Expand Down Expand Up @@ -46,6 +47,8 @@ class ClusterInfo:
creator: str | None = None
"""User-name of the creator of the cluster, if known."""

__id_attributes__: ClassVar[tuple[str, ...]] = ("cluster_id",)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typically dunder signals Python builtin. I prefer only leading __ underscore

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can't use just a leading __: that makes it a private attribute, and it can't be accessed from outside the instance. (Dunder attributes are special: although they start with __ that are not subject to mangling like ordinary fields. Arguably they signal a protocol in use behind the curtains, which indeed is what Python mainly uses them for.)

In this case alternatives are:

  • _id_attributes: ClassVar[tuple[str, ...]]: Protected, mildly cranky linter ("access to protected member of a client class").
  • id_attributes: ClassVar[tuple[str, ...]]: Public, which doesn't feel right for what is essentially metadata that we're trying to attach to the instance.

A bit further from home, is something like the SQLAlchemy approach where metadata (such as this) is placed in a sibling-class. That's a pretty large hammer to hit this particular nail though.

Any further thoughts?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this confusion could have been avoided by using a method-based protocol...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's stick with this for now



class CheckClusterMixin(CheckInitScriptMixin):
_ws: WorkspaceClient
Expand Down Expand Up @@ -203,6 +206,8 @@ class PolicyInfo:
creator: str | None = None
"""User-name of the creator of the cluster policy, if known."""

__id_attributes__: ClassVar[tuple[str, ...]] = ("policy_id",)


class PoliciesCrawler(CrawlerBase[PolicyInfo], CheckClusterMixin):
def __init__(self, ws: WorkspaceClient, sbe: SqlBackend, schema):
Expand Down
3 changes: 3 additions & 0 deletions src/databricks/labs/ucx/assessment/jobs.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@
from dataclasses import dataclass
from datetime import datetime, timedelta, timezone
from hashlib import sha256
from typing import ClassVar

from databricks.labs.lsql.backends import SqlBackend
from databricks.sdk import WorkspaceClient
Expand Down Expand Up @@ -40,6 +41,8 @@ class JobInfo:
creator: str | None = None
"""User-name of the creator of the pipeline, if known."""

__id_attributes__: ClassVar[tuple[str, ...]] = ("job_id",)


class JobsMixin:
@classmethod
Expand Down
3 changes: 3 additions & 0 deletions src/databricks/labs/ucx/assessment/pipelines.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import logging
from collections.abc import Iterable
from dataclasses import dataclass
from typing import ClassVar

from databricks.labs.lsql.backends import SqlBackend
from databricks.sdk import WorkspaceClient
Expand All @@ -24,6 +25,8 @@ class PipelineInfo:
creator_name: str | None = None
"""User-name of the creator of the pipeline, if known."""

__id_attributes__: ClassVar[tuple[str, ...]] = ("pipeline_id",)


class PipelinesCrawler(CrawlerBase[PipelineInfo], CheckClusterMixin):
def __init__(self, ws: WorkspaceClient, sbe: SqlBackend, schema):
Expand Down
22 changes: 20 additions & 2 deletions src/databricks/labs/ucx/contexts/application.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,17 +38,19 @@
ComputeLocations,
Grant,
GrantsCrawler,
GrantOwnership,
MigrateGrants,
PrincipalACL,
)
from databricks.labs.ucx.hive_metastore.mapping import TableMapping
from databricks.labs.ucx.hive_metastore.table_migration_status import TableMigrationIndex
from databricks.labs.ucx.hive_metastore.table_migration_status import TableMigrationIndex, TableMigrationOwnership
from databricks.labs.ucx.hive_metastore.table_migrate import (
TableMigrationStatusRefresher,
TablesMigrator,
)
from databricks.labs.ucx.hive_metastore.table_move import TableMove
from databricks.labs.ucx.hive_metastore.udfs import UdfsCrawler
from databricks.labs.ucx.hive_metastore.tables import TableOwnership
from databricks.labs.ucx.hive_metastore.udfs import UdfsCrawler, UdfOwnership
from databricks.labs.ucx.hive_metastore.verification import VerifyHasCatalog, VerifyHasMetastore
from databricks.labs.ucx.installer.workflows import DeployedWorkflows
from databricks.labs.ucx.progress.install import VerifyProgressTracking
Expand Down Expand Up @@ -243,14 +245,26 @@ def group_manager(self) -> GroupManager:
def grants_crawler(self) -> GrantsCrawler:
return GrantsCrawler(self.tables_crawler, self.udfs_crawler, self.config.include_databases)

@cached_property
def grant_ownership(self) -> GrantOwnership:
return GrantOwnership(self.administrator_locator)

@cached_property
def udfs_crawler(self) -> UdfsCrawler:
return UdfsCrawler(self.sql_backend, self.inventory_database, self.config.include_databases)

@cached_property
def udf_ownership(self) -> UdfOwnership:
return UdfOwnership(self.administrator_locator)

@cached_property
def tables_crawler(self) -> TablesCrawler:
return TablesCrawler(self.sql_backend, self.inventory_database, self.config.include_databases)

@cached_property
def table_ownership(self) -> TableOwnership:
return TableOwnership(self.administrator_locator)

@cached_property
def tables_migrator(self) -> TablesMigrator:
return TablesMigrator(
Expand Down Expand Up @@ -363,6 +377,10 @@ def migration_status_refresher(self) -> TableMigrationStatusRefresher:
self.tables_crawler,
)

@cached_property
def table_migration_ownership(self) -> TableMigrationOwnership:
return TableMigrationOwnership(self.tables_crawler, self.table_ownership)

@cached_property
def iam_credential_manager(self) -> CredentialManager:
return CredentialManager(self.workspace_client)
Expand Down
134 changes: 129 additions & 5 deletions src/databricks/labs/ucx/contexts/workflow_task.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,21 +3,35 @@

from databricks.labs.blueprint.installation import Installation
from databricks.labs.lsql.backends import RuntimeBackend, SqlBackend
from databricks.labs.ucx.hive_metastore.table_migration_status import TableMigrationStatus
from databricks.sdk import WorkspaceClient, core

from databricks.labs.ucx.__about__ import __version__
from databricks.labs.ucx.assessment.clusters import ClustersCrawler, PoliciesCrawler
from databricks.labs.ucx.assessment.clusters import (
ClustersCrawler,
PoliciesCrawler,
ClusterOwnership,
ClusterInfo,
ClusterPolicyOwnership,
PolicyInfo,
)
from databricks.labs.ucx.assessment.init_scripts import GlobalInitScriptCrawler
from databricks.labs.ucx.assessment.jobs import JobsCrawler, SubmitRunsCrawler
from databricks.labs.ucx.assessment.pipelines import PipelinesCrawler
from databricks.labs.ucx.assessment.jobs import JobOwnership, JobInfo, JobsCrawler, SubmitRunsCrawler
from databricks.labs.ucx.assessment.pipelines import PipelinesCrawler, PipelineInfo, PipelineOwnership
from databricks.labs.ucx.config import WorkspaceConfig
from databricks.labs.ucx.contexts.application import GlobalContext
from databricks.labs.ucx.hive_metastore import TablesInMounts, TablesCrawler
from databricks.labs.ucx.hive_metastore.grants import Grant
from databricks.labs.ucx.hive_metastore.table_size import TableSizeCrawler
from databricks.labs.ucx.hive_metastore.tables import FasterTableScanCrawler
from databricks.labs.ucx.hive_metastore.tables import FasterTableScanCrawler, Table
from databricks.labs.ucx.hive_metastore.udfs import Udf
from databricks.labs.ucx.installer.logs import TaskRunWarningRecorder
from databricks.labs.ucx.progress.history import HistoryLog
from databricks.labs.ucx.progress.workflow_runs import WorkflowRunRecorder

# As with GlobalContext, service factories unavoidably have a lot of public methods.
# pylint: disable=too-many-public-methods


class RuntimeContext(GlobalContext):
@cached_property
Expand Down Expand Up @@ -54,6 +68,10 @@ def installation(self) -> Installation:
def jobs_crawler(self) -> JobsCrawler:
return JobsCrawler(self.workspace_client, self.sql_backend, self.inventory_database)

@cached_property
def job_ownership(self) -> JobOwnership:
return JobOwnership(self.administrator_locator)

@cached_property
def submit_runs_crawler(self) -> SubmitRunsCrawler:
return SubmitRunsCrawler(
Expand All @@ -67,10 +85,18 @@ def submit_runs_crawler(self) -> SubmitRunsCrawler:
def clusters_crawler(self) -> ClustersCrawler:
return ClustersCrawler(self.workspace_client, self.sql_backend, self.inventory_database)

@cached_property
def cluster_ownership(self) -> ClusterOwnership:
return ClusterOwnership(self.administrator_locator)

@cached_property
def pipelines_crawler(self) -> PipelinesCrawler:
return PipelinesCrawler(self.workspace_client, self.sql_backend, self.inventory_database)

@cached_property
def pipeline_ownership(self) -> PipelineOwnership:
return PipelineOwnership(self.administrator_locator)

@cached_property
def table_size_crawler(self) -> TableSizeCrawler:
return TableSizeCrawler(self.tables_crawler)
Expand All @@ -79,12 +105,18 @@ def table_size_crawler(self) -> TableSizeCrawler:
def policies_crawler(self) -> PoliciesCrawler:
return PoliciesCrawler(self.workspace_client, self.sql_backend, self.inventory_database)

@cached_property
def cluster_policy_ownership(self) -> ClusterPolicyOwnership:
return ClusterPolicyOwnership(self.administrator_locator)

@cached_property
def global_init_scripts_crawler(self) -> GlobalInitScriptCrawler:
return GlobalInitScriptCrawler(self.workspace_client, self.sql_backend, self.inventory_database)

@cached_property
def tables_crawler(self) -> TablesCrawler:
# Warning: Not all runtime contexts support the fast-scan implementation; it requires the JVM bridge to Spark
# and that's not always available.
return FasterTableScanCrawler(self.sql_backend, self.inventory_database, self.config.include_databases)

@cached_property
Expand Down Expand Up @@ -116,10 +148,102 @@ def workflow_run_recorder(self) -> WorkflowRunRecorder:
return WorkflowRunRecorder(
self.sql_backend,
self.config.ucx_catalog,
workspace_id=self.workspace_client.get_workspace_id(),
workspace_id=self.workspace_id,
workflow_name=self.named_parameters["workflow"],
workflow_id=int(self.named_parameters["job_id"]),
workflow_run_id=int(self.named_parameters["parent_run_id"]),
workflow_run_attempt=int(self.named_parameters.get("attempt", 0)),
workflow_start_time=self.named_parameters["start_time"],
)

@cached_property
def workspace_id(self) -> int:
return self.workspace_client.get_workspace_id()

@cached_property
def historical_clusters_log(self) -> HistoryLog[ClusterInfo]:
return HistoryLog(
self.sql_backend,
self.cluster_ownership,
ClusterInfo,
int(self.named_parameters["parent_run_id"]),
self.workspace_id,
self.config.ucx_catalog,
)

@cached_property
def historical_cluster_policies_log(self) -> HistoryLog[PolicyInfo]:
return HistoryLog(
self.sql_backend,
self.cluster_policy_ownership,
PolicyInfo,
int(self.named_parameters["parent_run_id"]),
self.workspace_id,
self.config.ucx_catalog,
)

@cached_property
def historical_grants_log(self) -> HistoryLog[Grant]:
return HistoryLog(
self.sql_backend,
self.grant_ownership,
Grant,
int(self.named_parameters["parent_run_id"]),
self.workspace_id,
self.config.ucx_catalog,
)

@cached_property
def historical_jobs_log(self) -> HistoryLog[JobInfo]:
return HistoryLog(
self.sql_backend,
self.job_ownership,
JobInfo,
int(self.named_parameters["parent_run_id"]),
self.workspace_id,
self.config.ucx_catalog,
)

@cached_property
def historical_pipelines_log(self) -> HistoryLog[PipelineInfo]:
return HistoryLog(
self.sql_backend,
self.pipeline_ownership,
PipelineInfo,
int(self.named_parameters["parent_run_id"]),
self.workspace_id,
self.config.ucx_catalog,
)

@cached_property
def historical_tables_log(self) -> HistoryLog[Table]:
return HistoryLog(
self.sql_backend,
self.table_ownership,
Table,
int(self.named_parameters["parent_run_id"]),
self.workspace_id,
self.config.ucx_catalog,
)

@cached_property
def historical_table_migration_log(self) -> HistoryLog[TableMigrationStatus]:
return HistoryLog(
self.sql_backend,
self.table_migration_ownership,
TableMigrationStatus,
int(self.named_parameters["parent_run_id"]),
self.workspace_id,
self.config.ucx_catalog,
)

@cached_property
def historical_udfs_log(self) -> HistoryLog[Udf]:
return HistoryLog(
self.sql_backend,
self.udf_ownership,
Udf,
int(self.named_parameters["parent_run_id"]),
self.workspace_id,
self.config.ucx_catalog,
)
2 changes: 1 addition & 1 deletion src/databricks/labs/ucx/framework/crawlers.py
Original file line number Diff line number Diff line change
Expand Up @@ -159,6 +159,6 @@ def _snapshot(self, fetcher: ResultFn, loader: ResultFn, *, force_refresh: bool)
self._update_snapshot(loaded_records, mode="overwrite")
return loaded_records

def _update_snapshot(self, items: Sequence[Result], mode: Literal["append", "overwrite"] = "append") -> None:
def _update_snapshot(self, items: Sequence[Result], *, mode: Literal["append", "overwrite"]) -> None:
logger.debug(f"[{self.full_name}] found {len(items)} new records for {self._table}")
self._backend.save_table(self.full_name, items, self._klass, mode=mode)
9 changes: 8 additions & 1 deletion src/databricks/labs/ucx/hive_metastore/grants.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
from collections.abc import Callable, Iterable
from dataclasses import dataclass, replace
from functools import partial, cached_property
from typing import Protocol
from typing import ClassVar, Protocol

from databricks.labs.blueprint.installation import Installation
from databricks.labs.blueprint.parallel import ManyError, Threads
Expand Down Expand Up @@ -66,6 +66,8 @@ class Grant:
any_file: bool = False
anonymous_function: bool = False

__id_attributes__: ClassVar[tuple[str, ...]] = ("object_type", "object_key", "action_type", "principal")

@staticmethod
def type_and_key(
*,
Expand Down Expand Up @@ -105,6 +107,11 @@ def type_and_key(
)
raise ValueError(msg)

@property
def object_type(self) -> str:
this_type, _ = self.this_type_and_key()
return this_type

@property
def object_key(self) -> str:
_, key = self.this_type_and_key()
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
import logging
from dataclasses import dataclass, replace
from collections.abc import Iterable, KeysView
from typing import ClassVar

from databricks.labs.lsql.backends import SqlBackend
from databricks.sdk import WorkspaceClient
Expand All @@ -25,6 +26,8 @@ class TableMigrationStatus:
dst_table: str | None = None
update_ts: str | None = None

__id_attributes__: ClassVar[tuple[str, ...]] = ("src_schema", "src_table")

def destination(self):
return f"{self.dst_catalog}.{self.dst_schema}.{self.dst_table}".lower()

Expand Down
4 changes: 3 additions & 1 deletion src/databricks/labs/ucx/hive_metastore/tables.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import logging
import re
import typing
from collections.abc import Iterable, Iterator, Collection
from collections.abc import Collection, Iterable, Iterator
from dataclasses import dataclass
from enum import Enum, auto
from functools import cached_property, partial
Expand Down Expand Up @@ -64,6 +64,8 @@ class Table: # pylint: disable=too-many-public-methods
storage_properties: str | None = None
is_partitioned: bool = False

__id_attributes__: typing.ClassVar[tuple[str, ...]] = ("catalog", "database", "name")

DBFS_ROOT_PREFIXES: typing.ClassVar[list[str]] = [
"/dbfs/",
"dbfs:/",
Expand Down
Loading
Loading