-
Notifications
You must be signed in to change notification settings - Fork 96
Crawler: support for object ownership #2774
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
573fca2
b8f7e69
07fa875
28daa7e
f4e247e
d0c22db
467f912
33cb841
3a1868c
ec23bb0
b9dd2a3
a6b46c1
57bf8c3
7db7aa0
7676f7c
7010237
d1e24eb
9155e19
9980c20
fded489
53da23d
83044e8
ed38942
5d4c994
348d9b0
5cf6f30
8d4de1f
4a597a2
1818d46
c13b4eb
7e66e70
f7942aa
3cb9abf
2120322
e2189f4
0d8e48b
28ab56a
cc7db1c
0dd3f11
64048e1
8f0265f
8b944b3
86582ad
b2e66f2
953ff62
7037b6a
8282051
3d769c6
3c0a5b4
9b39e30
7029bf9
8d8191d
94a601d
b627890
ae8d194
a6b5da0
33d9c13
d677179
b68a6c4
c6de109
4deaf93
47d5343
d74c241
736ecfb
409bdf9
215a1be
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,6 +25,7 @@ | |
from databricks.labs.ucx.assessment.clusters import CheckClusterMixin | ||
from databricks.labs.ucx.assessment.crawlers import spark_version_compatibility | ||
from databricks.labs.ucx.framework.crawlers import CrawlerBase | ||
from databricks.labs.ucx.framework.owners import Ownership | ||
from databricks.labs.ucx.framework.utils import escape_sql_identifier | ||
|
||
logger = logging.getLogger(__name__) | ||
|
@@ -37,6 +38,7 @@ class JobInfo: | |
failures: str | ||
job_name: str | None = None | ||
creator: str | None = None | ||
"""User-name of the creator of the pipeline, if known.""" | ||
|
||
|
||
class JobsMixin: | ||
|
@@ -106,7 +108,8 @@ def _prepare(all_jobs) -> tuple[dict[int, set[str]], dict[int, JobInfo]]: | |
if not job.job_id: | ||
continue | ||
job_assessment[job.job_id] = set() | ||
if not job.creator_user_name: | ||
creator_user_name = job.creator_user_name or None | ||
if not creator_user_name: | ||
logger.warning( | ||
f"Job {job.job_id} have Unknown creator, it means that the original creator has been deleted " | ||
f"and should be re-created" | ||
|
@@ -122,7 +125,7 @@ def _prepare(all_jobs) -> tuple[dict[int, set[str]], dict[int, JobInfo]]: | |
job_details[job.job_id] = JobInfo( | ||
job_id=str(job.job_id), | ||
job_name=job_name, | ||
creator=job.creator_user_name, | ||
creator=creator_user_name, | ||
success=1, | ||
failures="[]", | ||
) | ||
|
@@ -140,6 +143,16 @@ def _check_jar_task(self, all_task: list[RunTask]) -> list[str]: | |
return task_failures | ||
|
||
|
||
class JobOwnership(Ownership[JobInfo]): | ||
"""Determine ownership of jobs (workflows) in the inventory. | ||
|
||
This is the job creator (if known). | ||
""" | ||
|
||
def _maybe_direct_owner(self, record: JobInfo) -> str | None: | ||
return record.creator | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what about There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is interesting. This is not being gathered by the crawler, which means either: a) updating the crawler to include it; b) doing it here. Wherever this is done, it looks more complicated than at first glance: the Either way I'd suggest covering this as a separate PR. |
||
|
||
|
||
@dataclass | ||
class SubmitRunInfo: | ||
run_ids: str # JSON-encoded list of run ids | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the User be recreated or the cluster?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cluster should either have its owner changed (there is an API for this) or be recreated.
Interestingly there's a mix of terminology here: the property here is
creator_user_name
but the REST API for changing it is…/clusters/change-owner
, and such requests have aowner_username
property.