-
Notifications
You must be signed in to change notification settings - Fork 77
Add Switch transpiler with --include-llm-transpiler flag
#2066
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
hiroyukinakazato-db
wants to merge
49
commits into
main
Choose a base branch
from
feature/switch-installer-integration
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 39 commits
Commits
Show all changes
49 commits
Select commit
Hold shift + click to select a range
dd726b5
feat: integrate Switch transpiler with Lakebridge installer
hiroyukinakazato-db febb62d
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db fa26b4c
fix: remove undefined URLError from exception handling
hiroyukinakazato-db 6511e20
refactor: streamline SwitchInstaller deployment logic and update tests
hiroyukinakazato-db 33ea7de
refactor: simplify SwitchInstaller test structure and improve assertions
hiroyukinakazato-db d0c63c3
Merge remote-tracking branch 'origin/main' into feature/switch-instal…
hiroyukinakazato-db 7cb9ea9
feat: add Switch transpiler installer for Lakebridge integration
hiroyukinakazato-db 467dea9
fix: support case-insensitive config lookup in SwitchInstaller
hiroyukinakazato-db 57298b0
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db 09c0eb8
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db 8439314
refactor: separate Switch installation from workspace deployment
hiroyukinakazato-db 5f66f3f
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db 9dc4b04
refactor: encapsulate Switch package path resolution in SwitchDeployment
hiroyukinakazato-db 7637234
test: update Switch installation tests for refactored interface
hiroyukinakazato-db 729cb0d
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db 42ce0df
fix: exclude wait_for_completion from Switch job parameters
hiroyukinakazato-db f698470
Merge branch 'main' into feature/switch-installer-integration
hiroyukinakazato-db 22cadc9
Defaults in `labs.yml` are strings.
asnare b3d2441
Update flag description to use placeholder syntax.
asnare ac7e2a4
Disable flag pending completion of integration.
asnare f0426e1
Leave pylint's max-args as-is.
asnare 934c2e8
Remove unnecessary include_llm arguments.
asnare 74923cc
Refactor Switch installation.
asnare 084f90f
upgrade to latest switch plugin
sundarshankar89 61f796f
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 0c1d1d5
fixed package dependencies
sundarshankar89 6aeea25
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 468f8de
added additional configuration for making switch
sundarshankar89 6a57570
Latest Switch
sundarshankar89 2c3d153
Sorted List for FMAPI
sundarshankar89 f41dee8
setting logging level
sundarshankar89 950c1b8
setting logging level
sundarshankar89 6ca78ed
setting logging level
sundarshankar89 ba65df4
setting logging level
sundarshankar89 2e2abcb
setting logging level
sundarshankar89 bcbe4df
make default as first choice
sundarshankar89 486250f
fix tests
sundarshankar89 fc1ddca
fix tests
sundarshankar89 42c9c4e
fixes few bugs
sundarshankar89 1831076
update databricks-switch-plugin dependency to version 0.1.4
hiroyukinakazato-db 01a0c87
Review Comments
sundarshankar89 ccce0f2
Review Comments
sundarshankar89 eae5997
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 34c9f8f
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 23df37b
added tests for configurator
sundarshankar89 4a0bf49
added tests for installer
sundarshankar89 c49c5b3
added tests for installer
sundarshankar89 078a0bc
added tests for switch
sundarshankar89 69f93b2
Merge branch 'main' into feature/switch-installer-integration
sundarshankar89 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,182 @@ | ||
| import importlib.resources | ||
| import logging | ||
| from collections.abc import Generator, Sequence | ||
| from importlib.abc import Traversable | ||
| from pathlib import PurePosixPath | ||
| from typing import Any | ||
|
|
||
| from databricks.labs import switch | ||
| from databricks.labs.switch.__about__ import __version__ as switch_version | ||
| from databricks.labs.blueprint.installation import Installation | ||
| from databricks.labs.blueprint.installer import InstallState | ||
| from databricks.labs.blueprint.paths import WorkspacePath | ||
| from databricks.labs.blueprint.wheels import ProductInfo | ||
| from databricks.sdk import WorkspaceClient | ||
| from databricks.sdk.errors import InvalidParameterValue, NotFound | ||
| from databricks.sdk.service.jobs import JobParameterDefinition, JobSettings, NotebookTask, Source, Task | ||
|
|
||
| from databricks.labs.lakebridge.deployment.job import JobDeployment | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
|
|
||
| class SwitchDeployment: | ||
| _INSTALL_STATE_KEY = "Switch" | ||
| _TRANSPILER_ID = "switch" | ||
|
|
||
| def __init__( | ||
| self, | ||
| ws: WorkspaceClient, | ||
| installation: Installation, | ||
| install_state: InstallState, | ||
| product_info: ProductInfo, | ||
| job_deployer: JobDeployment, | ||
| ): | ||
| self._ws = ws | ||
| self._installation = installation | ||
| self._install_state = install_state | ||
| self._product_info = product_info | ||
| self._job_deployer = job_deployer | ||
|
|
||
| def install(self) -> None: | ||
| """Deploy Switch to workspace and configure resources.""" | ||
| logger.debug("Deploying Switch resources to workspace...") | ||
| self._deploy_resources_to_workspace() | ||
| self._setup_job() | ||
| logger.debug("Switch deployment completed") | ||
|
|
||
| def uninstall(self) -> None: | ||
| """Remove Switch job from workspace.""" | ||
| if self._INSTALL_STATE_KEY not in self._install_state.jobs: | ||
| logger.debug("No Switch job found in InstallState") | ||
| return | ||
|
|
||
| job_id = int(self._install_state.jobs[self._INSTALL_STATE_KEY]) | ||
| try: | ||
| logger.info(f"Removing Switch job with job_id={job_id}") | ||
| del self._install_state.jobs[self._INSTALL_STATE_KEY] | ||
| self._ws.jobs.delete(job_id) | ||
| self._install_state.save() | ||
| except (InvalidParameterValue, NotFound): | ||
| logger.warning(f"Switch job {job_id} doesn't exist anymore") | ||
| self._install_state.save() | ||
|
|
||
| def _get_switch_workspace_path(self) -> WorkspacePath: | ||
| installation_root = self._installation.install_folder() | ||
| return WorkspacePath(self._ws, installation_root) / "switch" | ||
|
|
||
| def _deploy_resources_to_workspace(self) -> None: | ||
| """Replicate the Switch package sources to the workspace.""" | ||
| # TODO: This is temporary, instead the jobs should directly run the code from the deployed wheel/package. | ||
| resource_root = self._get_switch_workspace_path() | ||
| # Replace existing resources, to avoid stale files and potential confusion. | ||
| if resource_root.exists(): | ||
| resource_root.rmdir(recursive=True) | ||
| resource_root.mkdir(parents=True) | ||
| already_created = {resource_root} | ||
| logger.info(f"Copying resources to {resource_root} in workspace.......") | ||
| for resource_path, resource in self._enumerate_package_files(switch): | ||
| # Resource path has a leading 'switch' that we want to strip off. | ||
| nested_path = resource_path.relative_to(PurePosixPath("switch")) | ||
| upload_path = resource_root / nested_path | ||
| if (parent := upload_path.parent) not in already_created: | ||
| logger.debug(f"Creating workspace directory: {parent}") | ||
| parent.mkdir() | ||
| already_created.add(parent) | ||
| logger.debug(f"Uploading: {resource_path} -> {upload_path}") | ||
| upload_path.write_bytes(resource.read_bytes()) | ||
| logger.info(f"Completed Copying resources to {resource_root} in workspace...") | ||
|
|
||
| @staticmethod | ||
| def _enumerate_package_files(package) -> Generator[tuple[PurePosixPath, Traversable]]: | ||
| # Locate the root of the package, and then enumerate all its files recursively. | ||
| root = importlib.resources.files(package) | ||
|
|
||
| def _enumerate_resources( | ||
| resource: Traversable, parent: PurePosixPath = PurePosixPath(".") | ||
| ) -> Generator[tuple[PurePosixPath, Traversable]]: | ||
| if resource.name.startswith("."): | ||
| # Skip hidden files and directories | ||
| return | ||
| if resource.is_dir(): | ||
| next_parent = parent / resource.name | ||
| for child in resource.iterdir(): | ||
| yield from _enumerate_resources(child, next_parent) | ||
| elif resource.is_file(): | ||
| # Skip hidden files and compiled Python files | ||
| if not (name := resource.name).endswith((".pyc", ".pyo")): | ||
| yield parent / name, resource | ||
|
|
||
| yield from _enumerate_resources(root) | ||
|
|
||
| def _setup_job(self) -> None: | ||
| """Create or update Switch job.""" | ||
| existing_job_id = self._get_existing_job_id() | ||
| logger.info("Setting up Switch job in workspace...") | ||
| try: | ||
| job_id = self._create_or_update_switch_job(existing_job_id) | ||
| self._install_state.jobs[self._INSTALL_STATE_KEY] = job_id | ||
| self._install_state.save() | ||
| job_url = f"{self._ws.config.host}/jobs/{job_id}" | ||
| logger.info(f"Switch job created/updated: {job_url}") | ||
| except (RuntimeError, ValueError, InvalidParameterValue) as e: | ||
| logger.error(f"Failed to create/update Switch job: {e}") | ||
|
|
||
| def _get_existing_job_id(self) -> str | None: | ||
| """Check if Switch job already exists in workspace.""" | ||
| if self._INSTALL_STATE_KEY not in self._install_state.jobs: | ||
| return None | ||
| try: | ||
| job_id = self._install_state.jobs[self._INSTALL_STATE_KEY] | ||
| self._ws.jobs.get(int(job_id)) | ||
| return job_id | ||
| except (InvalidParameterValue, NotFound, ValueError): | ||
| return None | ||
|
|
||
| def _create_or_update_switch_job(self, job_id: str | None) -> str: | ||
| """Create or update Switch job, returning job ID.""" | ||
| job_settings = self._get_switch_job_settings() | ||
|
|
||
| # Try to update existing job | ||
| if job_id: | ||
| try: | ||
| logger.info(f"Updating Switch job: {job_id}") | ||
| self._ws.jobs.reset(int(job_id), JobSettings(**job_settings)) | ||
| return job_id | ||
| except (ValueError, InvalidParameterValue): | ||
| logger.warning("Previous Switch job not found, creating new one") | ||
|
|
||
| # Create new job | ||
| logger.info("Creating new Switch job") | ||
| new_job = self._ws.jobs.create(**job_settings) | ||
| new_job_id = str(new_job.job_id) | ||
| assert new_job_id is not None | ||
| return new_job_id | ||
|
|
||
| def _get_switch_job_settings(self) -> dict[str, Any]: | ||
| """Build job settings for Switch transpiler.""" | ||
| job_name = "Lakebridge_Switch" | ||
| notebook_path = self._get_switch_workspace_path() / "notebooks" / "00_main" | ||
|
|
||
| task = Task( | ||
| task_key="run_transpilation", | ||
| notebook_task=NotebookTask(notebook_path=str(notebook_path), source=Source.WORKSPACE), | ||
| disable_auto_optimization=True, # To disable retries on failure | ||
| ) | ||
|
|
||
| return { | ||
| "name": job_name, | ||
| "tags": {"created_by": self._ws.current_user.me().id, "switch_version": f"v{switch_version}"}, | ||
sundarshankar89 marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| "tasks": [task], | ||
| "parameters": self._get_switch_job_parameters(), | ||
| "max_concurrent_runs": 100, # Allow simultaneous transpilations | ||
| } | ||
|
|
||
| def _get_switch_job_parameters(self) -> Sequence[JobParameterDefinition]: | ||
| # Add required runtime parameters, static for now. | ||
| parameters = { | ||
| "source_tech": "", | ||
| "input_dir": "", | ||
| "output_dir": "", | ||
| } | ||
| return [JobParameterDefinition(name=key, default=value) for key, value in parameters.items()] | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.