Skip to content

Conversation

@hiroyukinakazato-db
Copy link

Changes

This PR adds the llm-transpile command for LLM-powered SQL conversion using the Switch transpiler.

What does this PR do?

Adds llm-transpile CLI command that runs Switch transpiler jobs with parameter passing support.

Relevant implementation details

CLI Integration:

  • Add llm-transpile command to Lakebridge CLI
  • Input source validation (workspace paths and local files)
  • Parameter passing to Switch job runs

Switch Runner Implementation:

  • SwitchConfig: manages Switch resources and job ID retrieval from InstallState
  • SwitchRunner: orchestrates Switch job execution with parameters

Testing:

  • Unit tests for Switch runner with parameter verification
  • Integration tests for Switch installation lifecycle

Development Environment:

  • Add .env to .gitignore for local development credentials

Caveats/things to watch out for when reviewing:

  • Parameter design: Follows transpile and recon command patterns
  • Catalog/schema usage: Uses values configured during Switch installation (following recon pattern)
  • Output parameter naming: Uses --output-ws-folder (not --output-folder) to explicitly indicate workspace folder
  • Dependencies: Requires PR Add Switch transpiler with --include-llm-transpiler flag #2066 (Switch installation) to be merged first

Linked issues

Resolves #2047

Functionality

  • added relevant user documentation
  • added new CLI command: databricks labs lakebridge llm-transpile
  • modified existing command

Tests

  • manually tested
  • added unit tests
  • added integration tests

@hiroyukinakazato-db hiroyukinakazato-db added enhancement New feature or request feat/cli actions that are visible to the user labels Oct 6, 2025
@github-actions
Copy link

github-actions bot commented Oct 6, 2025

✅ 46/46 passed, 6 flaky, 3m12s total

Flaky tests:

  • 🤪 test_validate_non_empty_tables (25ms)
  • 🤪 test_transpiles_informatica_to_sparksql_non_interactive[False] (13.842s)
  • 🤪 test_transpile_teradata_sql_non_interactive[True] (16.419s)
  • 🤪 test_transpile_teradata_sql_non_interactive[False] (5.753s)
  • 🤪 test_transpile_teradata_sql (5.833s)
  • 🤪 test_transpiles_informatica_to_sparksql_non_interactive[True] (8.401s)

Running from acceptance #2812

Implement llm-transpile command for LLM-based code transpilation:
- Add SwitchInstaller for Switch transpiler package management
  - Install Switch package and deploy to workspace
  - Create and manage Databricks jobs with job-level parameters
  - Configure Switch resources (catalog, schema, volume)
- Add SwitchRunner for executing Switch transpilation jobs
  - Upload source files to workspace volume
  - Execute transpilation via Databricks job
  - Download results and handle job lifecycle
- Add llm-transpile CLI command with Switch transpiler support
- Add comprehensive unit and integration tests
Move _get_switch_package_path() from WorkspaceInstallation to SwitchDeployment
as a protected method, following Single Responsibility Principle. SwitchDeployment
now resolves its own package path internally.

Changes:
- Add _get_switch_package_path() protected method to SwitchDeployment
- Update SwitchDeployment.install() signature to remove path parameter
- Remove duplicate _get_switch_package_path() from WorkspaceInstallation
- Remove unused sys and TranspilerRepository imports from installation.py
- Update tests to use new interface with mocked path resolution
Update test_installation.py to match the refactored SwitchDeployment.install()
interface that now takes only resources parameter (path resolution is internal).

Changes:
- Remove switch_repository fixture parameter from test methods
- Delete unused _StubTranspilerRepository stub class
- Remove unused imports (Path, TranspilerRepository)
- Update assertions to check only resources argument

The tests verify that:
1. Switch installation uses configured resources correctly
2. Missing resources logs appropriate error message
Sync with main branch to incorporate latest documentation updates

# Conflicts:
#	labs.yml
The wait_for_completion option is intended for local CLI execution only
and should not be included in Databricks job parameters. This change
filters it out when building job parameter definitions.

Changes:
- Add excluded_options set to filter local-only options
- Skip wait_for_completion when converting config.yml options
- Add test using FriendOfSwitchDeployment pattern to verify exclusion
…ntrol

Add test_switch_install_with_transpile for full workflow testing including
job execution and output verification. Test automatically skips without
LAKEBRIDGE_SWITCH_E2E=true environment variable.

Refactor existing test to test_switch_install and extract helper functions
for DRY implementation. All changes in single file with no CI impact.
Merged latest changes from main branch including:
- Profiler skeleton and Synapse profiler scripts
- Transpiler product_name to transpiler_id rename
- Additional transpile command arguments support
- Test improvements for MSSQL and transpiler repository
@sundarshankar89 sundarshankar89 changed the base branch from main to switch-integration October 27, 2025 06:31
@sundarshankar89 sundarshankar89 added the stacked PR Should be reviewed, but not merged label Oct 27, 2025
@sundarshankar89 sundarshankar89 changed the base branch from switch-integration to main October 27, 2025 06:32
Copy link
Collaborator

@sundarshankar89 sundarshankar89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

) -> None:
"""Transpile source code to Databricks using LLM Transpiler (Switch)"""
ctx = ApplicationContext(w)
ctx.add_user_agent_extra("cmd", "llm-transpile")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ctx.add_user_agent_extra("cmd", "llm-transpile")
ctx.add_user_agent_extra("cmd", "switch")

from databricks.labs.lakebridge.transpiler.switch_runner import SwitchConfig, SwitchRunner


class FriendOfSwitchRunner(SwitchRunner):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class FriendOfSwitchRunner(SwitchRunner):
class MockSwitchRunner(SwitchRunner):

- name: interactive
description: "Whether installing in interactive mode, which may prompt for configuration settings."
default: auto
- name: include-llm-transpiler
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confused why this is here in this PR?

Copy link
Collaborator

@gueniai gueniai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

if ctx is None:
ctx = ApplicationContext(w)
del w
ctx.add_user_agent_extra("cmd", "transpile-switch")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the command is llm-transpile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request feat/cli actions that are visible to the user stacked PR Should be reviewed, but not merged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE]: Add Switch transpiler CLI integration and testing

4 participants