Skip to content

Commit ce4d0ad

Browse files
leibovitzgilmlopsengrichardliaw
authored
[data] Fix Databricks host URL handling in Ray Data (#49926)
<!-- Thank you for your contribution! Please review https://github.com/ray-project/ray/blob/master/CONTRIBUTING.rst before opening a pull request. --> <!-- Please add a reviewer to the assignee section when you create a PR. If you don't have the access to it, we will shortly find a reviewer and assign them to your PR. --> ## Why are these changes needed? The current Databricks integration in Ray Data requires providing the Databricks host URL without the "https://" prefix. However, this creates compatibility issues when using Ray Data alongside MLflow, as MLflow's Databricks integration (which uses the same DATABRICKS_HOST environment variable) expects the URL to include the "https://" prefix. ## Related issue number Closes #49925 ## Checks - [ ] I've signed off every commit(by using the -s flag, i.e., `git commit -s`) in this PR. - [ ] I've run `scripts/format.sh` to lint the changes in this PR. - [ ] I've included any doc changes needed for https://docs.ray.io/en/master/. - [ ] I've added any new APIs to the API Reference. For example, if I added a method in Tune, I've added it in `doc/source/tune/api/` under the corresponding `.rst` file. - [ ] I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/ - Testing Strategy - [ ] Unit tests - [ ] Release tests - [ ] This PR is not tested :( --------- Signed-off-by: Gil Leibovitz <gil.leibovitz@doubleverify.com> Signed-off-by: Richard Liaw <rliaw@berkeley.edu> Co-authored-by: Gil Leibovitz <gil.leibovitz@doubleverify.com> Co-authored-by: Richard Liaw <rliaw@berkeley.edu>
1 parent 925b25c commit ce4d0ad

File tree

1 file changed

+4
-1
lines changed

1 file changed

+4
-1
lines changed

python/ray/data/_internal/datasource/databricks_uc_datasource.py

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,10 @@ def __init__(
3737
self.schema = schema
3838
self.query = query
3939

40-
url_base = f"https://{self.host}/api/2.0/sql/statements/"
40+
if not host.startswith(("http://", "https://")):
41+
self.host = f"https://{host}"
42+
43+
url_base = f"{self.host}/api/2.0/sql/statements/"
4144

4245
payload = json.dumps(
4346
{

0 commit comments

Comments
 (0)