Support Databricks Workload Identity Federation for GitHub tokens #933

hectorcast-db · 2025-03-21T08:57:24Z

What changes are proposed in this pull request?

This PR adds support for Databricks Workload Identity Federation using GitHub tokens. This allows users to use WIF from their GitHub Workflows and authenticate their workloads without long lived secrets.

This new credentials strategy is added to the DefaultCredentialsStrategy after the other Databricks Credentials Strategy and before cloud specific authentication methods.
WIF credentials uses a subset of configuration values of other Databricks authentication methods. By being added after them it ensures that WIF is not used when other Databricks authentication methods are configured.
WIF uses the Databricks client id, which is not used by cloud specific authentication methods. Therefore, it will not be used when cloud specific authentication methods are configured.

How is this tested?

Added tests.

renaudhartert-db

LGTM as soon as we have agreement on the auth_type name.

renaudhartert-db · 2025-03-27T08:00:39Z

README.md

-| `token`      | _(String)_ The Databricks personal access token (PAT) _(AWS, Azure, and GCP)_ or Azure Active Directory (Azure AD) token _(Azure)_. | `DATABRICKS_TOKEN` |
-| `username`   | _(String)_ The Databricks username part of basic authentication. Only possible when `Host` is `*.cloud.databricks.com` _(AWS)_. | `DATABRICKS_USERNAME` |
-| `password`   | _(String)_ The Databricks password part of basic authentication. Only possible when `Host` is `*.cloud.databricks.com` _(AWS)_. | `DATABRICKS_PASSWORD` |
+- For Databricks wif authentication, you must provide `host`, `client_id` and `token_audience` _(optional)_; or their environment variable or `.databrickscfg` file field equivalents.


Suggested change

- For Databricks wif authentication, you must provide `host`, `client_id` and `token_audience` _(optional)_; or their environment variable or `.databrickscfg` file field equivalents.

- For Databricks WIF authentication, you must provide the `host`, `client_id` and `token_audience` _(optional)_ either directly, through the corresponding environment variables, or in your `.databrickscfg` configuration file.

github-actions · 2025-04-23T07:44:09Z

If integration tests don't run automatically, an authorized user can run them manually by following the instructions below:

Trigger:
go/deco-tests-run/sdk-py

Inputs:

PR number: 933
Commit SHA: a013ff39353943295df3ceff36f4df9269e8802b

Checks will be approved automatically on success.

## Release v0.51.0 ### New Features and Improvements * Enabled asynchronous token refreshes by default. A new `disable_async_token_refresh` configuration option has been added to allow disabling this feature if necessary ([#952](#952)). To disable asynchronous token refresh, set the environment variable `DATABRICKS_DISABLE_ASYNC_TOKEN_REFRESH=true` or configure it within your configuration object. The previous `enable_experimental_async_token_refresh` option has been removed as asynchronous refresh is now the default behavior. * Introduce support for Databricks Workload Identity Federation in GitHub workflows ([933](#933)). See README.md for instructions. * [Breaking] Users running their workflows in GitHub Actions, which use Cloud native authentication and also have a `DATABRICKS_CLIENT_ID` and `DATABRICKS_HOST` environment variables set may see their authentication start failing due to the order in which the SDK tries different authentication methods. ### API Changes * Added [w.alerts_v2](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/sql/alerts_v2.html) workspace-level service. * Added `update_ncc_azure_private_endpoint_rule_public()` method for [a.network_connectivity](https://databricks-sdk-py.readthedocs.io/en/latest/account/settings/network_connectivity.html) account-level service. * Added `update_endpoint_budget_policy()` and `update_endpoint_custom_tags()` methods for [w.vector_search_endpoints](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_endpoints.html) workspace-level service. * Added `created_at`, `created_by` and `metastore_id` fields for `databricks.sdk.service.catalog.SetArtifactAllowlist`. * Added `node_type_flexibility` field for `databricks.sdk.service.compute.EditInstancePool`. * Added `page_size` and `page_token` fields for `databricks.sdk.service.compute.GetEvents`. * Added `next_page_token` and `prev_page_token` fields for `databricks.sdk.service.compute.GetEventsResponse`. * Added `node_type_flexibility` field for `databricks.sdk.service.compute.GetInstancePool`. * Added `node_type_flexibility` field for `databricks.sdk.service.compute.InstancePoolAndStats`. * Added `effective_performance_target` field for `databricks.sdk.service.jobs.RepairHistoryItem`. * Added `performance_target` field for `databricks.sdk.service.jobs.RepairRun`. * [Breaking] Added `network_connectivity_config` field for `databricks.sdk.service.settings.CreateNetworkConnectivityConfigRequest`. * [Breaking] Added `private_endpoint_rule` field for `databricks.sdk.service.settings.CreatePrivateEndpointRuleRequest`. * Added `domain_names` field for `databricks.sdk.service.settings.NccAzurePrivateEndpointRule`. * Added `auto_resolve_display_name` field for `databricks.sdk.service.sql.CreateAlertRequest`. * Added `auto_resolve_display_name` field for `databricks.sdk.service.sql.CreateQueryRequest`. * Added `budget_policy_id` field for `databricks.sdk.service.vectorsearch.CreateEndpoint`. * Added `custom_tags` and `effective_budget_policy_id` fields for `databricks.sdk.service.vectorsearch.EndpointInfo`. * Added `create_clean_room`, `execute_clean_room_task` and `modify_clean_room` enum values for `databricks.sdk.service.catalog.Privilege`. * Added `dns_resolution_error` and `gcp_denied_by_org_policy` enum values for `databricks.sdk.service.compute.TerminationReasonCode`. * Added `disabled` enum value for `databricks.sdk.service.jobs.TerminationCodeCode`. * Added `expired` enum value for `databricks.sdk.service.settings.NccAzurePrivateEndpointRuleConnectionState`. * [Breaking] Changed `create_network_connectivity_configuration()` and `create_private_endpoint_rule()` methods for [a.network_connectivity](https://databricks-sdk-py.readthedocs.io/en/latest/account/settings/network_connectivity.html) account-level service with new required argument order. * [Breaking] Changed `create_index()` method for [w.vector_search_indexes](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_indexes.html) workspace-level service to return `databricks.sdk.service.vectorsearch.VectorIndex` dataclass. * [Breaking] Changed `delete_data_vector_index()` method for [w.vector_search_indexes](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_indexes.html) workspace-level service . HTTP method/verb has changed. * [Breaking] Changed `delete_data_vector_index()` method for [w.vector_search_indexes](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_indexes.html) workspace-level service with new required argument order. * [Breaking] Changed `databricks.sdk.service.vectorsearch.List` dataclass to. * [Breaking] Changed `workload_size` field for `databricks.sdk.service.serving.ServedModelInput` to type `str` dataclass. * [Breaking] Changed `group_id` field for `databricks.sdk.service.settings.NccAzurePrivateEndpointRule` to type `str` dataclass. * [Breaking] Changed `target_services` field for `databricks.sdk.service.settings.NccAzureServiceEndpointRule` to type `databricks.sdk.service.settings.EgressResourceTypeList` dataclass. * [Breaking] Changed `data_array` field for `databricks.sdk.service.vectorsearch.ResultData` to type `databricks.sdk.service.vectorsearch.ListValueList` dataclass. * [Breaking] Changed waiter for [VectorSearchEndpointsAPI.create_endpoint](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_endpoints.html#databricks.sdk.service.vectorsearch.VectorSearchEndpointsAPI.create_endpoint) method. * [Breaking] Removed `name` and `region` fields for `databricks.sdk.service.settings.CreateNetworkConnectivityConfigRequest`. * [Breaking] Removed `group_id` and `resource_id` fields for `databricks.sdk.service.settings.CreatePrivateEndpointRuleRequest`. * [Breaking] Removed `null_value` field for `databricks.sdk.service.vectorsearch.Value`. * [Breaking] Removed `large`, `medium` and `small` enum values for `databricks.sdk.service.serving.ServedModelInputWorkloadSize`. * [Breaking] Removed `blob`, `dfs`, `mysql_server` and `sql_server` enum values for `databricks.sdk.service.settings.NccAzurePrivateEndpointRuleGroupId`.

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 08:57 — with GitHub Actions Inactive

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from cef6e21 to 0bb1af4 Compare March 21, 2025 09:03

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 09:03 — with GitHub Actions Inactive

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 0bb1af4 to 520d2d1 Compare March 21, 2025 09:24

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 09:24 — with GitHub Actions Inactive

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 09:25 — with GitHub Actions Inactive

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 520d2d1 to 4c38da9 Compare March 21, 2025 09:34

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 09:34 — with GitHub Actions Inactive

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 09:36 — with GitHub Actions Inactive

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 4c38da9 to 98a637e Compare March 21, 2025 09:41

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 09:42 — with GitHub Actions Inactive

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 98a637e to 5685b9f Compare March 21, 2025 09:49

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 09:49 — with GitHub Actions Inactive

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 5685b9f to f647c21 Compare March 21, 2025 09:54

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 09:54 — with GitHub Actions Inactive

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from f647c21 to 72c3549 Compare March 21, 2025 12:56

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 12:56 — with GitHub Actions Inactive

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 72c3549 to b6f303c Compare March 21, 2025 13:00

hectorcast-db temporarily deployed to test-trigger-is March 21, 2025 13:00 — with GitHub Actions Inactive

Tests

5063be3

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from b6f303c to 5063be3 Compare March 25, 2025 14:40

hectorcast-db temporarily deployed to test-trigger-is March 25, 2025 14:40 — with GitHub Actions Inactive

hectorcast-db changed the title ~~[DRAFT] Support Databricks Workload Identity Federation for GitHub tokens~~ Support Databricks Workload Identity Federation for GitHub tokens Mar 25, 2025

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 6a584b1 to d810e1b Compare March 26, 2025 07:30

hectorcast-db temporarily deployed to test-trigger-is March 26, 2025 07:30 — with GitHub Actions Inactive

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from d810e1b to d2df30a Compare March 26, 2025 07:44

hectorcast-db temporarily deployed to test-trigger-is March 26, 2025 07:44 — with GitHub Actions Inactive

hectorcast-db requested a review from renaudhartert-db March 26, 2025 07:54

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from d2df30a to 4cc6705 Compare March 26, 2025 10:30

hectorcast-db temporarily deployed to test-trigger-is March 26, 2025 10:30 — with GitHub Actions Inactive

renaudhartert-db approved these changes Mar 28, 2025

View reviewed changes

Merge branch 'main' into databricks-wif

e9c6fc9

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 4cc6705 to 00ba2d7 Compare April 7, 2025 07:50

hectorcast-db temporarily deployed to test-trigger-is April 7, 2025 07:50 — with GitHub Actions Inactive

Merge branch 'main' into databricks-wif

46dcb5a

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 00ba2d7 to 46dcb5a Compare April 7, 2025 07:51

hectorcast-db temporarily deployed to test-trigger-is April 7, 2025 07:51 — with GitHub Actions Inactive

hectorcast-db temporarily deployed to test-trigger-is April 7, 2025 07:52 — with GitHub Actions Inactive

hectorcast-db temporarily deployed to test-trigger-is April 22, 2025 10:38 — with GitHub Actions Inactive

Merge branch 'main' into databricks-wif

a013ff3

hectorcast-db force-pushed the hectorcast-db/databricks-wif branch from 4b0bed3 to a013ff3 Compare April 23, 2025 07:43

hectorcast-db temporarily deployed to test-trigger-is April 23, 2025 07:43 — with GitHub Actions Inactive

hectorcast-db temporarily deployed to test-trigger-is April 23, 2025 07:44 — with GitHub Actions Inactive

hectorcast-db added this pull request to the merge queue Apr 29, 2025

Merged via the queue into main with commit 6d8d906 Apr 29, 2025
17 checks passed

hectorcast-db deleted the hectorcast-db/databricks-wif branch April 29, 2025 10:07

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support Databricks Workload Identity Federation for GitHub tokens #933

Support Databricks Workload Identity Federation for GitHub tokens #933

Uh oh!

hectorcast-db commented Mar 21, 2025 •

edited

Loading

Uh oh!

renaudhartert-db left a comment

Uh oh!

renaudhartert-db Mar 27, 2025

Uh oh!

github-actions bot commented Apr 23, 2025

Uh oh!

Uh oh!

Uh oh!

	- For Databricks wif authentication, you must provide `host`, `client_id` and `token_audience` _(optional)_; or their environment variable or `.databrickscfg` file field equivalents.
	- For Databricks WIF authentication, you must provide the `host`, `client_id` and `token_audience` _(optional)_ either directly, through the corresponding environment variables, or in your `.databrickscfg` configuration file.

Support Databricks Workload Identity Federation for GitHub tokens #933

Support Databricks Workload Identity Federation for GitHub tokens #933

Uh oh!

Conversation

hectorcast-db commented Mar 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes are proposed in this pull request?

How is this tested?

Uh oh!

renaudhartert-db left a comment

Choose a reason for hiding this comment

Uh oh!

renaudhartert-db Mar 27, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 23, 2025

Uh oh!

Uh oh!

Uh oh!

hectorcast-db commented Mar 21, 2025 •

edited

Loading