-
Notifications
You must be signed in to change notification settings - Fork 154
Support Databricks Workload Identity Federation for GitHub tokens #933
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cef6e21
to
0bb1af4
Compare
0bb1af4
to
520d2d1
Compare
520d2d1
to
4c38da9
Compare
4c38da9
to
98a637e
Compare
98a637e
to
5685b9f
Compare
5685b9f
to
f647c21
Compare
f647c21
to
72c3549
Compare
72c3549
to
b6f303c
Compare
b6f303c
to
5063be3
Compare
6a584b1
to
d810e1b
Compare
d810e1b
to
d2df30a
Compare
d2df30a
to
4cc6705
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM as soon as we have agreement on the auth_type name.
README.md
Outdated
| `token` | _(String)_ The Databricks personal access token (PAT) _(AWS, Azure, and GCP)_ or Azure Active Directory (Azure AD) token _(Azure)_. | `DATABRICKS_TOKEN` | | ||
| `username` | _(String)_ The Databricks username part of basic authentication. Only possible when `Host` is `*.cloud.databricks.com` _(AWS)_. | `DATABRICKS_USERNAME` | | ||
| `password` | _(String)_ The Databricks password part of basic authentication. Only possible when `Host` is `*.cloud.databricks.com` _(AWS)_. | `DATABRICKS_PASSWORD` | | ||
- For Databricks wif authentication, you must provide `host`, `client_id` and `token_audience` _(optional)_; or their environment variable or `.databrickscfg` file field equivalents. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- For Databricks wif authentication, you must provide `host`, `client_id` and `token_audience` _(optional)_; or their environment variable or `.databrickscfg` file field equivalents. | |
- For Databricks WIF authentication, you must provide the `host`, `client_id` and `token_audience` _(optional)_ either directly, through the corresponding environment variables, or in your `.databrickscfg` configuration file. |
4cc6705
to
00ba2d7
Compare
00ba2d7
to
46dcb5a
Compare
4b0bed3
to
a013ff3
Compare
If integration tests don't run automatically, an authorized user can run them manually by following the instructions below: Trigger: Inputs:
Checks will be approved automatically on success. |
## Release v0.51.0 ### New Features and Improvements * Enabled asynchronous token refreshes by default. A new `disable_async_token_refresh` configuration option has been added to allow disabling this feature if necessary ([#952](#952)). To disable asynchronous token refresh, set the environment variable `DATABRICKS_DISABLE_ASYNC_TOKEN_REFRESH=true` or configure it within your configuration object. The previous `enable_experimental_async_token_refresh` option has been removed as asynchronous refresh is now the default behavior. * Introduce support for Databricks Workload Identity Federation in GitHub workflows ([933](#933)). See README.md for instructions. * [Breaking] Users running their workflows in GitHub Actions, which use Cloud native authentication and also have a `DATABRICKS_CLIENT_ID` and `DATABRICKS_HOST` environment variables set may see their authentication start failing due to the order in which the SDK tries different authentication methods. ### API Changes * Added [w.alerts_v2](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/sql/alerts_v2.html) workspace-level service. * Added `update_ncc_azure_private_endpoint_rule_public()` method for [a.network_connectivity](https://databricks-sdk-py.readthedocs.io/en/latest/account/settings/network_connectivity.html) account-level service. * Added `update_endpoint_budget_policy()` and `update_endpoint_custom_tags()` methods for [w.vector_search_endpoints](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_endpoints.html) workspace-level service. * Added `created_at`, `created_by` and `metastore_id` fields for `databricks.sdk.service.catalog.SetArtifactAllowlist`. * Added `node_type_flexibility` field for `databricks.sdk.service.compute.EditInstancePool`. * Added `page_size` and `page_token` fields for `databricks.sdk.service.compute.GetEvents`. * Added `next_page_token` and `prev_page_token` fields for `databricks.sdk.service.compute.GetEventsResponse`. * Added `node_type_flexibility` field for `databricks.sdk.service.compute.GetInstancePool`. * Added `node_type_flexibility` field for `databricks.sdk.service.compute.InstancePoolAndStats`. * Added `effective_performance_target` field for `databricks.sdk.service.jobs.RepairHistoryItem`. * Added `performance_target` field for `databricks.sdk.service.jobs.RepairRun`. * [Breaking] Added `network_connectivity_config` field for `databricks.sdk.service.settings.CreateNetworkConnectivityConfigRequest`. * [Breaking] Added `private_endpoint_rule` field for `databricks.sdk.service.settings.CreatePrivateEndpointRuleRequest`. * Added `domain_names` field for `databricks.sdk.service.settings.NccAzurePrivateEndpointRule`. * Added `auto_resolve_display_name` field for `databricks.sdk.service.sql.CreateAlertRequest`. * Added `auto_resolve_display_name` field for `databricks.sdk.service.sql.CreateQueryRequest`. * Added `budget_policy_id` field for `databricks.sdk.service.vectorsearch.CreateEndpoint`. * Added `custom_tags` and `effective_budget_policy_id` fields for `databricks.sdk.service.vectorsearch.EndpointInfo`. * Added `create_clean_room`, `execute_clean_room_task` and `modify_clean_room` enum values for `databricks.sdk.service.catalog.Privilege`. * Added `dns_resolution_error` and `gcp_denied_by_org_policy` enum values for `databricks.sdk.service.compute.TerminationReasonCode`. * Added `disabled` enum value for `databricks.sdk.service.jobs.TerminationCodeCode`. * Added `expired` enum value for `databricks.sdk.service.settings.NccAzurePrivateEndpointRuleConnectionState`. * [Breaking] Changed `create_network_connectivity_configuration()` and `create_private_endpoint_rule()` methods for [a.network_connectivity](https://databricks-sdk-py.readthedocs.io/en/latest/account/settings/network_connectivity.html) account-level service with new required argument order. * [Breaking] Changed `create_index()` method for [w.vector_search_indexes](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_indexes.html) workspace-level service to return `databricks.sdk.service.vectorsearch.VectorIndex` dataclass. * [Breaking] Changed `delete_data_vector_index()` method for [w.vector_search_indexes](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_indexes.html) workspace-level service . HTTP method/verb has changed. * [Breaking] Changed `delete_data_vector_index()` method for [w.vector_search_indexes](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_indexes.html) workspace-level service with new required argument order. * [Breaking] Changed `databricks.sdk.service.vectorsearch.List` dataclass to. * [Breaking] Changed `workload_size` field for `databricks.sdk.service.serving.ServedModelInput` to type `str` dataclass. * [Breaking] Changed `group_id` field for `databricks.sdk.service.settings.NccAzurePrivateEndpointRule` to type `str` dataclass. * [Breaking] Changed `target_services` field for `databricks.sdk.service.settings.NccAzureServiceEndpointRule` to type `databricks.sdk.service.settings.EgressResourceTypeList` dataclass. * [Breaking] Changed `data_array` field for `databricks.sdk.service.vectorsearch.ResultData` to type `databricks.sdk.service.vectorsearch.ListValueList` dataclass. * [Breaking] Changed waiter for [VectorSearchEndpointsAPI.create_endpoint](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vectorsearch/vector_search_endpoints.html#databricks.sdk.service.vectorsearch.VectorSearchEndpointsAPI.create_endpoint) method. * [Breaking] Removed `name` and `region` fields for `databricks.sdk.service.settings.CreateNetworkConnectivityConfigRequest`. * [Breaking] Removed `group_id` and `resource_id` fields for `databricks.sdk.service.settings.CreatePrivateEndpointRuleRequest`. * [Breaking] Removed `null_value` field for `databricks.sdk.service.vectorsearch.Value`. * [Breaking] Removed `large`, `medium` and `small` enum values for `databricks.sdk.service.serving.ServedModelInputWorkloadSize`. * [Breaking] Removed `blob`, `dfs`, `mysql_server` and `sql_server` enum values for `databricks.sdk.service.settings.NccAzurePrivateEndpointRuleGroupId`.
What changes are proposed in this pull request?
This PR adds support for Databricks Workload Identity Federation using GitHub tokens. This allows users to use WIF from their GitHub Workflows and authenticate their workloads without long lived secrets.
This new credentials strategy is added to the DefaultCredentialsStrategy after the other Databricks Credentials Strategy and before cloud specific authentication methods.
WIF credentials uses a subset of configuration values of other Databricks authentication methods. By being added after them it ensures that WIF is not used when other Databricks authentication methods are configured.
WIF uses the Databricks client id, which is not used by cloud specific authentication methods. Therefore, it will not be used when cloud specific authentication methods are configured.
How is this tested?
Added tests.