Skip to content

Commit 9fd6b47

Browse files
authored
Merge pull request #1 from data-platform-hq/add-module
feat: added databricks runtime module
2 parents 209f7d9 + b6c5a2f commit 9fd6b47

File tree

7 files changed

+312
-3
lines changed

7 files changed

+312
-3
lines changed

README.md

Lines changed: 67 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,12 +1,76 @@
1-
# Azure <> Terraform module
2-
Terraform module for creation Azure <>
1+
# Databricks Workspace Terraform module
2+
Terraform module used for Databricks Workspace configuration and Resources creation
33

44
## Usage
55

66
<!-- BEGIN_TF_DOCS -->
7+
## Requirements
78

9+
| Name | Version |
10+
| ---------------------------------------------------------------------------- | --------- |
11+
| <a name="requirement_terraform"></a> [terraform](#requirement\_terraform) | >= 1.0.0 |
12+
| <a name="requirement_azurerm"></a> [azurerm](#requirement\_azurerm) | >= 3.23.0 |
13+
| <a name="requirement_databricks"></a> [databricks](#requirement\_databricks) | >= 1.4.0 |
14+
15+
## Providers
16+
17+
| Name | Version |
18+
| ---------------------------------------------------------------------- | ------- |
19+
| <a name="provider_azurerm"></a> [azurerm](#provider\_azurerm) | 3.24.0 |
20+
| <a name="provider_databricks"></a> [databricks](#provider\_databricks) | 1.4.0 |
21+
22+
## Modules
23+
24+
No modules.
25+
26+
## Resources
27+
28+
| Name | Type |
29+
| ----------------------------------------------------------------------------------------------------------------------------------------- | -------- |
30+
| [azurerm_key_vault_secret.sp_client_id](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/key_vault_secret) | data |
31+
| [azurerm_key_vault_secret.sp_key](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/key_vault_secret) | data |
32+
| [azurerm_key_vault_secret.tenant_id](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/key_vault_secret) | data |
33+
| [databricks_token.pat](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/token) | resource |
34+
| [databricks_user.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/user) | resource |
35+
| [azurerm_role_assignment.this](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/role_assignment) | resource |
36+
| [databricks_cluster.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/cluster) | resource |
37+
| [databricks_mount.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mount) | resource |
38+
| [databricks_secret_scope.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/secret_scope) | resource |
39+
| [databricks_secret.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/secret) | resource |
40+
41+
## Inputs
42+
43+
| Name | Description | Type | Default | Required |
44+
| ---------------------------------------------------------------------------------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------- | --------------------------------------------------------------------------- | :------: |
45+
| <a name="input_workspace_id"></a> [workspace\_id](#input\_workspace\_id) | Databricks Workspace ID | `string` | n/a | yes |
46+
| <a name="input_sp_client_id_secret_name"></a> [sp\_client\_id\_secret\_name](#input\_sp\_client\_id\_secret\_name) | The name of Azure Key Vault secret that contains ClientID of Service Principal to access in Azure Key Vault | `string` | n/a | yes |
47+
| <a name="input_sp_key_secret_name"></a> [sp\_key\_secret\_name](#input\_sp\_key\_secret\_name) | The name of Azure Key Vault secret that contains client secret of Service Principal to access in Azure Key Vault | `string` | n/a | yes |
48+
| <a name="input_tenant_id_secret_name"></a> [tenant\_id\_secret\_name](#input\_tenant\_id\_secret\_name) | The name of Azure Key Vault secret that contains tenant ID secret of Service Principal to access in Azure Key Vault | `string` | n/a | yes |
49+
| <a name="input_key_vault_id"></a> [key\_vault\_id](#input\_key\_vault\_id) | ID of the Key Vault instance where the Secret resides | `string` | n/a | yes |
50+
| <a name="input_sku"></a> [sku](#input\_sku) | The sku to use for the Databricks Workspace: [standard \| premium \| trial] | `string` | "standard" | no |
51+
| <a name="input_pat_token_lifetime_seconds"></a> [pat\_token\_lifetime\_seconds](#input\_pat\_token\_lifetime\_seconds) | The lifetime of the token, in seconds. If no lifetime is specified, the token remains valid indefinitely | `number` | 315569520 | no |
52+
| <a name="input_cluster_nodes_availability"></a> [cluster\_nodes\_availability](#input\_cluster\_nodes\_availability) | Availability type used for all subsequent nodes past the first_on_demand ones: [SPOT_AZURE \| SPOT_WITH_FALLBACK_AZURE \| ON_DEMAND_AZURE] | `string` | null | no |
53+
| <a name="input_first_on_demand"></a> [first\_on\_demand](#input\_first\_on\_demand) | The first first_on_demand nodes of the cluster will be placed on on-demand instances: [[ \:number ]] | `number` | 0 | no |
54+
| <a name="input_spot_bid_max_price"></a> [spot\_bid\_max\_price](#input\_spot\_bid\_max\_price) | The max price for Azure spot instances. Use -1 to specify lowest price | `number` | -1 | no |
55+
| <a name="input_autotermination\_minutes"></a> [](#input\_autotermination\_minutes) | Automatically terminate the cluster after being inactive for this time in minutes. If not set, Databricks won't automatically terminate an inactive cluster. If specified, the threshold must be between 10 and 10000 minutes. You can also set this value to 0 to explicitly disable automatic termination | `number` | 15 | no |
56+
| <a name="input_min_workers"></a> [min\_workers](#input\_min\_workers) | The minimum number of workers to which the cluster can scale down when underutilized. It is also the initial number of workers the cluster will have after creation | `number` | 0 | no |
57+
| <a name="input_max_workers"></a> [max\_workers](#input\_max\_workers) | The maximum number of workers to which the cluster can scale up when overloaded. max_workers must be strictly greater than min_workers | `number` | 1 | no |
58+
| <a name="input_users"></a> [users](#input\_users) | List of users to access Databricks | `list(string)` | [] | no |
59+
| <a name="input_secrets"></a> [secrets](#input\_secrets) | Map of secrets to create in Databricks | `map(any)` | {} | no |
60+
| <a name="input_use_local_secret_scope"></a> [use\_local\_secret\_scope](#input\_use\_local\_secret\_scope) | Create databricks secret scope and create secrets | `bool` | false | no |
61+
| <a name="input_permissions"></a> [permissions](#input\_permissions) | Databricks Workspace permission maps | `list(map(string))` | <pre> [{ <br> object_id = null <br> role = null <br> }] </pre> | no |
62+
| <a name="input_spark_version"></a> [spark\_version](#input\_spark\_version) | Runtime version | `string` | "9.1.x-scala2.12" | no |
63+
| <a name="input_node_type"></a> [spark\_node\_type](#input\_node\_type) | Databricks_node_type id | `string` | "Standard_D3_v2" | no |
64+
| <a name="input_mountpoints"></a> [mountpoints](#input\_mountpoints) | Mountpoints for databricks | `map(any)` | null | no |
65+
66+
## Outputs
67+
68+
| Name | Description |
69+
| -------------------------------------------------------------------- | --------------------------------------- |
70+
| <a name="output_token"></a> [token](#output\_token) | Databricks Personal Authorization Token |
71+
| <a name="output_cluster_id"></a> [cluster\_id](#output\_cluster\_id) | Databricks Cluster Id |
872
<!-- END_TF_DOCS -->
973

1074
## License
1175

12-
Apache 2 Licensed. For more information please see [LICENSE](https://github.com/data-platform-hq/terraform-azurerm<>/tree/master/LICENSE)
76+
Apache 2 Licensed. For more information please see [LICENSE](https://github.com/data-platform-hq/terraform-databricks-databricks-runtime/blob/main/LICENSE)

main.tf

Lines changed: 69 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,69 @@
1+
data "azurerm_key_vault_secret" "sp_client_id" {
2+
name = var.sp_client_id_secret_name
3+
key_vault_id = var.key_vault_id
4+
}
5+
6+
data "azurerm_key_vault_secret" "sp_key" {
7+
name = var.sp_key_secret_name
8+
key_vault_id = var.key_vault_id
9+
}
10+
11+
data "azurerm_key_vault_secret" "tenant_id" {
12+
name = var.tenant_id_secret_name
13+
key_vault_id = var.key_vault_id
14+
}
15+
16+
locals {
17+
secrets = merge(var.secrets, {
18+
(var.sp_client_id_secret_name) = { value = data.azurerm_key_vault_secret.sp_client_id.value }
19+
(var.sp_key_secret_name) = { value = data.azurerm_key_vault_secret.sp_key.value }
20+
})
21+
secret_scope_name = var.use_local_secret_scope ? databricks_secret_scope.this[0].name : "main"
22+
mount_secret_name = var.use_local_secret_scope ? databricks_secret.this[var.sp_key_secret_name].key : data.azurerm_key_vault_secret.sp_key.name
23+
}
24+
25+
resource "databricks_token" "pat" {
26+
comment = "Terraform Provisioning"
27+
lifetime_seconds = var.pat_token_lifetime_seconds
28+
}
29+
30+
resource "databricks_user" "this" {
31+
for_each = var.sku == "standard" ? toset(var.users) : []
32+
user_name = each.value
33+
lifecycle { ignore_changes = [external_id] }
34+
}
35+
36+
resource "azurerm_role_assignment" "this" {
37+
for_each = {
38+
for permision in var.permissions : "${permision.object_id}-${permision.role}" => permision
39+
if permision.role != null
40+
}
41+
scope = var.workspace_id
42+
role_definition_name = each.value.role
43+
principal_id = each.value.object_id
44+
}
45+
46+
resource "databricks_cluster" "this" {
47+
cluster_name = "shared autoscaling"
48+
spark_version = var.spark_version
49+
50+
node_type_id = var.node_type
51+
autotermination_minutes = var.autotermination_minutes
52+
53+
autoscale {
54+
min_workers = var.min_workers
55+
max_workers = var.max_workers
56+
}
57+
58+
azure_attributes {
59+
availability = var.cluster_nodes_availability
60+
first_on_demand = var.first_on_demand
61+
spot_bid_max_price = var.spot_bid_max_price
62+
}
63+
64+
lifecycle {
65+
ignore_changes = [
66+
state
67+
]
68+
}
69+
}

mount.tf

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
resource "databricks_mount" "adls" {
2+
for_each = var.mountpoints
3+
4+
cluster_id = databricks_cluster.this.id
5+
name = each.key
6+
uri = "abfss://${each.value["container_name"]}@${each.value["storage_account_name"]}.dfs.core.windows.net/${each.value["root_path"]}"
7+
extra_configs = {
8+
"fs.azure.account.auth.type" : "OAuth",
9+
"fs.azure.account.oauth.provider.type" : "org.apache.hadoop.fs.azurebfs.oauth2.ClientCredsTokenProvider",
10+
"fs.azure.account.oauth2.client.id" : data.azurerm_key_vault_secret.sp_client_id.value,
11+
"fs.azure.account.oauth2.client.secret" : "{{secrets/${local.secret_scope_name}/${data.azurerm_key_vault_secret.sp_key.name}}}",
12+
"fs.azure.account.oauth2.client.secret" : "{{secrets/${local.secret_scope_name}/${local.mount_secret_name}}}",
13+
"fs.azure.account.oauth2.client.endpoint" : "https://login.microsoftonline.com/${data.azurerm_key_vault_secret.tenant_id.value}/oauth2/token",
14+
"fs.azure.createRemoteFileSystemDuringInitialization" : "false",
15+
"spark.databricks.sqldw.jdbc.service.principal.client.id" : data.azurerm_key_vault_secret.sp_client_id.value,
16+
"spark.databricks.sqldw.jdbc.service.principal.client.secret" : "{{secrets/${local.secret_scope_name}/${data.azurerm_key_vault_secret.sp_key.name}}}",
17+
"spark.databricks.sqldw.jdbc.service.principal.client.secret" : "{{secrets/${local.secret_scope_name}/${local.mount_secret_name}}}",
18+
}
19+
}

outputs.tf

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
output "token" {
2+
value = databricks_token.pat.token_value
3+
description = "Databricks Personal Authorization Token"
4+
}
5+
6+
output "cluster_id" {
7+
value = databricks_cluster.this.id
8+
description = "Databricks Cluster Id"
9+
}

secrets.tf

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
resource "databricks_secret_scope" "this" {
2+
count = var.use_local_secret_scope ? 1 : 0
3+
4+
name = "main"
5+
initial_manage_principal = "users"
6+
}
7+
8+
resource "databricks_secret" "this" {
9+
for_each = var.use_local_secret_scope ? local.secrets : {}
10+
11+
key = each.key
12+
string_value = each.value["value"]
13+
scope = databricks_secret_scope.this[0].id
14+
}

0 commit comments

Comments
 (0)