Skip to content

data-platform-hq/terraform-databricks-metastore

Repository files navigation

Databricks Unity Catalog Metastore Terraform module

Terraform module for creation of Databricks Unity Catalog Metastore

Usage

This module provides an ability to provision Databricks Unity Catalog Metastore and configure default access credentials. Below you can find examples of Account and Workspace level APIs configuration.

Account-level API provider authorization example (recommended).

# Prerequisite resources
variable "databricks_account_id" {}

# Databricks Access Connector (managed identity)
resource "azurerm_databricks_access_connector" "example" {
  name                = "example-resource"
  resource_group_name = "example-rg"
  location            = "eastus"

  identity {
    type = "SystemAssigned"
  }
}

# Storage Account
data "azurerm_storage_account" "example" {
  name                = "example-storage-account"
  resource_group_name = "example-rg"
}

# Container
data "azurerm_storage_container" "example" {
  name                 = "example-container-name"
  storage_account_name = data.azurerm_storage_account.example.name
}

# Configure Databricks Provider
provider "databricks" {
  alias      = "account"
  host       = "https://accounts.azuredatabricks.net"
  account_id = var.databricks_account_id
}

# Metastore creation
module "metastore" {
  source  = "data-platform-hq/metastore/databricks"
  version = "~> 1.0.0"

  metastore_name                                    = "primary-metastore"
  region                                            = "eastus" # required if using account-level api
  storage_root                                      = "abfss://${data.azurerm_storage_container.example.name}@${data.azurerm_storage_account.example.name}.dfs.core.windows.net/"
  azure_access_connector_id                         = azurerm_databricks_access_connector.example.id
  credentials_type                                  = "azure"
  delta_sharing_scope                               = "INTERNAL_AND_EXTERNAL"
  delta_sharing_recipient_token_lifetime_in_seconds = 0 # token is infinite

  providers = {
    databricks = databricks.account
  }
}

Workspace-level API provider authorization.

# Prerequisite resources
# Databricks Access Connector (managed identity)
resource "azurerm_databricks_access_connector" "example" {
  name                = "example-resource"
  resource_group_name = "example-rg"
  location            = "eastus"

  identity {
    type = "SystemAssigned"
  }
}

# Storage Account
data "azurerm_storage_account" "example" {
  name                = "example-storage-account"
  resource_group_name = "example-rg"
}

# Container
data "azurerm_storage_container" "example" {
  name                 = "example-container-name"
  storage_account_name = data.azurerm_storage_account.example.name
}

# Configure Databricks Provider
data "azurerm_databricks_workspace" "example" {
  name                = "example-workspace"
  resource_group_name = "example-rg"
}

provider "databricks" {
  alias                       = "workspace"
  host                        = data.databricks_workspace.example.workspace_url
  azure_workspace_resource_id = data.databricks_workspace.example.id
}

# Metastore creation
module "metastore" {
  source  = "data-platform-hq/metastore/databricks"
  version = "~> 1.0.0"

  metastore_name                                    = "primary-metastore"
  storage_root                                      = "abfss://${data.azurerm_storage_container.example.name}@${data.azurerm_storage_account.example.name}.dfs.core.windows.net/"
  azure_access_connector_id                         = azurerm_databricks_access_connector.example.id
  credentials_type                                  = "azure"
  delta_sharing_scope                               = "INTERNAL_AND_EXTERNAL"
  delta_sharing_recipient_token_lifetime_in_seconds = 0 # token is infinite

  providers = {
    databricks = databricks.workspace
  }
}

Requirements

Name Version
terraform >=1.0.0
databricks >=1.27.0

Providers

Name Version
databricks >=1.27.0

Modules

No modules.

Resources

Name Type
databricks_metastore.this resource
databricks_metastore_data_access.this resource

Inputs

Name Description Type Default Required
aws_iam_role_arn The Amazon Resource Name, of the AWS IAM role for S3 data access string null no
azure_access_connector_id Databricks Access Connector Id that lets you to connect managed identities to an Azure Databricks account. Provides an ability to access Unity Catalog with assigned identity string null no
credentials_type Cloud provider. Select from: azure, gcp, aws string n/a yes
delta_sharing_recipient_token_lifetime_in_seconds Used to set expiration duration in seconds on recipient data access tokens. Set to 0 for unlimited duration. string 0 no
delta_sharing_scope Used to enable delta sharing on the metastore. Valid values: INTERNAL, INTERNAL_AND_EXTERNAL. string "INTERNAL" no
is_data_access_default Are Data Access Storage Credentials default for assigned Metastore? string true no
metastore_data_access_force_destroy DAC force destroy option bool true no
metastore_data_access_name Unity Catalog Metastore Data Access Storage Credentials name string null no
metastore_name Unity Catalog Metastore name string n/a yes
region Required when using Account level API provider authorization. The region of metastore string null no
storage_root Path on cloud storage, where managed Unity Catalog Metastore is created string n/a yes

Outputs

Name Description
gcp_service_principal Databricks-managed GCP Service Account used for UC access
metastore_id Unity Catalog Metastore Id
metastore_name Unity Catalog Metastore name

License

Apache 2 Licensed. For more information please see LICENSE