Name	Name	Last commit message	Last commit date
parent directory ..
README.md	README.md
iam.tf	iam.tf
main.tf	main.tf
outputs.tf	outputs.tf
variables.tf	variables.tf
versions.tf	versions.tf

Google Cloud Dataproc

This module Manages a Google Cloud Dataproc cluster resource, including IAM.

TODO
Examples
IAM
- Authoritative IAM
- Additive IAM
Variables
Outputs

TODO

Add support for Cloud Dataproc autoscaling policy.

Examples

Simple

module "processing-dp-cluster-2" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
}
# tftest modules=1 resources=1

Cluster configuration

To set cluster configuration use the 'dataproc_config.cluster_config' variable.

module "processing-dp-cluster" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
  prefix     = "prefix"
  dataproc_config = {
    cluster_config = {
      gce_cluster_config = {
        subnetwork             = "https://www.googleapis.com/compute/v1/projects/PROJECT/regions/europe-west1/subnetworks/SUBNET"
        zone                   = "europe-west1-b"
        service_account        = ""
        service_account_scopes = ["cloud-platform"]
        internal_ip_only       = true
      }
    }
  }
}
# tftest modules=1 resources=1

Cluster with CMEK encryption

To set cluster configuration use the Customer Managed Encryption key, set dataproc_config.encryption_config. variable. The Compute Engine service agent and the Cloud Storage service agent need to have CryptoKey Encrypter/Decrypter role on they configured KMS key (Documentation).

module "processing-dp-cluster" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
  prefix     = "prefix"
  dataproc_config = {
    cluster_config = {
      gce_cluster_config = {
        subnetwork             = "https://www.googleapis.com/compute/v1/projects/PROJECT/regions/europe-west1/subnetworks/SUBNET"
        zone                   = "europe-west1-b"
        service_account        = ""
        service_account_scopes = ["cloud-platform"]
        internal_ip_only       = true
      }
    }
    encryption_config = {
      kms_key_name = "projects/project-id/locations/region/keyRings/key-ring-name/cryptoKeys/key-name"
    }
  }
}
# tftest modules=1 resources=1

IAM

IAM is managed via several variables that implement different features and levels of control:

iam and group_iam configure authoritative bindings that manage individual roles exclusively, and are internally merged
iam_bindings configure authoritative bindings with optional support for conditions, and are not internally merged with the previous two variables
iam_bindings_additive configure additive bindings via individual role/member pairs with optional support conditions

The authoritative and additive approaches can be used together, provided different roles are managed by each. Some care must also be taken with the groups_iam variable to ensure that variable keys are static values, so that Terraform is able to compute the dependency graph.

Refer to the project module for examples of the IAM interface.

Authoritative IAM

module "processing-dp-cluster" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
  prefix     = "prefix"
  group_iam = {
    "[email protected]" = [
      "roles/dataproc.viewer"
    ]
  }
  iam = {
    "roles/dataproc.viewer" = [
      "serviceAccount:service-account@PROJECT_ID.iam.gserviceaccount.com"
    ]
  }
}
# tftest modules=1 resources=2

Additive IAM

module "processing-dp-cluster" {
  source     = "./fabric/modules/dataproc"
  project_id = "my-project"
  name       = "my-cluster"
  region     = "europe-west1"
  prefix     = "prefix"
  iam_bindings_additive = {
    am1-viewer = {
      member = "user:[email protected]"
      role   = "roles/dataproc.viewer"
    }
  }
}
# tftest modules=1 resources=2

Variables

name	description	type	required	default
name	Cluster name.	`string`	✓
project_id	Project ID.	`string`	✓
region	Dataproc region.	`string`	✓
dataproc_config	Dataproc cluster config.	`object({…})`		`{}`
group_iam	Authoritative IAM binding for organization groups, in {GROUP_EMAIL => [ROLES]} format. Group emails need to be static. Can be used in combination with the `iam` variable.	`map(list(string))`		`{}`
iam	IAM bindings in {ROLE => [MEMBERS]} format.	`map(list(string))`		`{}`
iam_bindings	Authoritative IAM bindings in {KEY => {role = ROLE, members = [], condition = {}}}. Keys are arbitrary.	`map(object({…}))`		`{}`
iam_bindings_additive	Individual additive IAM bindings. Keys are arbitrary.	`map(object({…}))`		`{}`
labels	The resource labels for instance to use to annotate any related underlying resources, such as Compute Engine VMs.	`map(string)`		`{}`
prefix	Optional prefix used to generate project id and name.	`string`		`null`
service_account	Service account to set on the Dataproc cluster.	`string`		`null`

Outputs

name	description	sensitive
bucket_names	List of bucket names which have been assigned to the cluster.
http_ports	The map of port descriptions to URLs.
id	Fully qualified cluster id.
instance_names	List of instance names which have been assigned to the cluster.
name	The name of the cluster.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataproc

dataproc

README.md

Google Cloud Dataproc

TODO

Examples

Simple

Cluster configuration

Cluster with CMEK encryption

IAM

Authoritative IAM

Additive IAM

Variables

Outputs

Files

dataproc

Directory actions

More options

Directory actions

More options

Latest commit

History

dataproc

Folders and files

parent directory

README.md

Google Cloud Dataproc

TODO

Examples

Simple

Cluster configuration

Cluster with CMEK encryption

IAM

Authoritative IAM

Additive IAM

Variables

Outputs