Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add simple composer 2 blueprint #824

Merged
merged 12 commits into from
Sep 28, 2022
2 changes: 1 addition & 1 deletion blueprints/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ This section **[networking blueprints](./networking/)** that implement core patt
Currently available blueprints:

- **cloud operations** - [Resource tracking and remediation via Cloud Asset feeds](./cloud-operations/asset-inventory-feed-remediation), [Granular Cloud DNS IAM via Service Directory](./cloud-operations/dns-fine-grained-iam), [Granular Cloud DNS IAM for Shared VPC](./cloud-operations/dns-shared-vpc), [Compute Engine quota monitoring](./cloud-operations/quota-monitoring), [Scheduled Cloud Asset Inventory Export to Bigquery](./cloud-operations/scheduled-asset-inventory-export-bq), [Packer image builder](./cloud-operations/packer-image-builder), [On-prem SA key management](./cloud-operations/onprem-sa-key-management), [TCP healthcheck for unmanaged GCE instances](./cloud-operations/unmanaged-instances-healthcheck), [HTTP Load Balancer with Cloud Armor](./cloud-operations/glb_and_armor)
- **data solutions** - [GCE/GCS CMEK via centralized Cloud KMS](./data-solutions/gcs-to-bq-with-least-privileges/), [Cloud Storage to Bigquery with Cloud Dataflow with least privileges](./data-solutions/gcs-to-bq-with-least-privileges/), [Data Platform Foundations](./data-solutions/data-platform-foundations/), [SQL Server AlwaysOn availability groups blueprint](./data-solutions/sqlserver-alwayson), [Cloud SQL instance with multi-region read replicas](./data-solutions/cloudsql-multiregion/)
- **data solutions** - [GCE/GCS CMEK via centralized Cloud KMS](./data-solutions/gcs-to-bq-with-least-privileges/), [Cloud Storage to Bigquery with Cloud Dataflow with least privileges](./data-solutions/gcs-to-bq-with-least-privileges/), [Data Platform Foundations](./data-solutions/data-platform-foundations/), [SQL Server AlwaysOn availability groups blueprint](./data-solutions/sqlserver-alwayson), [Cloud SQL instance with multi-region read replicas](./data-solutions/cloudsql-multiregion/), [Cloud Composer version 2 private instance, supporting Shared VPC and external CMEK key](./data-solutions/composer-2/)
- **factories** - [The why and the how of resource factories](./factories/README.md)
- **GKE** - [GKE multitenant fleet](./gke/multitenant-fleet/), [Shared VPC with GKE support](./networking/shared-vpc-gke/), [Binary Authorization Pipeline](./gke/binauthz/), [Multi-cluster mesh on GKE (fleet API)](./gke/multi-cluster-mesh-gke-fleet-api/)
- **networking** - [hub and spoke via peering](./networking/hub-and-spoke-peering/), [hub and spoke via VPN](./networking/hub-and-spoke-vpn/), [DNS and Google Private Access for on-premises](./networking/onprem-google-access-dns/), [Shared VPC with GKE support](./networking/shared-vpc-gke/), [ILB as next hop](./networking/ilb-next-hop), [PSC for on-premises Cloud Function invocation](./networking/private-cloud-function-from-onprem/), [decentralized firewall](./networking/decentralized-firewall)
Expand Down
9 changes: 8 additions & 1 deletion blueprints/data-solutions/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,7 +30,7 @@ This [blueprint](./data-platform-foundations/) implements SQL Server Always On A

### Cloud SQL instance with multi-region read replicas

<a href="./cloudsql-multiregion/" title="Cloud SQL instance with multi-region read replicas"><img src="./cloudsql-multiregion/diagram.png" align="left" width="280px"></a>
<a href="./cloudsql-multiregion/" title="Cloud SQL instance with multi-region read replicas"><img src="./cloudsql-multiregion/images/diagram.png" align="left" width="280px"></a>
This [blueprint](./cloudsql-multiregion/) creates a [Cloud SQL instance](https://cloud.google.com/sql) with multi-region read replicas as described in the [Cloud SQL for PostgreSQL disaster recovery](https://cloud.google.com/architecture/cloud-sql-postgres-disaster-recovery-complete-failover-fallback) article.
<br clear="left">

Expand All @@ -41,3 +41,10 @@ This [blueprint](./data-playground/) creates a [Vertex AI
Notebook](https://cloud.google.com/vertex-ai/docs/workbench/introduction)
running on a VPC with a private IP and a dedicated Service Account. A GCS bucket and a BigQuery dataset are created to store inputs and outputs of data experiments.
<br clear="left">

### Cloud Composer version 2 private instance, supporting Shared VPC and external CMEK key

<a href="./composer-2/" title="# Cloud Composer version 2 private instance, supporting Shared VPC and external CMEK key
"><img src="./composer-2/diagram.png" align="left" width="280px"></a>
This [blueprint](./composer-2/) creates a [Cloud Composer](https://cloud.google.com/composer/) version 2 instance on a VPC with a dedicated service account. The solution supports as inputs: a Shared VPC and Cloud KMS CMEK keys.
<br clear="left">
115 changes: 115 additions & 0 deletions blueprints/data-solutions/composer-2/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
# Cloud Composer version 2 private instance, supporting Shared VPC and external CMEK key

This blueprint creates a Private instance of [Cloud Composer version 2](https://cloud.google.com/composer/docs/composer-2/composer-versioning-overview) on a VPC with a dedicated service account. Cloud Composer 2 is the new major version for Cloud Composer that supports:
- environment autoscaling
- workloads configuration: CPU, memory, and storage parameters for Airflow workers, schedulers, web server, and database.

Please consult the [documentation page](https://cloud.google.com/composer/docs/composer-2/composer-versioning-overview) for an exhaustive comparison between Composer Version 1 and Version 2.

The solution will use:
- Cloud Composer
- VPC with Private Service Access to deploy resources, if no Shared VPC configuration provided.
- Google Cloud NAT to access internet resources, if no Shared VPC configuration provided.

The solution supports as inputs:
- Shared VPC
- Cloud KMS CMEK keys

This is the high level diagram:

![Cloud Composer 2 architecture overview](./diagram.png "Cloud Composer 2 architecture overview")

# Requirements
This blueprint will deploy all its resources into the project defined by the project_id variable. Please note that we assume this project already exists. However, if you provide the appropriate values to the `project_create` variable, the project will be created as part of the deployment.

If `project_create` is left to null, the identity performing the deployment needs the owner role on the project defined by the `project_id` variable. Otherwise, the identity performing the deployment needs `resourcemanager.projectCreator` on the resource hierarchy node specified by `project_create.parent` and `billing.user` on the billing account specified by `project_create.billing_account_id`.

# Deployment
Run Terraform init:

```bash
$ terraform init
```

Configure the Terraform variable in your terraform.tfvars file. You need to specify at least the following variables:

```tfvars
project_id = "lcaggioni-sandbox"
prefix = "lc"
```

You can run now:

```bash
$ terraform apply
```

You can now connect to your instance.

# Customizations

## VPC
If a shared VPC is not configured, a VPC will be created within the project. The following IP ranges will be used:
- Cloudsql: `10.20.10.0/24`
- GKE: `10.20.11.0/28`

Change the code as needed to match your needed configuration, remember that these addresses should not overlap with any other range used in network.
## Shared VPC
As is often the case in real-world configurations, this blueprint accepts as input an existing [`Shared-VPC`](https://cloud.google.com/vpc/docs/shared-vpc) via the `network_config` variable.

Example:
```tfvars
network_config = {
host_project = "PROJECT"
network_self_link = "projects/PROJECT/global/networks/VPC_NAME"
subnet_self_link = "projects/PROJECT/regions/REGION/subnetworks/VPC_NAME"
composer_secondary_ranges = {
pods = "pods"
services = "services"
}
}
```

Make sure that:
- The GKE API (`container.googleapis.com`) is enabled in the VPC host project.
- The subnet has secondary ranges configured with 2 ranges:
- pods: `/22` example: `10.10.8.0/22`
- services = `/24` example: 10.10.12.0/24`
- Firewall rules are set, as described in the [documentation](https://cloud.google.com/composer/docs/composer-2/configure-private-ip#step_3_configure_firewall_rules)

In order to run the example and deploy Cloud Composer on a shared VPC the identity running Terraform must have the following IAM role on the Shared VPC Host project.
- Compute Network Admin (roles/compute.networkAdmin)
- Compute Shared VPC Admin (roles/compute.xpnAdmin)

## Encryption
As is often the case in real-world configurations, this blueprint accepts as input an existing [`Cloud KMS keys`](https://cloud.google.com/kms/docs/cmek) via the `service_encryption_keys` variable.

Example:
```tfvars
service_encryption_keys = {
`europe/west1` = `projects/PROJECT/locations/REGION/keyRings/KR_NAME/cryptoKeys/KEY_NAME`
}
```
<!-- BEGIN TFDOC -->

## Variables

| name | description | type | required | default |
|---|---|:---:|:---:|:---:|
| [prefix](variables.tf#L81) | Unique prefix used for resource names. Not used for project if 'project_create' is null. | <code>string</code> | ✓ | |
| [project_id](variables.tf#L95) | Project id, references existing project if `project_create` is null. | <code>string</code> | ✓ | |
| [composer_config](variables.tf#L17) | Composer environemnt configuration. See [attribute reference](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/composer_environment#argument-reference---cloud-composer-2) for details on settings variables. | <code title="object&#40;&#123;&#10; environment_size &#61; string&#10; software_config &#61; any&#10; workloads_config &#61; object&#40;&#123;&#10; scheduler &#61; object&#40;&#10; &#123;&#10; cpu &#61; number&#10; memory_gb &#61; number&#10; storage_gb &#61; number&#10; count &#61; number&#10; &#125;&#10; &#41;&#10; web_server &#61; object&#40;&#10; &#123;&#10; cpu &#61; number&#10; memory_gb &#61; number&#10; storage_gb &#61; number&#10; &#125;&#10; &#41;&#10; worker &#61; object&#40;&#10; &#123;&#10; cpu &#61; number&#10; memory_gb &#61; number&#10; storage_gb &#61; number&#10; min_count &#61; number&#10; max_count &#61; number&#10; &#125;&#10; &#41;&#10; &#125;&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | | <code title="&#123;&#10; environment_size &#61; &#34;ENVIRONMENT_SIZE_SMALL&#34;&#10; software_config &#61; &#123;&#10; image_version &#61; &#34;composer-2-airflow-2&#34;&#10; env_variables &#61; &#123;&#10; FOO &#61; &#34;bar&#34;&#10; &#125;&#10; &#125;&#10; workloads_config &#61; null&#10;&#125;">&#123;&#8230;&#125;</code> |
| [iam_groups_map](variables.tf#L61) | Map of Role => groups to be added on the project. Example: { \"roles/composer.admin\" = [\"group:[email protected]\"]}. | <code>map&#40;list&#40;string&#41;&#41;</code> | | <code>null</code> |
| [network_config](variables.tf#L67) | Shared VPC network configurations to use. If null networks will be created in projects with preconfigured values. | <code title="object&#40;&#123;&#10; host_project &#61; string&#10; network_self_link &#61; string&#10; subnet_self_link &#61; string&#10; composer_secondary_ranges &#61; object&#40;&#123;&#10; pods &#61; string&#10; services &#61; string&#10; &#125;&#41;&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | | <code>null</code> |
| [project_create](variables.tf#L86) | Provide values if project creation is needed, uses existing project if null. Parent is in 'folders/nnn' or 'organizations/nnn' format. | <code title="object&#40;&#123;&#10; billing_account_id &#61; string&#10; parent &#61; string&#10;&#125;&#41;">object&#40;&#123;&#8230;&#125;&#41;</code> | | <code>null</code> |
| [region](variables.tf#L100) | Region where instances will be deployed. | <code>string</code> | | <code>&#34;europe-west1&#34;</code> |
| [service_encryption_keys](variables.tf#L106) | Cloud KMS keys to use to encrypt resources. Provide a key for each reagion in use. | <code>map&#40;string&#41;</code> | | <code>null</code> |

## Outputs

| name | description | sensitive |
|---|---|:---:|
| [composer_airflow_uri](outputs.tf#L22) | The URI of the Apache Airflow Web UI hosted within the Cloud Composer environment.. | |
| [composer_dag_gcs](outputs.tf#L17) | The Cloud Storage prefix of the DAGs for the Cloud Composer environment. | |

<!-- END TFDOC -->
30 changes: 30 additions & 0 deletions blueprints/data-solutions/composer-2/backend.tf.sample
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# The `impersonate_service_account` option require the identity launching terraform
# role `roles/iam.serviceAccountTokenCreator` on the Service Account specified.

terraform {
backend "gcs" {
bucket = "BUCKET_NAME"
prefix = "PREFIX"
impersonate_service_account = "SERVICE_ACCOUNT@PROJECT_ID.iam.gserviceaccount.com"
}
}
provider "google" {
impersonate_service_account = "SERVICE_ACCOUNT@PROJECT_ID.iam.gserviceaccount.com"
}
provider "google-beta" {
impersonate_service_account = "SERVICE_ACCOUNT@PROJECT_ID.iam.gserviceaccount.com"
}
111 changes: 111 additions & 0 deletions blueprints/data-solutions/composer-2/composer.tf
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
/**
* Copyright 2022 Google LLC
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

module "comp-sa" {
source = "../../../modules/iam-service-account"
project_id = module.project.project_id
prefix = var.prefix
name = "cmp"
display_name = "Composer service account"
}

resource "google_composer_environment" "env" {
name = "${var.prefix}-composer"
project = module.project.project_id
region = var.region
config {
dynamic "software_config" {
for_each = (
try(var.composer_config.software_config, null) != null
? { 1 = 1 }
: {}
)
content {
airflow_config_overrides = try(var.composer_config.software_config.airflow_config_overrides, null)
pypi_packages = try(var.composer_config.software_config.pypi_packages, null)
env_variables = try(var.composer_config.software_config.env_variables, null)
image_version = try(var.composer_config.software_config.image_version, null)
python_version = try(var.composer_config.software_config.python_version, null)
scheduler_count = try(var.composer_config.software_config.scheduler_count, null)
}
}
dynamic "workloads_config" {
for_each = (try(var.composer_config.workloads_config, null) != null ? { 1 = 1 } : {})

content {
scheduler {
cpu = try(var.composer_config.workloads_config.scheduler.cpu, null)
memory_gb = try(var.composer_config.workloads_config.scheduler.memory_gb, null)
storage_gb = try(var.composer_config.workloads_config.scheduler.storage_gb, null)
count = try(var.composer_config.workloads_config.scheduler.count, null)
}
web_server {
cpu = try(var.composer_config.workloads_config.web_server.cpu, null)
memory_gb = try(var.composer_config.workloads_config.web_server.memory_gb, null)
storage_gb = try(var.composer_config.workloads_config.web_server.storage_gb, null)
}
worker {
cpu = try(var.composer_config.workloads_config.worker.cpu, null)
memory_gb = try(var.composer_config.workloads_config.worker.memory_gb, null)
storage_gb = try(var.composer_config.workloads_config.worker.storage_gb, null)
min_count = try(var.composer_config.workloads_config.worker.min_count, null)
max_count = try(var.composer_config.workloads_config.worker.max_count, null)
}
}
}

environment_size = var.composer_config.environment_size

node_config {
network = local.orch_vpc
subnetwork = local.orch_subnet
service_account = module.comp-sa.email
enable_ip_masq_agent = "true"
tags = ["composer-worker"]
ip_allocation_policy {
cluster_secondary_range_name = try(
var.network_config.composer_secondary_ranges.pods, "pods"
)
services_secondary_range_name = try(
var.network_config.composer_secondary_ranges.services, "services"
)
}
}
private_environment_config {
enable_private_endpoint = "true"
cloud_sql_ipv4_cidr_block = try(
var.network_config.composer_ip_ranges.cloudsql, "10.20.10.0/24"
lcaggio marked this conversation as resolved.
Show resolved Hide resolved
)
master_ipv4_cidr_block = try(
var.network_config.composer_ip_ranges.gke_master, "10.20.11.0/28"
lcaggio marked this conversation as resolved.
Show resolved Hide resolved
)
}
dynamic "encryption_config" {
for_each = (
try(var.service_encryption_keys[var.region], null) != null
? { 1 = 1 }
: {}
)
content {
kms_key_name = try(var.service_encryption_keys[var.region], null)
}
}
}
depends_on = [
google_project_iam_member.shared_vpc,
module.project
]
}
Binary file added blueprints/data-solutions/composer-2/diagram.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading