Skip to content

Latest commit

 

History

History
 
 

gke-cluster-standard

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 

GKE Standard cluster module

This module offers a way to create and manage Google Kubernetes Engine (GKE) Standard clusters. With its sensible default settings based on best practices and authors' experience as Google Cloud practitioners, the module accommodates for many common use cases out-of-the-box, without having to rely on verbose configuration.

Important

This module should be used together with the gke-nodepool module because the default node pool is deleted upon cluster creation by default.

Cluster access configurations

The access_config variable can be used to configure access to the control plane, and nodes public access. The following examples illustrate different possible configurations.

Private cluster with DNS endpoint enabled

The default module configuration creates a cluster with private nodes, no public endpoint, and access via the DNS endpoint enabled. The default variable configuration is shown in comments.

Master authorized ranges can be set via the access_config.ip_access.authorized_ranges attribute.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = "myproject"
  name       = "cluster-1"
  location   = "europe-west1-b"
  # access_config can be omitted if master authorized ranges are not needed
  access_config = {
    # dns_access = true
    ip_access = {
      authorized_ranges = {
        internal-vms = "10.0.0.0/8"
      }
      # disable_public_endpoint = true
      # private_endpoint_config = {
      #   global_access = true
      # }
    }
    # private_nodes = true
  }
  vpc_config = {
    network    = var.vpc.self_link
    subnetwork = var.subnet.self_link
    secondary_range_names = {
      pods     = "pods"
      services = "services"
    }
  }
  max_pods_per_node = 32
  labels = {
    environment = "dev"
  }
}
# tftest modules=1 resources=1 inventory=access-private.yaml

Public cluster

To configure a public cluster, turn off access_config.ip_access.disable_public_endpoint. Nodes can be left as private or made public if needed, like in the example below. DNS endpoint is turned off here as it's probably redundant for a public cluster.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = "myproject"
  name       = "cluster-1"
  location   = "europe-west1-b"
  access_config = {
    dns_access = false
    ip_access = {
      authorized_ranges = {
        "corporate proxy" = "8.8.8.8/32"
      }
      disable_public_endpoint = false
    }
    private_nodes = false
  }
  vpc_config = {
    network    = var.vpc.self_link
    subnetwork = var.subnet.self_link
    secondary_range_names = {
      pods     = "pods"
      services = "services"
    }
  }
  max_pods_per_node = 32
  labels = {
    environment = "dev"
  }
}
# tftest modules=1 resources=1 inventory=access-public.yaml

Regional cluster

Regional clusters are created by setting location to a GCP region and then configuring node_locations, as shown in the example below.

module "cluster-1" {
  source         = "./fabric/modules/gke-cluster-standard"
  project_id     = "myproject"
  name           = "cluster-1"
  location       = "europe-west1"
  node_locations = ["europe-west1-b"]
  access_config = {
    ip_access = {
      authorized_ranges = {
        internal-vms = "10.0.0.0/8"
      }
    }
  }
  vpc_config = {
    network    = var.vpc.self_link
    subnetwork = var.subnet.self_link
    secondary_range_names = {
      pods     = "pods"
      services = "services"
    }
  }
  max_pods_per_node = 32
  labels = {
    environment = "dev"
  }
}
# tftest modules=1 resources=1 inventory=regional.yaml

Enable Dataplane V2

This example shows how to create a zonal GKE Cluster with Dataplane V2 enabled.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = "myproject"
  name       = "cluster-dataplane-v2"
  location   = "europe-west1-b"
  vpc_config = {
    network               = var.vpc.self_link
    subnetwork            = var.subnet.self_link
    secondary_range_names = {} # use default names "pods" and "services"
  }
  enable_features = {
    dataplane_v2          = true
    fqdn_network_policy   = true
    secret_manager_config = true
    workload_identity     = true
  }
  labels = {
    environment = "dev"
  }
}
# tftest modules=1 resources=1

Managing GKE logs

This example shows you how to control which logs are sent from your GKE cluster to Cloud Logging.

When you create a new GKE cluster, Cloud Operations for GKE integration with Cloud Logging is enabled by default and System logs are collected. You can enable collection of several other types of logs. The following example enables collection of all optional logs.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = "myproject"
  name       = "cluster-1"
  location   = "europe-west1-b"
  vpc_config = {
    network               = var.vpc.self_link
    subnetwork            = var.subnet.self_link
    secondary_range_names = {}
  }
  logging_config = {
    enable_workloads_logs          = true
    enable_api_server_logs         = true
    enable_scheduler_logs          = true
    enable_controller_manager_logs = true
  }
}
# tftest modules=1 resources=1 inventory=logging-config-enable-all.yaml

Monitoring configuration

This example shows how to configure collection of Kubernetes control plane metrics. These metrics are optional and are not collected by default.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = "myproject"
  name       = "cluster-1"
  location   = "europe-west1-b"
  vpc_config = {
    network               = var.vpc.self_link
    subnetwork            = var.subnet.self_link
    secondary_range_names = {} # use default names "pods" and "services"
  }
  monitoring_config = {
    enable_api_server_metrics         = true
    enable_controller_manager_metrics = true
    enable_scheduler_metrics          = true
  }
}
# tftest modules=1 resources=1 inventory=monitoring-config-control-plane.yaml

The next example shows how to configure collection of kube state metrics. These metrics are optional and are not collected by default.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = "myproject"
  name       = "cluster-1"
  location   = "europe-west1-b"
  vpc_config = {
    network               = var.vpc.self_link
    subnetwork            = var.subnet.self_link
    secondary_range_names = {} # use default names "pods" and "services"
  }
  monitoring_config = {
    enable_daemonset_metrics   = true
    enable_deployment_metrics  = true
    enable_hpa_metrics         = true
    enable_pod_metrics         = true
    enable_statefulset_metrics = true
    enable_storage_metrics     = true
    # Kube state metrics collection requires Google Cloud Managed Service for Prometheus,
    # which is enabled by default.
    # enable_managed_prometheus = true  
  }
}
# tftest modules=1 resources=1 inventory=monitoring-config-kube-state.yaml

The control plane metrics and kube state metrics collection can be configured in a single monitoring_config block.

Disable GKE logs or metrics collection

Warning

If you've disabled Cloud Logging or Cloud Monitoring, GKE customer support is offered on a best-effort basis and might require additional effort from your engineering team.

This example shows how to fully disable logs collection on a zonal GKE Standard cluster. This is not recommended.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = "myproject"
  name       = "cluster-1"
  location   = "europe-west1-b"
  vpc_config = {
    network               = var.vpc.self_link
    subnetwork            = var.subnet.self_link
    secondary_range_names = {}
  }
  logging_config = {
    enable_system_logs = false
  }
}
# tftest modules=1 resources=1 inventory=logging-config-disable-all.yaml

The next example shows how to fully disable metrics collection on a zonal GKE Standard cluster. This is not recommended.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = "myproject"
  name       = "cluster-1"
  location   = "europe-west1-b"
  vpc_config = {
    network               = var.vpc.self_link
    subnetwork            = var.subnet.self_link
    secondary_range_names = {}
  }
  monitoring_config = {
    enable_system_metrics     = false
    enable_managed_prometheus = false
  }
}
# tftest modules=1 resources=1 inventory=monitoring-config-disable-all.yaml

Cloud DNS

This example shows how to use Cloud DNS as a Kubernetes DNS provider for GKE Standard clusters.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = var.project_id
  name       = "cluster-1"
  location   = "europe-west1-b"
  vpc_config = {
    network               = var.vpc.self_link
    subnetwork            = var.subnet.self_link
    secondary_range_names = {}
  }
  enable_features = {
    dns = {
      provider = "CLOUD_DNS"
      scope    = "CLUSTER_SCOPE"
      domain   = "gke.local"
    }
  }
}
# tftest modules=1 resources=1 inventory=dns.yaml

Backup for GKE

Note

Although Backup for GKE can be enabled as an add-on when configuring your GKE clusters, it is a separate service from GKE.

Backup for GKE is a service for backing up and restoring workloads in GKE clusters. It has two components:

  • A Google Cloud API that serves as the control plane for the service.
  • A GKE add-on (the Backup for GKE agent) that must be enabled in each cluster for which you wish to perform backup and restore operations.

This example shows how to enable Backup for GKE on a new zonal GKE Standard cluster and plan a set of backups.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = var.project_id
  name       = "cluster-1"
  location   = "europe-west1-b"
  vpc_config = {
    network               = var.vpc.self_link
    subnetwork            = var.subnet.self_link
    secondary_range_names = {}
  }
  backup_configs = {
    enable_backup_agent = true
    backup_plans = {
      "backup-1" = {
        region   = "europe-west2"
        schedule = "0 9 * * 1"
        applications = {
          namespace-1 = ["app-1", "app-2"]
        }
      }
    }
  }
}
# tftest modules=1 resources=2 inventory=backup.yaml

Automatic creation of new secondary ranges

You can use var.vpc_config.secondary_range_blocks to let GKE create new secondary ranges for the cluster. The example below reserves an available /14 block for pods and a /20 for services.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = var.project_id
  name       = "cluster-1"
  location   = "europe-west1-b"
  vpc_config = {
    network    = var.vpc.self_link
    subnetwork = var.subnet.self_link
    secondary_range_blocks = {
      pods     = ""
      services = "/20" # can be an empty string as well
    }
  }
}
# tftest modules=1 resources=1

Node auto-provisioning with GPUs and TPUs

You can use var.cluster_autoscaling block to configure node auto-provisioning for the GKE cluster. The example below configures limits for CPU, memory, GPUs and TPUs.

module "cluster-1" {
  source     = "./fabric/modules/gke-cluster-standard"
  project_id = var.project_id
  name       = "cluster-1"
  location   = "europe-west1-b"
  vpc_config = {
    network    = var.vpc.self_link
    subnetwork = var.subnet.self_link
    secondary_range_blocks = {
      pods     = ""
      services = "/20"
    }
  }
  cluster_autoscaling = {
    cpu_limits = {
      max = 48
    }
    mem_limits = {
      max = 182
    }
    # Can be GPUs or TPUs
    accelerator_resources = [
      {
        resource_type = "nvidia-l4"
        max           = 2
      },
      {
        resource_type = "tpu-v5-lite-podslice"
        max           = 2
      }
    ]
  }
}
# tftest modules=1 resources=1

Variables

name description type required default
location Cluster zone or region. string
name Cluster name. string
project_id Cluster project id. string
vpc_config VPC-level configuration. object({…})
access_config Control plane endpoint and nodes access configurations. object({…}) {}
backup_configs Configuration for Backup for GKE. object({…}) {}
cluster_autoscaling Enable and configure limits for Node Auto-Provisioning with Cluster Autoscaler. object({…}) null
default_nodepool Enable default nodepool. object({…}) {}
deletion_protection Whether or not to allow Terraform to destroy the cluster. Unless this field is set to false in Terraform state, a terraform destroy or terraform apply that would delete the cluster will fail. bool true
description Cluster description. string null
enable_addons Addons enabled in the cluster (true means enabled). object({…}) {…}
enable_features Enable cluster-level features. Certain features allow configuration. object({…}) {…}
issue_client_certificate Enable issuing client certificate. bool false
labels Cluster resource labels. map(string) {}
logging_config Logging configuration. object({…}) {}
maintenance_config Maintenance window configuration. object({…}) {…}
max_pods_per_node Maximum number of pods per node in this cluster. number 110
min_master_version Minimum version of the master, defaults to the version of the most recent official release. string null
monitoring_config Monitoring configuration. Google Cloud Managed Service for Prometheus is enabled by default. object({…}) {}
node_config Node-level configuration. object({…}) {}
node_locations Zones in which the cluster's nodes are located. list(string) []
release_channel Release channel for GKE upgrades. string null

Outputs

name description sensitive
ca_certificate Public certificate of the cluster (base64-encoded).
cluster Cluster resource.
dns_endpoint Control plane DNS endpoint.
endpoint Cluster endpoint.
id FUlly qualified cluster id.
location Cluster location.
master_version Master version.
name Cluster name.
notifications GKE PubSub notifications topic.
self_link Cluster self link.
workload_identity_pool Workload identity pool.