GKE still cannot be reliably spin up using Terraform 0.12.3 (without defining node pool) #4391

Leectan · 2019-09-03T15:05:22Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment
If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

Terraform v0.12.3

provider.google v2.13.0
provider.google-beta v2.13.0

Affected Resource(s)

google_container_cluster
google_container_node_pool

Terraform Configuration Files

module "gke-sbp-cluster" {
// providers = {
// google = "google-beta"
// }
source = "./modules/gke-base-infra"
cluster_name = var.cluster_name
project_id = var.project_id
vpc_network = google_compute_network.vpc_network.name
vpc_subnetwork = google_compute_subnetwork.vpc_subnetwork.name
location = var.location
master_ipv4_cidr_block = var.master_ipv4_cidr_block
use_ip_aliases = true
// logging_service = var.logging_service
// monitoring_service = var.monitoring_service
// daily_maintenance_window_start_time = var.daily_maintenance_window_start_time
enable_private_endpoint = false
// cluster_client_certificate = true
// enable_kubernetes_dashboard = true
enable_private_nodes = true
enable_network_policy = true
// http_load_balancing = false
master_authorized_networks_cidr_blocks = var.master_authorized_networks_cidr_blocks
cluster_secondary_range_name = var.cluster_secondary_range_name
// cluster_autoscaling = var.cluster_autoscaling
// enable_legacy_abac = false
// pod_security_policy_config = true
// intranode_visibility = true

// #node pool configs below
// node_pool_name = "gke-node_pool-1"
// initial_node_count = 4
//// max_node_count = 4
//// min_node_count = 2
// node_auto_repair = true
// node_auto_upgrade = true
// node_disk_size_gb = 100
// node_image_type = "COS"
// node_machine_type = "n1-standard-1"
// oauth_scopes = [
// "https://www.googleapis.com/auth/logging.write",
// "https://www.googleapis.com/auth/monitoring",
// "https://www.googleapis.com/auth/devstorage.read_only"
//
// ]
// node_preemtible = false
// gke_service_account = module.gke_service_account.email
//// node_cluster_auto_scaling = true
//// cpu_maximum = 4
//// cpu_minimum = 2
//// memory_maximum = 8
//// memory_minimum = 4
}

Debug Output

Panic Output

Expected Behavior

I expect the process would run to finish.

Actual Behavior

Error: Error waiting for creating GKE cluster: All cluster resources were brought up, but the cluster API is reporting that: 4 nodes out of 4 are unhealthy.

Steps to Reproduce

terraform init.
terraform apply

terraform apply

Important Factoids

References

Impossible to reliably create a GKE cluster using terraform #2022

The text was updated successfully, but these errors were encountered:

rileykarson · 2019-09-03T16:11:00Z

It's hard to say why this is failing, especially when using a module. Are you able to replicate this with a google_container_cluster resource directly, or share debug logs?

Leectan · 2019-09-03T19:58:38Z

created brand new resources with minimal requirements, it seems to fail at remove_default_node_pool = true declare with node_pool resource at the same time.

provider "google" {
  version = "2.14.0"
  region = "us-east1"
  zone = "us-east1-b"
}

provider "google-beta" {
  version = "2.14.0"
  region = "us-east1"
  zone = "us-east1-b"
}

resource "google_compute_network" "vpc_network" {
  name = var.vpc_network_name
  auto_create_subnetworks = false
  routing_mode = "REGIONAL"
  delete_default_routes_on_create = true
  project = var.project_id
}

resource "google_compute_subnetwork" "vpc_subnetwork" {
  ip_cidr_range = var.vpc_subnetwork_cidr_range
  project = var.project_id
  name = var.vpc_subnetwork_name
  network = google_compute_network.vpc_network.self_link
  private_ip_google_access = true
  secondary_ip_range {
    ip_cidr_range = var.network_secondary_range
    range_name = var.network_secondary_range_name
  }
  enable_flow_logs = true
}

resource "google_container_cluster" "my-gke-cluster" {
  provider = "google-beta"
  name = "my-gke-cluster"
  location = "us-east1"
  project = var.project_id
  initial_node_count = 2
  cluster_autoscaling {
    enabled = true
    resource_limits {
      resource_type = "cpu"
        maximum = 8
        minimum = 2
    }
    resource_limits {
      resource_type = "memory"
      maximum = 16
      minimum = 4
    }
  }

  remove_default_node_pool = true


}

resource "google_container_node_pool" "node_pool_1" {
  provider = "google-beta"
  name = "gke-node-pools"
  project = var.project_id
  cluster = google_container_cluster.my-gke-cluster.name
  management {
    auto_repair = true
    auto_upgrade = true
  }

  autoscaling {
    max_node_count = 10
    min_node_count = 2
  }
  depends_on = [google_container_cluster.my-gke-cluster]
}

Cluster spin up fine without the remove_default_node_pool, but then it doesn't accept the node_pool value I specified in the resource block. It just created the default node pool.

But if specified remove_default_node_pool = true with node_pool resource, cluster will spin up, then deleting the default node pool, but never spin up the node_pool resource block. Been struggling with this issue for last few weeks now....

rileykarson · 2019-09-03T20:01:42Z

This may be related to #4024, someone in that issue saw errors under similar circumstances.

Can you share debug logs? I've never seen an issue like this, and can't reproduce it.

rileykarson · 2019-12-03T17:53:38Z

Closing as stale.

ghost · 2020-03-29T13:51:58Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

ghost added the bug label Sep 3, 2019

Leectan changed the title ~~GKE still cannot be reliably spin up using Terraform 0.12.3~~ GKE still cannot be reliably spin up using Terraform 0.12.3 (without defining node pool) Sep 3, 2019

rileykarson self-assigned this Sep 3, 2019

rileykarson added the waiting-response label Sep 3, 2019

ghost removed the waiting-response label Sep 3, 2019

rileykarson added the waiting-response label Sep 3, 2019

rileykarson added the stale label Oct 14, 2019

rileykarson closed this as completed Dec 3, 2019

ghost locked and limited conversation to collaborators Mar 29, 2020

github-actions bot added service/container forward/review In review; remove label to forward labels Jan 15, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GKE still cannot be reliably spin up using Terraform 0.12.3 (without defining node pool) #4391

GKE still cannot be reliably spin up using Terraform 0.12.3 (without defining node pool) #4391

Leectan commented Sep 3, 2019 •

edited

Loading

rileykarson commented Sep 3, 2019

Leectan commented Sep 3, 2019

rileykarson commented Sep 3, 2019

rileykarson commented Dec 3, 2019

ghost commented Mar 29, 2020

GKE still cannot be reliably spin up using Terraform 0.12.3 (without defining node pool) #4391

GKE still cannot be reliably spin up using Terraform 0.12.3 (without defining node pool) #4391

Comments

Leectan commented Sep 3, 2019 • edited Loading

Community Note

Terraform Version

Affected Resource(s)

Terraform Configuration Files

Debug Output

Panic Output

Expected Behavior

Actual Behavior

Steps to Reproduce

Important Factoids

References

rileykarson commented Sep 3, 2019

Leectan commented Sep 3, 2019

rileykarson commented Sep 3, 2019

rileykarson commented Dec 3, 2019

ghost commented Mar 29, 2020

Leectan commented Sep 3, 2019 •

edited

Loading