Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google_container_cluster with google_container_node_pool update causes 400 badRequest #3035

Open
jkamenik opened this issue Feb 12, 2019 · 3 comments

Comments

@jkamenik
Copy link

jkamenik commented Feb 12, 2019

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
  • If an issue is assigned to the "modular-magician" user, it is either in the process of being autogenerated, or is planned to be autogenerated soon. If an issue is assigned to a user, that user is claiming responsibility for the issue. If an issue is assigned to "hashibot", a community member has claimed the issue already.

Terraform Version

terraform -v
Terraform v0.11.8
+ provider.google-beta v1.20.0

Your version of Terraform is out of date! The latest version
is 0.11.11. You can update by downloading from www.terraform.io/downloads.html

Affected Resource(s)

  • google_container_cluster
  • google_container_node_pool

Terraform Configuration Files

# The main cluster
resource "google_container_cluster" "main" {
  provider = "google-beta"
  name = "${var.name}"
  description = "K8s cluster ${var.name}"
  zone = "${var.zone}"
  initial_node_count = 1

  min_master_version = "${local.kubernetes_version}"
  node_version = "${local.kubernetes_version}"

  remove_default_node_pool = true

  master_authorized_networks_config = "${var.master_authorized_networks_config}"

  enable_legacy_abac = false

  addons_config {
    horizontal_pod_autoscaling {
      disabled = false
    }

    kubernetes_dashboard {
      disabled = true
    }

    network_policy_config {
      disabled = true
    }
  }

  # Note on purpose we don't enable master_auth as it is less secure then
  # using IAM
  master_auth {
    username = ""
    password = ""
    client_certificate_config {
      issue_client_certificate = false
    }
  }

  master_authorized_networks_config = "${var.master_authorized_networks_config}"

  maintenance_policy {
    daily_maintenance_window {
      start_time = "${var.maintenance_window}"
    }
  }

  resource_labels {
    chargeline = "${lower(var.chargeline_label)}"
    owner = "${lower(var.owner_label)}"
  }

  timeouts {
    create = "${var.create_timeout}"
    update = "${var.update_timeout}"
    delete = "${var.delete_timeout}"
  }
}

resource "google_container_node_pool" "main_pool" {
  provider = "google-beta"
  name = "${join("-",list(var.name,"main"))}"
  cluster = "${google_container_cluster.main.name}"
  zone = "${var.zone}"
  initial_node_count = 1

  # if enable_auto_upgrade is true then don't supply one.
  # Otherwise use main version
  version = "${var.enable_auto_upgrade ? "" : local.kubernetes_version}"

  autoscaling {
    min_node_count = 1
    max_node_count = "${var.main_pool_max_node_count}"
  }

  management {
    auto_repair = "${var.enable_auto_repair}"
    auto_upgrade = "${var.enable_auto_upgrade}"
  }

  node_config {
    disk_size_gb = "${var.node_disk_size}"
    image_type = "${var.main_pool_image_type}"
    machine_type = "${var.main_pool_machine_type}"
    preemptible = "${var.main_pool_preemptible}"
  }

  depends_on = ["google_container_cluster.main"]
}

Debug Output

data.google_container_engine_versions.region: Refreshing state...
google_container_cluster.main: Refreshing state... (ID: johnk)
google_container_node_pool.main_pool: Refreshing state... (ID: us-central1-a/johnk/johnk-main)
module.cluster.google_container_node_pool.main_pool: Destroying... (ID: us-central1-a/johnk/johnk-main)
module.cluster.google_container_cluster.main: Modifying... (ID: johnk)
  min_master_version: "1.11.5-gke.5" => "1.11.6-gke.6"
  node_version:       "1.11.5-gke.5" => "1.11.6-gke.6"
...
module.cluster.google_container_cluster.main: Still modifying... (ID: johnk, 14m19s elapsed)
module.cluster.google_container_cluster.main: Still modifying... (ID: johnk, 14m29s elapsed)

Error: Error applying plan:

1 error(s) occurred:

* module.cluster.google_container_cluster.main: 1 error(s) occurred:

* google_container_cluster.main: googleapi: Error 400: Node_pool_id must be specified., badRequest

Full logs: https://gist.github.com/jkamenik/4fdeff4cb4341358f172910a1cfff3fd

Panic Output

N/A

Expected Behavior

Update the node-pool before updating the main cluster.

Actual Behavior

Both the node-pool and cluster are updated at the same time, and as soon as the node-pool is deleted then the main cluster updates with a 400 error.

Steps to Reproduce

  1. terraform apply
  2. Update the K8s version used
  3. Make the node-pool nodes preemptable
  4. terraform apply

Important Factoids

  • If the cluster has a second node-pool then it doesn't fail
  • This might be specific to K8s upgrades at the same time as node-pool destruction.
  • Applying each update individually works (order doesn't matter)

References

b/299442591

@ghost ghost added the bug label Feb 12, 2019
@jkamenik
Copy link
Author

I was able to work around this issue via

terraform apply -target "google_container_cluster.main" && \
terraform apply

modular-magician added a commit to modular-magician/terraform-provider-google that referenced this issue Jan 29, 2020
modular-magician added a commit that referenced this issue Jan 29, 2020
danawillow pushed a commit that referenced this issue Jan 29, 2020
@rileykarson
Copy link
Collaborator

This is due to the underlying API I think. Some operations are impossible on node-less clusters, which happens if your final node pool is deleted. Unfortunately, there isn't anything that can be done in the provider to mitigate this behaviour. Apply ordering is chosen by Terraform Core.

@Edwinhr716
Copy link

Wasn't able to reproduce it using terraform 1.6.5. Here's what I attempted:

Code tested:


resource "google_container_cluster" "gke_cluster_2" {
    project = "project-1"
    provider = "google-beta"
    name = "test-cluster-5"
    location = "us-central1"

    min_master_version = "1.11.5-gke.5"
    initial_node_count = 1
    remove_default_node_pool = true

    //default is true, need to disable if the version is less than 1.13.0
    enable_shielded_nodes = false
}


resource "google_container_node_pool" "main_pool_2" {
    
    project = "project-1"
    provider = "google-beta"
    cluster = "test-cluster-5"
    location = "us-central1"
    initial_node_count = 1
    name = "test-node-pool-2"

    version = "1.11.5-gke.5"

    node_config {
        preemptible = false
    }

    depends_on = [ google_container_cluster.gke_cluster_2 ]

}

Steps followed:

  1. terraform apply to create a new cluster and nodepool
  2. Changed master_node_version to 1.11.6-gke.6 in cluster, and version to 1.11.6-gke.6 in nodepool
  3. changed preemptible to true
  4. terraform apply

Output log

google_container_node_pool.main_pool_2: Destroying... [id=projects/project-1v/locations/us-central1/clusters/test-cluster-5/nodePools/test-node-pool-2]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/eproject-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 10s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 20s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 30s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 40s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1locatio...t-cluster-5/nodePools/test-node-pool-2, 50s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 1m0s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 1m10s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 1m20s elapsed]
google_container_node_pool.main_pool_2: Still destroying... [id=projects/project-1/locatio...t-cluster-5/nodePools/test-node-pool-2, 1m30s elapsed]
google_container_node_pool.main_pool_2: Destruction complete after 1m32s
google_container_cluster.gke_cluster_2: Modifying... [iproject-1/locations/us-central1/clusters/test-cluster-5]
...
google_container_cluster.gke_cluster_2: Still modifying... [id=projects/project-1v/locations/us-central1/clusters/test-cluster-5, 15m10s elapsed]
google_container_cluster.gke_cluster_2: Still modifying... [id=projects/project-1/locations/us-central1/clusters/test-cluster-5, 15m20s elapsed]
google_container_cluster.gke_cluster_2: Modifications complete after 15m21s [id=projects/edwinhernandez-gke-dev/locations/us-central1/clusters/test-cluster-5]
google_container_node_pool.main_pool_2: Creating...
google_container_node_pool.main_pool_2: Still creating... [10s elapsed]
google_container_node_pool.main_pool_2: Still creating... [20s elapsed]
google_container_node_pool.main_pool_2: Still creating... [30s elapsed]
google_container_node_pool.main_pool_2: Still creating... [40s elapsed]
google_container_node_pool.main_pool_2: Still creating... [50s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m0s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m10s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m20s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m30s elapsed]
google_container_node_pool.main_pool_2: Still creating... [1m40s elapsed]
google_container_node_pool.main_pool_2: Creation complete after 1m43s [id=projects/project-1/locations/us-central1/clusters/test-cluster-5/nodePools/test-node-pool-2]

Apply complete! Resources: 1 added, 1 changed, 1 destroyed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants