Terraform replacing AKS nodepool cluster when changing VM count #3835

local-master · 2019-07-12T09:34:34Z

Community Note

Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform (and AzureRM Provider) Version

Terraform v0.12.4
provider.azurerm v1.31.0

Affected Resource(s)

azurerm_kubernetes_cluster

Terraform Configuration Files

resource "azurerm_resource_group" "test" {
  location = "${var.location}"
  name = "${var.resource_group_name}"
}
resource "azurerm_kubernetes_cluster" "test" {
  name                = "${var.resource_group_name}"
  location            = "${var.location}"
  resource_group_name = "${azurerm_resource_group.test.name}"
  dns_prefix          = "test"
  kubernetes_version  = "${var.kubernetes_version}"

  linux_profile {
      admin_username  = "${var.admin_user}"
      ssh_key {
            key_data  = "${var.ssh_key}"
        }
    }

  agent_pool_profile {
    name              = "${var.agent_pool1_name}"
    count             = "${var.agent_pool1_count}"
    vm_size           = "${var.agent_pool1_vm_size}"
    os_type           = "Linux"
    os_disk_size_gb   = "${var.agent_pool1_os_disk_size}"
    vnet_subnet_id    = "${var.subnet_id}"
    max_pods          = "${var.max_pods_per_node}"
    type              = "${var.agent_pool_scaling}"
  }

  agent_pool_profile {
    name              = "${var.agent_pool2_name}"
    count             = "${var.agent_pool2_count}"
    vm_size           = "${var.agent_pool2_vm_size}"
    os_type           = "Linux"
    os_disk_size_gb   = "${var.agent_pool2_os_disk_size}"
    vnet_subnet_id    = "${var.subnet_id}"
    max_pods          = "${var.max_pods_per_node}"
    type              = "${var.agent_pool_scaling}"
  }

  service_principal {
    client_id         = "is somewhere"
    client_secret     = "is somewhere"
  }

  network_profile {
    network_plugin    = "${var.network_plugin}"
  }

  role_based_access_control {
    enabled           = "true"

Expected Behavior

Changing "count" in one of the "agent_pool_profile" and running "terraform apply" should add one more node to cluster.

Actual Behavior

Terraform replaces whole cluster ad adds new one with new number of nodes in the given nodepool. It also seems to be chaining the nodepool name around from looking at the plan

terraform plan output:

  # azurerm_kubernetes_cluster.cloudbees-jenkins-dev must be replaced

~ agent_pool_profile {
          ~ count           = 1 -> 2
          + dns_prefix      = (known after apply)
          ~ fqdn            = "test" -> (known after apply)
            max_pods        = 30
          ~ name            = "medium" -> "performance" # forces replacement
            os_disk_size_gb = 50
            os_type         = "Linux"
            type            = "VirtualMachineScaleSets"
          ~ vm_size         = "Standard_D4s_v3" -> "Standard_D8s_v3" # forces replacement
}

        }
      ~ agent_pool_profile {
            count           = 1
          + dns_prefix      = (known after apply)
          ~ fqdn            = "test" -> (known after apply)
            max_pods        = 30
          ~ name            = "performance" -> "medium" # forces replacement
            os_disk_size_gb = 50
            os_type         = "Linux"
            type            = "VirtualMachineScaleSets"
          ~ vm_size         = "Standard_D8s_v3" -> "Standard_D4s_v3" # forces replacement
}

      - service_principal {
          - client_id = "acutal_client_id" -> null
        }

### Steps to Reproduce

1. Change nodepool count from 1 to 2
2. `terraform plan or apply`

The text was updated successfully, but these errors were encountered:

rmb938 · 2019-07-23T16:18:27Z

I would also like the ability to modify agent pools without having the cluster be recreated.

All this can be done via the command line without having to delete the cluster https://docs.microsoft.com/en-us/cli/azure/ext/aks-preview/aks/nodepool?view=azure-cli-latest

So it should be simple to modify the provider to do something similar.

djsly · 2019-08-01T02:30:54Z

@titilambert here's another on that we could tackle at the same time

davidack · 2019-08-14T19:05:16Z

Modifications to node pools causing the cluster to be destroyed and recreated are definitely a problem with the current version of the azurerm provider. That problem still needs to be fixed.

However in this case I believe you are running into the same problem I ran into last week: the provider is not sorting the agent_pool_profile blocks from your code before comparing them to the node pools in the current state (which appear to be returned in alphabetical order by name). Two of the four agent_pool_profile blocks in my code were not in alphabetical order by name, and running terraform plan or terraform apply would result in exactly the kind of behavior you are seeing: a plan that wanted to destroy then recreate the two node pools in question along with the cluster, while swapping all the differing parameters of the two node pools (name, vm_size etc), even if no changes had been made to the code since the last apply.

It seems to be that the provider should be sorting both the elements from the code, and the elements from the query of current state, in the same way so they can be compared properly. Should this be considered another aspect of this issue, or should I open a separate issue for it?

The workaround, until this sorting bug is fixed, is to make sure the names of your agent_pool_profile blocks are listed in your code in alphabetical order.

artburkart · 2019-10-21T22:56:26Z

@davidack, someone ultimately made an issue documenting what you reported: #4560

davidack · 2019-10-21T23:03:40Z

Thanks Art, for both the heads up and for the fix in #4676.

ghost · 2019-11-26T08:43:45Z

This has been released in version 1.37.0 of the provider. Please see the Terraform documentation on provider versioning or reach out if you need any assistance upgrading. As an example:

provider "azurerm" {
    version = "~> 1.37.0"
}
# ... other configuration ...

nidhi5885 · 2020-01-17T12:29:33Z

I am facing the same kind of issue:

Problem Statement : Terraform is causing Kubernetes cluster to be recreated everytime I execute the below command:
az aks get-credentials -n K8clustername -g resourcegroupname

Particulary this command is replacing my .kube/config file.

I am not getting how by executing the above command changes the terraform state.

Provider Versions I am using:
Terraform v0.12.2

provider.azurerm v1.39.0
provider.helm v0.10.4
provider.kubernetes v1.10.0
provider.local v1.4.0

ghost · 2020-01-17T14:50:22Z

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. If you feel I made an error 🤖 🙉 , please reach out to my human friends 👉 [email protected]. Thanks!

katbyte added bug service/kubernetes-cluster labels Jul 12, 2019

lewalt000 mentioned this issue Jul 30, 2019

azurerm_kubernetes_cluster: Adding node pools causes AKS cluster replacement #3971

Closed

tombuildsstuff mentioned this issue Nov 17, 2019

r/kubernetes_cluster: supporting conditional updates / introducing default_node_pool #4898

Merged

tombuildsstuff closed this as completed in #4898 Nov 20, 2019

tombuildsstuff added this to the v1.37.0 milestone Nov 20, 2019

ghost locked and limited conversation to collaborators Jan 17, 2020

catriona-m added the v/1.x (legacy) label Jul 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Terraform replacing AKS nodepool cluster when changing VM count #3835

Terraform replacing AKS nodepool cluster when changing VM count #3835

local-master commented Jul 12, 2019

rmb938 commented Jul 23, 2019

djsly commented Aug 1, 2019

davidack commented Aug 14, 2019

artburkart commented Oct 21, 2019

davidack commented Oct 21, 2019

ghost commented Nov 26, 2019

nidhi5885 commented Jan 17, 2020

ghost commented Jan 17, 2020

Terraform replacing AKS nodepool cluster when changing VM count #3835

Terraform replacing AKS nodepool cluster when changing VM count #3835

Comments

local-master commented Jul 12, 2019

Community Note

Terraform (and AzureRM Provider) Version

Affected Resource(s)

Terraform Configuration Files

Expected Behavior

Actual Behavior

rmb938 commented Jul 23, 2019

djsly commented Aug 1, 2019

davidack commented Aug 14, 2019

artburkart commented Oct 21, 2019

davidack commented Oct 21, 2019

ghost commented Nov 26, 2019

nidhi5885 commented Jan 17, 2020

ghost commented Jan 17, 2020