Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terraform wants to recreate cluster on every apply #8

Closed
asaghri opened this issue Oct 25, 2018 · 19 comments
Closed

Terraform wants to recreate cluster on every apply #8

asaghri opened this issue Oct 25, 2018 · 19 comments

Comments

@asaghri
Copy link

asaghri commented Oct 25, 2018

Hello,

Thanks for the great module. However with the avaibility zones variables being set, terraform wants to recreate the cluster on every apply as you can see on this issue (hashicorp/terraform#16724).

I guess a workaround would be to drop the availability_zones variables in the cluster.

Tx !

@max-rocket-internet
Copy link
Contributor

Hey @asaghri
Interesting. Could you paste your code?

I am using this without any problem:

availability_zones              = ["${data.aws_availability_zones.available.names}"]

@asaghri
Copy link
Author

asaghri commented Oct 25, 2018

Hey @max-rocket-internet,

That was super fast ! Thanks for the advice I'll try that right away.

Before I had this :
availability_zones = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]

@max-rocket-internet
Copy link
Contributor

availability_zones = ["eu-west-1a", "eu-west-1b", "eu-west-1c"]

That's functionally the same as my example. So no worries there.

If you could paste your code and the output from Terraform when it wants to destroy and recreate the cluster, that would be good 🙂

@asaghri
Copy link
Author

asaghri commented Oct 25, 2018

Ok so here is the code

module "aurora" {
  source                          = "github.com/terraform-aws-modules/terraform-aws-rds-aurora"
  name                            = "${local.name_prefix}-db"
  engine                          = "aurora-postgresql"
  engine_version                  = "10.4"
  subnets                         = ["${module.vpc.database_subnets}"]
  availability_zones              = ["${data.aws_availability_zones.available.names}"]
  vpc_id                          = "${module.vpc.vpc_id}"
  replica_count                   = "${var.aurora_replica_count}"
  username                        = "${var.aurora_master_username}"
  password                        = "${var.aurora_master_password}"
  instance_type                   = "${var.aurora_instance_type}"
  snapshot_identifier             = "${var.snapshot_identifier}"
  apply_immediately               = true
  skip_final_snapshot             = true
  db_parameter_group_name         = "${aws_db_parameter_group.aurora_db_postgres10_parameter_group.id}"
  db_cluster_parameter_group_name = "${aws_rds_cluster_parameter_group.aurora_cluster_postgres10_parameter_group.id}"
}

And the output

-/+ module.cleo.module.aurora.aws_rds_cluster.this (new resource required)
      id:                               "staging-cleo-bfmtv-db" => <computed> (forces new resource)
      apply_immediately:                "true" => "true"
      arn:                              "arn:aws:rds:eu-west-1:834179885026:cluster:staging-cleo-bfmtv-db" => <computed>
      availability_zones.#:             "3" => "3"
      availability_zones.1924028850:    "eu-west-1b" => "eu-west-1b"
      availability_zones.3953592328:    "eu-west-1a" => "eu-west-1a"
      availability_zones.94988580:      "eu-west-1c" => "eu-west-1c"
      backup_retention_period:          "7" => "7"
      cluster_identifier:               "staging-cleo-bfmtv-db" => "staging-cleo-bfmtv-db"
      cluster_identifier_prefix:        "" => <computed>
      cluster_members.#:                "1" => <computed>
      cluster_resource_id:              "cluster-SBQNHFRE2DT7SUUDDBPAEMFZQY" => <computed>
      database_name:                    "" => <computed>
      db_cluster_parameter_group_name:  "staging-cleo-bfmtv-aurora-postgres10-cluster-parameter-group" => "staging-cleo-bfmtv-aurora-postgres10-cluster-parameter-group"
      db_subnet_group_name:             "staging-cleo-bfmtv-db" => "staging-cleo-bfmtv-db"
      endpoint:                         "staging-cleo-bfmtv-db.cluster-cw3emfe46duo.eu-west-1.rds.amazonaws.com" => <computed>
      engine:                           "aurora-postgresql" => "aurora-postgresql"
      engine_mode:                      "provisioned" => "provisioned"
      engine_version:                   "10.4" => "10.4"
      final_snapshot_identifier:        "final-staging-cleo-bfmtv-db-557bdeaf" => "final-staging-cleo-bfmtv-db-557bdeaf"
      hosted_zone_id:                   "Z29XKXDKYMONMX" => <computed>
      kms_key_id:                       "" => <computed>
      master_password:                  <sensitive> => <sensitive> (attribute changed)
      master_username:                  "cleorecette" => "cleorecette"
      port:                             "5432" => "5432"
      preferred_backup_window:          "02:00-03:00" => "02:00-03:00"
      preferred_maintenance_window:     "sun:05:00-sun:06:00" => "sun:05:00-sun:06:00"
      reader_endpoint:                  "staging-cleo-bfmtv-db.cluster-ro-cw3emfe46duo.eu-west-1.rds.amazonaws.com" => <computed>
      skip_final_snapshot:              "true" => "true"
      snapshot_identifier:              "cleo-aurora" => "cleo-aurora"
      storage_encrypted:                "false" => "true" (forces new resource)
      vpc_security_group_ids.#:         "1" => "1"
      vpc_security_group_ids.560316842: "sg-002c06942ec79f3f6" => "sg-002c06942ec79f3f6"

-/+ module.cleo.module.aurora.aws_rds_cluster_instance.this (new resource required)
      id:                               "staging-cleo-bfmtv-db-1" => <computed> (forces new resource)
      apply_immediately:                "true" => "true"
      arn:                              "arn:aws:rds:eu-west-1:834179885026:db:staging-cleo-bfmtv-db-1" => <computed>
      auto_minor_version_upgrade:       "true" => "true"
      availability_zone:                "eu-west-1c" => <computed>
      cluster_identifier:               "staging-cleo-bfmtv-db" => "${aws_rds_cluster.this.id}" (forces new resource)
      db_parameter_group_name:          "staging-cleo-bfmtv-aurora-db-postgres10-parameter-group" => "staging-cleo-bfmtv-aurora-db-postgres10-parameter-group"
      db_subnet_group_name:             "staging-cleo-bfmtv-db" => "staging-cleo-bfmtv-db"
      dbi_resource_id:                  "db-RVT2U4M3CN7DQTQIMDOK6QQVVU" => <computed>
      endpoint:                         "staging-cleo-bfmtv-db-1.cw3emfe46duo.eu-west-1.rds.amazonaws.com" => <computed>
      engine:                           "aurora-postgresql" => "aurora-postgresql"
      engine_version:                   "10.4" => "10.4"
      identifier:                       "staging-cleo-bfmtv-db-1" => "staging-cleo-bfmtv-db-1"
      identifier_prefix:                "" => <computed>
      instance_class:                   "db.r4.large" => "db.r4.large"
      kms_key_id:                       "" => <computed>
      monitoring_interval:              "0" => "0"
      monitoring_role_arn:              "" => <computed>
      performance_insights_enabled:     "false" => "false"
      performance_insights_kms_key_id:  "" => <computed>
      port:                             "5432" => <computed>
      preferred_backup_window:          "02:00-03:00" => <computed>
      preferred_maintenance_window:     "sun:05:00-sun:06:00" => "sun:05:00-sun:06:00"
      promotion_tier:                   "1" => "1"
      publicly_accessible:              "false" => "false"
      storage_encrypted:                "false" => <computed>
      writer:                           "true" => <computed>


Plan: 2 to add, 0 to change, 2 to destroy.

@asaghri
Copy link
Author

asaghri commented Oct 25, 2018

Ok sorry the problem came from the storage encryption.
The snapshot I used wasn't encrypted, so every time I applied it tried to change the storage type but it seems like it's not possible to change the storage encryption.

It recreates the cluster with the storage encryption set to true, but actually it doesn't change it.

@max-rocket-internet
Copy link
Contributor

Ah I see! Well mystery solved then.

@gannino
Copy link

gannino commented Nov 16, 2018

Hi Everyone,
I'm new to terraform (only 2 months) and I've been hit from this odd behavior too to be honest my belief is that the Aurora cluster try to use 3AZ when you specify the value availability_zone, even if you set your AZ to be only a and b when terraform refresh the status it find a 3rd zone that causes the destroy and recreate behaviour, as from info per the previous issue this disappear once you comment out availability_zone and specify only the db_subnet_group_name for your cluster.

hope this help, but an article explaining this behaviour might save some day of research for someone else.

hope this helps to find the root cause,
thanks everyone!
Giovanni

@nergdron
Copy link

nergdron commented Nov 22, 2018

I'm seeing this right now with the latest code, using only the subnets option, no azs or anything else specified. for me, it can't seem to correctly read the AZs and id state for for the existing cluster, so it always thinks those have changed:

      id:                                "my-db-id" => <computed> (forces new resource)
      availability_zones.#:              "3" => "0" (forces new resource)
      availability_zones.3551460226:     "us-east-1e" => "" (forces new resource)
      availability_zones.3569565595:     "us-east-1a" => "" (forces new resource)
      availability_zones.986537655:      "us-east-1c" => "" (forces new resource)

@asaghri
Copy link
Author

asaghri commented Nov 22, 2018

Could you share the code ?
I use data.aws_availability_zones.available.names to indicate the az and it works fine.
Do you use a snapshot id ?

@nergdron
Copy link

yes, I'm not specifying the AZs, all I'm supplying is the subnets, and it's computing the AZs from that. I'm not using a snapshot at all, this is a fresh install.

@nergdron
Copy link

module "db" {
  source = "../../modules/aws-rds-aurora"

  name                    = "something-${var.aws_env}"
  identifier_prefix       = "something-${var.aws_env}"
  vpc_id                  = "${data.aws_vpc.info.id}"
  subnets                 = "${var.aws_private_subnet_ids}"
  allowed_security_groups = ["${aws_security_group.mysg.id}"]

  engine                          = "aurora-postgresql"
  engine_version                  = "10.4"
  storage_encrypted               = "true"
  preferred_maintenance_window    = "Sun:03:00-Sun:03:30"
  preferred_backup_window         = "04:00-04:30"
  replica_count                   = 1
  instance_type                   = "db.${var.aws_instance_type}"
  skip_final_snapshot             = true                          # not useful on initial create
  db_parameter_group_name         = "default.aurora-postgresql10"
  db_cluster_parameter_group_name = "default.aurora-postgresql10"

  username = "something"
  password = "somethingelse" # default, must be changed after setup

  tags = {
    Name        = "something-${var.aws_env}"
    environment = "${var.aws_env}"
    terraform   = "true"
  }
}

@asaghri
Copy link
Author

asaghri commented Nov 22, 2018

ok then you should try to add this to use all the zones available and see if it works :

availability_zones = "${data.aws_availability_zones.available.names}"

@nergdron
Copy link

looks like if I do that it'll want to kill and recreate my instance too, since we're using specific AZs intentionally, given that us-east-1 is a bit of a mess and not all instance types and configurations are available in all AZs.

-/+ module.db.aws_rds_cluster.this (new resource required)
      id:                                "chatapi-metrics-dev" => <computed> (forces new resource)
      apply_immediately:                 "false" => "false"
      arn:                               "arn:aws:rds:us-east-1:759579518471:cluster:chatapi-metrics-dev" => <computed>
      availability_zones.#:              "3" => "6" (forces new resource)
      availability_zones.1252502072:     "" => "us-east-1f" (forces new resource)
      availability_zones.1305112097:     "" => "us-east-1b" (forces new resource)
      availability_zones.2762590996:     "" => "us-east-1d" (forces new resource)
      availability_zones.3551460226:     "us-east-1e" => "us-east-1e"
      availability_zones.3569565595:     "us-east-1a" => "us-east-1a"
      availability_zones.986537655:      "us-east-1c" => "us-east-1c"

strangely, it sees the existing AZs when I supply AZs as an arg, but not when I let it compute them. so I feel like this may be something wrong in the upstream terraform module. note that even if I fix the AZs problem it still wants to recreate it every time because of the id as well. so I'm not sure what the solution here is.

@max-rocket-internet
Copy link
Contributor

I think the problem is already solved in this PR: #10.

i.e. just pass subnets to the module and don't use the availability_zones argument at all. It's not really clear from the documentation how these 2 arguments interact when they don't match.

given that us-east-1 is a bit of a mess and not all instance types and configurations are available in all AZs.

Are you sure?? I've never heard of different AZs being inconsistent in this way.

@nergdron
Copy link

Oh yeah, PR#10 does seem to be what I'm encountering. thanks!

as for the AZ issues, we've definitely run into this in the past, were certain instance types were available in certain AZs for extended periods of time. but only in us-east-1. us-west-2, for instance, never seems to have this issue. we've always put it down to it being the oldest and crustiest region, and AWS not exactly keeping things consistent much of anywhere in their codebases.

@MarkAsbrey
Copy link

Ok sorry the problem came from the storage encryption.
The snapshot I used wasn't encrypted, so every time I applied it tried to change the storage type but it seems like it's not possible to change the storage encryption.

It recreates the cluster with the storage encryption set to true, but actually it doesn't change it.

Having what looks to be the same issue, did you manage to get round it without needing to destroy the cluster?

@wayneworkman
Copy link

wayneworkman commented Jun 26, 2019

I'm having the same problems with DocumentDB. If I try to pass a subnet group to a cluster, it wants to rebuild it every time. If I pass the availability zones, it wants to rebuild it every time. If I comment out those two lines, it doesn't rebuild it every time.

I have to be able to pass a list of subnets at the minimum. We only use two availibility zones of the available 3 in the region.

@sanoop19
Copy link

sanoop19 commented Apr 10, 2020

My requirement was only two AZ with Aurora , Do not use DB Subnet group in RDS Instance , Not sure thats a logic , but this resolved my issue. ALso need to have lifestyle rule in place

apply_immediately = "true"
lifecycle {
ignore_changes = [
"availability_zones",
]
}
}

Before the issue

resource "aws_rds_cluster_instance" "test" {
count = var.count1
instance_class = var.instance_type
identifier = "${var.rds_identifier}-${count.index+01}"
cluster_identifier = aws_rds_cluster.testmigcluster.id
#db_subnet_group_name = aws_db_subnet_group.subnet_group.name
ca_cert_identifier = "rds-ca-2019"
promotion_tier = "1"
db_parameter_group_name = aws_db_parameter_group.rds-parameter.name
engine = var.engine
engine_version = var.engine_version

After solving ****

resource "aws_rds_cluster_instance" "test" {
count = var.count1
instance_class = var.instance_type
identifier = "${var.rds_identifier}-${count.index+01}"
cluster_identifier = aws_rds_cluster.testmigcluster.id
#db_subnet_group_name = aws_db_subnet_group.subnet_group.name
ca_cert_identifier = "rds-ca-2019"
promotion_tier = "1"
engine = var.engine
engine_version = var.engine_version

Also In avaiability zone i mentioned eu-west1a, and eu-west-1b
In Subnet group for RDS Cluster i mentioned EU-WEST-1a subnet and 1b subnet

bryantbiggs pushed a commit to bryantbiggs/terraform-aws-rds-aurora that referenced this issue Aug 16, 2022
…meter-group

[HARRI-136673] Add example for parameter-group
@github-actions
Copy link

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Apr 15, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants