Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

r/aws_security_group_rule create_before_destroy fails when changing CIDRs #25173

Closed
Nuru opened this issue Jun 4, 2022 · 8 comments
Closed
Labels
bug Addresses a defect in current functionality. service/vpc Issues and PRs that pertain to the vpc service. stale Old or inactive issues managed by automation, if no further action taken these will get closed.

Comments

@Nuru
Copy link

Nuru commented Jun 4, 2022

Community Note

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or other comments that do not add relevant new information or questions, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment

Terraform CLI and Terraform AWS Provider Version

Terraform v1.2.0
on darwin_amd64
+ provider registry.terraform.io/hashicorp/aws v4.17.1

Affected Resource(s)

  • aws_security_group_rule

Terraform Configuration Files

Click to reveal
variable "ipv6_allowed" {
  type    = bool
  default = false
}

variable "other_vpc_allowed" {
  type    = bool
  default = false
}

provider "aws" {
  region = "us-west-2"
}

resource "aws_vpc" "default" {
  cidr_block = "10.9.8.0/24"

  assign_generated_ipv6_cidr_block = true
}

resource "aws_security_group" "default" {
  vpc_id = aws_vpc.default.id
}

resource "aws_security_group_rule" "default" {
  security_group_id = aws_security_group.default.id
  type              = "ingress"
  from_port         = 0
  to_port           = 0 # [sic] from and to port ignored when protocol is "-1", warning if not zero
  protocol          = "-1"
  description       = "Allow ingress from VPC"
  cidr_blocks       = concat([aws_vpc.default.cidr_block], var.other_vpc_allowed ? ["10.1.2.0/24"] : [])
  ipv6_cidr_blocks  = var.ipv6_allowed ? [aws_vpc.default.ipv6_cidr_block] : []

  lifecycle {
    create_before_destroy = true
  }
}

Expected Behavior

I want to use create_before_destroy = true for Security Group Rules so that I can avoid the service interruption that comes when existing rules are destroyed before the new rules are created. I expect the provider to be smart enough to avoid trying to create an existing rule.

Actual Behavior

Whenever an CIDR is added or removed from a rule, the apply fails if an existing CIDR is retained.

│ Error: [WARN] A duplicate Security Group rule was found on (sg-01b3487e05d48a6a3). This may be
│ a side effect of a now-fixed Terraform issue causing two security groups with
│ identical attributes but different source_security_group_ids to overwrite each
│ other in the state. See https://github.com/hashicorp/terraform/pull/2376 for more
│ information and instructions for recovery. Error: InvalidPermission.Duplicate: the specified rule "peer: 10.9.8.0/24, ALL, ALLOW" already exists

Unlike with other, similar issues (see References below), in this case the change is not actually made.

Steps to Reproduce

  1. terraform apply -auto-approve
  2. terraform apply -auto-approve -var other_vpc_allowed=true # error
  3. terraform apply -auto-approve -var ipv6_allowed=true # error
  4. terraform destroy
  5. terraform apply -auto-approve -var other_vpc_allowed=true
  6. terraform apply -auto-approve # error

References

Similar to, but different from these other issues. Different because in this case the rules are not properly updated and there is no good workaround. The only fix that does not cause a service interruption is to add the new rule manually.

@github-actions github-actions bot added needs-triage Waiting for first response or review from a maintainer. service/vpc Issues and PRs that pertain to the vpc service. labels Jun 4, 2022
@osterman
Copy link

osterman commented Jun 6, 2022

Linking hashicorp/terraform#2376 since it's mentioned in the output.

@justinretzolk justinretzolk added bug Addresses a defect in current functionality. and removed needs-triage Waiting for first response or review from a maintainer. labels Jun 6, 2022
@carlosjgp
Copy link

carlosjgp commented Sep 9, 2022

This is a limitation, or feature (?), of the AWS API that doesn't allow duplicated entries + the fact that the resource aws_security_group_rule creates different SG rule IDs per CIDR for each item provided on the list

Later on when trying to change that list something goes terribly wrong and either

  • TF tries to add the whole list again and fails since there are entries that already exist
  • TF tries to delete all the SG Rule IDs and create them all again but fails to delete them or there is a race condition doing this or AWS API still reports that it exists for a while after being deleted 🤷

I'd advise the following workaround

resource "aws_security_group_rule" "default_ipv4" {
  for_each = toset(concat([aws_vpc.default.cidr_block], var.other_vpc_allowed ? ["10.1.2.0/24"] : []))
  security_group_id = aws_security_group.default.id
  type              = "ingress"
  from_port         = 0
  to_port           = 0 # [sic] from and to port ignored when protocol is "-1", warning if not zero
  protocol          = "-1"
  description       = "Allow ingress from VPC"
  cidr_blocks       = [each.value]
}

resource "aws_security_group_rule" "default_ipv6" {
  for_each = toset(var.ipv6_allowed ? [aws_vpc.default.ipv6_cidr_block] : [])
  security_group_id = aws_security_group.default.id
  type              = "ingress"
  from_port         = 0
  to_port           = 0 # [sic] from and to port ignored when protocol is "-1", warning if not zero
  protocol          = "-1"
  description       = "Allow ingress from VPC"
  ipv6_cidr_blocks  = [each.value]
}

You might need to increase your account SG rule per SG attachments

@ewbankkit
Copy link
Contributor

Relates #20104.

@aidanmelen
Copy link
Contributor

aidanmelen commented Oct 7, 2022

The challenge working with the aws_security_group_rule resource is that individual list elements and individual argument types are not updated in-place and are instead replaced. That is the fundamental challenge here.

Attempts to use Create Before Destroy lifecycle management with this resource will frequently result in duplcate rule failures. The only generalization solution that I can think of is to have users create a single aws_security_group_rule resource for each EC2 Security Group rule you intend to create. This obviously defeats the purpose of the argument rule argument grouping.

I am working to solve this at the module level by unpacking user provided rules such that the user can specify any permutation of rule argument groups in the Terraform while guaranteeing that for_each user defined rule a single/dedicated aws_security_group_rule resource will be created. Rule replacement caused by updates will not have the unwanted side-effect of changing other rules that just so happen to be grouped in the same resource.

@aidanmelen
Copy link
Contributor

@aidanmelen
Copy link
Contributor

@aidanmelen
Copy link
Contributor

I just released v2.0.0 of the terraform-aws-security-group-v2 module with support for what I call unpacked rules. That is:

A single aws_security_group_rule resource can result in one or many security group rules being created by the EC2 API. When unpacking is enabled, aws_security_group_rule resource arguments supplied by the user will be unpacked such that each resource is guaranteed to result in the EC2 API creating exactly one rule. This prevents the side-effect of a single rule update causing many other unwanted updates during the replacement.

The core unpack feature works like this:

module "unpack" {
  source  = "aidanmelen/security-group-v2/aws"
  version = ">= 2.0.0"
  rules = [
    {
      from_port                = "443"
      to_port                  = "443"
      protocol                 = "tcp"
      cidr_blocks              = ["10.0.1.0/24", "10.0.2.0/24"]
      ipv6_cidr_blocks         = ["2001:db8::/64"]
      prefix_list_ids          = [data.aws_prefix_list.private_s3.id]
      source_security_group_id = data.aws_security_group.default.id
      self                     = true
      description              = "managed by Terraform"
    }
  ]
}

# Outputs:

# rules = [
#   {
#     "cidr_blocks" = [
#       "10.0.1.0/24",
#     ]
#     "description" = "managed by Terraform"
#     "from_port" = "443"
#     "protocol" = "tcp"
#     "to_port" = "443"
#   },
#   {
#     "cidr_blocks" = [
#       "10.0.2.0/24",
#     ]
#     "description" = "managed by Terraform"
#     "from_port" = "443"
#     "protocol" = "tcp"
#     "to_port" = "443"
#   },
#   {
#     "description" = "managed by Terraform"
#     "from_port" = "443"
#     "ipv6_cidr_blocks" = [
#       "2001:db8::/64",
#     ]
#     "protocol" = "tcp"
#     "to_port" = "443"
#   },
#   {
#     "description" = "managed by Terraform"
#     "from_port" = "443"
#     "prefix_list_ids" = [
#       "pl-11111111",
#     ]
#     "protocol" = "tcp"
#     "to_port" = "443"
#   },
#   {
#     "description" = "managed by Terraform"
#     "from_port" = "443"
#     "protocol" = "tcp"
#     "self" = true
#     "to_port" = "443"
#   },
#   {
#     "description" = "managed by Terraform"
#     "from_port" = "443"
#     "protocol" = "tcp"
#     "source_security_group_id" = "sg-11111111"
#     "to_port" = "443"
#   },
# ]

In this way each rule gets a dedicated aws_security_group_rule resource which solves for the need for create before destroy.

Please see the Unpack Example for more information.

Copy link

Marking this issue as stale due to inactivity. This helps our maintainers find and focus on the active issues. If this issue receives no comments in the next 30 days it will automatically be closed. Maintainers can also remove the stale label.

If this issue was automatically closed and you feel this issue should be reopened, we encourage creating a new issue linking back to this one for added context. Thank you!

@github-actions github-actions bot added the stale Old or inactive issues managed by automation, if no further action taken these will get closed. label Sep 29, 2024
@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Addresses a defect in current functionality. service/vpc Issues and PRs that pertain to the vpc service. stale Old or inactive issues managed by automation, if no further action taken these will get closed.
Projects
None yet
Development

No branches or pull requests

6 participants