Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

aws_security_group: DependencyViolation: resource sg-XXX has a dependent object #11047

Closed
brikis98 opened this issue Jan 5, 2017 · 23 comments
Closed

Comments

@brikis98
Copy link
Contributor

brikis98 commented Jan 5, 2017

Terraform Version

Terraform v0.8.2

Affected Resource(s)

  • aws_security_group

Terraform Configuration Files

This is part of a larger configuration, but I think the relevant parts are as follows.

Under modules/webserver-cluster/main.tf, I define a module with the following code:

resource "aws_autoscaling_group" "example" {
  launch_configuration = "${aws_launch_configuration.example.id}"
  availability_zones   = ["${data.aws_availability_zones.all.names}"]
  load_balancers       = ["${aws_elb.example.name}"]
  health_check_type    = "ELB"

  min_size = 2
  max_size = 10
}

resource "aws_launch_configuration" "example" {
  image_id        = "ami-40d28157"
  instance_type   = "t2.micro"
  security_groups = ["${aws_security_group.instance.id}"]

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_security_group" "instance" {
  name = "my-security-group"

  lifecycle {
    create_before_destroy = true
  }
}

resource "aws_security_group_rule" "allow_http_inbound" {
  type              = "ingress"
  security_group_id = "${aws_security_group.instance.id}"

  from_port   = 80
  to_port     = 80
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]
}

data "aws_availability_zones" "all" {}

resource "aws_elb" "example" {
  name               = "my-example-elb"
  availability_zones = ["${data.aws_availability_zones.all.names}"]
  security_groups    = ["${aws_security_group.elb.id}"]

  listener {
    lb_port           = 80
    lb_protocol       = "http"
    instance_port     = 80
    instance_protocol = "http"
  }

  health_check {
    healthy_threshold   = 2
    unhealthy_threshold = 2
    timeout             = 3
    interval            = 30
    target              = "HTTP:80/"
  }
}

resource "aws_security_group" "elb" {
  name = "elb"
}

resource "aws_security_group_rule" "allow_http_inbound" {
  type              = "ingress"
  security_group_id = "${aws_security_group.elb.id}"

  from_port   = 80
  to_port     = 80
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]
}

resource "aws_security_group_rule" "allow_all_outbound" {
  type              = "egress"
  security_group_id = "${aws_security_group.elb.id}"

  from_port   = 0
  to_port     = 0
  protocol    = "-1"
  cidr_blocks = ["0.0.0.0/0"]
}

output "elb_security_group_id" {
  value = "${aws_security_group.elb.id}"
}

In a separate folder, I use this module in the usual way, but also add a custom security group rule:

module "webserver_cluster" {
  source = "modules/webserver-cluster"

  # ... pass various parameters ...
}

resource "aws_security_group_rule" "allow_testing_inbound" {
  type              = "ingress"
  security_group_id = "${module.webserver_cluster.elb_security_group_id}"

  from_port   = 12345
  to_port     = 12345
  protocol    = "tcp"
  cidr_blocks = ["0.0.0.0/0"]
}

Expected Behavior

I expect to be able to run terraform apply and terraform destroy without errors.

Actual Behavior

terraform apply works fine. Occasionally, terraform destroy fails with the following error:

aws_security_group.elb: DependencyViolation: resource sg-344baa48 has a dependent object

Steps to Reproduce

  1. terraform apply
  2. terraform destroy

Important Factoids

It's an intermittent issue, so I can't be sure, but I don't think this error happened with Terraform 0.7.x.

@catsby catsby added the core label Jan 5, 2017
@leigh507
Copy link

leigh507 commented Jan 6, 2017

Same issue seen on terraform-0.7.13.

@r0ps3c
Copy link

r0ps3c commented Jan 31, 2017

I'm having the same problem with v0.8.5

@brikis98
Copy link
Contributor Author

brikis98 commented Mar 7, 2017

Hitting the issue regularly now with Terraform 0.8.8.

@josh-padnick
Copy link

I can confirm I'm also hitting the issue regularly in v0.8.8.

TestPersistentEbsVolume 2017/03/08 01:46:34 module.example.aws_security_group.instance: Still destroying... (1m0s elapsed)
TestPersistentEbsVolume 2017/03/08 01:46:44 module.example.aws_security_group.instance: Still destroying... (1m10s elapsed)
...
TestPersistentEbsVolume 2017/03/08 01:50:24 module.example.aws_security_group.instance: Still destroying... (4m50s elapsed)
TestPersistentEbsVolume 2017/03/08 01:50:34 module.example.aws_security_group.instance: Still destroying... (5m0s elapsed)
TestPersistentEbsVolume 2017/03/08 01:50:36 Error applying plan:
TestPersistentEbsVolume 2017/03/08 01:50:36 
TestPersistentEbsVolume 2017/03/08 01:50:36 1 error(s) occurred:
TestPersistentEbsVolume 2017/03/08 01:50:36 
TestPersistentEbsVolume 2017/03/08 01:50:36 * aws_security_group.instance: DependencyViolation: resource sg-14029373 has a dependent object
TestPersistentEbsVolume 2017/03/08 01:50:36 	status code: 400, request id: 486cd332-60a5-4ffc-ad49-8979207f96d9
TestPersistentEbsVolume 2017/03/08 01:50:36 

It looks like Terraform is indeed waiting for some dependent object on the Security Group...

@josh-padnick
Copy link

An anecdotal update on this. When I manually delete the Security Group from the AWS Console, Terraform immediately continues executing, so I suspect there's some bug around deleting a Security Group or first terminating an EC2 Instance and then deleting the Security Group.

@mpas
Copy link

mpas commented Mar 9, 2017

I run into the same situation, i need to manually delete the network interfaces from the security groups and then it works.

@philoserf
Copy link

philoserf commented Mar 13, 2017

This appears to be an order of execution problem. I ran into it with security groups.

I have verified :

  1. the destroy is issued before the new security group is added to an instance, and
  2. before the security group to destroy is removed from an instance.

I will explore some more and update this issue as I find more detail.


As a workaround for other users I offer the following:

While the destroy is underway ...Still destroying... (10s elapsed)

  1. Add the replacement security group to the target instance(s)
  2. Remove the destroy target security group from the target instance(s)
  3. Watch terraform fail to add the new security group to the instance(s)
* aws_instance.PLACEHOLDER: InvalidParameterValue: Value (PLACEHOLDER) for parameter iamInstanceProfile.name is invalid. Invalid IAM Instance Profile name
	status code: 400, request id: PLACEHOLDER_GUID
  1. Rerun the terraform build

@josh-padnick
Copy link

Still hitting this on Terraform 0.9. Can maintainers make an official decision on whether to declare this a bug?

@kerbou
Copy link

kerbou commented Mar 29, 2017

Hi all. I've experienced the same error message today when renaming security groups. I suspect that my EC2 instance is the dependant object since it's my environment is quite low-key (1 EC2-instance, 1 ELB, 1 RDS instance and securitygroups, subnets to make things communicate).

Terraformversion: 0.9.1

@dolphane
Copy link

I am also experiencing this with two interdependent security groups (each refer to the other as a source)

Terraform v0.9.3

@josh-padnick
Copy link

We're still suffering from this and it's causing build failures in our automated Terraform tests up to 50% of the time. Is there an update on this?

@houqp
Copy link

houqp commented Jun 4, 2017

I am on v0.9.6 and I am also running into this issue.

@elektron9
Copy link

I'm on 0.9.5 and still see this issue.

@leftathome
Copy link

I ran into this with 0.9.3. I'm not sure what the Terraform-side answer is since the API pretty much just says, "Dependency failure." From a Terraform provider perspective, one would have to write something that would at least attempt to resolve what that dependency is and either present it to the user or automatically take some kind of action.

In my case I was trying to delete a security group that was itself the source security group of a rule on another security group ("Let 'sg-being-destroyed' access this security group on port '80', protocol 'tcp'"). When I removed this rule from the other SG, Terraform finished destroying successfully.

I've spoken to others who were able to resolve this issue by attempting to delete the object via the AWS management console - the console will do the legwork of finding the conflict for you.

@jennyfountain
Copy link

I am on 0.9.11 and still see this issue.

@jsdevel
Copy link

jsdevel commented Jul 19, 2017

This happened to me recently after I implemented this after having issues renaming my launch configuration. Using name_prefix on my ASG worked. The previous ASG had instances that were dependent on the SG.

@s-nakka
Copy link

s-nakka commented Aug 2, 2017

I still see it on Terraform v0.9.9.

@geekifier
Copy link

geekifier commented Aug 31, 2017

Apparently still a problem in Terraform 0.10.

resource "aws_instance" "temp_bastion" {
  subnet_id                   = "${var.vpc_public_subnets[0]}"
  instance_type               = "t2.nano"
  ami                         = "${var.dcos_ami}"
  key_name                    = "${var.ec2_keypair}"
  associate_public_ip_address = true
  vpc_security_group_ids      = ["${var.vpc_security_group_public_id}", "${aws_security_group.bastion.id}"]

  root_block_device {
    volume_type = "gp2"
    volume_size = "20"
  }

  tags {
    "TerraForm"   = "True"
    "Name"        = "${var.dcos_cluster_name}-bastion"
    "Environment" = "${var.tf_env_name}"
  }

  volume_tags {
    "TerraForm"   = "True"
    "Name"        = "${var.dcos_cluster_name}-bastion"
    "Environment" = "${var.tf_env_name}"
  }
}

resource "aws_security_group" "bastion" {
  vpc_id = "${data.aws_subnet.public.vpc_id}"
  name   = "appBastion"

  ingress {
    from_port   = 22
    to_port     = 22
    protocol    = "tcp"
    cidr_blocks = ["0.0.0.0/0"]
  }

  tags {
    "Name"        = "appBastion"
    "Environment" = "${var.tf_env_name}"
    "TerraForm"   = "True"
  }
}

Changing the name of the SG produces the behavior described by previous posters.

module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 10s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 20s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 30s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 40s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 50s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 1m0s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 1m10s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 1m20s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 1m30s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 1m40s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 1m50s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 2m0s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 2m10s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 2m20s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 2m30s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 2m40s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 2m50s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 3m0s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 3m10s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 3m20s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 3m30s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 3m40s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 3m50s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 4m0s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 4m10s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 4m20s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 4m30s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 4m40s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 4m50s elapsed)
module.dcos_cluster1.aws_security_group.bastion: Still destroying... (ID: sg-5e2bfb36, 5m0s elapsed)
Error applying plan:

1 error(s) occurred:

* module.dcos_cluster1.aws_security_group.bastion (destroy): 1 error(s) occurred:

* aws_security_group.bastion: DependencyViolation: resource sg-5e2bfb36 has a dependent object
	status code: 400, request id: 823c264e-0016-400c-a227-0f16b4165cde

Interestingly, adding create_before_destroy to the SG does not help. In my case, I was changing the name of the SG from AppBastion to appBastion. Terraform differentiates the two strings, but the AWS API does not. As a result, a new SG could not be created prior to destroy.

@realflash
Copy link
Contributor

Still in v.0.10.3. I changed the description of an SG which triggers a new resource and then this failure.

@apparentlymart
Copy link
Contributor

This error is coming from the AWS API itself. Terraform is retrying it for five minutes because sometimes when e.g. an EC2 instance is connected with a security group there is a delay between the instance being destroyed and its network interface being destroyed, and the network interface "holds on to" the security group.

The relationship between network interface and instance is something Terraform doesn't directly manage -- it's done behind the scenes as part of the EC2 instances API -- and so this five-minute retry loop was put into place to allow us to wait until this hidden process completes and the network interface is deleted.

It sounds like there's either a situation where the network interface takes longer than five minutes to detach or where some other object is attached to the security group that Terraform can't "see". Either way, this is going to require some research to understand what's going on, so I'm going to have our bot move this over to the AWS provider repository where it's more likely to be seen by those working on that provider.

@apparentlymart
Copy link
Contributor

I think some people here are encountering a slightly different problem where they are trying to apply a plan with an action like -/+ aws_security_group.example which doesn't also re-create the associated EC2 instances.

I don't think there's any way we can support replacement of the security group without also replacing the EC2 instance, but there is a limitation here similar to hashicorp/terraform-provider-aws#1315 where Terraform would ideally be able to understand that the instance must be replaced in order to replace the security group and correctly describe that in the diff, allowing the user to decide what to do.

Since we already have #16065 open to discuss a core change to help Terraform detect and handle that scenario, let's consider this particular issue to be about the unexplained occurrences of this "has a dependent object" error, where it comes up specifically during terraform destroy and thus we'd expect that all of the dependent resources should've been destroyed already by the time we're destroying the security group.

@hashibot
Copy link
Contributor

This issue has been automatically migrated to hashicorp/terraform-provider-aws#1671 because it looks like an issue with that provider. If you believe this is not an issue with the provider, please reply to this issue and let us know.

@ghost
Copy link

ghost commented Apr 7, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 7, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests