Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

provider/aws: Cleanup the Lambda ENI deletion fails on destroy (after update Terraform to v0.7.11) #10272

Closed
mgagliardo opened this issue Nov 21, 2016 · 14 comments · Fixed by #11849

Comments

@mgagliardo
Copy link

mgagliardo commented Nov 21, 2016

Hi there,

I was running v0.7.8 and recently upgraded to v0.7.11 (without jumping to 0.7.9 or 0.7.10 first). It seems that GH-5767 came back at some point between those versions.

Basically we have an SNS topic that triggers a lambda function within a VPC, if at least once this lambda function is used, the ENI attached to it won't get removed and the destroy command will fail and we'll need to remove the rest of the VPC components by hand.

Also need to mention that between upgrade 0.7.8 -> 0.7.11 I did not even touch the code, neither from the lambda function nor the terraform config files.

Should you need any additional info please let me know.

Thanks a lot in advance! (and sorry if I forgot to add some info, first issue on TF here!)


Terraform Version

$ terraform --version
Terraform v0.7.11

Affected Resource(s)

  • aws_security_group.lambda_sg
  • aws_subnet

Terraform Configuration Files

I will just add some example code as there is some sensitive information on mine, but this should be enough to check out.

resource "aws_security_group" "lambda_sg" {
  name        = "lambda-cert-removal-sg-service"
  vpc_id      = "${var.vpc_id}"
}

resource "aws_lambda_function" "main" {
    function_name    = "${var.function_name}"
    description      = "Lambda function"
    filename         = "./my_file_path.zip"
    role             = "${aws_iam_role.iam_role.arn}"
    runtime          = "python2.7"
    timeout          = "60"
    handler          = "certs_removal_script.lambda_handler"

    vpc_config       = {
      subnet_ids         = [ "subnet-4ab34e12,subnet-9j5gs95s" ]
      security_group_ids = [ "${aws_security_group.lambda_sg.id}" ]
    }
}

Panic Output

[...]
module.base_vpc.aws_subnet.private.1: Still destroying... (4m0s elapsed)
module.base_vpc.aws_subnet.private.1: Still destroying... (4m10s elapsed)
module.base_vpc.aws_subnet.private.1: Still destroying... (4m20s elapsed)
module.base_vpc.aws_subnet.private.1: Still destroying... (4m30s elapsed)
module.base_vpc.aws_subnet.private.1: Still destroying... (4m40s elapsed)
module.base_vpc.aws_subnet.private.1: Still destroying... (4m50s elapsed)
Error applying plan:

2 error(s) occurred:

* aws_security_group.lambda_sg: DependencyViolation: resource sg-f70eb68a has a dependent object
	status code: 400, request id: dc5e1aa2-df29-4d1c-8bbb-0dab2b1513ba
* aws_subnet.private.1: Error deleting subnet: timeout while waiting for state to become 'destroyed' (last state: 'pending', timeout: 5m0s)

Expected Behavior

The terraform destroy command should have succeeded without issues (like it did on 0.7.8 before I upgraded to 0.7.11)

Actual Behavior

The lambda ENI should have been removed and it was not even detached.

* aws_security_group.lambda_sg: DependencyViolation: resource sg-f70eb68a has a dependent object
	status code: 400, request id: dc5e1aa2-df29-4d1c-8bbb-0dab2b1513ba

Steps to Reproduce

  1. Use the lambda function at least once.
  2. terraform destroy
  3. See it fail :(

References

Issue (exactly same issue):

PRs that fixed this in the past:

@jsonmaur
Copy link

jsonmaur commented Nov 22, 2016

Having this same problem, but with aws_security_group attached to a Lambda function with a VPC. Any change to the security group that forces a destroy/recreation of the resource will fail with the message aws_security_group.main: DependencyViolation: resource sg-xxxxx has a dependent object. I have to manually go into the VPC and detatch/delete the network interface (ENI) automatically created by Lambda before I can continue with Terraform.

@msassak
Copy link

msassak commented Dec 2, 2016

Just ran into this myself. Same issue with Lambda plus VPC config. I was able to solve it by importing the related ENI by hand:

terraform import aws_network_interface.lambda_eni <eni-id>

After this terraform plan -destroy recognizes the ENI and marks it for destruction, but that is obviously not a long-term solution.

@gozer
Copy link

gozer commented Feb 2, 2017

Right now, I am seeing this issue with v0.8.5, and I believe it's because the requesterid in my case isn't what the code is searching for in builtin/providers/aws/resource_aws_security_group.go

filter3 := &ec2.Filter{
		Name:   aws.String("requester-id"),
		Values: []*string{aws.String("*:awslambda_*")},
	}

But in the case of my ENI, I can see it's got a requester-id of

"RequesterId": "AROAJ2WVHV7OGRSULBYCI:<lambda-function-name>",

@grayaii
Copy link
Contributor

grayaii commented Feb 9, 2017

So if I understand that correctly, the RequesterId no longer has the string "awslambda_" in it, therefore, the ENI does not get picked up for deletion. I'm not sure what value to replace "awslambda_" with.

@grayaii
Copy link
Contributor

grayaii commented Feb 10, 2017

I can't find (nor can I remember) why RequesterId was even added here (most likely as an extra safety net to make sure we don't delete the wrong ENI), but deleteLingeringLambdaENIs only gets called when resourceAwsSecurityGroupDelete is called, and the filters are "security group-id" and "description", so it should be "good enough". I still don't know who added "awslambda_" to the requestor-id (terraform? or AWS?), but I removed that filter and things seem to be working fine. I'll create a PR shortly.

@grayaii
Copy link
Contributor

grayaii commented Feb 10, 2017

Here is the PR for this issue: #11849

@AirbornePorcine
Copy link

I don't suppose anyone has come up with a clever workaround for this issue for the moment that allows destroy to work seamlessly?

@grayaii
Copy link
Contributor

grayaii commented Mar 16, 2017

What version of terraform are you using? It's been fixed for quite a while now. This issue used to bite us big time, but ever since the fix went in, destroys work just fine.

@AirbornePorcine
Copy link

The PR doesn't seem to be merged which is why I'm asking - I'm on 0.8.8 right now and seeing this behaviour.

@grayaii
Copy link
Contributor

grayaii commented Mar 16, 2017

ah, you're right. i'm using terraform built from the branch that has the fix. Looks like there are merge conflicts with base now :( I'll try to resolve those conflicts so that we can re-request this PR.

@grayaii
Copy link
Contributor

grayaii commented Mar 20, 2017

I resolved the conflicts and asked for the PR to be merged: #11849

@brikis98
Copy link
Contributor

This is definitely still broken in Terraform 0.9.x. If you use Terraform to a) create a lambda function, b) give that lambda function access to a VPC, and c) attach a security group to that lambda function, then terraform destroy will no longer work. It makes lambda unusable with a VPC :(

@primeroz
Copy link

primeroz commented Jul 4, 2017

I am having the very same issue on 0.9.10 . Was this closed because we have a resolution/workaround ? If yes, what is the workaround ?

@ghost
Copy link

ghost commented Apr 8, 2020

I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues.

If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@ghost ghost locked and limited conversation to collaborators Apr 8, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

9 participants