-
Notifications
You must be signed in to change notification settings - Fork 9.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Using element with splat reference should scope dependency to selected resource #3449
Comments
Confirmed. We have run into this issue as well. I think it has to do with dependencies not taking the "count" into account. |
I think this comes down to that the state does not track which specific instance an instance depends on, only the resource. Here is an example:
When removing an aws_instance, you would have to find all aws_volume_attachments which happen to share the same "instance_id" attribute. But that would be provider, and perhaps even resource, specific. However, this is not specific to aws. It will occur anytime you have to resources with count parameters, where one resource depends on the other. The right abstraction would be to depend on "gw_instance.db_instance.0" in this case. I don't know what the implications of that would be, though. |
Turns out I was wrong. The "depends_on" attribute in the state file has nothing to do with this. Consider this diff:
It seems like changing one element of the aws_instance.my_instance.*.id causes the entire "element" expression to be considered changed. |
Our current workaround is to duplicate the "aws_volume_attachment" resources, rather than using the element function. |
I dug further into this. It seems the expected behaviour broke with commit 7735847, which was introduced to fix issue #2744. To me, it seems like you want the treatment of unknown values in splats to behave differently depending on the interpolation context. When you use formatlist, you want to treat the entire list as unknown if it contains any unknown value, but for element, you only care about if a specific value in the list is unknown or not. I did a test where I introduced a new splat operator with the only difference being how it is treated if the list contains unknown values. It solves the problem, but having two splat operators is kind of confusing. @mitchellh: Ideas? |
Thanks for the report @kklipsch and thanks for taking the time to do a deep dive, @danabr!
Yep I think this is the key insight. Switching this to In the meantime, duplicating |
Ok thanks. Unfortunately, for our use case that very quickly becomes unwieldy as we are doing 10s of nodes currently but want to be able to scale up to hundreds. |
@kklipsch: If you are OK with running a patched terraform for a while, and you don't rely on the formatlist behavior anywhere, you can just comment out the three lines at https://github.com/hashicorp/terraform/blob/master/terraform/interpolate.go#L466, and compile terraform yourself. |
@danabr and @phinze resource "aws_instance" "appnodes" {
instance_type = "${var.flavor_name}"
ami = "${var.image_name}"
key_name = "${var.key_name}"
security_groups = ["${split(",", var.security_groups)}"]
availability_zone = "${var.availability_zone}"
user_data = "${file("mount.sh")}"
tags {
Name = "${var.app_name}-${format("%02d", 1)}"
}
}
resource "aws_volume_attachment" "ebsatt" {
device_name = "/dev/sdh"
volume_id = "${aws_ebs_volume.ebsvolumes.id}"
instance_id = "${aws_instance.appnodes.id}"
}
resource "aws_ebs_volume" "ebsvolumes" {
availability_zone = "${var.availability_zone}"
size = "${var.ebs_size}"
type = "${var.ebs_type}"
}
resource "aws_instance" "app-nodes" {
instance_type = "${var.flavor_name}"
ami = "${var.image_name}"
key_name = "${var.key_name}"
security_groups = ["${split(",", var.security_groups)}"]
availability_zone = "${var.availability_zone}"
user_data = "${file("mount.sh")}"
tags {
Name = "${var.app_name}-${format("%02d", 1)}"
}
}
resource "aws_volume_attachment" "ebs_att" {
device_name = "/dev/sdh"
volume_id = "${aws_ebs_volume.ebs-volumes.id}"
instance_id = "${aws_instance.app-nodes.id}"
}
resource "aws_ebs_volume" "ebs-volumes" {
availability_zone = "${var.availability_zone}"
size = "${var.ebs_size}"
type = "${var.ebs_type}"
} |
@pdakhane: Just take kklipsch, example, but instead of using a "count" attribute of the aws_volume_attachment resource, create multiple aws_volume_attachment_resources referring directly to the instances and volumes. For example if you have three instances:
This only works if you have a small number of nodes, though, and are OK to use the same number of instances in all environments. |
@phinze pointed to this issue as potentially related to mine. Here is my config (redacted for readability):
The basic idea is that every instance gets a "runner" attached that does binary deployment and other things. I'm using a null_resource to break a dependency cycle with ELB addresses used by the runner. The first time I bring up an instance, everything works fine: each instance gets created, then the null_resource runs properly on each. Here's the log of terraform plan after terminating an instance:
I was expecting only "null_resource.cockroach-runner.1" to be updated, but it seems that 0 and 2 changed as well. |
Re-titling this to indicate the nature of the core issue here. We'll get this looked at soon! |
Just pinging here since we just ran into this issue as well. |
Okay just consolidated a few other issue threads that were expressions of this bug into this one. My apologies to all the users who have been hitting this - this is now in my list of top priority core bugs to get fixed soon. As I alluded to with the re-title, this issue comes down to the fact that Terraform core is currently unaware that The most direct solution would be to "just make it work for This is probably not the right way to go as it is (a) difficult to implement "context-awareness" into that part of the codebase, and (b) a brittle solution that sets a bad precedent of special casing certain functions in the core. Because of this, the core team thinks the best way forward is to add first-class list indexing into the interpolation language. This would promote the behavior of I've got a spike of first-class-indexing started, and I'll keep this thread updated with my progress. |
💯 |
@phinze thank you so much for the detailed response and the ongoing effort! 🎉 |
Thanks for the report @phinze - is there a WIP branch available to follow along? |
Keen to see this one resolved. Quite limiting for those of using count with instances and template_file to generate userdata. Does anyone know of a workaround? |
I created a new issue #14536 with details of my problem |
…f resources. See hashicorp/terraform#3449 (comment) for details.
I have a similar issue passing a splat list to a module, even if I access the elements inside the module with [] syntax. Does passing a list into a module impact whether its considered 'unknown' as a whole? |
@apparentlymart I have the same issue as @dannytrigo :'( Here is my sample :
main.tf variable "shortnames" {
type = "list"
}
module "generic_linux" {
source = "./instance/"
shortnames = "${var.shortnames}"
}
resource "aws_ebs_volume" "data" {
count = "${length(var.shortnames)}"
availability_zone = "eu-west-1a"
size = "5"
type = "gp2"
}
resource "aws_volume_attachment" "data_ebs_att" {
count = "${length(var.shortnames)}"
device_name = "/dev/sdc"
volume_id = "${aws_ebs_volume.data.*.id[count.index]}"
instance_id = "${module.generic_linux.instances_ids[count.index]}"
} My module code is : variable "shortnames" {
type = "list"
description = "list of shortname"
}
resource "aws_instance" "instances" {
count = "${length(var.shortnames)}"
instance_type = "t2.micro"
key_name = "formation-hpc"
ami = "ami-xxxxxxxx"
vpc_security_group_ids = ["sg-xxxxxxxx"]
subnet_id = "subnet-xxxxxxxx"
tags {
Name = "${var.shortnames[count.index]}-${count.index}"
}
}
output "instances_ids" {
value = "${aws_instance.instances.*.id}"
} Usage of
Is there a workaround to add nodes on an undefined size cluster based on a generic instance module without recreate each dependant resources ? |
I do have the same issue. It's a bit shocking that this issue has been opened for 2 years and hasn't been fixed already. It'd be great if someone had a workaround for this. |
I just happened to find a workaround-ish using resource "aws_volume_attachment" "data_ebs_att" {
count = "${length(var.shortnames)}"
device_name = "/dev/sdc"
volume_id = "${aws_ebs_volume.data.*.id[count.index]}"
instance_id = "${module.generic_linux.instances_ids[count.index]}"
lifecycle {
ignore_changes: ["instance_id"]
}
}
|
I've done similar @loalf - but it feels as though that really shouldn't be necessary. Being that Terraform is intentionally declarative, I can see how it's ended up being this way. In my case, I dynamically allocate instances in round-robin fashion to whatever variable number of subnets I have. BUT, when you change the number of subnets you have provisioned in a given VPC, it can dangerously trigger the recreation of your EC2's, so I've done something similar to what you have. Check this:
|
@armenr would your solution create additional ec2 instances or remove extra ones when the count of subnets changes? Or is this designed to always keep the number of ec2 instances static after initial creation? |
Please check this option: Instead of "element" use the [] option. |
@misham - Good question! It will KEEP existing instances in the subnets where they reside, and add instances when you add a subnet to your list of subnets. From what I recall, if I issue a destroy on a specific subnet, the EC2's get destroyed also. |
Recently upgraded terraform
I've tried the syntax by @apparentlymart # Create AWS Instances
resource "aws_instance" "web" {
count = "${var.count}"
ami = "${var.aws_ami}"
instance_type = "${var.aws_instance_type}"
associate_public_ip_address = "${var.aws_public_ip}"
...
}
# Attach Instances to Application Load Balancer
resource "aws_alb_target_group_attachment" "web" {
count = "${var.count}"
target_group_arn = "${var.aws_alb_target_group_arn}"
# target_id = "${element(aws_instance.web.*.id, count.index)}"
target_id = "${aws_instance.web.*.id[count.index]}"
port = "${var.aws_alb_target_group_port}"
} However when I issue the command:
or just
Terraform wants to destroy all aws_alb_target_group_attachments:
I can properly remove just the aws_alb_target_group_attachment:
However, if I follow that up with a destroy of the instance it will want to remove all other remaining target group attachment still. Is the approach wrong or is there still a bug here? |
I still have same problem. Example: resource "aws_instance" "masters" {
count = "3"
ami = "${var.ami}"
}
resource "null_resource" "outindex" {
count = "3"
triggers {
cluster_instance = "${aws_instance.masters.*.id[count.index]}"
}
provisioner "local-exec" {
command = "date"
}
lifecycle { create_before_destroy = true }
} When I try to update instance with new AMI for first resource it first updated ALL instances, then start execute null resource.
I expected to see only for first instance changes. Environment:
OS: MacOS UPDATE: Currently to fix this I do:
|
you cannot use count.index with a list variable to get resource id, it'll cause a forces new resource, that is, delete all old resources and re-create again
I see that this is closed but I'm still experiencing the same issue in v0.11.10. Is this expected? |
I noticed this issue appears to still be happening in Terraform v0.11.14. Could this be because we are using a module under the hood to create the EC2 instances? Incrementing our count from 7 => 8 causes all volume attachments 1-7 to be re-attached. module "elk-elasticsearch-node" {
source = "./app-cluster-static"
} # ./app-cluster-static/main.yml
module "this" {
source = "terraform-aws-modules/ec2-instance/aws"
version = "~> 1.19.0"
...
}
|
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further. |
I'm trying to setup a multi-node cluster with attached ebs volumes. An example below:
If a change happens to a single node (for instance if a single ec2 instance is terminated) ALL of the aws_volume_attachments are recreated.
Clearly we would not want volume attachments to be removed in a production environment. Worse than that, in conjunction with #2957 you first must unmount these attachments before they can be recreated. This has the effect of making volume attachments only viable on brand new clusters.
The text was updated successfully, but these errors were encountered: