-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unknown values should not block successful planning #30937
Comments
Isn't that a typo?
s/known/unknown |
Same typo here I think
s/known/unknown |
Thanks for the corrections! I've updated the original comment to fix them. |
Hi @apparentlymart, I believe you made a similar typo in your blog post :
|
Apparently it's easy for my fingers to confuse those! I'll update my blog at some point, too. Thanks! |
@apparentlymart Is this why Oversimplified example. With count = length(var.a_list) works, even when the values in the count = length(compact(var.a_list)) fails, because count = length(sort(var.a_list)) fails. The cardinality of the list is not changed by sorting it, so why should it break |
I expect that the I imagine we might be able to change |
It would be great to have providers lazy load if a value it is waiting on to be initialized. This way you can create a cluster and provisioning the kuberentes provider can work in the same apply in the same workspace. #1102 I think this issue also effects For example first apply when creating a many
where as works just fine
|
I wanted to add a note here which may or may not be related depending on what solutions we find. There is one place where unknown values are allowed, but can drastically alter the behavior of the plan -- resource attributes which force replacement. Feeding an unknown into an attribute of this type always forces replacement, which itself may cause more values to become unknown, cascading replacements through the config even when the end result may have little to no differences from the initial state. Technically a solution to the problem posed here would generally have no negative effect on this type of resource replacement, but we need to keep this in mind if we start allowing users to more haphazardly throw around unknown values. (A solution here may even help with this problem, but we can brainstorm about that and other possibilities in other threads) |
Thanks for adding that, @jbardin. I think you're describing the situation where if a value of an existing argument becomes unknown later then we'll conservatively assume it will have a different value than it had before, which will at least cause the provider to plan an update but may also sometimes cause the provider to plan a replace instead, if it's in an argument that can't update in place. The specific case where folks find that confusing / frustrating is where the new value happens to end up being the same as the old value, and so Terraform ends up needlessly updating the object to the values it already had, or replacing the object with one configured in the same way as the previous one. This tends to be exacerbated today by the fact that our old SDK design made it awkward for providers to offer predictions of what values attributes would have during planning -- for example, to use the documented syntax for a particular kind of ARN to predict an AWS resource type's |
My two biggest requests in this area are as follows, I could have listed many more but these two and the
|
Hi @stevehipwell! Thanks for sharing those. I'm familiar with the behavior you're describing for I'm asking because I'd like to see more details on what's going on so we can understand whether this is a situation where the provider must return unknown for some reason -- in which case this issue is relevant as a way to make Terraform handle that better -- or if the AWS provider should already have enough information to avoid producing an unknown value in that case, and so we could potentially fix that inside the AWS provider with Terraform as it exists today. For all of the AWS provider resource types I'm familiar with, changing a tag can be achieved using an in-place update which should therefore avoid anything else becoming unknown, but I don't have any direct experience with the EKS resource types in particular, and there's possibly something different about the EKS API that means the AWS provider needs to treat it differently than tagging of other resource types. |
@apparentlymart I'll add something here next time I see this, but there is nothing special with an EKS cluster to make it need custom behaviour. We see it elsewhere but for us the EKS cluster is the resource at the top of the biggest graphs. We spent a significant engineering effort to change how we use Terraform to limit the number of incorrect "known at apply" fields showing up in the plan. Our code is much harder to understand and develop but our end users get a cleaner plan, that said there are still a significant number of cases where this will happen. This has also reminded me of a number of other very obvious issues in this area.
|
IAM policy documents are a particularly tricky case because it ends up folding a bunch of separate identifiers into a single string, which must therefore end up being unknown if any part of it is unknown. The AWS provider could potentially help with this by generating ARNs during the planning step based on the syntax specified in the AWS documentation, but I'm aware that most of the time This issue is focused only on the situations where unknown values cause Terraform to return an error, and so the problem of "limiting the number of known after apply" in the plan is not in scope for this issue, but if you see situations where a provider is returning an unknown value even though it should have all of the information to return a known value I'd suggest opening an enhancement request with the relevant provider about that, because providers are already empowered to return final values during planning if they include the logic necessary to do so. For the sake of this issue, we'd benefit from there being fewer situations where providers return unknown values, but it will never be possible to eliminate them entirely -- some values really are determined by the remote system only during the apply step -- and so this issue is about making sure that those situations cannot cause Terraform to fail planning altogether, so that it's always possible to make progress towards a converged result. |
@apparentlymart whenever we've opened issues about these issues they get closed, ignored, or both. If you want another scenario where this error happens; if you create a new resource in the plan and compare an input value with the resource identifier to create a list of exclusive identifiers it will error even though it should be able to determine that the input value isn't equal. This usually happens in a multi module scenario but the following convoluted example should have the same behaviour. variable "identifier" {
type = string
}
resource "resource_type_a" "my_resource" {
}
resource "resource_type_b" "my_other_resource" {
count = length(concat([resource_type_a.my_resource.id], var.identifier != "" && var.identifier != resource_type_a.my_resource.id ? [var.identifier] : []))
} |
Thanks @stevehipwell. For the purpose of the framing of this issue the answer in that case would be for Making the Terraform language's analysis of unknown values more precise is also possible, but is something separate from what this issue is about. What you have there is essentially a Terraform language version of the situation we were originally describing, where something is unknown even though it could technically be known. Feel free to open a new issue in this repository about that case if you like, and we could see about addressing it separately from what this issue is covering, although I will say that at first glance it seems correct to me that this would return unknown if The intended scope of this issue is that there should be no situation where an unknown value leads to an error saying that the value must be known, regardless of why the unknown value is present. Situations where a value turns out unknown but needn't be are also good separate enhancement requests (either in this repository for the core language, or in provider repositories for providers), but are not in scope for this issue. |
I updated the example slightly to check that the var is set for completeness, but that doesn't change the behaviour. The point here is that if data about the types isn't lost (the identifier of a resource which doesn't exist can't be passed in) this should evaluate correctly, but if the type system falls back to the lowest common denominator (known or unknown) it's a tricky problem. Shouldn't the type system be evaluated and "fixed" if possible before changing the behaviour of the system as a whole? A potential outcome might be the lack of need to change the system, but a more likely outcome would be to change the scope or shift the context. |
To be clear, I'm not saying that both aren't valid improvements to make, just that I don't want this issue to grow into an overly generalized "everything that's wrong with unknown values" situation which could never be resolved. More than happy to discuss improvements to the precision of Terraform's handling of unknown values in other issues, but this one is intended to have a very specific "acceptance criteria" after which it'd be time to close it, which is that Terraform doesn't block planning when |
+1 for this - and/or to add an equivalent of In the case of Kubernetes, the behavior I'm looking for is pretty simple: if the cluster that the provider depends on needs to be created or replaced, we can safely assume that all of the resources that are defined using that provider will also need to be created from scratch, and the plan can accurately reflect this. There should be a way of passing a provider the known information that a planned operation will destroy all of its existing resources. In the more complex case where we can't be certain if a resource will still exist or not if some of its provider's values are unknown, I'd be fine with Terraform expressing that uncertainty in the plan as "may be replaced", etc. Having some uncertainty in the plan, and letting the end user decide whether to accept that uncertainty, is infinitely preferable over a perfectly valid configuration being impossible to apply! If some users want to always have a 100% deterministic plan, what about letting them choose the following configurable behaviors?
|
Hi, |
I have a module which optionally assigns a Key Vault role to a group, depending on whether a Key Vault ID is specified when the module is called:
However, because the Key Vault does not yet exist, I get the error described above. I guess the other option would be to do what the error message says, and EDIT: Using an variable "pipeline_key_vault" {
type = object({
name = string
id = string
})
description = "Pipeline Key Vault object"
default = null
} |
Any progress on that issue ? |
Yes, there has been considerable progress on this issue. It's under active development. |
Is hashicorp/terraform-provider-kubernetes#1775 a side effect of this issue? Would it make sense to mark that one as a duplicate? |
Yes, although they may have work to do to close that issue after the core functionality is finished here, and that issue is already marked |
@apparentlymart This is great to hear! Are you able to provide a sneak peak into what changes are being introduced and how this might affect what options we have for unknown values being used in providers? Are you implementing a mechanism to allow Terraform to converge on the desired state over multiple applies, or are you solving this another way? I'm at a fork in the road for some refactoring I'm working on - either provision an EKS cluster in one root module and bootstrap it with the kubernetes provider in another, or provision and bootstrap the cluster in the same root module with some hacky workarounds to the provider issues described in this ticket. If there are changes coming soon which will mean that provisioning and bootstrapping a cluster can be done cleanly under the same root module then I'm inclined to go for the hacky, single module option now and clean it up once these improvements are available, to avoid needing the separate root modules. Also, is there any expectation that this work will make it possible for child resources to be deleted using the old provider configuration and created using the new one? So taking your |
The idea of "unknown values" is a crucial part of how Terraform implements planning as a separate step from applying.
An unknown value is a placeholder for a value that Terraform (most often, a Terraform provider) cannot know until the apply step. Unknown values allow Terraform to still keep track of type information where possible, even if the exact values aren't known, and allow Terraform to be explicit in its proposed plan output about which values it can predict and which values it cannot.
Internally, Terraform performs checks to ensure that the final arguments for a resource instance at the apply step conform to the arguments previously shown in the plan: known values must remain exactly equal, while unknown values must be replaced by known values matching the unknown value's type constraint. Through this mechanism, Terraform aims to promise that the apply phase will use the same settings as were used during planning, or Terraform will return an error explaining that it could not.
The design goal for unknown values is that Terraform should always be able to produce some sort of plan, even if parts of it are not yet known, and then it's up to the user to review the plan and decide either to accept the risk that the unknown values might not be what's expected, or to apply changes from a smaller part of the configuration (e.g. using
-target
) in order to learn more final values and thus produce a plan with fewer unknowns.However, Terraform currently falls short of that goal in a couple different situations:
The Terraform language runtime does not allow an unknown value to be assigned to either of the two resource repetition meta-arguments,
count
andfor_each
.In that situation, Terraform cannot even predict how many instances of a resource are being declared, and it isn't clear how exactly Terraform should explain that degenenerate situation in a plan and so currently Terraform gives up and returns an error:
If any unknown values appear in a
provider
block for configuring a provider, Terraform will pass those unknown values to the provider's "Configure" function.Although Terraform Core handles this in an arguably-reasonable way, we've never defined how exactly a provider ought to react to crucial arguments being unknown, and so existing providers tend to fail or behave strangely in that situation.
For example, some providers (due to quirks of the old Terraform SDK) end up treating an unknown value the same as an unset value, causing the provider to try to connect to somewhere weird like a port on localhost.
Providers built using the modern Provider Framework don't run into that particular malfunction, but it still isn't really clear what a provider ought to do when a crucial argument is unknown and so e.g. the AWS Cloud Control provider -- a flagship use of the new framework -- reacts to unknown provider arguments by returning an error, causing a similar effect as we see for
count
andfor_each
above.Although the underlying causes for the errors in these two cases are different, they both lead to a similar problem: planning is blocked entirely by the resulting error and the user has to manually puzzle out how to either change the configuration to avoid the unknown values appearing in "the wrong places", or alternatively puzzle out what exactly to pass to
-target
to select a suitable subset of the configuration to cause the problematic values to be known in a subsequent untargeted plan.Terraform should ideally treat unknown values in these locations in a similar way as it does elsewhere: it should successfully produce a plan which describes what's certain and is explicit about what isn't known yet. The user can then review that plan and decide whether to proceed.
Ideally in each situation where an unknown value appears there should be some clear feedback on what unknown value source it was originally derived from, so that in situations where the user doesn't feel comfortable proceeding without further information they can more easily determine how to use
-target
(or some other similar capabililty yet to be designed) to deal with only a subset of resources at first and thus create a more complete subsequent plan.This issue is intended as a statement of a problem to be solved and not as a particular proposed solution to that problem. However, there are some specific questions for us to consider on the path to designing a solution:
Is it acceptable for Terraform to produce a plan which can't even say how many instances of a particular resource will be created?
That's a line we've been loathe to cross so far because the difference between a couple instances and tens of instances can be quite an expensive bill, but the same could be said for other values that Terraform is okay with leaving unknown in the plan output, such as the "desired count" of an EC2 autoscaling group. Maybe it's okay as long as Terraform is explicit about it in the plan output?
A particularly "interesting" case to consider here is if some instances of a resource already exist and then subsequent changes to the configuration cause the
count
orfor_each
to become retroactively unknown. In that case, the final result ofcount
orfor_each
could mean that there should be more instances of the resource (create), fewer instances of the resource (destroy), or no change to the number of instances (no-op). I personally would feel uncomfortable applying a plan that can't say for certain whether it will destroy existing objects.Conversely, is it acceptable for Terraform to automatically produce a plan which explicitly covers only a subset of the configuration, leaving the user to run
terraform apply
again to pick up where it left off?This was essence of the earlier proposal Partial/Progressive Configuration Changes #4149, which is now closed due to its age and decreasing relevance to modern Terraform. That proposal made the observation that, since we currently suggest folks work around unknown value errors by using
-target
, Terraform could effectively synthesize its own-target
settings to carve out the maximum possible set of actions that can be taken without tripping over the two problematic situations above.Should providers (probably with some help from the Plugin Framework) be permitted to return an entirely-unknown response to the
UpgradeResourceState
,ReadResource
,ReadDataSource
, andPlanResourceChange
operations for situations where the provider isn't configured completely enough to even attempt these operations?These are the four operations that Terraform needs to be able to ask a partially-configured provider to perform. If we allow a provider to signal that it isn't configured enough to even try at those, what should Terraform Core do in order to proceed with that incomplete or stale information?
We most frequently encounter large numbers of unknown values when planning the initial creation of a configuration, when nothing at all exists yet. That is definitely the most common scenario where these problems arise, but a provider can potentially return unknown values even as part of an in-place update if that is the best representation of the remote API's behavior -- for example, perhaps one of the output attributes is derived from an updated argument in a way that the provider cannot predict or simulate.
Do we need to take any extra care to deal with the situation where an unknown value cascades downstream from an updated or replaced resource instance?
For example, if I've used an attribute from a vendor-specific Kubernetes cluster resource to provide an API URL to the
hashicorp/kubernetes
provider and the user changes the configuration of the cluster itself in a way that causes the API URL to change, how should Terraform and the Kubernetes provider react to the cluster URL being unknown even though there are existing objects bound to resources belonging to that provider which we will need to refresh and plan?What sort of analysis would we need to implement in order to answer questions like "why is this value unknown?" and "what subset of actions could I take in order to make this value be known?"?.
The text was updated successfully, but these errors were encountered: