Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

non-obvious ordering of functions and resources #25162

Open
danieldreier opened this issue Jun 5, 2020 · 7 comments
Open

non-obvious ordering of functions and resources #25162

danieldreier opened this issue Jun 5, 2020 · 7 comments

Comments

@danieldreier
Copy link
Contributor

Current Terraform Version

Terraform v0.12.26, also applies to Terraform v0.13.0-beta1

Use-cases

I'm creating this as a tracking issue for a broad category of issues where people people do not realize that functions run early, and don't participate in the resource dependency graph, and encounter non-obvious behavior as a result.

Specific examples are things like trying to create a template file and then reference it using templatefile.

Attempted Solutions

No specific workarounds listed yet - this is just to gather evidence to support some other refactor.

Proposal

No proposal yet - this is just a tracking issue to link issues of this nature as they come in.

References

@apparentlymart
Copy link
Contributor

One thing I notice regarding this is that the documentation does talk about it specifically, but perhaps not using terms that are clear enough to the target audience, and also the text is pretty buried at the end of a wall of other text:

This function can be used only with files that already exist on disk at the beginning of a Terraform run. Functions do not participate in the dependency graph, so this function cannot be used with files that are generated dynamically during a Terraform operation. We do not recommend using dynamic templates in Terraform configurations, but in rare situations where this is necessary you can use the template_file data source to render templates while respecting resource dependencies.

Perhaps a way to start here would be to try to improve the docs situation around this to draw more attention to this behavior.

Most functions don't interact with the surrounding environment and so the time they are evaluated isn't significant. file, templatefile, and the other functions of this type are in a sense "bad" functions because they don't stick within the expected constraints, but they are there because extra resource files shipped as part of the configuration is a very common case. The way this is documented between all of them is inconsistent, and some of them don't even mention it directly at all, so I think there's lots of room for improvement of the docs even if we don't implement any changes to behavior for right now.

For easy reference, the set of functions whose result depends on environmental stuff that can be modified by terraform apply side-effects at the time of writing includes:

@tculp
Copy link

tculp commented Nov 29, 2021

Is there any plan to make some functions operate in the dependency graph, or to otherwise add functionality to render a template that can use calculated (applied) values, while also not being stuck to the primitives limitation of template_file?

@apparentlymart
Copy link
Contributor

Hi @tculp,

The templatefile function will accept dynamic values as part of the set of symbols to define in the template, given in the second argument. It's only the template source code itself which must be fixed as part of the configuration on disk. If you are interested in templates generated dynamically then you might be interested in #26838, though I also want to be explicit that as far as I know nobody is currently working on that, and because it likely involves a new language feature its next step is more detailed research and a design document rather than implementation.

@tculp
Copy link

tculp commented Nov 30, 2021

@apparentlymart Thanks for the help. I had tried using that before but I kept getting "... values will be known after apply", but now it's working... I'm not sure what changed, but I'll keep a lookout to see if I get that behavior again

Edit: I think I figured it out. I believe I had messed up the templating, but the "will be known after apply" message is printed informationally, but I mistook that to be the cause of the error

@dekelpilli
Copy link

I have a local-exec that creates a file, and then I use the sha of that file (filesha256) in a later resource. Just adding my 2 cents to say these functions respecting the dependency graph would be useful for me

@apparentlymart
Copy link
Contributor

apparentlymart commented Jun 6, 2024

Returning to this some time later, I think it might be helpful to be a bit more specific about what exactly it might mean for functions to "respect the dependency graph":

I know from related discussion about providers being configured during the planning phase that a lot of participants have a different idea about what exactly Terraform's "dependency graph" is used for and what it guarantees.

Terraform uses its dependency graph primarily to ensure that expressions are evaluated in the correct order relative to one another. During the apply phase in particular it also affects the order of the planned side-effects, but that addition is specific to the apply phase and so isn't helpful for anything that happens before the apply phase.

In particular, the dependency graph very little involvement in deciding what can happen in the planning phase vs. what must wait until the apply phase, or even to a future plan/apply round, because each phase has its own dependency graph used for its own evaluation process.

Terraform has some other mechanisms, separate from the dependency graph, that it uses to model the fact that some decisions need to be made before Terraform has performed the actions that would decide the inputs:

  • Unknown values: this is Terraform's main tool in this area, allowing us to insert a placeholder for a value that Terraform cannot predict yet. This is what renders as "(known after apply)" in the plan output. Terraform still evaluates all of the expressions to create a plan, but if an input to an expression is unknown then the result of that expression is likely to be unknown too.

    As far as possible Terraform tries to complete planning in the presence of unknown values, but there are some situations where Terraform cannot proceed today, discussed in Unknown values should not block successful planning #30937 where we are working to reduce those situations and turn the remaining ones into deferred actions that can be planned and applied in a future plan/apply round.

  • Explicit deferrals: in some cases Terraform is required to decide whether something should be done during the planning phase or should be deferred until the apply phase. The new idea of "deferred actions" is extending that further to include the idea of certain actions not even being plannable in the current round and so deferred for planning in a future round.

    The rules for this tend to be derived from either unknown values or dependencies, and so this is one way that the dependency graph can potentially indirectly affect the outcome, but this is a separate mechanism from the main execution dependency graph.

    For example, Terraform is designed to read from data sources during the planning phase if possible. "Possible" is currently defined as: the configuration does not contain any unknown values, and none of the data resource's direct dependencies already have planned changes. This indirectly uses the dependency graph because Terraform must already have planned the dependencies before planning the data resource, but the decision here is part of the data resource evaluation logic, not part of the graph walk itself.

Taking this feature request at face value then, it is asking for Terraform to have a rule for deciding that some functions should have their execution deferred until the apply phase under certain circumstances, and thus anything that would depend on the result of that function must also be somehow deferred until the apply phase.

We already have a few functions that have simple deferral behavior. For example, the timestamp function is defined as returning the time when the function was evaluated during the apply phase and so by definition it cannot be decided during the plan phase. Therefore it returns an unknown value during the planning phase, which then propagates through expression evaluation as with any unknown value and so downstream objects react to that however they are designed to.

A naive interpretation of this request would have us make the filesystem access functions also always return unknown values during the planning phase, so the the file contents read are always those present during the apply phase. But that would make these functions considerably less useful for their currently-intended purpose, and even if not it would be a breaking change and so not possible under the Terraform v1.x Compatibility Promises.

A more subtle compromise would be to change the functions to return unknown values only if they would otherwise have failed with an error, on the assumption that the error will probably go away during the apply phase. However:

  • That's a faulty assumption. It's also pretty common for these functions to fail because the author made a mistake in specifying a path in the arguments, in which case this would just move the correct error from the planning phase to the apply phase, making the iteration cycle longer and potentially risking a failure after other actions were already taken. (As far as possible Terraform aims to anticipate failures during the planning phase.)
  • It is arguably also a breaking change, although one that is more debatable than the naive idea. We typically don't consider replacing an error with a success as a breaking change, but we do still need to be somewhat cautious because the try and can functions allow authors to "program with errors" and if a particular pattern is in wide use (despite these functions' documented warnings against non-simple usage) we would probably err towards retaining compatibility out of pragmatism.
  • Not all problems with these functions would manifest as actual errors. For example, if a directory passed to fileset already exists during planning but would have extra files added to it during apply then the planning phase would succeed but produce an incomplete result. Terraform would then detect the inconsistency and raise a confusing error during the apply phase.

So that idea doesn't seem quite right either. I'm not going to enumerate any more possibilities here, since this comment is already long, but suffice it to say that every alternative I've considered so far has encountered some significant drawback that either makes it not solve the assumed problem or comes at a significant cost to those already using these functions in the way they were intended to be used -- for reading files that are distributed along with the configuration and not modified at all during the apply step.

Therefore this issue remains open in large part because it isn't clear that there is any satisfactory solution to it, even though we acknowledge that these functions are an attractive nuisance for someone new to Terraform who might misunderstand them as the way to read files in all cases.

In practice, the hashicorp/local provider already offers data-source-based equivalents of many of these functions, which therefore do allow ordering of operations relative to other changes in the same way as any other data resource would. Those that aren't available could be added if there is sufficient interest, although I'm not responsible for that provider myself so I can't commit on behalf of the team that maintains it.

With all of this said then, it seems to me that finding a way to clarify the situation better in our documentation remains the most plausible possibility for this issue. Documentation is not in my area of responsibility and so I cannot say exactly what ought to change, but it does seem that the current docs are not getting the point across sufficiently.

(There is also room for changing the error messages that Terraform returns when these functions fail, but we are typically more constrained in that context due to error messages needing to be relatively concise and highly relevant or else people tend to just not read them at all.)

@dekelpilli
Copy link

Thank you for the extremely detailed response. I think this explains very well the difficulty/drawbacks of making this change. For what it's worth, I solved my specific problem using the local_file data source as you mentioned, so while it would have been convenient for it to work with the filesha256 function, it wasn't a blocker or even more than a minor inconvenience. Given the presence of workarounds here and the complexity of potential solutions, I can see why this has been open for 4+ years. That said, explicit deferrals do sound like a positive step.

Also, just as an aside, I thought the error message returned by terraform in this instance was clear and concise -- terraform can't be expected to know the file was missing because I expected some bash script to generate it beforehand.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants