-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stablize the yamlencode function output #29436
Comments
We are seeing issues where applying a k8s configmap once, results in terraform wanting to make a change to apply the exact same configmap in a second run of |
Hi @mhulscher, From the behavior you described it sounds like something outside of Terraform is rewriting your data values into a different shape, which the Kubernetes provider is therefore detecting as drift. I must admit to not being super familiar with config maps, but my understanding is that config maps are mappings from string keys to arbitrary blobs/strings, and so I would expect the server to just store what you submitted verbatim, rather than parsing it as YAML and then reserializing it as YAML (with different formatting details) on the next read. Therefore I'm not sure I understand what would be causing the inconsistency you're describing. The concern of classifying detected changes as normalization rather than drift is the responsibility of providers rather than Terraform itself, so if these arguments are defined as being YAML specifically, rather than just raw strings to be stored and returned verbatim, then that sounds like a missing normalization-detection rule in the Kubernetes provider. I think something else might be going on in your particular case though, so I'd like to learn more if you have more to share. I'm not sure exactly what to ask due to my only-surface-level Kubernetes knowledge, but if you know of anything else in your system that might be rewriting these YAML blobs, or if you know that my assumption above about them being arbitrary blobs is incorrect, please let me know! And of course, if you know of anything else that might be relevant, even though I don't know to ask about it, please let me know that too! |
Hi @apparentlymart , Thank you for taking the time to write out such a detailed response. I am still investigating the issue that we are facing, but I don't believe the problem to be related to the k8s provider or the way the config_map resource is implemented. Our configmap has 1 key-value pair. The value is the string result of an array that is Unless you consider it possible that The diff can be found below:
|
Hi @mhulscher, If I'm reading that diff correctly, it seems like the value you are encoding is a list of objects (or, equivalently for Another possibility is that the input here is a set of objects. Terraform sets don't have any inherent ordering and so Terraform converts them to YAML sequences using lexical ordering for sets of strings or an unspecified-but-consistent ordering for other types. Again that should mean that the same input to In both cases then I don't really see how the ordering of a If you are using Terraform v1.0 or later then we might be able to get a little extra information to debug this: if the problem is something else modifying this data outside of Terraform then Terraform v1.0 should detect that during planning and announce "Note: Objects changed outside of Terraform" and then describe essentially the opposite of the diff you're seeing, and then Terraform will propose to undo that change to restore what you wrote in your configuration just as you're seeing today. Giving more information to help understand what's causing non-converging configurations like this is a key use-case of that new feature. With that said though, with the information we have right now I'm inclined to believe that something outside of the Terraform language is modifying your YAML documents here, and so I'm going to move forward with that assumption for the moment but if you are able to demonstrate otherwise then I'd definitely like to hear about it! Thanks again for sharing this strange behavior with me. |
As I mentioned in #23322 (comment), I am hard-pressed to think of an example resource API that accepts YAML that actually extends beyond JSON-compatible, and would not therefore be better-served by using actual JSON, besides the aforementioned The So I'd say |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Back when we originally introduced
yamlencode
, circa the v0.12 release, we included a note in its documentation that the exact layout of its output was subject to change in future releases and so not to use it in situations that would react poorly to such changes:The motivation for this leeway arose from the fact that YAML is a relatively complex language (in comparison to, say, JSON) and so we expected that we'd need to make some pragmatic adjustments to allow for quirks in other YAML parser implementations, to try to maximize interop with other systems that consume YAML. In the intervening time we have made some small adjustments to both
yamlencode
andyamldecode
, but nothing particularly substantial.In practice, we're now treating
yamlencode
as largely frozen, because although the letter of our documented exception would allow it we don't actually have any wish to churn existing uses ofyamlencode
for any purely-stylistic reasons, and so the bar for changing it at this point is high. Specifically, I expect we'd only change it if we found that it either had a significant bug where it's producing a result that isn't valid per the YAML spec, or if there were a YAML parser elsewhere that is commonly used within software relevant to Terraform's use-cases (infrastructure tools) which can't reliably parse ouryamldecode
output even if it's valid per the YAML spec.This issue represents further raising that bar to say that the current output of
yamlencode
for a given input is effectively frozen under the Terraform v1.0 Compatibility Promises unless the change is to address behavior that's inconsistent with the YAML specification and the benefit of fixing it outweighs the costs when considering the behavior of existing third-party YAML parsers. To make that more specific:yamlencode
produces something that isn't valid per the YAML spec then our first order of business would be to understand why we hadn't previously been aware of that. If the outcome is that common YAML parsers in the wild are more tolerant than the YAML spec requires then we would likely conclude that the benefit of fixing the bug is outweighed by the cost of breaking compatibility. If the outcome is that the bug only occurs in a rarely-encountered edge case and that common YAML parser in the wild can't parse the buggy output then we would likely move to fix the bug, even at the risk of impacting modules that are already generating invalid YAML in that edge case.yamlencode
produces something that is valid per the YAML spec but yet still can't be parsed correctly by some particular third-party YAML parser, we would now typically not change theyamlencode
behavior and instead propose that the third party update their parser implementation to be more compliant with the specification.Although the experiment warning in the documentation isn't specific about what situations it's talking about, I want to be clear here that one major motivating use-case for this stabilization is to de-risk the use of
yamlencode
in userdata/metadata features in common cloud providers when producing cloud-init's "Cloud config" language, which typically requires including an additional YAML comment and thus isn't a suitable situation for using JSON instead:user_data
in EC2, and its equivalent in most/all other cloud VM systems, is an argument which can be changed only by replacing the corresponding virtual machine with a fresh one, because cloud-init processes it only on system boot. However, from the cloud provider level of abstraction it's just an opaque string or sequence of bytes, and thus there's no scope for any classification of two inputs as functionally-equivalent. It's therefore a particularly sensitive situation when it comes to churn, and so a configuration like the above would be against the current recommendation in the docs.However, before we do this we ought to first try to conclude some due diligence related to the existing issue #23322.
That was originally opened to discuss making the
yamlencode
output configurable, but as I already noted over there we don't intend to do that because YAML encoding is a relatively-minor programmatic integration point from Terraform's perspective, rather than a big part of it's scope, and so the cost of implementing and maintaining a highly-configurable YAML encoder (particularly under 1.0 compatibility promises as discussed here) is too high.However, there was some other commentary in that issue about how some
yamlencode
output is causing non-specific problems with specific external software. I asked the participants in that issue to share some more concrete examples, but at the time of writing nobody has done so.Given that, I think our best path forward would be to spend some time with the specific software mentioned (Ansible, and Kubernetes config maps) to see if we can reproduce any such problems ourselves. If we can then we can discuss here any tactical changes we might want to make to
yamlencode
's current behavior (while preserving our goal of minimizing unjustified churn).If we can't reproduce any such problems, and if nobody in the other issue comes forward with a specific example in the meantime, then I suggest we move forward with stabilization of the current implementation and then respond to any late-arriving feedback per the principles I described above, and within the principles of our 1.0 Compatibility Promises.
The text was updated successfully, but these errors were encountered: