-
Notifications
You must be signed in to change notification settings - Fork 9.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
configurable output format for yamlencode #23322
Comments
+1, I think we should produce some nice looking YAML :) |
Ran into this problem today. The quotations are causing weird issues with Kubernetes config maps (I have to embed a YAML into a config map key) |
+1. Want to be able to create config maps from terraform maps. |
This causes problems in a number of environments where downstream applications consume YAML but dislike the "quote everything" + "alphabetical sorted" output of the |
Yes, it would be nice to keep the original ordering of fields in the template file + remove the quotes. You could use a beautifier for that within Terraform after converting to YAML. |
+1, encountered this issue just now |
+1, Also encountered this issue today |
I have a workaround for this. It's working for me, but beware; It's kinda hacky. Having this variable: some_map_var = {
foo = ["bar", "baz"]
dofoo = true
} Wrap it with a regex replace function: replace(yamlencode(var.some_map_var), "/((?:^|\n)[\\s-]*)\"([\\w-]+)\":/", "$1$2:") Results in this output: foo:
- "bar"
- "baz"
dofoo: true |
Hi all! Sorry for the slow response here. I was just reviewing the comments here and it seems like while some of the comments could be considered just a matter of style preference (some folks prefer the unquoted YAML style, which is fair enough), I also see several of you talking about situations where other software has refused to process the To summarize I see:
When we first introduced However, I expect we would make some different tradeoffs if it turned out that what Since generating YAML is only an ancillary use-case for Terraform and not its primary purpose, I don't expect that we would invest in a highly-configurable Thanks! I do want to note that there's a key difference here between a purely stylistic tradeoff like string quoting compared to the functional difference of specifying map keys in a particular order. For the latter, it's not Dealing with these various little differences between type systems is part of the game when it comes to cross-language serialization formats, so I'd hope that anyone writing a YAML parser would be pragmatic and realize that there are plenty of languages which (like Terraform) don't have order-preserving mapping types. If not though, unfortunately I don't think we can really help much with that because the original ordering information just isn't there, and often wasn't inherent in the source data in the first place if e.g. the map was constructed dynamically using a function. If you need that level of control, you'd need to use a different strategy to generate YAML mechanically yourself, such as generating it from a template where you can dictate exactly which punctuation, whitespace, and ordering the result would have. |
@apparentlymart I think the point is the yamlencode function does not produce valid YAML, at all, for anything. No YAML should have the maps, keys, and lists in quotes. That is not the standard anywhere and parsers that encode or lint proper YAML syntax would have a problem with this. For example, the AWS EKS Terraform module created by AWS uses yamlencode to render data for the aws-auth configMap in Kubernetes. This defines the mapping of AWS IAM accounts, roles, and users to Kubernetes groups and users for access control to the entire cluster. The YAML data in the configMap should look like this, as shown in their documentation. This is a standard Kubernetes manifest apiVersion: v1
kind: ConfigMap
metadata:
name: aws-auth
namespace: kube-system
data:
mapRoles: |
- rolearn: <ARN of instance role (not instance profile)>
username: system:node:{{EC2PrivateDNSName}}
groups:
- system:bootstrappers
- system:nodes However, this is what you get as a result of using the Terraform EKS module that aggregates maps and lists in locals then uses the yamlencode function to render the YAML for the data in the configMap that is created with the Kubernetes provider configMap resource. None of these quotes should have been added, it's not required, and goes against the entire point of YAML being more human-readable. I don't think I've seen any YAML parser that puts everything in quotes like this. mapRoles: |
- "groups":
- "system:bootstrappers"
- "system:nodes"
"rolearn": "arn:aws:iam::{redacted}:role/eks-workers-role"
"username": "system:node:{{EC2PrivateDNSName}}"
- "groups":
- "system:masters"
"rolearn": "arn:aws:iam::{redacted}:role/AWSReservedSSO_AdministratorAccess"
"username": "AWSReservedSSO_AdministratorAccess"
- "groups":
- "system:bootstrappers"
- "system:nodes"
"rolearn": "arn:aws:iam::{redacted}:role/eks-node-role"
"username": "system:node:{{EC2PrivateDNSName}}" The yamlencode function is useless without needing to use replace functions to remove them, not ideal and not always possible to do. I might not want to replace all """ with "". The reason being the occasion when you want to change the type to string. Where a boolean type (e.g. key: true ) or maybe you want the value case as a string (e.g. key: "true"), same with numbers as 123456 or cast to string with "123456" Another example would be to look at any Kubernetes manifest YAML or use the Helm template command to render a chart into manifest YAML. The only time you'd have quotes around the value is for things like numbers that you want to be treated as a string type. You don't even need quotes or escapes when a key name has . or / within them as long as it's before the : kind: ConfigMap
metadata:
creationTimestamp: "2021-08-28T04:45:12Z"
labels:
eks.amazonaws.com/component: coredns
k8s-app: kube-dns
name: coredns
namespace: kube-system
resourceVersion: "10831359" The reason why this is so extremely important and why you do not want this is that, unlike JSON, whitespace matters in YAML The number of spaces, not tabs, and therefore the indentations of maps, lists, and how data would be nested is important for it to be valid in most cases. |
+1 for non-quoted yaml keys and most values |
don't get me wrong, I would 100% prefer a way to only use quoted keys/values when it's required, but...
@fitchtech fyi the YAML 1.1 spec does actually have examples of quoted keys being valid YAML, which they totally are. it's just not super well spelled out imo. the YAML 1.2 spec has a slightly different example, but demonstrates the same validity of quoted keys. |
@joshsleeper while it maybe be valid it does cause issues. Also it does not follow proper YAML styling. Using quotes has a specific meaning in YAML unlike JSON or HCL. For example if I have locals { number = 12345 } that's specifying a number data type. So I would expect the YAML equivalent be.. number: 12345 And not this.. "number": "12345" That's not what I declared or want as the output. It should be the same data type and only cast to string when set that way. For example, if it were number: "12345" It just doesn't make sense to put all the keys, values, and maps in quotes like this. It's not useful in practical application and I always avoid it. An easier approach with cleaner YAML is to use the templatefile function with a map of maps variable that inserts your YAML blocks within a template file using a string template for each expression. Nesting that within YAML decode in locals then let's you pass it to other blocks easily like the data block of a Kubernetes Config map resource. |
while I agree that arbitrarily quoting numbers and boolean values would be a problem, I'm not seeing such behavior in # sample.tf
locals {
test_yamlencode = yamlencode({
string_key : "string_value"
simple_number : 123
complex_number : 1e+3
123 : 123
bool_key : false
map : [
"map_string", 456, true,
]
})
}
output "test_yamlencode" {
value = local.test_yamlencode
}
string, boolean, and number values passed to the only change I'm really seeing is it forcing key quoting (which really should be considered a style thing since all keys act like strings and it's perfectly valid according to the spec) and forcing string value quoting (which again is perfectly valid and often recommended to avoid special characters behaving oddly). |
@joshsleeper didn't realize it was not quoting numbers and bool type values at least. Still seems strange that all the other keys and string values are in quotes despite that being unnecessary. IMHO the only times it should be quoted in the YAML is when you want number or bool cast as string, e.g. "12345" or "true" |
Thanks for raising the question about the use of quotes, and for the efforts here to uncover whether it represents a practical problem for interoperability with other software.
Using quoted strings universally is therefore a compromise that ensures that most other parsers (of both YAML versions) will interpret the value as a string without incurring the high readability cost of writing out explicit type tags. We intend the result to follow the YAML 1.2 core schema while also being unambiguous to a YAML 1.1 parser (as far as possible, given that YAML 1.1 intentionally treats various parsing rules as application-defined). Based on what we've seen so far, this seems like an example of a style preference rather than an interoperability problem and thus not within the scope of changes we'd consider making to |
I have a case where I am using Terraform + SaltStack + Consul. I have SaltStack setup to read pillar information from Consul:
In Terraform I would write a key named
This writes the key as follows:
The key's value has the quotes when I look at Consul. However it seems that Saltstack is able to handle it just fine and remove the quotes:
Even though for my use case it seems to work, its odd because normally you wouldn't put quotes around those items if you where defining this in a local YAML file. Most people would probably be thrown off by this behavior (I was initially). |
I use this workaround set {
name = "config"
value = replace(yamlencode(
{region:"eu-west-1",
set_timestamp:"false",
period_seconds : "240",
metrics:[ { aws_namespace : "AWS/RDS",
aws_metric_name: "ReadLatency",
aws_dimensions:"[DBInstanceIdentifier]",
aws_dimension_select: "{DBInstanceIdentifier : [db-complete-mysql-444105]}" ,
aws_statistics : "[Average]"
},
]
}
),"\"","" )
} |
In addition to things above, this causes configuration drift for Terraform rancher_app_v2 input, which seems to format yaml in a different way and as result, there are always confiuration drifts when using yamlencode output as rancher_app_v2 values input |
Hi @herrbpl, In Terraform's architecture, part of the responsibility of a provider is to include rules to recognize the difference between two values that are materially different -- that is, the meaning has changed -- vs. two values that are just two different ways to write down the same information. There are already lots of examples of providers handling this for JSON, where remote APIs will often accept JSON as input but store the data internally in some other format, re-serializing it to JSON on read and therefore potentially producing a different serialization. Although this is the first example I've seen of a system doing this with YAML -- and surprising, because presumably that means it will also discard any comments you included in the input, thus defeating a main benefit of YAML over JSON -- I think the same architectural principle still applies: the Rancher provider ought to have a rule to detect when two values are serializations of the same data and classify that as an immaterial change, to allow the configuration and state to converge. I'd suggest recording that as a feature request for the provider. Unfortunately since I think this is the first example of doing it for YAML in particular, rather than for e.g. JSON, it'll take some extra up-front work to write a comparison function for YAML, whereas in JSON situations there is one built into the SDK which can handle many simple situations. However, I assume the same principle will apply as for the JSON equivalent: parse both the old and the new to discard the irrelevant syntax details, and then compare them to see if there are any remaining differences beyond just syntax. |
Thanks for detailed reply. Now that i think of it, I seem to recall rancher_app and app_v2 use string for values input. Even extra line line feed causes drift. I'll post this to their provider tracker. |
My 2c: providers should never* deal with YAML directly. There is very rarely a situation where JSON wouldn't be better: you can reasonably normalize JSON for most applications, thereby preventing drift without having to parse it and compare the parsed tree. And JSON is a subset of YAML these days, so all YAML-compliant apps should be able to handle it. If at any point along the chain anything re-encodes the YAML, you're almost certainly going to lose stylistic information anyway: AFAIK there exists no YAML re-encode process that perfectly preserves stylistic info (all whitespace, all quote styles, all comments). So if your application only deals with the subset of YAML structure that is JSON-compatible, you may as well use JSON because your YAML's going to get mangled anyway. Style-preserving YAML is almost a fundamentally separate type to we-only-care-about-data YAML. For example, take the
Indeed, when it loads Anyway, my point is: I'm guessing YAML re-encoding stability is not actually that necessary in practice because no real API actually wants an (* Exception might be when the output is meant for human consumption and you need to preserve its exact stylistic structure, comments, etc. but I'm hard-pressed to think of an example of that in the Terraform realm.) |
One counterexample would be cloud-init. I would argue that you could just store the shebang-style comment and the body separately, then mix them together in YAML for the user when writing to the API. |
I agree that it would be weird for a provider to itself be dealing with YAML. I think the main situations for I do find the Rancher example surprising for this reason, but I'm not familiar enough with Rancher to understand the details of what's going on there. It seems like either the Rancher provider or the Rancher API are directly using the YAML but are reflecting it back in a normalized form, which is pretty unusual as I mentioned above and I've still not encountered another example of such a design. I'd rather keep discussions about the designs of specific providers in those providers' own issue trackers though, so that their authors (who know far more about the underlying systems than I do) can be the ones to make the necessary tradeoffs. For our purposes with this issue, if a provider has behavior like discussed above where it (or the API it interacts with) accepts YAML and normalizes it then it would be the provider's responsibility to classify that normalization as normalization, so that Terraform will not report it as a meaningful change. Whether the provider should be doing that is a matter for the provider developers to consider for themselves, but the previous situation is one of the consequences they should consider when making that decision. |
Yep, I was just using I think you're right: it's up to the provider to know its resource API details and avoid drift where there isn't a meaningful change. Namely, it should not be up to the user, via normalization flags to So I'd say the solution to this particular issue is just a clear Terraform policy around that, that users and provider devs can be pointed at when this comes up. That said, my advice as a provider dev is to never do API calls with raw YAML if it can be avoided. |
Hi @apparentlymart -- I am assuming you're affiliated with Hashicorp and Terraform. Thank you for your answers and for your effort here. The nature of this thread reveals a core truth of Terraform, namely that it is a semantically correct and pure software tool. There are a multitude of use-cases for producing YAML (and JSON) as these are the primary data interchange mechanisms used by modern software. While you assert that it's not a primary function of the software, that cannot really be true, as a fundamental purpose of Terraform is to interoperate with other software. If it is the case that the latest version of YAML allows for quoting, that's delightful, but it's not anyone's current reality. It may be pure, but it ain't real :-) I try to do things right as often as possible in the software I work on. But I work in reality. I hope you and other Terraformers will understand the day to day challenges those of us who do battle daily are faced with and think about ways to be right by default, and be flexible as an option. A I have great respect for and appreciation of the Terraform tool and team. Thanks for listening. |
Hi @tomharrisonjr! My request above was to share specific examples of software that doesn't implement YAML in a way that supports the format that Terraform is generating, in which case we would review whether it is either Terraform or the other software that is incorrect and adjust Terraform if appropriate. I'm still willing to do that, and it does sound like you have a potential example to share. Can you say a little more about what's going on with Buildkite that is causing you problems? I understand that Buildkite is closed-source SaaS software and so not possible for you to describe details about its implementation, but if you can show the input you tried to send to Buildkite (with |
The produced YAML code causes issues when the sorting of the terraform code is not kept the same. For example, here is a terraform code:
And that produced this YAML code:
In the above, the |
Hi @georgikoemdzhiev! Thanks for sharing that. Do you know which software is ultimately parsing and decoding that YAML document? I see that you are passing it to an AWS provider resource type, but I'm not sure whether it's the AWS provider which parses it or if it just sends that whole string to some other system which then parses it. I'd like to identify who owns the parser so we can understand the impact of this difference. This situation is unfortunately more fundamental than just customizing the output format, because listing source before destination here requires information that Terraform doesn't have. As is the case in several other languages, maps in Terraform are not an order-preserving data type and so the order of definition of elements in a constructor is only a source code artifact and has no effect on the behavior at runtime. I don't think allowing a caller to control the serialization order for map elements will be possible with I wonder how this YAML structure would be described in other languages that similarly do not retain the declaration order of a constructed map. 🤔 |
Hello, thank you for addressing my comment.
I believe the software that parses the YAML is AWSTOE and it is used by AWS Image Builder itself but I am not sure. Looking at the Image Builder docs it certainly sounds like that is the software parsing the YAML This is an extract from the docs: |
Thanks for that information, @georgikoemdzhiev. After following a few links I believe you are right and that in particular the specific part you raised here is the action-specific input arguments for the I wasn't able to find anything in the documentation stating that this format requires the |
Hi Martin, I tried to replicate the issue I was having today but it appears that I can no longer replicate it. The issue with using using the |
One of those languages is YAML itself! The YAML spec is clear on this. Anything that requires a particular ordering of keys in a YAML mapping is not processing YAML correctly. Link to YAML 1.2 spec: 3.2.2.1. Mapping Key Order. (The YAML 1.1 spec has almost identical text.) |
Hi all! We left this issue open for a few years to try to gather specific details on any situations where Terraform's My read of the discussion above is that several participants still have the reasonable style preference to produce unquoted mapping keys, although that particular change is not something we intend to make for the reasons I described earlier. Other than that, it doesn't seem like we've found any significant cases where For that reason, we're planning to move forward with treating I'm going to close this issue now specifically to represent that we do not intend to produce a configurable
This issue also touched on a concern which isn't really part of
If you find situations where a provider seems to refresh YAML into a shape that doesn't match the input -- regardless of whether that input was produced using the Thanks for the discussion here, and in particular to those who shared specific potential concerning examples for us to study. That has been helpful in confirming that the We do intend to eventually support functions contributed by provider plugins as a new extension point for the Terraform language, although that is still subject to some research and design work before it's ready to go. Once we reach that point, those who have a interest in generating a particular YAML style that doesn't match the Thanks again! |
I'm going to lock this issue because it has been closed for 30 days ⏳. This helps our maintainers find and focus on the active issues. |
Current Terraform Version
Use-cases
I want to use terraform to generate yaml formatted configuration files for an ansible based installation.
Attempted Solutions
Currently we dump a
jsonencode(${var.some_map_var})
file to the target system, and then useremote-exec
that runs a Python script that parses the .json file to generate the desiredconfig.yaml
With a map of
This will generate a nice yaml that Ansible can use, i.e.
Having discovered the
yamlencode
function in 0.12 this seems like a really nice option to avoid the escape hatch of theremote-exec
python script and stay truer to Terraform native end-to-end.However, the current
yamlencode
function seems to produce a file like thiswhere all the keys are quoted (I guess because they are strings), rather than giving us a nice UTF-8 unquoted yaml file as we get with our Python parser.
This seems to create some issues for Ansible.
Proposal
Allow (at least a config switch) to generate yaml files what does not quote keys and values
References
The text was updated successfully, but these errors were encountered: