-
Notifications
You must be signed in to change notification settings - Fork 520
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Numbers in scientific notation without dot are parsed as string #173
Comments
The sign is optional. You were quoting the canonical form. |
Oh, my mistake. So I guess both changes are justified. |
I ran into the same issue (which I consider a bug). The stackoverflow solution provides a simple fix that I hope will find its way upstream sooner than later (unfortunately it was not adopted in the step from 3.12 to 5.1) |
For anyone running into similar issues, the derived package ruamel.yaml resolves issues like the one stated above. |
It seems like it's not just the presence or absence or a decimal point that affects things, but also whether there is an explicit sign (positive or negative) preceeding the exponent. I'm unsure if this is also addressed by PR #174 or if it could be included. In [1]: import yaml
In [2]: yaml_str = ('''
...: hello:
...: - 5.0E6
...: - 5.0e6
...: - 5.E6
...: - 5.e6
...: - 5E6
...: - 5e6
...: ''')
In [3]: yaml.safe_load(yaml_str)
Out[3]: {'hello': ['5.0E6', '5.0e6', '5.E6', '5.e6', '5E6', '5e6']}
In [4]: yaml_str = ('''
...: hello:
...: - 5.0E-6
...: - 5.0e-6
...: - 5.E-6
...: - 5.e-6
...: - 5E-6
...: - 5e-6
...: ''')
In [5]: yaml.safe_load(yaml_str)
Out[5]: {'hello': [5e-06, 5e-06, 5e-06, 5e-06, '5E-6', '5e-6']}
In [6]: yaml_str = ('''
...: hello:
...: - 5.0E+6
...: - 5.0e+6
...: - 5.E+6
...: - 5.e+6
...: - 5E+6
...: - 5e+6
...: ''')
In [7]: yaml.safe_load(yaml_str)
Out[7]: {'hello': [5000000.0, 5000000.0, 5000000.0, 5000000.0, '5E+6', '5e+6']} Reference to original discussion: microscopium/microscopium#159 (comment) |
Issue is still present, I use the fix found on Stackoverflow, manually adding the new implicit resolver. |
I stumbled into this one today. we use a python script to alter a kubernetes manifest. it uses yaml.safe_load() to read in the manifest file, alters the dictionary and yal.dump() 's it back to the yaml. Today that manifest had an annotation of The following is a trimmed down recreation of what happens
pyyaml appears to be considering this as a string and therefore removing the quotes and the output yaml ends up being when it should be since it was quoted in the origional yaml file. I can work around it using the following
which outputs |
Since pyyaml implements YAML 1.1, this behaviour is actually correct. |
In scenarios where an interface name is composed of numbers in scientific notation without a dot, the application panics with a failure to extract a string type from a Go interface (which is actually a float64). In order to fix the problem, the parsing of the interface name is performed using a dedicated interface-state structure. Note: this change does not solve the problem of incorrectly representing valid scientific numeric values like `10e+02` as strings (they are now represented back as full integers: "1000"). This is due to the YAML-to-JSON conversion which does not obey to the defined member type. ref: yaml/pyyaml#173 Signed-off-by: Edward Haas <[email protected]>
* state: Interpret basic state through an explicit structure In order to ease the interpretation of the state yaml data, use a type structure that represents the basic root items of state: - interfaces - routes: - config - running In order to keep the generic nature of these items content, they use as value a generic Go interface{}. The filter helper functions, with this change, have been also made immutable. They now return the filtered values. Signed-off-by: Edward Haas <[email protected]> * state, filter: Introduce unit tests for numeric iface names Interfaces names with numeric values are accepted by the kernel. Introduce tests that checks the behavior when using numeric names. Signed-off-by: Edward Haas <[email protected]> * state, filter: Do not panic on interpretation of ifaces names In scenarios where an interface name is composed of numbers in scientific notation without a dot, the application panics with a failure to extract a string type from a Go interface (which is actually a float64). In order to fix the problem, the parsing of the interface name is performed using a dedicated interface-state structure. Note: this change does not solve the problem of incorrectly representing valid scientific numeric values like `10e+02` as strings (they are now represented back as full integers: "1000"). This is due to the YAML-to-JSON conversion which does not obey to the defined member type. ref: yaml/pyyaml#173 Signed-off-by: Edward Haas <[email protected]> * state, filter: Fix interpretation of scientific numeric iface names This change solves the problem of incorrectly representing valid scientific numeric valuesi without a dot in them (e.g. `10e20`) as strings (e.g. represented back as `1000`). The problem originates from the YAML-to-JSON conversion which does not obey to the defined member type. Fixed by using the go-yaml package (which does not perform such a conversion from YAML to JSON). The go-yaml is used just to unmarshal the name from the original raw byte stream and update the state structure with the proper name string. Signed-off-by: Edward Haas <[email protected]>
It appears @ingydotnet has not committed to this repository in 3 months now, and there are a number of increasingly pressing quality-of-life issues stemming from the lack of YAML 1.2 support. What is the strategy for landing YAML 1.2 support in PyYAML? |
I'm not speaking for everyone, but IMO: it'll land when it's ready. There are no paid contributors or maintainers on this project, and since this is generally considered the "stable" Python YAML parser, if the API changes required to support YAML 1.2 break the world, it's going to be hell for everyone involved (even if we have "well, it's a major release, just deal" to hide behind). @perlpunk has been doing some great work recently to get the functionality in there, and I'm still trying to find some time to really sit down and bash around with it to see what we can do to preserve backward compatibility for at least the common use cases. |
@nitzmahone thanks for that context, and thanks to @perlpunk for all the hard work as well. The main reason I was asking was to see if the release planning process was stuck waiting for input from an absent contributor. Is there anything specific that external contributors can do to speed up the process (PRs that could be accepted for specific features or refactors)? Even an interim release where the 1.2 grammar is available, but off by default, would be terrific. |
That's the current plan. I think defaulting to 1.2, if ever, needs to wait for a while. It would break a lot of stuff =) I'm trying to figure out a syntax with which people can use the 1.2 schema, but without adding too much classes into pyyaml itself. #512 is a draft that adds a lot of classes, but I'm working on top of that branch for an alternative. You say "1.2 grammar". Do you think of something specific in the grammar? Because I'm really only working on the schema. #512 has some details on that. edit: fixed PR number |
Superseded by #555. Edit: Sorry, wanted to close the associated PR instead. |
Hi,
it appears that PyYAML requires a dot in the mantissa as well as a sign in the exponent in order to parse numbers in scientific notation as numbers and otherwise parses them as string:
According to YAML 1.2 spec the dot is optional as in JSON:
Furthermore, JSON makes even the sign in the exponent optional: https://www.json.org/number.gif
Since YAML 1.2 aspires to be a superset of JSON, I believe it should make the sign optional as well.
Others than ran into similar problems:
https://stackoverflow.com/questions/30458977/yaml-loads-5e-6-as-string-and-not-a-number
Best, Thomas
The text was updated successfully, but these errors were encountered: