-
-
Notifications
You must be signed in to change notification settings - Fork 60
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Force double quotes on integers and null in string #112
Conversation
Hm, I think that it was to avoid |
Yes, that's what I thought. Any suggestion to fix? Perhaps parsing a substring would be sufficient. |
I just ran into an issue that was caused by this issue yesterday. To my utter amazement this PR had just been opened a few hours before. I want to shoutout @jnystad 🥇 your timing couldn't have been more perfect. I was literally cloning down the repo to look into fixing this issue when I saw this PR. I'd like to suggest that this PR not wait for any further performance enhancement modifications since it addresses the main issue of YAML non-compliance. Instead a new PR be created to improve the performance (if possible) to investigate those concerns. |
This behavior has been around for the past 5 years, so there is no urgency to land this PR.
Could you run a benchmark before/after using BenchmarkDotNet on a significant yml file to see the impact? |
Given the customizable nature of SharpYaml, and specifically since Schema in SerializerContext can be overridden I think any optimizations added here may come into conflict with custom Schema handling and therefore do not belong here. Also, I'm not sure I have the overview of the library to be certain I'm creating useful performance tests. A preliminary investigation using Stopwatch and loops indicate that the actual Serialize step has a negligible performance cost compared to initializing a new Serializer (i.e. creating a new Serializer and running Serialize takes ~500ms on a large object while the next thousand with the same Serializer takes ~5ms in total). Also, I don't think the regexes used by the real parser are significantly slower than any other check one can do, at least not in a scale that matters. They probably return fast enough regardless of string length considering the limited matches they produce. But as I said, I don't know the internal mechanics enough to determine if I'm actually testing this properly (as in, is there some caching where feeding the same deserialized object to the same Serializer somehow skips the actual serialization?). Regarding the urgency, I wonder how existing users of SharpYaml and YamlDotNet (which has the same flaw with at least integer values) live with this. Perhaps they override the default behaviour (like I have done where I'm currently using SharpYaml until this fix is merged). |
Afair, It is recommended to use a Serializer per thread and keep it around, as it is storing metadata information that you really don't want to recompute.
Right, this could be also further optimized with .NET 7+ regex optimizations in the future. Anyway, thanks for the fix, I'm gonna merge it. The regression in perf might be negligeable and if someone notice it, we will have a candidate to fix it 😅 |
Thanks! |
Fixes #106
Also fixes quoting for integer values in strings.
I guess the double.TryParse was an optimization to avoid parsing long string values if unnecessary? If so, this might impact performance when serializing a lot of longer string values. Might add similar parsing attempts for bools and integers (null check is trivial), but not sure if C# parsers match the YAML spec exactly.