-
Notifications
You must be signed in to change notification settings - Fork 857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Clarify Unicode and UTF-8 references #929
Conversation
@pradyunsg Here are the isolated Unicode and UTF-8 language changes.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Two minor nits. Also, should we specify a minimum Unicode version? Iirc, UTF-8 was only introduced in version 2. Most public, free or just plain built in Unicode support is typically at 5 (for instance .NET Framework on Win7) or higher.
For most things, version wouldn’t matter. Just the introduction to the higher planes (V2) and surrogates (also V2) are relevant, I guess.
Both have been around for ages, not sure how much it’s worth over specifying here. Just thinking out loud.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The new preamble is very clear, and the rest of the improvements remove ambiguity. Language is hard, esp spec language, I really do appreciate you taking the time to get this right.
TLDR: LGTM!
Couldn't have done it without you guys @abelbraaksma @ChristianSi ! @pradyunsg Here's the Unicode and UTF-8 language cleanup that was moved out of #924 and greatly refined. Think these changes will improve the specification a lot. What do you think? |
@pradyunsg Please review these changes and make a decision. Our work coalesced two months ago, and we sent a reminder to review this at that time. What do you think? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! This is really nice!
One minor nit-pick again. 😅
Thanks @eksortso! |
Very happy this got approved, considering the discussions that came with it (in the other threads), glad this is in! 👏 |
Last-minute changes are always dangerous. While it's good this was merged, in the last minute (without anyone being able to re-review it) the sentence
was changed to:
Hmm! So now it seems that every TOML file "will be rejected (preferably)" which was probably not the intent! I'd suggest changing this to:
Or just return to the original wording. @eksortso: Will you open a new PR for that? Or should I? |
@ChristianSi I'll open the new PR. Even though @pradyunsg wrote the change, I accepted it thinking that all it was doing was restating the requirement in positive terms. I take responsibility for missing the wrong conjunction and the subsequent change in meaning. The new wording will be as follows. I kept the first MUST clause in its own sentence. The consequences of violating this clause are specified in the second MUST ("otherwise") sentence. Together, these two sentences are semantically identical to the original wording.
|
This PR applies the changes to clarify Unicode and UTF-8 references, done originally for #924 while considering relaxations on control code in comments. These are standalone changes, separate from the original scope of #924. Many thanks to @abelbraaksma and @ChristianSi for their work on this.