Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refine vertical punctuation logic based on Unicode standard #3608

Merged
merged 2 commits into from
Nov 14, 2016

Conversation

1ec5
Copy link
Contributor

@1ec5 1ec5 commented Nov 14, 2016

The list of Unicode character blocks is now based on the official Unicode 9.0 character database, whereas the previous table was based on an older Unicode standard.

The logic distinguishing upright, rotated, and neutral characters has been refined based on Unicode Technical Report #50 (proposed revision 16, which corresponds with Unicode 9.0).

Here are some example effects of this PR:

  • In a vertical “地圖©地圖盒子”, “©” is displayed upright instead of rotated.
  • In a vertical “地……圖”, “…” is displayed as an upright “︙” instead of a rotated “…”, so it is centered horizontally and no longer runs into the following CJK character.
  • In a vertical “地圖!”, “!” is displayed as an upright “︕”.

To keep things simple, the Letterlike Symbols and Number Forms blocks as a whole are considered neutrally oriented. We can refine that portion further if this simplification turns out to be problematic.

Depends on mapbox/mapbox-gl-test-suite#169 (although technically no test suite change was required because the differences were so minor).

/cc @lucaswoj @nickidlugash

@1ec5 1ec5 added the bug 🐞 label Nov 14, 2016
@1ec5 1ec5 self-assigned this Nov 14, 2016
Base the Unicode character blocks off of the official Unicode 9.0 character database.

Refined the logic distinguishing upright, rotated, and neutral characters based on Unicode Technical Report 50 (with some simplifications). In particular, not everything in the General Punctuation block is treated as having neutral orientation; instead, the vertical punctuation table is consulted.
@1ec5 1ec5 force-pushed the 1ec5-vertical-punctuation-tr branch from d62da3e to 0dc9bad Compare November 14, 2016 07:15
Copy link
Contributor

@lucaswoj lucaswoj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing. Thank you @1ec5.

@@ -64,6 +65,7 @@ module.exports = {
// 'Batak': (char) => char >= 0x1BC0 && char <= 0x1BFF,
// 'Lepcha': (char) => char >= 0x1C00 && char <= 0x1C4F,
// 'Ol Chiki': (char) => char >= 0x1C50 && char <= 0x1C7F,
// 'Cyrillic Extended-C': (char) => char >= 0x1C80 && char <= 0x1C8F,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any idea how I might've missed this and other blocks the first time around?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block is new to Unicode 9.0, whereas the file you consulted was based on Unicode 8.0. I've updated this file to point to the Unicode Character Database, which will make it easier for us to keep this file up-to-date in the future (by diffing between Unicode releases).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

//if (isChar['CJK Unified Ideographs Extension C'](char)) return true;
//if (isChar['CJK Unified Ideographs Extension D'](char)) return true;
//if (isChar['CJK Unified Ideographs Extension E'](char)) return true;
//if (isChar['CJK Compatibility Ideographs Supplement'](char)) return true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer if this code block were copied into mapbox/DEPRECATED-mapbox-gl#29 than left commented out in the codebase.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants