Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consolidation of Glyph Sharing Suggestions (See Issue #179) #98

Closed
kenlunde opened this issue Apr 18, 2015 · 46 comments
Closed

Consolidation of Glyph Sharing Suggestions (See Issue #179) #98

kenlunde opened this issue Apr 18, 2015 · 46 comments

Comments

@kenlunde
Copy link
Contributor

This Issue will be used to consolidate suggestions for glyphs that can be shared across more than one language, with the intent to purge one or more glyphs per code point, which will free up CIDs for accommodating additional glyphs. Please report any future glyph sharing suggestions to this Issue.

@kenlunde kenlunde self-assigned this Apr 18, 2015
@kenlunde
Copy link
Contributor Author

Issues #92 and #97 are consolidated here.

@tamcy
Copy link

tamcy commented Apr 21, 2015

Glad to know that SHS 1.002 has been released, thanks for the hard work.

I'd propose to unify the 辶 component so that certain glyphs of CN and JP can be shared.

KR's 辶 will always have two dots at the top left corner, while CN will always have one. JP's case is a mix of the two, the 辶 may have one or two dots at the top left corner.

Given there is no difference in other components, the two-dot version glyph can be shared among JP and KR. Theoretically this is also true for CN and JP. However, currently the glyphs can't be shared because CN uses a slightly different design for the same component, which doesn't seem necessary.

jpcn2

2015/4/23 : updated image

@kenlunde
Copy link
Contributor Author

Thank you for the suggestion. Also, I just edited the sentence in the first post to this Issue.

@hfhchan
Copy link

hfhchan commented Apr 21, 2015

I second @tamcy 's suggestion. If we compare SimHei and Microsoft YaHei, which are both fonts built towards the PRC standards, it is obvious that using the design on the left is acceptable (and likely more modern-like and preferrable).

@hfhchan
Copy link

hfhchan commented Apr 21, 2015

FYI I have talked with Ms Lu Qin on the stroke connection for components such as 口, 辶 and 又, and she indicated that the exact stroke connection point is deemed as a feature of asethetics, which is likely to be codified in the document for the upcoming registration of the Hong Kong IVD Collection. The main concern of the glyphs for Hong Kong is that only the type of stroke has to be correct. In this case, for example 請 may directly map to the cn glyph as the connection between the first two strokes is, connection point for 口 are deemed as asethetic issue and not a requirement. However, the character cannot directly map to the tw glyph, as the first stroke of the right bottom "月" must be stroke 2 instead of stroke 3. This should dramatically reduce the number of glyphs that should be redesigned.

@tamcy
Copy link

tamcy commented Apr 21, 2015

Also
丸, 飄, 凝, 摜 CN -> JP
多 TW -> JP
忍 CN -> KR
牆, 壯, 伴, 光, 心, 乾, 晚, 色, 怎, 慣, 馮, 載, 踢 CN/TW -> JP
抱, 勤 CN/TW -> KR

@jungshik
Copy link

I third tamcy's suggestion about 辶. As for Korean, as long as they're consistent across characters (that is, every character with that radical takes the same shape in KR font instances) except for pairs of characters for which the only difference is 'one dot vs two dots' in that radical , either form should be fine.

@hfhchan
Copy link

hfhchan commented Apr 21, 2015

For one dot + horizontal line, e.g. 應,言, etc. I think the dot just touch horizontal line is a good design to unify CN and TW (and possibly HK) glyphs. In Song design, the TW MOE Song design must always touch. The PRC Song design does not usually touch unless not enough space. I think we can get away with using dot "touch" line instead of dot "separated from" line or dot "join" line.

@RyanChng
Copy link

For 透, I think it may not be that simple - the 乃 component is a bit different in there.

@tamcy
Copy link

tamcy commented Apr 22, 2015

@RyanChng You are correct, I'll revise the screenshot when I have time (EDIT: updated).

@kenlunde
Copy link
Contributor Author

While we very much appreciate any and all feedback about the extent to which glyphs can be shared across the supported languages, the decision will ultimately be made by the typeface designer, Ryoko Nishizuka, in consultation with Changzhou SinoType's typeface designer, and to some extent, with me.

I am planning to begin this process by preparing a list of candidate glyphs, in an effort to reduce the size of the candidate list. I will separately list particular components for consideration. These will be used as Ryoko's guide.

Also, please try to refrain from using this "issue" as a discussion thread. If a discussion is necessary, I suggest it be done elsewhere, and the results posted to this issue.

@jimmymasaru
Copy link

AdobeJapan1-6 contains many variants, but some of them are designed for serif fonts, e.g. 父, CID+3541 vs CID+13497. I found Source Han Sans also includes these glyphs, and they are having the same design. Is it possible to remove them because they look like some sort of duplication?

@kenlunde
Copy link
Contributor Author

kenlunde commented Jun 6, 2015

@jimmymasaru: We have a requirement to support the Adobe-Japan1 IVD collection, which is to have unique CIDs. We understand that some of the glyph distinctions cannot be represented in a sans serif (aka gothic) typeface design, meaning that we are fully aware of this.

BTW, your comment would also apply to Kozuka Gothic.

@jimmymasaru
Copy link

U+5840 塀
U+584F 塏
I suggest those two glyphs, as well as others containing 䒑, can be unified because they are all design differences. I checked Meiryo and Microsoft Yahei and found it's flexible to have the second stroke fully linked or not fully linked with the third stroke.

U+5852 塒
The JP glyph of this character can be unified with the one of CN, because the 土 and 寸 are linked in other characters like 寺時侍持.

㪽㪿㫀
Can the glyphs containing 斤 be unified between JP and CN?

All the characters listed above are having the same glyphs in Hiragino Kaku Gothic and Hiragino Sans GB.

@kenlunde
Copy link
Contributor Author

Thank you for the continued suggestions. I am starting to ramp up preparations for Version 2.000, which will take a few months to complete due to the various things that I'd like to accomplish, one of which is better support for Hong Kong. This particular glyph-sharing issue is important because the intent is to free up enough CIDs to make the Hong Kong support possible.

@tamcy
Copy link

tamcy commented Jun 14, 2015

I would like to add that the case of 斤 can also be applied to characters of same or similar strokes, like below:

shs2
(Top: JP, Bottom: TW)

后臼興 (also implies 盾垢揑舁) can be shared if unified.
學段劉 are examples of non-shareable but affected glyphs (but 壆覺 can be shared).

@kenlunde
Copy link
Contributor Author

To demonstrate progress on this issue, and as the first real step toward Version 2.000, I spent a solid three days last week compiling glyph pairs that can potentially be shared across languages, and came up with 832 candidates. (I use the term "candidate" because the final determination is made by the designers, but I am the best person to come up with a list of candidates.) I plan to spend part of this weekend and Monday to find additional glyphs pairs, though I suspect that the figure will be one or two digits at most.

The designers will use the materials that I am preparing, which show the pairs side-by-side at the weight extremes (ExtraLight and Heavy) and at an intermediate weight (Medium), and also overlaid at the Regular weight, to determine which glyph pairs can be reduced to a single glyph.

Remember that the primary purpose of this particular exercise is to free up enough CIDs to provide proper, or at least adequate support, for Hong Kong SCS (HKSCS). (Though Hong Kong is moving toward a new standard abbreviated HKCS, and thus represents somewhat of a moving target.)

Just FYI, the differences shown in @tamcy's 2015-04-21 and 2015-06-14 posts above will not be shared, because these language-based differences were intentionally established at the designers' discretion.

@kenlunde
Copy link
Contributor Author

@jimmymasaru: About your 2015-06-14 post, the CN and JP glyphs for 塀, 塏, 塒, 寺, and 持 are already among the candidates for sharing, which were captured by the "fine-tooth comb" work that I did last week. I am simply confirming that they were detected as candidates through my systematic efforts. Note that 時 and 侍 are not candidates, because JP and CN already share the glyphs (the former is a CN glyph that JP uses, and the latter is a JP glyph that CN uses).

@tamcy
Copy link

tamcy commented May 1, 2016

Not sure the following glyphs are already on the list, but I'll post it anyway.

U+51BD  冽:     11410 (J/K) = 11412 (T/C)
U+52A0  加:     11783 (J/K) = 11784 (T/C)
U+53FB  叻:     12356 (J/K) = 12357 (T/C)
U+5420  吠:     12409 (J/K) = 12410 (T/C)
U+5FCC  忌:     17914 (J/K) = 17915 (T/C)
U+6028  怨:     18062 (J/K) = 18064 (T)
U+606A  恪:     18169 (J/K) = 18170 (T/C)
U+60B2  悲:     18288 (J/K) = 18290 (T)
U+617C  慼:     18656 (J/K) = 18657 (C)
U+64BC  撼:     20061 (J/K) = 20062 (C)
U+67F1  柱:     21464 (J/K) = 21465 (T/C)
U+6BB2  殲:     23207 (J/K) = 23209 (T)
U+6CD7  泗:     23684 (J/K) = 23685 (T/C)
U+7199  熙:     25841 (J) = 25843 (C)  (Added 2 May)
U+74E6  瓦:     27313 (J/K) = 27315 (T)
U+7765  睥:     28445 (J/K) = 28446 (T/C)
U+7BD9  篙:     30527 (J/K) = 30528 (C)
U+8304  茄:     34230 (J/K) = 34231 (C)
U+9D60  鵠:     61949 (K) = 46386 (T/C)

@kenlunde
Copy link
Contributor Author

kenlunde commented May 4, 2016

@tamcy: Thank you. I will check this list against my current notes, but at least U+7BD9 篙 cannot be unified between J/K and C due to its seventh stroke.

@kenlunde
Copy link
Contributor Author

@tamcy: I finally had time to compare your list to my list of sharing candidates. All of them were included in my own data, except for U+7BD9 篙 that cannot be shared for reasons explained above. It was reassuring to confirm that your list was a pure subset of what I independently came up with.

@tamcy
Copy link

tamcy commented Jun 10, 2016

U+56CD 囍: 13554 (J/K) = 13555 (T/C)
U+50D6 僖: 11011 (J/K) = 11012 (T/C)

Actually there is a subtle difference between the two set, which is how "口" touches the last two strokes of the "壴" component. But this should be a designer's preference issue.

@kenlunde
Copy link
Contributor Author

@tamcy: Both of these characters are included in my list of sharing candidates.

@Rameshdaspam
Copy link

@kenlunde : How to convert Simplified Chinese to .eot?

@kenlunde
Copy link
Contributor Author

kenlunde commented Aug 1, 2016

@Rameshdaspam: Converting to EOT, using the command-line ttf2eot tool, first requires a TTF version of the font, which is a DIY affair. We have no plans to deploy these fonts as TTFs.

@Rameshdaspam
Copy link

Actually we need Webfont for Simplified Chinese.
SourceHanSansSC
SourceHanSansHWSC

@kenlunde
Copy link
Contributor Author

kenlunde commented Aug 1, 2016

@Rameshdaspam: If the fonts are not in a format that you can use, you will need to convert them into the desired format. I do not know enough about your request to advise further, other than supplying EOTs is a non-starter due to the TTF requirement.

@acuteaccent
Copy link

You don't need to assign two different glyphs for U+115F and U+1160, as they are just fillers. In fact, CID+461 is currently shared by U+1160 and U+3164. Instead of this, you can use CID+460 for U+115F, U+1160, and U+3164; and use CID+461 for something else.

@acuteaccent
Copy link

acuteaccent commented Nov 7, 2016

Also, you don't actually need these:
63752 Hangul OldHangul-LeadingConsonants uni115F.ljmo01
63877 Hangul OldHangul-LeadingConsonants uni115F.ljmo02
64002 Hangul OldHangul-LeadingConsonants uni115F.ljmo03
64127 Hangul OldHangul-LeadingConsonants uni115F.ljmo04
64252 Hangul OldHangul-LeadingConsonants uni115F.ljmo05
64377 Hangul OldHangul-LeadingConsonants uni115F.ljmo06
64502 Hangul OldHangul-Vowels uni1160.vjmo02

My understanding is that the glyphs with .ljmo0[1-6] at the end have the width of 920, and ones with .vjmo0[1-2] or .tjmo0[1-4] at the end have the width of zero.

For the first six of the above, you can just simply use CID+740 (uni115F; nominal form of U+115F), as CID+740 is already a spacing glyph. You don't need seven (including the nominal form) 920-width blank glyphs; one is good enough.
For the last one, you can just simply use CID+64407 (uni1160.vjmo01). You don't need two zero-width glyphs with nothing in them; one is good enough.

You can save seven glyphs (eight if you count the comment right above this one).

@kenlunde
Copy link
Contributor Author

kenlunde commented Nov 7, 2016

While I have a preference to keep these glyphs, because their presence makes debugging the three GSUB features an easier process, I am willing to build a test font that includes only the nominal and combining forms, along with the space (U+0020), but excludes the eight glyphs that you indicated above (and substitutes them with the appropriate glyphs in both the 'cmap' table and GSUB features).

@kenlunde
Copy link
Contributor Author

kenlunde commented Nov 7, 2016

@acuteaccent: If I were to build such a test font, would you be willing to test it?

@acuteaccent
Copy link

@kenlunde Yes.

@kenlunde
Copy link
Contributor Author

kenlunde commented Nov 8, 2016

Excellent. This will be my weekend project.

@kenlunde
Copy link
Contributor Author

kenlunde commented Nov 30, 2016

@acuteaccent: Apologies for the delay. I spent the evening building test fonts. The one named CombiningJamoTestAll-ExtraLight.otf includes only the glyphs necessary for combining jamo (the nominal and combining forms of 1100-11FF, A960-A97C, D7B0-D7C6, and D7CB-D7FB) plus U+0020 (space). The one named CombiningJamoTest-ExtraLight.otf is the same, but excludes the eight glyphs mentioned above (uni1160, uni115F.ljmo01, uni115F.ljmo02, uni115F.ljmo03, uni115F.ljmo04, uni115F.ljmo05, uni115F.ljmo06, and uni1160.vjmo02), and modifies the 'cmap' table and GSUB features accordingly. Please test at your earliest convenience.

@kenlunde
Copy link
Contributor Author

In terms of a test file, please use this one, which includes all 30,222 two-and three-character sequences—among the possible 1,638,750 ones—that include U+115F or U+1160.

While I think that we can get away with removing uni115F.ljmo0[1-6] and uni1160.vjmo02, which will save seven glyphs, I think that we need to keep uni1160. Let me explain. When rendering, my initial testing suggests that it is okay to use the same glyph for U+115F and U+1160, but when a PDF is created, any instance of U+1160 in the original text will be converted to U+115F when the text is copied from the PDF. I will build a third test font later this morning that keeps uni1160.

@kenlunde
Copy link
Contributor Author

The test fonts link above now corresponds to a ZIP file that contains a third test font, CombiningJamoTest1160-ExtraLight.otf, which is identical to CombiningJamoTest-ExtraLight.otf except that it retains the nominal glyph for U+1160 (uni1160), both in terms of the 'cmap' table and GSUB features. Only seven glyphs—uni115F.ljmo01, uni115F.ljmo02, uni115F.ljmo03, uni115F.ljmo04, uni115F.ljmo05, uni115F.ljmo06, and uni1160.vjmo02—have been removed.

@hfhchan
Copy link

hfhchan commented Apr 7, 2017

image
CN glyphs for U+611F, U+61BE, U+64BC can be shared with the JP glyphs, since that is done for U+8F57 anyway.

Personally, the JP one feels "more Chinese" than CN with its asymmetric balance.

@hfhchan
Copy link

hfhchan commented Apr 7, 2017

Incidentally, U+6FB8, U+9C64, U+9CE1, U+3673, U+40ED, U+425E, U+4717, and U+4AF2 need to be redesigned if the JP version is preferred.

@hfhchan
Copy link

hfhchan commented Apr 7, 2017

image
U+501F CN/TW === JP/KR

@hfhchan
Copy link

hfhchan commented Apr 7, 2017

image
U+503B TW === JP/KR

@hfhchan
Copy link

hfhchan commented Apr 8, 2017

U+4E51 U+4E5A

image
The difference between these two characters in SHSans are unnecessary compared with those in SHSerif.

@hfhchan
Copy link

hfhchan commented Apr 8, 2017

Per the following two posts on 心,
#98 (comment)
#98 (comment)

it seems that the difference between 心 in CN/TW and JP/KR are a widespread phenomenon instead of simply confined to compounds containing 感.

My personal opinion is that the difference in the placement of the 點 above 豎彎鉤 is much more minor than the stroke joining of 又 and 叉. I think the former is an aesthetic design issue only, while the latter is a stroke-level mandatory requirement by the MOE. I would personally prefer the glyphs to be allocated to solve the latter instead of the former.

@hfhchan
Copy link

hfhchan commented Apr 8, 2017

image

The CN/TW glyph of U+4E1E should use the JP/KR glyph.

image

As far as the code charts are concerned, the starting position of the 捺 (5th stroke) should be the 豎鉤 (2nd stroke) for TW, which is markedly more similar to the JP/KR glyph. The CN glyph looks similar to the one in the code chart though.

image

For PMingLiU and DFKai-SB, the starting position of the 捺 (5th stroke) is also the 豎鉤 (2nd stroke). For Microsoft Jhenghei, the starting point is exactly at the intersection of 橫折 (1st stroke) and 豎鉤 (2nd stroke).

image

The situation for CN fonts is similar. For Kaiti and SimSun, the starting position of the 捺 (5th stroke) is also the 豎鉤 (2nd stroke). For Microsoft Yahei and SimHei, the starting point is exactly at the intersection of 橫折 (1st stroke) and 豎鉤 (2nd stroke).

Therefore, both the CN/TW glyphs can be safely remapped from the JP/KR glyph.

@acuteaccent
Copy link

(I write this comment to show that I did not ignore Ken's request. I already responded him via email, but forgot to write a comment here.)

#98 (comment)
About the test font without the seven blank glyphs: There was no problem on my end. Everything was okay.

@kenlunde
Copy link
Contributor Author

kenlunde commented Apr 9, 2017

@acuteaccent: We also confirmed this via Source Han Serif.

@kenlunde kenlunde changed the title Consolidation of Glyph Sharing Suggestions Consolidation of Glyph Sharing Suggestions (TO CLOSE) May 26, 2017
@kenlunde kenlunde changed the title Consolidation of Glyph Sharing Suggestions (TO CLOSE) Consolidation of Glyph Sharing Suggestions (See Issue #179) May 26, 2017
@kenlunde
Copy link
Contributor Author

Consolidated with Issue #179.

@adobe-fonts adobe-fonts locked as resolved and limited conversation to collaborators Nov 20, 2018
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants