Consolidation of Glyph Sharing Suggestions (See Issue #179) #98

kenlunde · 2015-04-18T16:11:29Z

This Issue will be used to consolidate suggestions for glyphs that can be shared across more than one language, with the intent to purge one or more glyphs per code point, which will free up CIDs for accommodating additional glyphs. Please report any future glyph sharing suggestions to this Issue.

kenlunde · 2015-04-21T03:01:25Z

Issues #92 and #97 are consolidated here.

tamcy · 2015-04-21T14:03:55Z

Glad to know that SHS 1.002 has been released, thanks for the hard work.

I'd propose to unify the 辶 component so that certain glyphs of CN and JP can be shared.

KR's 辶 will always have two dots at the top left corner, while CN will always have one. JP's case is a mix of the two, the 辶 may have one or two dots at the top left corner.

Given there is no difference in other components, the two-dot version glyph can be shared among JP and KR. Theoretically this is also true for CN and JP. However, currently the glyphs can't be shared because CN uses a slightly different design for the same component, which doesn't seem necessary.

2015/4/23 : updated image

kenlunde · 2015-04-21T14:20:46Z

Thank you for the suggestion. Also, I just edited the sentence in the first post to this Issue.

hfhchan · 2015-04-21T16:07:13Z

I second @tamcy 's suggestion. If we compare SimHei and Microsoft YaHei, which are both fonts built towards the PRC standards, it is obvious that using the design on the left is acceptable (and likely more modern-like and preferrable).

hfhchan · 2015-04-21T16:16:18Z

FYI I have talked with Ms Lu Qin on the stroke connection for components such as 口, 辶 and 又, and she indicated that the exact stroke connection point is deemed as a feature of asethetics, which is likely to be codified in the document for the upcoming registration of the Hong Kong IVD Collection. The main concern of the glyphs for Hong Kong is that only the type of stroke has to be correct. In this case, for example 請 may directly map to the cn glyph as the connection between the first two strokes is, connection point for 口 are deemed as asethetic issue and not a requirement. However, the character cannot directly map to the tw glyph, as the first stroke of the right bottom "月" must be stroke 2 instead of stroke 3. This should dramatically reduce the number of glyphs that should be redesigned.

tamcy · 2015-04-21T16:46:11Z

Also
丸, 飄, 凝, 摜 CN -> JP
多 TW -> JP
忍 CN -> KR
牆, 壯, 伴, 光, 心, 乾, 晚, 色, 怎, 慣, 馮, 載, 踢 CN/TW -> JP
抱, 勤 CN/TW -> KR

jungshik · 2015-04-21T18:19:38Z

I third tamcy's suggestion about 辶. As for Korean, as long as they're consistent across characters (that is, every character with that radical takes the same shape in KR font instances) except for pairs of characters for which the only difference is 'one dot vs two dots' in that radical , either form should be fine.

hfhchan · 2015-04-21T19:03:02Z

For one dot + horizontal line, e.g. 應,言, etc. I think the dot just touch horizontal line is a good design to unify CN and TW (and possibly HK) glyphs. In Song design, the TW MOE Song design must always touch. The PRC Song design does not usually touch unless not enough space. I think we can get away with using dot "touch" line instead of dot "separated from" line or dot "join" line.

RyanChng · 2015-04-22T02:37:25Z

For 透, I think it may not be that simple - the 乃 component is a bit different in there.

tamcy · 2015-04-22T17:15:41Z

@RyanChng You are correct, I'll revise the screenshot when I have time (EDIT: updated).

kenlunde · 2015-04-24T15:03:53Z

While we very much appreciate any and all feedback about the extent to which glyphs can be shared across the supported languages, the decision will ultimately be made by the typeface designer, Ryoko Nishizuka, in consultation with Changzhou SinoType's typeface designer, and to some extent, with me.

I am planning to begin this process by preparing a list of candidate glyphs, in an effort to reduce the size of the candidate list. I will separately list particular components for consideration. These will be used as Ryoko's guide.

Also, please try to refrain from using this "issue" as a discussion thread. If a discussion is necessary, I suggest it be done elsewhere, and the results posted to this issue.

jimmymasaru · 2015-06-06T00:46:49Z

AdobeJapan1-6 contains many variants, but some of them are designed for serif fonts, e.g. 父, CID+3541 vs CID+13497. I found Source Han Sans also includes these glyphs, and they are having the same design. Is it possible to remove them because they look like some sort of duplication?

kenlunde · 2015-06-06T02:07:57Z

@jimmymasaru: We have a requirement to support the Adobe-Japan1 IVD collection, which is to have unique CIDs. We understand that some of the glyph distinctions cannot be represented in a sans serif (aka gothic) typeface design, meaning that we are fully aware of this.

BTW, your comment would also apply to Kozuka Gothic.

jimmymasaru · 2015-06-14T10:41:52Z

U+5840 塀
U+584F 塏
I suggest those two glyphs, as well as others containing 䒑, can be unified because they are all design differences. I checked Meiryo and Microsoft Yahei and found it's flexible to have the second stroke fully linked or not fully linked with the third stroke.

U+5852 塒
The JP glyph of this character can be unified with the one of CN, because the 土 and 寸 are linked in other characters like 寺時侍持.

㪽㪿㫀
Can the glyphs containing 斤 be unified between JP and CN?

All the characters listed above are having the same glyphs in Hiragino Kaku Gothic and Hiragino Sans GB.

kenlunde · 2015-06-14T12:58:03Z

Thank you for the continued suggestions. I am starting to ramp up preparations for Version 2.000, which will take a few months to complete due to the various things that I'd like to accomplish, one of which is better support for Hong Kong. This particular glyph-sharing issue is important because the intent is to free up enough CIDs to make the Hong Kong support possible.

tamcy · 2015-06-14T17:45:48Z

I would like to add that the case of 斤 can also be applied to characters of same or similar strokes, like below:

(Top: JP, Bottom: TW)

后臼興 (also implies 盾垢揑舁) can be shared if unified.
學段劉 are examples of non-shareable but affected glyphs (but 壆覺 can be shared).

kenlunde · 2015-07-18T13:23:48Z

To demonstrate progress on this issue, and as the first real step toward Version 2.000, I spent a solid three days last week compiling glyph pairs that can potentially be shared across languages, and came up with 832 candidates. (I use the term "candidate" because the final determination is made by the designers, but I am the best person to come up with a list of candidates.) I plan to spend part of this weekend and Monday to find additional glyphs pairs, though I suspect that the figure will be one or two digits at most.

The designers will use the materials that I am preparing, which show the pairs side-by-side at the weight extremes (ExtraLight and Heavy) and at an intermediate weight (Medium), and also overlaid at the Regular weight, to determine which glyph pairs can be reduced to a single glyph.

Remember that the primary purpose of this particular exercise is to free up enough CIDs to provide proper, or at least adequate support, for Hong Kong SCS (HKSCS). (Though Hong Kong is moving toward a new standard abbreviated HKCS, and thus represents somewhat of a moving target.)

Just FYI, the differences shown in @tamcy's 2015-04-21 and 2015-06-14 posts above will not be shared, because these language-based differences were intentionally established at the designers' discretion.

kenlunde · 2015-07-18T13:30:58Z

@jimmymasaru: About your 2015-06-14 post, the CN and JP glyphs for 塀, 塏, 塒, 寺, and 持 are already among the candidates for sharing, which were captured by the "fine-tooth comb" work that I did last week. I am simply confirming that they were detected as candidates through my systematic efforts. Note that 時 and 侍 are not candidates, because JP and CN already share the glyphs (the former is a CN glyph that JP uses, and the latter is a JP glyph that CN uses).

tamcy · 2016-05-01T04:55:27Z

Not sure the following glyphs are already on the list, but I'll post it anyway.

U+51BD  冽:     11410 (J/K) = 11412 (T/C)
U+52A0  加:     11783 (J/K) = 11784 (T/C)
U+53FB  叻:     12356 (J/K) = 12357 (T/C)
U+5420  吠:     12409 (J/K) = 12410 (T/C)
U+5FCC  忌:     17914 (J/K) = 17915 (T/C)
U+6028  怨:     18062 (J/K) = 18064 (T)
U+606A  恪:     18169 (J/K) = 18170 (T/C)
U+60B2  悲:     18288 (J/K) = 18290 (T)
U+617C  慼:     18656 (J/K) = 18657 (C)
U+64BC  撼:     20061 (J/K) = 20062 (C)
U+67F1  柱:     21464 (J/K) = 21465 (T/C)
U+6BB2  殲:     23207 (J/K) = 23209 (T)
U+6CD7  泗:     23684 (J/K) = 23685 (T/C)
U+7199  熙:     25841 (J) = 25843 (C)  (Added 2 May)
U+74E6  瓦:     27313 (J/K) = 27315 (T)
U+7765  睥:     28445 (J/K) = 28446 (T/C)
U+7BD9  篙:     30527 (J/K) = 30528 (C)
U+8304  茄:     34230 (J/K) = 34231 (C)
U+9D60  鵠:     61949 (K) = 46386 (T/C)

kenlunde · 2016-05-04T21:27:27Z

@tamcy: Thank you. I will check this list against my current notes, but at least U+7BD9 篙 cannot be unified between J/K and C due to its seventh stroke.

kenlunde · 2016-05-25T03:21:50Z

@tamcy: I finally had time to compare your list to my list of sharing candidates. All of them were included in my own data, except for U+7BD9 篙 that cannot be shared for reasons explained above. It was reassuring to confirm that your list was a pure subset of what I independently came up with.

tamcy · 2016-06-10T14:18:19Z

U+56CD 囍: 13554 (J/K) = 13555 (T/C)
U+50D6 僖: 11011 (J/K) = 11012 (T/C)

Actually there is a subtle difference between the two set, which is how "口" touches the last two strokes of the "壴" component. But this should be a designer's preference issue.

kenlunde · 2016-06-10T14:22:26Z

@tamcy: Both of these characters are included in my list of sharing candidates.

Rameshdaspam · 2016-08-01T15:23:33Z

@kenlunde : How to convert Simplified Chinese to .eot?

kenlunde · 2016-08-01T17:05:52Z

@Rameshdaspam: Converting to EOT, using the command-line ttf2eot tool, first requires a TTF version of the font, which is a DIY affair. We have no plans to deploy these fonts as TTFs.

Rameshdaspam · 2016-08-01T17:17:01Z

Actually we need Webfont for Simplified Chinese.
SourceHanSansSC
SourceHanSansHWSC

kenlunde · 2016-08-01T17:22:26Z

@Rameshdaspam: If the fonts are not in a format that you can use, you will need to convert them into the desired format. I do not know enough about your request to advise further, other than supplying EOTs is a non-starter due to the TTF requirement.

acuteaccent · 2016-11-07T08:06:15Z

You don't need to assign two different glyphs for U+115F and U+1160, as they are just fillers. In fact, CID+461 is currently shared by U+1160 and U+3164. Instead of this, you can use CID+460 for U+115F, U+1160, and U+3164; and use CID+461 for something else.

acuteaccent · 2016-11-07T08:34:00Z

Also, you don't actually need these:
63752 Hangul OldHangul-LeadingConsonants uni115F.ljmo01
63877 Hangul OldHangul-LeadingConsonants uni115F.ljmo02
64002 Hangul OldHangul-LeadingConsonants uni115F.ljmo03
64127 Hangul OldHangul-LeadingConsonants uni115F.ljmo04
64252 Hangul OldHangul-LeadingConsonants uni115F.ljmo05
64377 Hangul OldHangul-LeadingConsonants uni115F.ljmo06
64502 Hangul OldHangul-Vowels uni1160.vjmo02

My understanding is that the glyphs with .ljmo0[1-6] at the end have the width of 920, and ones with .vjmo0[1-2] or .tjmo0[1-4] at the end have the width of zero.

For the first six of the above, you can just simply use CID+740 (uni115F; nominal form of U+115F), as CID+740 is already a spacing glyph. You don't need seven (including the nominal form) 920-width blank glyphs; one is good enough.
For the last one, you can just simply use CID+64407 (uni1160.vjmo01). You don't need two zero-width glyphs with nothing in them; one is good enough.

You can save seven glyphs (eight if you count the comment right above this one).

kenlunde · 2016-11-07T14:26:23Z

While I have a preference to keep these glyphs, because their presence makes debugging the three GSUB features an easier process, I am willing to build a test font that includes only the nominal and combining forms, along with the space (U+0020), but excludes the eight glyphs that you indicated above (and substitutes them with the appropriate glyphs in both the 'cmap' table and GSUB features).

kenlunde · 2016-11-07T15:05:17Z

@acuteaccent: If I were to build such a test font, would you be willing to test it?

acuteaccent · 2016-11-08T06:15:45Z

@kenlunde Yes.

kenlunde · 2016-11-08T13:25:32Z

Excellent. This will be my weekend project.

kenlunde · 2016-11-30T05:31:36Z

@acuteaccent: Apologies for the delay. I spent the evening building test fonts. The one named CombiningJamoTestAll-ExtraLight.otf includes only the glyphs necessary for combining jamo (the nominal and combining forms of 1100-11FF, A960-A97C, D7B0-D7C6, and D7CB-D7FB) plus U+0020 (space). The one named CombiningJamoTest-ExtraLight.otf is the same, but excludes the eight glyphs mentioned above (uni1160, uni115F.ljmo01, uni115F.ljmo02, uni115F.ljmo03, uni115F.ljmo04, uni115F.ljmo05, uni115F.ljmo06, and uni1160.vjmo02), and modifies the 'cmap' table and GSUB features accordingly. Please test at your earliest convenience.

kenlunde · 2016-11-30T13:36:53Z

In terms of a test file, please use this one, which includes all 30,222 two-and three-character sequences—among the possible 1,638,750 ones—that include U+115F or U+1160.

While I think that we can get away with removing uni115F.ljmo0[1-6] and uni1160.vjmo02, which will save seven glyphs, I think that we need to keep uni1160. Let me explain. When rendering, my initial testing suggests that it is okay to use the same glyph for U+115F and U+1160, but when a PDF is created, any instance of U+1160 in the original text will be converted to U+115F when the text is copied from the PDF. I will build a third test font later this morning that keeps uni1160.

kenlunde · 2016-11-30T13:59:02Z

The test fonts link above now corresponds to a ZIP file that contains a third test font, CombiningJamoTest1160-ExtraLight.otf, which is identical to CombiningJamoTest-ExtraLight.otf except that it retains the nominal glyph for U+1160 (uni1160), both in terms of the 'cmap' table and GSUB features. Only seven glyphs—uni115F.ljmo01, uni115F.ljmo02, uni115F.ljmo03, uni115F.ljmo04, uni115F.ljmo05, uni115F.ljmo06, and uni1160.vjmo02—have been removed.

hfhchan · 2017-04-07T19:14:14Z

CN glyphs for U+611F, U+61BE, U+64BC can be shared with the JP glyphs, since that is done for U+8F57 anyway.

Personally, the JP one feels "more Chinese" than CN with its asymmetric balance.

hfhchan · 2017-04-07T19:19:08Z

Incidentally, U+6FB8, U+9C64, U+9CE1, U+3673, U+40ED, U+425E, U+4717, and U+4AF2 need to be redesigned if the JP version is preferred.

hfhchan · 2017-04-07T19:53:14Z

U+501F CN/TW === JP/KR

hfhchan · 2017-04-07T19:57:43Z

U+503B TW === JP/KR

hfhchan · 2017-04-08T07:07:10Z

U+4E51 U+4E5A

The difference between these two characters in SHSans are unnecessary compared with those in SHSerif.

hfhchan · 2017-04-08T20:08:41Z

Per the following two posts on 心,
#98 (comment)
#98 (comment)

it seems that the difference between 心 in CN/TW and JP/KR are a widespread phenomenon instead of simply confined to compounds containing 感.

My personal opinion is that the difference in the placement of the 點 above 豎彎鉤 is much more minor than the stroke joining of 又 and 叉. I think the former is an aesthetic design issue only, while the latter is a stroke-level mandatory requirement by the MOE. I would personally prefer the glyphs to be allocated to solve the latter instead of the former.

hfhchan · 2017-04-08T20:36:21Z

The CN/TW glyph of U+4E1E should use the JP/KR glyph.

As far as the code charts are concerned, the starting position of the 捺 (5th stroke) should be the 豎鉤 (2nd stroke) for TW, which is markedly more similar to the JP/KR glyph. The CN glyph looks similar to the one in the code chart though.

For PMingLiU and DFKai-SB, the starting position of the 捺 (5th stroke) is also the 豎鉤 (2nd stroke). For Microsoft Jhenghei, the starting point is exactly at the intersection of 橫折 (1st stroke) and 豎鉤 (2nd stroke).

The situation for CN fonts is similar. For Kaiti and SimSun, the starting position of the 捺 (5th stroke) is also the 豎鉤 (2nd stroke). For Microsoft Yahei and SimHei, the starting point is exactly at the intersection of 橫折 (1st stroke) and 豎鉤 (2nd stroke).

Therefore, both the CN/TW glyphs can be safely remapped from the JP/KR glyph.

acuteaccent · 2017-04-09T04:45:54Z

(I write this comment to show that I did not ignore Ken's request. I already responded him via email, but forgot to write a comment here.)

#98 (comment)
About the test font without the seven blank glyphs: There was no problem on my end. Everything was okay.

kenlunde · 2017-04-09T12:31:54Z

@acuteaccent: We also confirmed this via Source Han Serif.

kenlunde · 2017-05-26T17:55:08Z

Consolidated with Issue #179.

kenlunde added the enhancement label Apr 18, 2015

kenlunde self-assigned this Apr 18, 2015

This was referenced Apr 21, 2015

U660E (明) - Different glyph for TC&SC, is this necessary? #97

Closed

(Question) Why does the glyph of u5220 look the same as u522A in TW? #92

Closed

kenlunde changed the title ~~Consolidation of Glyph Sharing Suggestions~~ Consolidation of Glyph Sharing Suggestions (TO CLOSE) May 26, 2017

kenlunde changed the title ~~Consolidation of Glyph Sharing Suggestions (TO CLOSE)~~ Consolidation of Glyph Sharing Suggestions (See Issue #179) May 26, 2017

kenlunde added the consolidated label May 26, 2017

kenlunde closed this as completed May 26, 2017

kenlunde mentioned this issue May 27, 2017

Consolidation of Glyph Sharing Suggestions #179

Closed

adobe-fonts locked as resolved and limited conversation to collaborators Nov 20, 2018

Consolidation of Glyph Sharing Suggestions (See Issue #179) #98

Consolidation of Glyph Sharing Suggestions (See Issue #179) #98

Comments

kenlunde commented Apr 18, 2015

kenlunde commented Apr 21, 2015

tamcy commented Apr 21, 2015

kenlunde commented Apr 21, 2015

hfhchan commented Apr 21, 2015

hfhchan commented Apr 21, 2015

tamcy commented Apr 21, 2015

jungshik commented Apr 21, 2015

hfhchan commented Apr 21, 2015

RyanChng commented Apr 22, 2015

tamcy commented Apr 22, 2015

kenlunde commented Apr 24, 2015

jimmymasaru commented Jun 6, 2015

kenlunde commented Jun 6, 2015

jimmymasaru commented Jun 14, 2015

kenlunde commented Jun 14, 2015

tamcy commented Jun 14, 2015

kenlunde commented Jul 18, 2015

kenlunde commented Jul 18, 2015

tamcy commented May 1, 2016 • edited Loading

kenlunde commented May 4, 2016

kenlunde commented May 25, 2016

tamcy commented Jun 10, 2016

kenlunde commented Jun 10, 2016

Rameshdaspam commented Aug 1, 2016

kenlunde commented Aug 1, 2016

Rameshdaspam commented Aug 1, 2016

kenlunde commented Aug 1, 2016

acuteaccent commented Nov 7, 2016

acuteaccent commented Nov 7, 2016 • edited Loading

kenlunde commented Nov 7, 2016

kenlunde commented Nov 7, 2016

acuteaccent commented Nov 8, 2016

kenlunde commented Nov 8, 2016

kenlunde commented Nov 30, 2016 • edited Loading

kenlunde commented Nov 30, 2016

kenlunde commented Nov 30, 2016

hfhchan commented Apr 7, 2017

hfhchan commented Apr 7, 2017

hfhchan commented Apr 7, 2017

hfhchan commented Apr 7, 2017

hfhchan commented Apr 8, 2017 • edited Loading

hfhchan commented Apr 8, 2017 • edited Loading

hfhchan commented Apr 8, 2017

acuteaccent commented Apr 9, 2017

kenlunde commented Apr 9, 2017

kenlunde commented May 26, 2017

tamcy commented May 1, 2016 •

edited

Loading

acuteaccent commented Nov 7, 2016 •

edited

Loading

kenlunde commented Nov 30, 2016 •

edited

Loading

hfhchan commented Apr 8, 2017 •

edited

Loading

hfhchan commented Apr 8, 2017 •

edited

Loading