-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidation of Post-V2 Glyph Correction Suggestions #204
Comments
U+4EFD 份 - HK should use CN version, currently mapped to TW. |
So overall, maybe fix 62853 (uni8FEAuE0101-JP), 62854 (uni8FEBuE0101-JP), 62861 (uni8FFDuE0101-JP), 62867 (uni900EuE0101-JP), 62882 (uni9041uE0101-JP), 62891 (uni9052uE0101-JP) and 62892 (uni9053uE0101-JP, already mentioned), which have feet. Decide if you want more consistency to those hidden glyphs, some of them used as KR mappings. Font Book only gave me glyph numbers (in the Repertoire section, hover and you see the glyph numbers, they are in order), but I checked the glyph numbers against the mapping file, for example, 造, which is correct on my end, so should be correct for everything else. |
There are many characters with "feet" remaining in the KR glyph where they are absent in the other versions, a few of them are: 迫、追、逭 (feet in CN glyph as well)、适、迪、這 It seems that the different versions are rather inconsistent with the "feet" in characters with the 辶 radical (I don't know if this is an intended design difference or not.) |
ref http://www.gb688.cn/bzgk/gb/newGbInfo?hcno=BCBF3BC7DCED3629F5E41CE02D9CFD55 page 112 |
@CNMan For U+9FE9, I would claim that Unicode's representative glyph should also be corrected, especially when checking the original proposal (L2/13-009). In particular, see the kaishu form on page 5 (page 6 of the PDF) that is ⿰魯爾. For U+9FA9 龩, U+9FB3 龳, and U+9FCA 鿊, no action is necessary, because these are single-source (HK) ideographs. Although no action is necessary for a similar reason, you missed U+9FEA (KR). And yes, I am fully aware that some single-source ideographs have multiple region-specific glyphs. When push comes to shove, which was necessary for the Version 2.000 update, such glyphs are removed in order to make room for higher-priority glyphs. Also, I'd like to point out that following standards such as GB/T 22321.1-2018 is not within the scope of the Source Han projects. I think of such standards as attempts to hammer square pegs into round holes, meaning that regional conventions are applied to ideographs that are not actually used in that particular region. It would be nice to do, but when dealing with a glyph set that is already full, practicality becomes necessary. |
For Source Han Sans Hong Kong, I noticed that the 胡 part of 鬍 is not consistent with other characters that have a 胡 component. Here is an example of the difference between the 胡 part of 鬍 when contrasted with 胡 as a standalone character as well as other characters with a 胡 component: In addition, it seems that the 鬍 character in Hong Kong's version of Source Han Sans may not conform with the "standard" found in 香港小學學習字詞表. I have attached a picture of the character 鬍 based on the online version of 香港小學學習字詞表 here: Finally... on this Thanksgiving Day in the USA, I wanted to take a moment and thank @kenlunde and everyone involved in the creation of Source Han Sans and Source Han Serif for bringing these fonts to life for the community! |
@pan-asian-wok Thank you, and you're welcome. Adding a new HK glyph for U+9B0D 鬍, uni9B0D-HK, is now noted in Issue #206. BTW, I am using 香港電腦漢字參考字形 as the reference for HK glyphs. Its scope is the same as for the Source Han typefaces, specifically Big Five plus HKSCS-2016. The scope of 香港小學學習字詞表 is far more limited. In fact, I found an error in the former during Source Han Sans Version 2.000 development that I reported, which has been corrected, and is noted on its last page (page 1014). |
For Source Han Sans Hong Kong, I think the 系 of 遜 should have a hook. Reference: 香港電腦漢字參考字形 And I think the 山 part of CN glyph for 﨑 should be unified. The size of dot in fullwidth ? and ! also should be unified. |
@Buernia Adding a new HK glyph for U+905C 遜, uni905C-HK, is now noted in Issue #206. Adjusting the CN glyph for U+FA11 﨑, uniFA11-CN, is now noted in the table in this issue. The dot in the glyphs for U+FF1F ?, uniFF1F and uniFF1F-CN, and U+FE16 ︖, uniFE16, is not compatible in the ExtraLight and Heavy masters, which means that the ExtraLight and Heavy weights look okay, but the five intermediate weights have inconsistent or seemingly random weights. This is also now noted in the table in this issue. |
For HK glyph variant, according to my feeling (need further verification) |
@c933103 Before dousing this issue with potentially meaningless comments, please first check against 香港電腦漢字參考字形. And, after checking them all, I confirmed that none of your comments are actionable. Also, please don't post issues based on feelings. As I wrote above, please check the standard that is referenced in the previous paragraph. If you have an issue with that standard, take it up with the organization that is responsible for it, not with this project. Apologies if I seem to be a bit harsh. I was in the midst of reading a new book entitled Zero Sum Game. |
For Source Han Sans Hong Kong, the middle 八 in radical ⿋ should be 丷. Reference: 香港電腦漢字參考字形 And HK glyph 戠 should use CN version. |
@Marcus98T I am still thinking about this, hence the lack of a reply to you and to @lapomme. I am also technically on vacation. 🥃 |
@kenlunde I also think the CN glyph of ⿋ should be replaced by the future HK glyph. |
@c933103 Congratulations, you have discovered that the standard forms as provided by EDB or provided in 香港電腦漢字參考字形 don't really match what many Hongkongers have learned since they were young. 🤷🏻♂️ May I suggest you contact 教育局課程發展處中國語文教育組 at the email [email protected] and/or Chinese Language Interface Advisory Committee (CLIAC) at [email protected] to complain about this issue. |
@Buernia You're quite right. Actually, the CN glyph for U+9EF9 黹, uni9EF9-CN, was modified for V2, and needs to be reverted to its pre-V2 form, and also used for HK. U+2FCB ⿋ needs similar treatment, in terms of its HK mapping override. These changes are now reflected in the tables of the appropriate issues. |
(謊 included for reference)
(此 and 死 included for reference) |
I thought the question mark was adjusted in response to this issue? It may sound ironic, but I actually like the change when I saw it. It's fine for ExtraLight and Heavy to have a larger dots (they're usually used in short text and meant to be catchy) but for Light, Regular, and Bold which are usually used in documents or long text I would like the dot to be less eye-catching (incidentally I also had the dots of the full-width glyphs shrinked in my fork. The original version looks a little bit too large). I was amazed when I see this interpolation magic, and even planned to suggest similar changes to the full-width exclamation mark (but didn't as I need some more time to experience with it). |
@tamcy Corrections for the CN glyph for U+4E31 丱 and the HK glyphs for U+3C54 㱔, U+451D 䔝, and U+585F 塟 are now reflected in the table at the beginning of this issue. Adding new HK glyphs for U+5DDF 巟, U+614C 慌, and U+819A 膚 is now reflected in the table at the beginning of Issue #206. And, adjusting the TW and HK mappings for U+55D7 嗗 is now reflected in the table at the beginning of Issue #202. (Just FYI, I edited your post, to change "U+819A 丱" to "U+4E31 丱" in the first column of the first row of the first table.) |
@tamcy With regard to the glyphs for U+FF01 ! and U+FF1F ?, to include their vertical presentation forms, U+FE15 ︕ and U+FE16 ︖, I will ask our designer to make the dot smaller, though I thought that I did as part of the Version 2.000 update. This is independent of making the dot in the glyphs for U+FE16 ︖ and U+FF1F ? compatible between the ExtraLight and Heavy masters. See the table at the beginning of this issue. @extc The corrections for the HK glyphs for U+212FE 𡋾, U+235CE 𣗎, and U+27C12 𧰒 are reflected in the table at the beginning of this issue. |
@Marcus98T & @lapomme I finally had time to deep-dive into the "foot" issue for ideographs that use Radical 162 (辶). The following are the 17 affected glyphs, according to your list, and confirmed by me: uni9002-JP, uni9005-JP, uni9005-CN, uni900E-JP, uni9019-JP, uni902D-JP, uni902D-CN, uni9041-JP, uni9052-JP, uni8FEAuE0101-JP, uni8FEBuE0101-JP, uni8FFDuE0101-JP, uni900EuE0101-JP, uni9020uE0101-JP, uni9041uE0101-JP, uni9052uE0101-JP & uni9053uE0101-JP (and shown below in the same order). I think that you missed three Extension A ideographs and one Extension B ideograph: uni4894-CN, uni48AE-CN (uni48AE-HK, shown in green, is okay), uni48B0-CN & u28455-JP. |
@cathree3 Intentional. The KR glyph is actually a variant of the JP glyph, and maps to Adobe-Japan1-7 CID+13666, which corresponds to Adobe-Japan1 IVS <U+82B1,U+E0101>. A separate CN glyph is included for subtle balance adjustments. |
I suggest adjusting the JP and KR versions to match the balance of the CN version for glyph sharing (EDIT: in the case of the KR version), so we have one less glyph to worry about. |
We'll take it under advisement. |
@Marcus98T In Heavy weights, it is not uncommon that a stroke becomes thinner when it passes through a box-like component. In fact, this is optically preferred (because making the stroke uniform, aka “mathematically correct”, actually looks uneven). Here are some examples showing the character U+8679 虹:
From this point of view, U+906D 遭 may be a non-issue after all. The CN glyph of U+89E6 触, however, does seem to have an interpolation problem. Perhaps this is the same issue as the already documented U+720B 爋. |
@RuixiZhang42 Yes, the issue that @Marcus98T pointed out about the KR glyph for U+906D 遭 is intentional, and a non-issue. CN glyph of U+89E6 触, on the other hand, is a real issue, and will be fixed like we did for the CN glyph for U+720B 爋. |
@kenlunde @RuixiZhang42 Yes, your argument about the optical stroke thinness is correct, but the other U+906D 遭 glyphs (especially the JP glyph) do not have the thinner-stroke-within-a-box, and that bothers me. All I need is better consistency between the glyphs. Perhaps I suggest thinning the top two strokes of 曹 in the KR version so that it matches the JP version? Anyway I have found out that the top two strokes in the KR version are actually thicker than in the JP version, and this is an issue, because they are inconsistent. I also found out Kozuka Gothic is affected. And for U+89E6 触, as said before, your argument about the optical stroke thinness is still correct, but yet the JP glyph has no thinner-stroke-within-a-box within 虫. Therefore, as to how the CN version can be fixed, I want to bring over the JP component of 虫 to the CN version, like this mockup below. I’ll be ok with having a thinner-stroke-within-a-box look, just that ALL locales need to have consistency, whether it’s JP, KR, CN, or any hidden Kangxi-style glyph. We can’t have a situation where one locale has the thinner-stroke-within-a-box while the others have uniform strokes. |
@Marcus98T We have limited time and limited resources, and while perfection is a worthwhile goal, there are reasons, practical and otherwise, why it's not going to happen in the near future. If we had unlimited time and unlimited resources, it would be a different story, but alas we do not. I suggest that you focus your energies elsewhere. With that said, the JP glyph for U+906D 遭 is based on Adobe-Japan1-7 CID+2810 (Supplement 0), and its working glyph name is uni906D-JP, and the KR glyph is based on Adobe-japan1-7 CID+13896 (Supplement 4), and its working glyph name is uni906DuE0101-JP. The reason why you see the same difference in Kozuka Gothic is because the Souce Han Sans glyphs were derived from that typeface. The Kozuka glyphs date back almost 20 years. While it is an inconsistency, fixing it is nowhere close to being a high priority. |
I didn’t know even huge companies have limited resources. If so, the most important thing I’d like to see in v2.002 is completing the removal of the feet as mentioned earlier. Especially the CN/TW/HK 隹 components placed on top, and a few others like 糙. (EDIT: I think maybe drop the "confusion" reaction and we do one thing at a time, like what I said above. I think I will bring up the unification of glyphs much later when the time is right) |
@S-Asakoto Being in CNS 11643 Plane 3 and not present in HKSCS, the ideograph U+7742 睂 is outside the Traditional Chinese scope of this project. |
Some dot strokes become a straight line in the HK/TW version of the font, which is inconsistent with the Unicode standard. They're basically characters used in simplified Chinese, but regardless having no JP/KR source they show in JP/KR style. For example, U+4EB2 亲 and U+5E90 庐: I know these are in CNS 11643 Plane 3 and are not present in HKSCS, so On the other hand, U+5E9D 庝, being in CNS 11643 Plane 3, without JP/KR source and not present in HKSCS, the dot is preserved -- rather, the glyph shares among all 5 fonts: I would like to hope this be an exception to the "out-of-scope" policy and a glyph shared by HK/TW font be created with the dot replacing the vertical bar. |
@S-Asakoto I think it isn't very accurate to say that “dot strokes become a straight line” for the characters you mentioned. A better way to say it would be the characters are considered “out of scope” so adhering to the corresponding regional convention is not guaranteed. As you know, the scope of Traditional Chinese for Taiwan in this project is limited to the characters defined in Big5 (i.e. CNS 11643 Planes 1 & 2), which means that only Big5 characters are guaranteed to adhere (mostly) to Taiwan MoE's conventions in the TW version. But the codepoint coverage of Source Han Sans is not limited to Big5, so it is very easy to a character beyond Big5 not adhering to Taiwan MoE's standard. It looks to me that the developer tries to find the best-match from other regions for any out-of-scope characters. Take 擵 as an example: 擵 has two glyphs, namely uni64F5-JP (for JP region) and uni64F5-CN (for CN). This character falls beyond Big5, so there is no dedicated glyph for TW or HK, and unfortunately none of the existing glyphs conform to TW standard (the JP glyph uses a straight line for the dot; in the CN glyph 𣏟 is unified with 林. The developer decided to map the JP glyph but not CN one for TW, probably because the difference in the “𣏟” component is more apparant, so the JP glyph is considered the closest match. Now, back to the three characters you mentioned. U+4EB2 亲: JP and CN glyphs exist. JP is chosen for TW, probably because the design difference in 木 is more apparant (which I agree). U+5E90 庐: JP and CN glyphs exist. JP is chosen for TW, probably because the design difference in 戶 is more apparant (which I also agree for TW. But HK should use CN glyph instead. It isn't the case probably due to historical reason that a separate HK version didn't exist before v2.000. Before that, TW was named TWHK, so the same mapping for out-of-scope characters was used). U+5E9D 庝: Only CN glyph exists. There is no choice but maps all other regions to the CN glyph. Since the design of the “dot” is the same for CN and TW, the effect is that 庝 adheres to MoE standard even though it isn't in the supported range. So yes, the dot is kind of “preserved” for U+5E9D 庝, but this is purely coincidental. Same for why U+5E0D 帍 adheres to MoE standard (戶's first stroke in 丿) but U+5554 啔 and U+623B 戻 doesn't even though all these are out of scope - because U+5E0D 帍 has a JP glyph to map from, and coincidently the JP design of 帍 is the same as (or very close to) that required by TW. |
@tamcy Oh actually I had thought that Unicode standard would be followed anyway even for out-of-scope characters if T-/H-source is present. So that's why the character 睂 also looks that way for TW/HK font. |
I suggest to adjust the following six HK glyphs:
The design of the 米 component for these 6 glyphs follows the TW form, where the two slanted strokes at the bottom do not touch the middle of 十, which is not consistent with other HK glyphs. For this component, HK, CN, JP and KR share the same form, where the two slanted strokes at the bottom are touching the middle of 十: While at it, I hope that u25E49-HK 𥹉 can be adjusted further to enhance the poportion of the left and right components, and the spacing of the 又 component, as shown: |
I'm not sure it is intended, but the JP version of 龜 (left: Source Han Sans v2.001, right: Source Han Serif v1.001) 龝 |
Hi, I'd like to ask if a definitive decision was made regarding making quotation marks (U+2018, U+2019, U+201C, U+201D) proportional width for Chinese? Most of the discussion for this was in notofonts/noto-cjk#5; it seemed like this change was under consideration for 2.000, however there was no follow-up comment and the issue was closed for being stale. I have skimmed through the various consolidation issues and the only mention I found was a brief discussion ending at #99 (comment). My (very limited) understanding regarding quotation marks in Chinese is that European-style quotation marks are used for Simplified Chinese but not Traditional Chinese. If these quotation marks cannot be made proportional for all Chinese because of this, then I would suggest at least making them proportional for Traditional Chinese. Mixing Chinese and English text is somewhat common in Hong Kong, e.g. most HK websites will have Traditional Chinese and English versions. Having full-width quotation marks in English text is quite an eyesore and (from a web design/development standpoint) it means another font like Roboto or Helvetica needs to be included to display English text. |
For issues related to HKSCS-2016 and Hong Kong in general, please be sure to reference 香港電腦漢字參考字形 before posting an issue here.
The following table shows the glyphs that will be corrected in the next update, and unless otherwise noted, the corrections are from my own notes:
The following table shows the glyphs that were corrected in the Version 2.001 update, and unless otherwise noted, the corrections are from my own notes:
The text was updated successfully, but these errors were encountered: