Skip to content

Commit

Permalink
Two small form CJK characters for Chinese (#666)
Browse files Browse the repository at this point in the history
* UnicodeData.txt lines reconstructed from L2/23-284

* LineBreak.txt as described by L2/23-284

* script

* Ideographic Extender

* Regenerate UCD

* ideohack

* Remove changelog

* gc=Lm for the small ers

* Regenerate UCD

* Not lowercase
  • Loading branch information
eggrobin authored Nov 15, 2024
1 parent 25ab030 commit c255ee4
Show file tree
Hide file tree
Showing 16 changed files with 67 additions and 39 deletions.
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/DerivedAge.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedAge-17.0.0.txt
# Date: 2024-11-14, 22:50:46 GMT
# Date: 2024-11-15, 13:43:01 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -2091,7 +2091,8 @@ FDC8..FDCE ; 17.0 # [7] ARABIC LIGATURE RAHIMAHU ALLAAH TAAALAA..ARABIC LIG
16DA0..16DA9 ; 17.0 # [10] CHISOI DIGIT ZERO..CHISOI DIGIT NINE
16EA0..16EB8 ; 17.0 # [25] BERIA ERFE CAPITAL LETTER ARKAB..BERIA ERFE CAPITAL LETTER AY
16EBB..16ED3 ; 17.0 # [25] BERIA ERFE SMALL LETTER ARKAB..BERIA ERFE SMALL LETTER AY
16FF2..16FF3 ; 17.0 # [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER

# Total code points: 259
# Total code points: 261

# EOF
23 changes: 15 additions & 8 deletions unicodetools/data/ucd/dev/DerivedCoreProperties.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedCoreProperties-17.0.0.txt
# Date: 2024-11-14, 22:51:10 GMT
# Date: 2024-11-15, 13:43:27 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -1335,6 +1335,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
16FE0..16FE1 ; Alphabetic # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
16FE3 ; Alphabetic # Lm OLD CHINESE ITERATION MARK
16FF0..16FF1 ; Alphabetic # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
16FF2..16FF3 ; Alphabetic # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; Alphabetic # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18CD5 ; Alphabetic # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18CFF..18D08 ; Alphabetic # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
Expand Down Expand Up @@ -1456,7 +1457,7 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
30000..3134A ; Alphabetic # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Alphabetic # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 142933
# Total code points: 142935

# ================================================

Expand Down Expand Up @@ -3492,6 +3493,7 @@ FFF9..FFFB ; Case_Ignorable # Cf [3] INTERLINEAR ANNOTATION ANCHOR..INTERLI
16FE0..16FE1 ; Case_Ignorable # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
16FE3 ; Case_Ignorable # Lm OLD CHINESE ITERATION MARK
16FE4 ; Case_Ignorable # Mn KHITAN SMALL SCRIPT FILLER
16FF2..16FF3 ; Case_Ignorable # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
1AFF0..1AFF3 ; Case_Ignorable # Lm [4] KATAKANA LETTER MINNAN TONE-2..KATAKANA LETTER MINNAN TONE-5
1AFF5..1AFFB ; Case_Ignorable # Lm [7] KATAKANA LETTER MINNAN TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-5
1AFFD..1AFFE ; Case_Ignorable # Lm [2] KATAKANA LETTER MINNAN NASALIZED TONE-7..KATAKANA LETTER MINNAN NASALIZED TONE-8
Expand Down Expand Up @@ -3533,7 +3535,7 @@ E0001 ; Case_Ignorable # Cf LANGUAGE TAG
E0020..E007F ; Case_Ignorable # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; Case_Ignorable # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 2790
# Total code points: 2792

# ================================================

Expand Down Expand Up @@ -6904,6 +6906,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
16F93..16F9F ; ID_Start # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
16FE0..16FE1 ; ID_Start # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
16FE3 ; ID_Start # Lm OLD CHINESE ITERATION MARK
16FF2..16FF3 ; ID_Start # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; ID_Start # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18CD5 ; ID_Start # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18CFF..18D08 ; ID_Start # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
Expand Down Expand Up @@ -7014,7 +7017,7 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
30000..3134A ; ID_Start # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; ID_Start # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 141432
# Total code points: 141434

# ================================================

Expand Down Expand Up @@ -8293,6 +8296,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
16FE3 ; ID_Continue # Lm OLD CHINESE ITERATION MARK
16FE4 ; ID_Continue # Mn KHITAN SMALL SCRIPT FILLER
16FF0..16FF1 ; ID_Continue # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
16FF2..16FF3 ; ID_Continue # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; ID_Continue # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18CD5 ; ID_Continue # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18CFF..18D08 ; ID_Continue # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
Expand Down Expand Up @@ -8441,7 +8445,7 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
31350..323AF ; ID_Continue # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 144765
# Total code points: 144767

# ================================================

Expand Down Expand Up @@ -9117,6 +9121,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
16F93..16F9F ; XID_Start # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
16FE0..16FE1 ; XID_Start # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
16FE3 ; XID_Start # Lm OLD CHINESE ITERATION MARK
16FF2..16FF3 ; XID_Start # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; XID_Start # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18CD5 ; XID_Start # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18CFF..18D08 ; XID_Start # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
Expand Down Expand Up @@ -9227,7 +9232,7 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
30000..3134A ; XID_Start # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; XID_Start # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 141409
# Total code points: 141411

# ================================================

Expand Down Expand Up @@ -10507,6 +10512,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
16FE3 ; XID_Continue # Lm OLD CHINESE ITERATION MARK
16FE4 ; XID_Continue # Mn KHITAN SMALL SCRIPT FILLER
16FF0..16FF1 ; XID_Continue # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
16FF2..16FF3 ; XID_Continue # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; XID_Continue # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18CD5 ; XID_Continue # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18CFF..18D08 ; XID_Continue # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
Expand Down Expand Up @@ -10655,7 +10661,7 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
31350..323AF ; XID_Continue # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
E0100..E01EF ; XID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 144746
# Total code points: 144748

# ================================================

Expand Down Expand Up @@ -12727,6 +12733,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
16FE0..16FE1 ; Grapheme_Base # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
16FE2 ; Grapheme_Base # Po OLD CHINESE HOOK MARK
16FE3 ; Grapheme_Base # Lm OLD CHINESE ITERATION MARK
16FF2..16FF3 ; Grapheme_Base # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; Grapheme_Base # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18CD5 ; Grapheme_Base # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18CFF..18D08 ; Grapheme_Base # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
Expand Down Expand Up @@ -12932,7 +12939,7 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
30000..3134A ; Grapheme_Base # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Grapheme_Base # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 152951
# Total code points: 152953

# ================================================

Expand Down
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/EastAsianWidth.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# EastAsianWidth-17.0.0.txt
# Date: 2024-11-14, 22:51:17 GMT
# Date: 2024-11-15, 13:43:35 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -2382,6 +2382,7 @@ FFFD ; A # So REPLACEMENT CHARACTER
16FE3 ; W # Lm OLD CHINESE ITERATION MARK
16FE4 ; W # Mn KHITAN SMALL SCRIPT FILLER
16FF0..16FF1 ; W # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
16FF2..16FF3 ; N # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; W # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18AFF ; W # Lo [768] TANGUT COMPONENT-001..TANGUT COMPONENT-768
18B00..18CD5 ; W # Lo [470] KHITAN SMALL SCRIPT CHARACTER-18B00..KHITAN SMALL SCRIPT CHARACTER-18CD5
Expand Down
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/LineBreak.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# LineBreak-17.0.0.txt
# Date: 2024-11-14, 22:51:18 GMT
# Date: 2024-11-15, 13:43:36 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -3293,6 +3293,7 @@ FFFD ; AI # So REPLACEMENT CHARACTER
16FE3 ; NS # Lm OLD CHINESE ITERATION MARK
16FE4 ; GL # Mn KHITAN SMALL SCRIPT FILLER
16FF0..16FF1 ; CM # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
16FF2..16FF3 ; NS # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; ID # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18AFF ; ID # Lo [768] TANGUT COMPONENT-001..TANGUT COMPONENT-768
18B00..18CD5 ; AL # Lo [470] KHITAN SMALL SCRIPT CHARACTER-18B00..KHITAN SMALL SCRIPT CHARACTER-18CD5
Expand Down
8 changes: 5 additions & 3 deletions unicodetools/data/ucd/dev/PropList.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# PropList-17.0.0.txt
# Date: 2024-11-14, 22:51:30 GMT
# Date: 2024-11-15, 13:43:48 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -878,6 +878,7 @@ FB1E ; Other_Alphabetic # Mn HEBREW POINT JUDEO-SPANISH VARIKA
F900..FA6D ; Ideographic # Lo [366] CJK COMPATIBILITY IDEOGRAPH-F900..CJK COMPATIBILITY IDEOGRAPH-FA6D
FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILITY IDEOGRAPH-FAD9
16FE4 ; Ideographic # Mn KHITAN SMALL SCRIPT FILLER
16FF2..16FF3 ; Ideographic # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; Ideographic # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18CD5 ; Ideographic # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18CFF..18D08 ; Ideographic # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
Expand All @@ -892,7 +893,7 @@ FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COM
30000..3134A ; Ideographic # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Ideographic # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 106477
# Total code points: 106479

# ================================================

Expand Down Expand Up @@ -1207,11 +1208,12 @@ FF70 ; Extender # Lm HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND
16B42..16B43 ; Extender # Lm [2] PAHAWH HMONG SIGN VOS NRUA..PAHAWH HMONG SIGN IB YAM
16FE0..16FE1 ; Extender # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
16FE3 ; Extender # Lm OLD CHINESE ITERATION MARK
16FF2..16FF3 ; Extender # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
1E13C..1E13D ; Extender # Lm [2] NYIAKENG PUACHUE HMONG SIGN XW XW..NYIAKENG PUACHUE HMONG SYLLABLE LENGTHENER
1E5EF ; Extender # Mn OL ONAL SIGN IKIR
1E944..1E946 ; Extender # Mn [3] ADLAM ALIF LENGTHENER..ADLAM GEMINATION MARK

# Total code points: 61
# Total code points: 63

# ================================================

Expand Down
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/Scripts.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Scripts-17.0.0.txt
# Date: 2024-11-14, 22:51:49 GMT
# Date: 2024-11-15, 13:44:08 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -1597,6 +1597,7 @@ FA70..FAD9 ; Han # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILI
16FE2 ; Han # Po OLD CHINESE HOOK MARK
16FE3 ; Han # Lm OLD CHINESE ITERATION MARK
16FF0..16FF1 ; Han # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
16FF2..16FF3 ; Han # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
20000..2A6DF ; Han # Lo [42720] CJK UNIFIED IDEOGRAPH-20000..CJK UNIFIED IDEOGRAPH-2A6DF
2A700..2B739 ; Han # Lo [4154] CJK UNIFIED IDEOGRAPH-2A700..CJK UNIFIED IDEOGRAPH-2B739
2B740..2B81D ; Han # Lo [222] CJK UNIFIED IDEOGRAPH-2B740..CJK UNIFIED IDEOGRAPH-2B81D
Expand All @@ -1607,7 +1608,7 @@ FA70..FAD9 ; Han # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILI
30000..3134A ; Han # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Han # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 99030
# Total code points: 99032

# ================================================

Expand Down
2 changes: 2 additions & 0 deletions unicodetools/data/ucd/dev/UnicodeData.txt
Original file line number Diff line number Diff line change
Expand Up @@ -30539,6 +30539,8 @@ FFFD;REPLACEMENT CHARACTER;So;0;ON;;;;;N;;;;;
16FE4;KHITAN SMALL SCRIPT FILLER;Mn;0;NSM;;;;;N;;;;;
16FF0;VIETNAMESE ALTERNATE READING MARK CA;Mc;6;L;;;;;N;;;;;
16FF1;VIETNAMESE ALTERNATE READING MARK NHAY;Mc;6;L;;;;;N;;;;;
16FF2;CHINESE SMALL SIMPLIFIED ER;Lm;0;L;;;;;N;;;;;
16FF3;CHINESE SMALL TRADITIONAL ER;Lm;0;L;;;;;N;;;;;
17000;<Tangut Ideograph, First>;Lo;0;L;;;;;N;;;;;
187F7;<Tangut Ideograph, Last>;Lo;0;L;;;;;N;;;;;
18800;TANGUT COMPONENT-001;Lo;0;L;;;;;N;;;;;
Expand Down
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/VerticalOrientation.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# VerticalOrientation-17.0.0.txt
# Date: 2024-11-14, 22:51:52 GMT
# Date: 2024-11-15, 13:44:11 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -2211,7 +2211,8 @@ FFFC..FFFD ; U # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEMENT CHARA
16FE4 ; U # Mn KHITAN SMALL SCRIPT FILLER
16FE5..16FEF ; U # Cn [11] <reserved-16FE5>..<reserved-16FEF>
16FF0..16FF1 ; U # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
16FF2..16FFF ; U # Cn [14] <reserved-16FF2>..<reserved-16FFF>
16FF2..16FF3 ; U # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
16FF4..16FFF ; U # Cn [12] <reserved-16FF4>..<reserved-16FFF>
17000..187F7 ; U # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
187F8..187FF ; U # Cn [8] <reserved-187F8>..<reserved-187FF>
18800..18AFF ; U # Lo [768] TANGUT COMPONENT-001..TANGUT COMPONENT-768
Expand Down
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/auxiliary/SentenceBreakProperty.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# SentenceBreakProperty-17.0.0.txt
# Date: 2024-11-14, 22:51:50 GMT
# Date: 2024-11-15, 13:44:09 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -2534,6 +2534,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
16F93..16F9F ; OLetter # Lm [13] MIAO LETTER TONE-2..MIAO LETTER REFORMED TONE-8
16FE0..16FE1 ; OLetter # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
16FE3 ; OLetter # Lm OLD CHINESE ITERATION MARK
16FF2..16FF3 ; OLetter # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; OLetter # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18CD5 ; OLetter # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18CFF..18D08 ; OLetter # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
Expand Down Expand Up @@ -2609,7 +2610,7 @@ FFDA..FFDC ; OLetter # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
30000..3134A ; OLetter # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; OLetter # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF

# Total code points: 137017
# Total code points: 137019

# ================================================

Expand Down
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/extracted/DerivedBidiClass.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedBidiClass-17.0.0.txt
# Date: 2024-11-14, 22:51:08 GMT
# Date: 2024-11-15, 13:43:25 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -1114,6 +1114,7 @@ FFDA..FFDC ; L # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER
16FE0..16FE1 ; L # Lm [2] TANGUT ITERATION MARK..NUSHU ITERATION MARK
16FE3 ; L # Lm OLD CHINESE ITERATION MARK
16FF0..16FF1 ; L # Mc [2] VIETNAMESE ALTERNATE READING MARK CA..VIETNAMESE ALTERNATE READING MARK NHAY
16FF2..16FF3 ; L # Lm [2] CHINESE SMALL SIMPLIFIED ER..CHINESE SMALL TRADITIONAL ER
17000..187F7 ; L # Lo [6136] TANGUT IDEOGRAPH-17000..TANGUT IDEOGRAPH-187F7
18800..18CD5 ; L # Lo [1238] TANGUT COMPONENT-001..KHITAN SMALL SCRIPT CHARACTER-18CD5
18CFF..18D08 ; L # Lo [10] KHITAN SMALL SCRIPT CHARACTER-18CFF..TANGUT IDEOGRAPH-18D08
Expand Down Expand Up @@ -1225,7 +1226,7 @@ FFDA..FFDC ; L # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL LETTER
F0000..FFFFD ; L # Co [65534] <private-use-F0000>..<private-use-FFFFD>
100000..10FFFD; L # Co [65534] <private-use-100000>..<private-use-10FFFD>

# The above property value applies to 815161 code points not listed here.
# The above property value applies to 815159 code points not listed here.
# Total code points: 1095476

# ================================================
Expand Down
Loading

0 comments on commit c255ee4

Please sign in to comment.