Skip to content

Commit

Permalink
Merge remote-tracking branch 'la-vache/unihan-17' into UNC-disunifica…
Browse files Browse the repository at this point in the history
…tion-of-峀
  • Loading branch information
eggrobin committed Nov 14, 2024
2 parents 170fd50 + b0c8783 commit b25a3cc
Show file tree
Hide file tree
Showing 52 changed files with 11,489 additions and 213 deletions.
1 change: 1 addition & 0 deletions unicodetools/data/ucd/dev/Blocks.txt
Original file line number Diff line number Diff line change
Expand Up @@ -367,6 +367,7 @@ FFF0..FFFF; Specials
2F800..2FA1F; CJK Compatibility Ideographs Supplement
30000..3134F; CJK Unified Ideographs Extension G
31350..323AF; CJK Unified Ideographs Extension H
323B0..3347F; CJK Unified Ideographs Extension J
E0000..E007F; Tags
E0100..E01EF; Variation Selectors Supplement
F0000..FFFFF; Supplementary Private Use Area-A
Expand Down
5 changes: 3 additions & 2 deletions unicodetools/data/ucd/dev/DerivedAge.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedAge-17.0.0.txt
# Date: 2024-11-13, 12:26:32 GMT
# Date: 2024-11-14, 15:47:44 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -2066,7 +2066,8 @@ A7DA..A7DC ; 16.0 # [3] LATIN CAPITAL LETTER LAMBDA..LATIN CAPITAL LETTER L
# Newly assigned in Unicode 17.0.0 (September, 2025)

2B73A..2B73E ; 17.0 # [5] CJK UNIFIED IDEOGRAPH-2B73A..CJK UNIFIED IDEOGRAPH-2B73E
323B0..33479 ; 17.0 # [4298] CJK UNIFIED IDEOGRAPH-323B0..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 5
# Total code points: 4303

# EOF
26 changes: 13 additions & 13 deletions unicodetools/data/ucd/dev/DerivedCoreProperties.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedCoreProperties-17.0.0.txt
# Date: 2024-11-13, 12:26:49 GMT
# Date: 2024-11-14, 15:48:02 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -1439,9 +1439,9 @@ FFDA..FFDC ; Alphabetic # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANG
2EBF0..2EE5D ; Alphabetic # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; Alphabetic # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; Alphabetic # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Alphabetic # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
31350..33479 ; Alphabetic # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 142764
# Total code points: 147062

# ================================================

Expand Down Expand Up @@ -6960,9 +6960,9 @@ FFDA..FFDC ; ID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGUL
2EBF0..2EE5D ; ID_Start # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; ID_Start # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; ID_Start # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; ID_Start # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
31350..33479 ; ID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 141274
# Total code points: 145572

# ================================================

Expand Down Expand Up @@ -8367,10 +8367,10 @@ FFDA..FFDC ; ID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HAN
2EBF0..2EE5D ; ID_Continue # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; ID_Continue # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; ID_Continue # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; ID_Continue # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
31350..33479 ; ID_Continue # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
E0100..E01EF ; ID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 144546
# Total code points: 148844

# ================================================

Expand Down Expand Up @@ -9146,9 +9146,9 @@ FFDA..FFDC ; XID_Start # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HANGU
2EBF0..2EE5D ; XID_Start # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; XID_Start # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; XID_Start # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; XID_Start # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
31350..33479 ; XID_Start # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 141251
# Total code points: 145549

# ================================================

Expand Down Expand Up @@ -10554,10 +10554,10 @@ FFDA..FFDC ; XID_Continue # Lo [3] HALFWIDTH HANGUL LETTER EU..HALFWIDTH HA
2EBF0..2EE5D ; XID_Continue # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; XID_Continue # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; XID_Continue # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; XID_Continue # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
31350..33479 ; XID_Continue # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
E0100..E01EF ; XID_Continue # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256

# Total code points: 144527
# Total code points: 148825

# ================================================

Expand Down Expand Up @@ -12810,9 +12810,9 @@ FFFC..FFFD ; Grapheme_Base # So [2] OBJECT REPLACEMENT CHARACTER..REPLACEME
2EBF0..2EE5D ; Grapheme_Base # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; Grapheme_Base # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; Grapheme_Base # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Grapheme_Base # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
31350..33479 ; Grapheme_Base # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 152735
# Total code points: 157033

# ================================================

Expand Down
4 changes: 2 additions & 2 deletions unicodetools/data/ucd/dev/DerivedNormalizationProps.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# DerivedNormalizationProps-16.0.0.txt
# Date: 2024-04-30, 21:48:18 GMT
# DerivedNormalizationProps-17.0.0.txt
# Date: 2024-11-14, 15:48:05 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down
6 changes: 3 additions & 3 deletions unicodetools/data/ucd/dev/EastAsianWidth.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# EastAsianWidth-17.0.0.txt
# Date: 2024-11-13, 12:26:54 GMT
# Date: 2024-11-14, 15:48:07 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -2675,8 +2675,8 @@ FFFD ; A # So REPLACEMENT CHARACTER
2FA20..2FFFD ; W # Cn [1502] <reserved-2FA20>..<reserved-2FFFD>
30000..3134A ; W # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
3134B..3134F ; W # Cn [5] <reserved-3134B>..<reserved-3134F>
31350..323AF ; W # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
323B0..3FFFD ; W # Cn [56398] <reserved-323B0>..<reserved-3FFFD>
31350..33479 ; W # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
3347A..3FFFD ; W # Cn [52100] <reserved-3347A>..<reserved-3FFFD>
E0001 ; N # Cf LANGUAGE TAG
E0020..E007F ; N # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; A # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
Expand Down
6 changes: 3 additions & 3 deletions unicodetools/data/ucd/dev/LineBreak.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# LineBreak-17.0.0.txt
# Date: 2024-11-13, 12:26:54 GMT
# Date: 2024-11-14, 15:48:07 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -3659,8 +3659,8 @@ FFFD ; AI # So REPLACEMENT CHARACTER
2FA20..2FFFD ; ID # Cn [1502] <reserved-2FA20>..<reserved-2FFFD>
30000..3134A ; ID # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
3134B..3134F ; ID # Cn [5] <reserved-3134B>..<reserved-3134F>
31350..323AF ; ID # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
323B0..3FFFD ; ID # Cn [56398] <reserved-323B0>..<reserved-3FFFD>
31350..33479 ; ID # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479
3347A..3FFFD ; ID # Cn [52100] <reserved-3347A>..<reserved-3FFFD>
E0001 ; CM # Cf LANGUAGE TAG
E0020..E007F ; CM # Cf [96] TAG SPACE..CANCEL TAG
E0100..E01EF ; CM # Mn [240] VARIATION SELECTOR-17..VARIATION SELECTOR-256
Expand Down
10 changes: 5 additions & 5 deletions unicodetools/data/ucd/dev/PropList.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# PropList-17.0.0.txt
# Date: 2024-11-13, 12:27:01 GMT
# Date: 2024-11-14, 15:48:14 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -883,9 +883,9 @@ FA70..FAD9 ; Ideographic # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COM
2EBF0..2EE5D ; Ideographic # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; Ideographic # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; Ideographic # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Ideographic # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
31350..33479 ; Ideographic # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 106482
# Total code points: 110780

# ================================================

Expand Down Expand Up @@ -1365,9 +1365,9 @@ FA27..FA29 ; Unified_Ideograph # Lo [3] CJK COMPATIBILITY IDEOGRAPH-FA27..C
2CEB0..2EBE0 ; Unified_Ideograph # Lo [7473] CJK UNIFIED IDEOGRAPH-2CEB0..CJK UNIFIED IDEOGRAPH-2EBE0
2EBF0..2EE5D ; Unified_Ideograph # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
30000..3134A ; Unified_Ideograph # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Unified_Ideograph # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
31350..33479 ; Unified_Ideograph # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 97685
# Total code points: 101983

# ================================================

Expand Down
3 changes: 2 additions & 1 deletion unicodetools/data/ucd/dev/PropertyValueAliases.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# PropertyValueAliases-17.0.0.txt
# Date: 2024-09-11, 23:38:17 GMT
# Date: 2024-10-16, 13:48:47 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -212,6 +212,7 @@ blk; CJK_Ext_F ; CJK_Unified_Ideographs_Extension_F
blk; CJK_Ext_G ; CJK_Unified_Ideographs_Extension_G
blk; CJK_Ext_H ; CJK_Unified_Ideographs_Extension_H
blk; CJK_Ext_I ; CJK_Unified_Ideographs_Extension_I
blk; CJK_Ext_J ; CJK_Unified_Ideographs_Extension_J
blk; CJK_Radicals_Sup ; CJK_Radicals_Supplement
blk; CJK_Strokes ; CJK_Strokes
blk; CJK_Symbols ; CJK_Symbols_And_Punctuation
Expand Down
6 changes: 3 additions & 3 deletions unicodetools/data/ucd/dev/Scripts.txt
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
# Scripts-17.0.0.txt
# Date: 2024-11-13, 12:27:13 GMT
# Date: 2024-11-14, 15:48:27 GMT
# © 2024 Unicode®, Inc.
# Unicode and the Unicode Logo are registered trademarks of Unicode, Inc. in the U.S. and other countries.
# For terms of use and license, see https://www.unicode.org/terms_of_use.html
Expand Down Expand Up @@ -1602,9 +1602,9 @@ FA70..FAD9 ; Han # Lo [106] CJK COMPATIBILITY IDEOGRAPH-FA70..CJK COMPATIBILI
2EBF0..2EE5D ; Han # Lo [622] CJK UNIFIED IDEOGRAPH-2EBF0..CJK UNIFIED IDEOGRAPH-2EE5D
2F800..2FA1D ; Han # Lo [542] CJK COMPATIBILITY IDEOGRAPH-2F800..CJK COMPATIBILITY IDEOGRAPH-2FA1D
30000..3134A ; Han # Lo [4939] CJK UNIFIED IDEOGRAPH-30000..CJK UNIFIED IDEOGRAPH-3134A
31350..323AF ; Han # Lo [4192] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-323AF
31350..33479 ; Han # Lo [8490] CJK UNIFIED IDEOGRAPH-31350..CJK UNIFIED IDEOGRAPH-33479

# Total code points: 99035
# Total code points: 103333

# ================================================

Expand Down
2 changes: 2 additions & 0 deletions unicodetools/data/ucd/dev/UnicodeData.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39775,6 +39775,8 @@ FFFD;REPLACEMENT CHARACTER;So;0;ON;;;;;N;;;;;
3134A;<CJK Ideograph Extension G, Last>;Lo;0;L;;;;;N;;;;;
31350;<CJK Ideograph Extension H, First>;Lo;0;L;;;;;N;;;;;
323AF;<CJK Ideograph Extension H, Last>;Lo;0;L;;;;;N;;;;;
323B0;<CJK Ideograph Extension J, First>;Lo;0;L;;;;;N;;;;;
33479;<CJK Ideograph Extension J, Last>;Lo;0;L;;;;;N;;;;;
E0001;LANGUAGE TAG;Cf;0;BN;;;;;N;;;;;
E0020;TAG SPACE;Cf;0;BN;;;;;N;;;;;
E0021;TAG EXCLAMATION MARK;Cf;0;BN;;;;;N;;;;;
Expand Down
42 changes: 0 additions & 42 deletions unicodetools/data/ucd/dev/Unihan/kGB7.txt

This file was deleted.

4 changes: 2 additions & 2 deletions unicodetools/data/ucd/dev/Unihan/kHanYu.txt
Original file line number Diff line number Diff line change
Expand Up @@ -108,7 +108,7 @@
34A5 10238.090
34A6 10238.100
34A7 10239.090
34A8 10239.060
34A8 10239.050
34A9 10240.080
34AB 10266.050
34AD 10273.120
Expand Down Expand Up @@ -27570,7 +27570,7 @@
20452 10238.150
20453 80009.150
20454 80009.160
20457 10239.050
20457 10239.060
20458 10239.020
20459 10239.110
2045A 10239.010
Expand Down
Loading

0 comments on commit b25a3cc

Please sign in to comment.