Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some character mappings that may need to be handled differently when searching #1970

Open
20 of 37 tasks
Yorwba opened this issue Oct 4, 2019 · 8 comments
Open
20 of 37 tasks

Comments

@Yorwba
Copy link
Contributor

Yorwba commented Oct 4, 2019

Wall thread: https://tatoeba.org/eng/wall/show_message/33106#message_33106

Alan suggested I create a GitHub ticket and add more information, so here it is. I used spoilers to hide most of the gruesome details by default and added checkboxes to each group of characters so we have some chance of keeping track of the progress that will hopefully be made.

I'd like to apologize to anyone who receives an email notification about this in a client that doesn't support the spoiler tags.

EDIT: Since GitHub doesn't like it when people post huge amounts of text in the issue description, I had to abbreviate a bit. (ex:6674905,uses:16) refers to a character appearing in 16 different sentences, one of which is 6674905.

Duplicate Encodings a.k.a. Unicode NFC

  • ά → ά έ → έ ή → ή ί → ί ό → ό ύ → ύ ώ → ώ Affects: Ancient Greek [grc]
    • ‌ά‌ (U+1f71 GREEK SMALL LETTER ALPHA WITH OXIA) → ‌ά‌ (U+3ac GREEK SMALL LETTER ALPHA WITH TONOS) grc (ex:6674905,uses:16)
    • ‌έ‌ (U+1f73 GREEK SMALL LETTER EPSILON WITH OXIA) → ‌έ‌ (U+3ad GREEK SMALL LETTER EPSILON WITH TONOS) grc (ex:6919731,uses:23)
    • ‌ή‌ (U+1f75 GREEK SMALL LETTER ETA WITH OXIA) → ‌ή‌ (U+3ae GREEK SMALL LETTER ETA WITH TONOS) grc (ex:6919731,uses:10)
    • ‌ί‌ (U+1f77 GREEK SMALL LETTER IOTA WITH OXIA) → ‌ί‌ (U+3af GREEK SMALL LETTER IOTA WITH TONOS) grc (ex:6919723,uses:16)
    • ‌ό‌ (U+1f79 GREEK SMALL LETTER OMICRON WITH OXIA) → ‌ό‌ (U+3cc GREEK SMALL LETTER OMICRON WITH TONOS) grc (ex:6919728,uses:17)
    • ‌ύ‌ (U+1f7b GREEK SMALL LETTER UPSILON WITH OXIA) → ‌ύ‌ (U+3cd GREEK SMALL LETTER UPSILON WITH TONOS) grc (ex:6919728,uses:15)
    • ‌ώ‌ (U+1f7d GREEK SMALL LETTER OMEGA WITH OXIA) → ‌ώ‌ (U+3ce GREEK SMALL LETTER OMEGA WITH TONOS) grc (ex:6919726,uses:8)
  • 不 → 不 粒 → 粒 行 → 行 Affects: Literary Chinese [lzh], Cantonese [yue]
    • ‌不‌ (U+f967 CJK COMPATIBILITY IDEOGRAPH-F967) → ‌不‌ (U+4e0d CJK UNIFIED IDEOGRAPH-4E0D) lzh (ex:2929191,uses:3)
    • ‌粒‌ (U+f9f9 CJK COMPATIBILITY IDEOGRAPH-F9F9) → ‌粒‌ (U+7c92 CJK UNIFIED IDEOGRAPH-7C92) yue (ex:2600627,uses:1)
    • ‌行‌ (U+fa08 CJK COMPATIBILITY IDEOGRAPH-FA08) → ‌行‌ (U+884c CJK UNIFIED IDEOGRAPH-884C) lzh (ex:2929051,uses:1)

Duplicate Encodings (multiple codepoints)

  • à → à á → á â → â ã → ã ä → ä ả → ả å → å ạ → ạ ć → ć ĉ → ĉ ç → ç è → è é → é ê → ê ẹ → ẹ ę → ę ĝ → ĝ ḥ → ḥ ì → ì í → í ỉ → ỉ ị → ị ĵ → ĵ ň → ň ò → ò ó → ó õ → õ ö → ö ỏ → ỏ ọ → ọ ǫ → ǫ ṛ → ṛ ŝ → ŝ ṣ → ṣ ş → ş ṭ → ṭ ù → ù ú → ú ũ → ũ ŭ → ŭ ü → ü ủ → ủ ụ → ụ ý → ý ẓ → ẓ ầ → ầ ấ → ấ ẫ → ẫ ậ → ậ ề → ề ế → ế ễ → ễ ệ → ệ ố → ố ỗ → ỗ ổ → ổ ằ → ằ ắ → ắ ẳ → ẳ ặ → ặ ờ → ờ ớ → ớ ở → ở ợ → ợ ừ → ừ ứ → ứ ữ → ữ ử → ử ự → ự Affects: Finnish [fin], Interlingue [ile], Spanish [spa], Turkmen [tuk], Russian [rus], Esperanto [epo], Swedish [swe], Yoruba [yor], Tatar [tat], Shuswap [shs], Hungarian [hun], Italian [ita], Lingala [lin], Cayuga [cay], French [fra], Vietnamese [vie], Berber [ber], Navajo [nav], Serbian [srp], Kabyle [kab], Turkish [tur]
    • ‌à‌ (U+61 LATIN SMALL LETTER A)(U+300 COMBINING GRAVE ACCENT) → ‌à‌ (U+e0 LATIN SMALL LETTER A WITH GRAVE) fra (ex:7962232,uses:2), vie (ex:7027190,uses:27)
    • ‌á‌ (U+61 LATIN SMALL LETTER A)(U+301 COMBINING ACUTE ACCENT) → ‌á‌ (U+e1 LATIN SMALL LETTER A WITH ACUTE) nav (ex:7269805,uses:3), lin (ex:3649579,uses:4), cay (ex:7678161,uses:1), vie (ex:7027190,uses:6), hun (ex:3087045,uses:1), ile (ex:2822845,uses:1), spa (ex:6107095,uses:2)
    • ‌â‌ (U+61 LATIN SMALL LETTER A)(U+302 COMBINING CIRCUMFLEX ACCENT) → ‌â‌ (U+e2 LATIN SMALL LETTER A WITH CIRCUMFLEX) fra (ex:7962232,uses:1)
    • ‌ã‌ (U+61 LATIN SMALL LETTER A)(U+303 COMBINING TILDE) → ‌ã‌ (U+e3 LATIN SMALL LETTER A WITH TILDE) vie (ex:3349718,uses:2)
    • ‌ä‌ (U+61 LATIN SMALL LETTER A)(U+308 COMBINING DIAERESIS) → ‌ä‌ (U+e4 LATIN SMALL LETTER A WITH DIAERESIS) swe (ex:7046127,uses:7), fin (ex:917714,uses:2)
    • ‌ả‌ (U+61 LATIN SMALL LETTER A)(U+309 COMBINING HOOK ABOVE) → ‌ả‌ (U+1ea3 LATIN SMALL LETTER A WITH HOOK ABOVE) vie (ex:5106168,uses:1)
    • ‌å‌ (U+61 LATIN SMALL LETTER A)(U+30a COMBINING RING ABOVE) → ‌å‌ (U+e5 LATIN SMALL LETTER A WITH RING ABOVE) swe (ex:7046127,uses:10)
    • ‌ạ‌ (U+61 LATIN SMALL LETTER A)(U+323 COMBINING DOT BELOW) → ‌ạ‌ (U+1ea1 LATIN SMALL LETTER A WITH DOT BELOW) vie (ex:6588610,uses:5)
    • ‌ć‌ (U+63 LATIN SMALL LETTER C)(U+301 COMBINING ACUTE ACCENT) → ‌ć‌ (U+107 LATIN SMALL LETTER C WITH ACUTE) srp (ex:6196047,uses:34)
    • ‌ĉ‌ (U+63 LATIN SMALL LETTER C)(U+302 COMBINING CIRCUMFLEX ACCENT) → ‌ĉ‌ (U+109 LATIN SMALL LETTER C WITH CIRCUMFLEX) epo (ex:7274473,uses:1)
    • ‌ç‌ (U+63 LATIN SMALL LETTER C)(U+327 COMBINING CEDILLA) → ‌ç‌ (U+e7 LATIN SMALL LETTER C WITH CEDILLA) fra (ex:7962232,uses:1)
    • ‌è‌ (U+65 LATIN SMALL LETTER E)(U+300 COMBINING GRAVE ACCENT) → ‌è‌ (U+e8 LATIN SMALL LETTER E WITH GRAVE) fra (ex:2158753,uses:1), ita (ex:7104143,uses:1), vie (ex:4111384,uses:4)
    • ‌é‌ (U+65 LATIN SMALL LETTER E)(U+301 COMBINING ACUTE ACCENT) → ‌é‌ (U+e9 LATIN SMALL LETTER E WITH ACUTE) fra (ex:7962232,uses:4), spa (ex:6804574,uses:6), vie (ex:6588607,uses:3)
    • ‌ê‌ (U+65 LATIN SMALL LETTER E)(U+302 COMBINING CIRCUMFLEX ACCENT) → ‌ê‌ (U+ea LATIN SMALL LETTER E WITH CIRCUMFLEX) fra (ex:7962232,uses:1)
    • ‌ẹ‌ (U+65 LATIN SMALL LETTER E)(U+323 COMBINING DOT BELOW) → ‌ẹ‌ (U+1eb9 LATIN SMALL LETTER E WITH DOT BELOW) vie (ex:2807128,uses:1)
    • ‌ę‌ (U+65 LATIN SMALL LETTER E)(U+328 COMBINING OGONEK) → ‌ę‌ (U+119 LATIN SMALL LETTER E WITH OGONEK) cay (ex:7678161,uses:3)
    • ‌ĝ‌ (U+67 LATIN SMALL LETTER G)(U+302 COMBINING CIRCUMFLEX ACCENT) → ‌ĝ‌ (U+11d LATIN SMALL LETTER G WITH CIRCUMFLEX) epo (ex:7274473,uses:1)
    • ‌ḥ‌ (U+68 LATIN SMALL LETTER H)(U+323 COMBINING DOT BELOW) → ‌ḥ‌ (U+1e25 LATIN SMALL LETTER H WITH DOT BELOW) ber (ex:8117598,uses:4), kab (ex:7079327,uses:5)
    • ‌ì‌ (U+69 LATIN SMALL LETTER I)(U+300 COMBINING GRAVE ACCENT) → ‌ì‌ (U+ec LATIN SMALL LETTER I WITH GRAVE) vie (ex:6552568,uses:7)
    • ‌í‌ (U+69 LATIN SMALL LETTER I)(U+301 COMBINING ACUTE ACCENT) → ‌í‌ (U+ed LATIN SMALL LETTER I WITH ACUTE) vie (ex:6552468,uses:3), lin (ex:3649591,uses:3), tat (ex:3285464,uses:1)
    • ‌ỉ‌ (U+69 LATIN SMALL LETTER I)(U+309 COMBINING HOOK ABOVE) → ‌ỉ‌ (U+1ec9 LATIN SMALL LETTER I WITH HOOK ABOVE) vie (ex:2807010,uses:1)
    • ‌ị‌ (U+69 LATIN SMALL LETTER I)(U+323 COMBINING DOT BELOW) → ‌ị‌ (U+1ecb LATIN SMALL LETTER I WITH DOT BELOW) vie (ex:7027190,uses:4)
    • ‌ĵ‌ (U+6a LATIN SMALL LETTER J)(U+302 COMBINING CIRCUMFLEX ACCENT) → ‌ĵ‌ (U+135 LATIN SMALL LETTER J WITH CIRCUMFLEX) epo (ex:3162694,uses:1)
    • ‌ň‌ (U+6e LATIN SMALL LETTER N)(U+30c COMBINING CARON) → ‌ň‌ (U+148 LATIN SMALL LETTER N WITH CARON) tuk (ex:3260109,uses:2)
    • ‌ò‌ (U+6f LATIN SMALL LETTER O)(U+300 COMBINING GRAVE ACCENT) → ‌ò‌ (U+f2 LATIN SMALL LETTER O WITH GRAVE) vie (ex:6552458,uses:5)
    • ‌ó‌ (U+6f LATIN SMALL LETTER O)(U+301 COMBINING ACUTE ACCENT) → ‌ó‌ (U+f3 LATIN SMALL LETTER O WITH ACUTE) spa (ex:6152443,uses:1), lin (ex:3649459,uses:2), rus (ex:3063650,uses:1), vie (ex:6552493,uses:17)
    • ‌õ‌ (U+6f LATIN SMALL LETTER O)(U+303 COMBINING TILDE) → ‌õ‌ (U+f5 LATIN SMALL LETTER O WITH TILDE) vie (ex:2808772,uses:1)
    • ‌ö‌ (U+6f LATIN SMALL LETTER O)(U+308 COMBINING DIAERESIS) → ‌ö‌ (U+f6 LATIN SMALL LETTER O WITH DIAERESIS) swe (ex:7045932,uses:1)
    • ‌ỏ‌ (U+6f LATIN SMALL LETTER O)(U+309 COMBINING HOOK ABOVE) → ‌ỏ‌ (U+1ecf LATIN SMALL LETTER O WITH HOOK ABOVE) vie (ex:6552468,uses:1)
    • ‌ọ‌ (U+6f LATIN SMALL LETTER O)(U+323 COMBINING DOT BELOW) → ‌ọ‌ (U+1ecd LATIN SMALL LETTER O WITH DOT BELOW) vie (ex:6588610,uses:7), yor (ex:3714005,uses:2)
    • ‌ǫ‌ (U+6f LATIN SMALL LETTER O)(U+328 COMBINING OGONEK) → ‌ǫ‌ (U+1eb LATIN SMALL LETTER O WITH OGONEK) cay (ex:7678161,uses:3)
    • ‌ṛ‌ (U+72 LATIN SMALL LETTER R)(U+323 COMBINING DOT BELOW) → ‌ṛ‌ (U+1e5b LATIN SMALL LETTER R WITH DOT BELOW) ber (ex:8218753,uses:6), kab (ex:8176260,uses:35)
    • ‌ŝ‌ (U+73 LATIN SMALL LETTER S)(U+302 COMBINING CIRCUMFLEX ACCENT) → ‌ŝ‌ (U+15d LATIN SMALL LETTER S WITH CIRCUMFLEX) epo (ex:3164015,uses:1)
    • ‌ṣ‌ (U+73 LATIN SMALL LETTER S)(U+323 COMBINING DOT BELOW) → ‌ṣ‌ (U+1e63 LATIN SMALL LETTER S WITH DOT BELOW) kab (ex:7076810,uses:2), yor (ex:3713991,uses:1)
    • ‌ş‌ (U+73 LATIN SMALL LETTER S)(U+327 COMBINING CEDILLA) → ‌ş‌ (U+15f LATIN SMALL LETTER S WITH CEDILLA) tur (ex:8129794,uses:1)
    • ‌ṭ‌ (U+74 LATIN SMALL LETTER T)(U+323 COMBINING DOT BELOW) → ‌ṭ‌ (U+1e6d LATIN SMALL LETTER T WITH DOT BELOW) kab (ex:7027791,uses:1)
    • ‌ù‌ (U+75 LATIN SMALL LETTER U)(U+300 COMBINING GRAVE ACCENT) → ‌ù‌ (U+f9 LATIN SMALL LETTER U WITH GRAVE) vie (ex:3349738,uses:2), yor (ex:3713991,uses:1)
    • ‌ú‌ (U+75 LATIN SMALL LETTER U)(U+301 COMBINING ACUTE ACCENT) → ‌ú‌ (U+fa LATIN SMALL LETTER U WITH ACUTE) spa (ex:3324250,uses:1), shs (ex:3160201,uses:1), vie (ex:6588610,uses:5)
    • ‌ũ‌ (U+75 LATIN SMALL LETTER U)(U+303 COMBINING TILDE) → ‌ũ‌ (U+169 LATIN SMALL LETTER U WITH TILDE) vie (ex:6588607,uses:1)
    • ‌ŭ‌ (U+75 LATIN SMALL LETTER U)(U+306 COMBINING BREVE) → ‌ŭ‌ (U+16d LATIN SMALL LETTER U WITH BREVE) epo (ex:3163980,uses:1)
    • ‌ü‌ (U+75 LATIN SMALL LETTER U)(U+308 COMBINING DIAERESIS) → ‌ü‌ (U+fc LATIN SMALL LETTER U WITH DIAERESIS) tur (ex:8129794,uses:1), hun (ex:3087045,uses:1)
    • ‌ủ‌ (U+75 LATIN SMALL LETTER U)(U+309 COMBINING HOOK ABOVE) → ‌ủ‌ (U+1ee7 LATIN SMALL LETTER U WITH HOOK ABOVE) vie (ex:6552524,uses:2)
    • ‌ụ‌ (U+75 LATIN SMALL LETTER U)(U+323 COMBINING DOT BELOW) → ‌ụ‌ (U+1ee5 LATIN SMALL LETTER U WITH DOT BELOW) vie (ex:6552458,uses:2)
    • ‌ý‌ (U+79 LATIN SMALL LETTER Y)(U+301 COMBINING ACUTE ACCENT) → ‌ý‌ (U+fd LATIN SMALL LETTER Y WITH ACUTE) vie (ex:4278430,uses:1), tuk (ex:3141157,uses:3)
    • ‌ẓ‌ (U+7a LATIN SMALL LETTER Z)(U+323 COMBINING DOT BELOW) → ‌ẓ‌ (U+1e93 LATIN SMALL LETTER Z WITH DOT BELOW) kab (ex:7027633,uses:1)
    • ‌ầ‌ (U+e2 LATIN SMALL LETTER A WITH CIRCUMFLEX)(U+300 COMBINING GRAVE ACCENT) → ‌ầ‌ (U+1ea7 LATIN SMALL LETTER A WITH CIRCUMFLEX AND GRAVE) vie (ex:6588613,uses:8)
    • ‌ấ‌ (U+e2 LATIN SMALL LETTER A WITH CIRCUMFLEX)(U+301 COMBINING ACUTE ACCENT) → ‌ấ‌ (U+1ea5 LATIN SMALL LETTER A WITH CIRCUMFLEX AND ACUTE) vie (ex:6588613,uses:7)
    • ‌ẫ‌ (U+e2 LATIN SMALL LETTER A WITH CIRCUMFLEX)(U+303 COMBINING TILDE) → ‌ẫ‌ (U+1eab LATIN SMALL LETTER A WITH CIRCUMFLEX AND TILDE) vie (ex:2808776,uses:1)
    • ‌ậ‌ (U+e2 LATIN SMALL LETTER A WITH CIRCUMFLEX)(U+323 COMBINING DOT BELOW) → ‌ậ‌ (U+1ead LATIN SMALL LETTER A WITH CIRCUMFLEX AND DOT BELOW) vie (ex:2808749,uses:3)
    • ‌ề‌ (U+ea LATIN SMALL LETTER E WITH CIRCUMFLEX)(U+300 COMBINING GRAVE ACCENT) → ‌ề‌ (U+1ec1 LATIN SMALL LETTER E WITH CIRCUMFLEX AND GRAVE) vie (ex:3323949,uses:2)
    • ‌ế‌ (U+ea LATIN SMALL LETTER E WITH CIRCUMFLEX)(U+301 COMBINING ACUTE ACCENT) → ‌ế‌ (U+1ebf LATIN SMALL LETTER E WITH CIRCUMFLEX AND ACUTE) vie (ex:6552568,uses:8)
    • ‌ễ‌ (U+ea LATIN SMALL LETTER E WITH CIRCUMFLEX)(U+303 COMBINING TILDE) → ‌ễ‌ (U+1ec5 LATIN SMALL LETTER E WITH CIRCUMFLEX AND TILDE) vie (ex:5106134,uses:1)
    • ‌ệ‌ (U+ea LATIN SMALL LETTER E WITH CIRCUMFLEX)(U+323 COMBINING DOT BELOW) → ‌ệ‌ (U+1ec7 LATIN SMALL LETTER E WITH CIRCUMFLEX AND DOT BELOW) vie (ex:7027190,uses:2)
    • ‌ố‌ (U+f4 LATIN SMALL LETTER O WITH CIRCUMFLEX)(U+301 COMBINING ACUTE ACCENT) → ‌ố‌ (U+1ed1 LATIN SMALL LETTER O WITH CIRCUMFLEX AND ACUTE) vie (ex:6552468,uses:5)
    • ‌ỗ‌ (U+f4 LATIN SMALL LETTER O WITH CIRCUMFLEX)(U+303 COMBINING TILDE) → ‌ỗ‌ (U+1ed7 LATIN SMALL LETTER O WITH CIRCUMFLEX AND TILDE) vie (ex:6552468,uses:2)
    • ‌ổ‌ (U+f4 LATIN SMALL LETTER O WITH CIRCUMFLEX)(U+309 COMBINING HOOK ABOVE) → ‌ổ‌ (U+1ed5 LATIN SMALL LETTER O WITH CIRCUMFLEX AND HOOK ABOVE) vie (ex:2808713,uses:2)
    • ‌ằ‌ (U+103 LATIN SMALL LETTER A WITH BREVE)(U+300 COMBINING GRAVE ACCENT) → ‌ằ‌ (U+1eb1 LATIN SMALL LETTER A WITH BREVE AND GRAVE) vie (ex:2808812,uses:2)
    • ‌ắ‌ (U+103 LATIN SMALL LETTER A WITH BREVE)(U+301 COMBINING ACUTE ACCENT) → ‌ắ‌ (U+1eaf LATIN SMALL LETTER A WITH BREVE AND ACUTE) vie (ex:2808725,uses:1)
    • ‌ẳ‌ (U+103 LATIN SMALL LETTER A WITH BREVE)(U+309 COMBINING HOOK ABOVE) → ‌ẳ‌ (U+1eb3 LATIN SMALL LETTER A WITH BREVE AND HOOK ABOVE) vie (ex:6552524,uses:1)
    • ‌ặ‌ (U+103 LATIN SMALL LETTER A WITH BREVE)(U+323 COMBINING DOT BELOW) → ‌ặ‌ (U+1eb7 LATIN SMALL LETTER A WITH BREVE AND DOT BELOW) vie (ex:2808819,uses:2)
    • ‌ờ‌ (U+1a1 LATIN SMALL LETTER O WITH HORN)(U+300 COMBINING GRAVE ACCENT) → ‌ờ‌ (U+1edd LATIN SMALL LETTER O WITH HORN AND GRAVE) vie (ex:7027190,uses:9)
    • ‌ớ‌ (U+1a1 LATIN SMALL LETTER O WITH HORN)(U+301 COMBINING ACUTE ACCENT) → ‌ớ‌ (U+1edb LATIN SMALL LETTER O WITH HORN AND ACUTE) vie (ex:7027190,uses:9)
    • ‌ở‌ (U+1a1 LATIN SMALL LETTER O WITH HORN)(U+309 COMBINING HOOK ABOVE) → ‌ở‌ (U+1edf LATIN SMALL LETTER O WITH HORN AND HOOK ABOVE) vie (ex:7027190,uses:1)
    • ‌ợ‌ (U+1a1 LATIN SMALL LETTER O WITH HORN)(U+323 COMBINING DOT BELOW) → ‌ợ‌ (U+1ee3 LATIN SMALL LETTER O WITH HORN AND DOT BELOW) vie (ex:7027190,uses:4)
    • ‌ừ‌ (U+1b0 LATIN SMALL LETTER U WITH HORN)(U+300 COMBINING GRAVE ACCENT) → ‌ừ‌ (U+1eeb LATIN SMALL LETTER U WITH HORN AND GRAVE) vie (ex:1697289,uses:1)
    • ‌ứ‌ (U+1b0 LATIN SMALL LETTER U WITH HORN)(U+301 COMBINING ACUTE ACCENT) → ‌ứ‌ (U+1ee9 LATIN SMALL LETTER U WITH HORN AND ACUTE) vie (ex:2808789,uses:3)
    • ‌ữ‌ (U+1b0 LATIN SMALL LETTER U WITH HORN)(U+303 COMBINING TILDE) → ‌ữ‌ (U+1eef LATIN SMALL LETTER U WITH HORN AND TILDE) vie (ex:6552493,uses:3)
    • ‌ử‌ (U+1b0 LATIN SMALL LETTER U WITH HORN)(U+309 COMBINING HOOK ABOVE) → ‌ử‌ (U+1eed LATIN SMALL LETTER U WITH HORN AND HOOK ABOVE) vie (ex:6588607,uses:1)
    • ‌ự‌ (U+1b0 LATIN SMALL LETTER U WITH HORN)(U+323 COMBINING DOT BELOW) → ‌ự‌ (U+1ef1 LATIN SMALL LETTER U WITH HORN AND DOT BELOW) vie (ex:5394226,uses:1)
  • й → й Affects: Bashkir [bak]
    • ‌й‌ (U+438 CYRILLIC SMALL LETTER I)(U+306 COMBINING BREVE) → ‌й‌ (U+439 CYRILLIC SMALL LETTER SHORT I) bak (ex:2839371,uses:2)
  • آ → آ أ → أ ؤ → ؤ Affects: Arabic [ara], Urdu [urd], Persian [pes]
    • ‌آ‌ (U+627 ARABIC LETTER ALEF)(U+653 ARABIC MADDAH ABOVE) → ‌آ‌ (U+622 ARABIC LETTER ALEF WITH MADDA ABOVE) ara (ex:1990503,uses:1), urd (ex:7997087,uses:2)
    • ‌أ‌ (U+627 ARABIC LETTER ALEF)(U+654 ARABIC HAMZA ABOVE) → ‌أ‌ (U+623 ARABIC LETTER ALEF WITH HAMZA ABOVE) pes (ex:8192346,uses:41)
    • ‌ؤ‌ (U+648 ARABIC LETTER WAW)(U+654 ARABIC HAMZA ABOVE) → ‌ؤ‌ (U+624 ARABIC LETTER WAW WITH HAMZA ABOVE) pes (ex:8046406,uses:13)
  • ऱ → ऱ क़ → क़ ख़ → ख़ ग़ → ग़ ज़ → ज़ ड़ → ड़ ढ़ → ढ़ फ़ → फ़ Affects: Marathi [mar], Hindi [hin], Garhwali [gbm]
    • ‌ऱ‌ (U+930 DEVANAGARI LETTER RA)(U+93c DEVANAGARI SIGN NUKTA) → ‌ऱ‌ (U+931 DEVANAGARI LETTER RRA) mar (ex:3946809,uses:1)
    • ‌क़‌ (U+958 DEVANAGARI LETTER QA) → ‌क़‌ (U+915 DEVANAGARI LETTER KA)(U+93c DEVANAGARI SIGN NUKTA) hin (ex:5683375,uses:4)
    • ‌ख़‌ (U+959 DEVANAGARI LETTER KHHA) → ‌ख़‌ (U+916 DEVANAGARI LETTER KHA)(U+93c DEVANAGARI SIGN NUKTA) hin (ex:4907683,uses:19), gbm (ex:4674743,uses:1)
    • ‌ग़‌ (U+95a DEVANAGARI LETTER GHHA) → ‌ग़‌ (U+917 DEVANAGARI LETTER GA)(U+93c DEVANAGARI SIGN NUKTA) hin (ex:3801996,uses:6)
    • ‌ज़‌ (U+95b DEVANAGARI LETTER ZA) → ‌ज़‌ (U+91c DEVANAGARI LETTER JA)(U+93c DEVANAGARI SIGN NUKTA) mar (ex:7731191,uses:4), hin (ex:5690258,uses:45)
    • ‌ड़‌ (U+95c DEVANAGARI LETTER DDDHA) → ‌ड़‌ (U+921 DEVANAGARI LETTER DDA)(U+93c DEVANAGARI SIGN NUKTA) hin (ex:6472450,uses:46), gbm (ex:4674624,uses:2)
    • ‌ढ़‌ (U+95d DEVANAGARI LETTER RHA) → ‌ढ़‌ (U+922 DEVANAGARI LETTER DDHA)(U+93c DEVANAGARI SIGN NUKTA) hin (ex:4163491,uses:13)
    • ‌फ़‌ (U+95e DEVANAGARI LETTER FA) → ‌फ़‌ (U+92b DEVANAGARI LETTER PHA)(U+93c DEVANAGARI SIGN NUKTA) hin (ex:7031114,uses:11), gbm (ex:4674621,uses:2)
  • ড় → ড় ঢ় → ঢ় য় → য় Affects: Bengali [ben], Assamese [asm]
    • ‌ড়‌ (U+9dc BENGALI LETTER RRA) → ‌ড়‌ (U+9a1 BENGALI LETTER DDA)(U+9bc BENGALI SIGN NUKTA) ben (ex:7199468,uses:370)
    • ‌ঢ়‌ (U+9dd BENGALI LETTER RHA) → ‌ঢ়‌ (U+9a2 BENGALI LETTER DDHA)(U+9bc BENGALI SIGN NUKTA) asm (ex:6406035,uses:1)
    • ‌য়‌ (U+9df BENGALI LETTER YYA) → ‌য়‌ (U+9af BENGALI LETTER YA)(U+9bc BENGALI SIGN NUKTA) ben (ex:7808350,uses:1032), asm (ex:6443024,uses:12)
  • ਸ਼ → ਸ਼ ਖ਼ → ਖ਼ ਗ਼ → ਗ਼ ਜ਼ → ਜ਼ ਫ਼ → ਫ਼ Affects: Punjabi (Eastern) [pan]
    • ‌ਸ਼‌ (U+a36 GURMUKHI LETTER SHA) → ‌ਸ਼‌ (U+a38 GURMUKHI LETTER SA)(U+a3c GURMUKHI SIGN NUKTA) pan (ex:6830453,uses:28)
    • ‌ਖ਼‌ (U+a59 GURMUKHI LETTER KHHA) → ‌ਖ਼‌ (U+a16 GURMUKHI LETTER KHA)(U+a3c GURMUKHI SIGN NUKTA) pan (ex:3797178,uses:3)
    • ‌ਗ਼‌ (U+a5a GURMUKHI LETTER GHHA) → ‌ਗ਼‌ (U+a17 GURMUKHI LETTER GA)(U+a3c GURMUKHI SIGN NUKTA) pan (ex:3797261,uses:3)
    • ‌ਜ਼‌ (U+a5b GURMUKHI LETTER ZA) → ‌ਜ਼‌ (U+a1c GURMUKHI LETTER JA)(U+a3c GURMUKHI SIGN NUKTA) pan (ex:6092572,uses:12)
    • ‌ਫ਼‌ (U+a5e GURMUKHI LETTER FA) → ‌ਫ਼‌ (U+a2b GURMUKHI LETTER PHA)(U+a3c GURMUKHI SIGN NUKTA) pan (ex:3827715,uses:6)
  • ோ → ோ Affects: Tamil [tam]
    • ‌ோ‌ (U+bc7 TAMIL VOWEL SIGN EE)(U+bbe TAMIL VOWEL SIGN AA) → ‌ோ‌ (U+bcb TAMIL VOWEL SIGN OO) tam (ex:4267930,uses:3)
  • ೀ → ೀ ೊ → ೊ ೋ → ೋ ೇ → ೇ Affects: Kannada [kan]
    • ‌ೀ‌ (U+cbf KANNADA VOWEL SIGN I)(U+cd5 KANNADA LENGTH MARK) → ‌ೀ‌ (U+cc0 KANNADA VOWEL SIGN II) kan (ex:4774611,uses:1)
    • ‌ೊ‌ (U+cc6 KANNADA VOWEL SIGN E)(U+cc2 KANNADA VOWEL SIGN UU) → ‌ೊ‌ (U+cca KANNADA VOWEL SIGN O) kan (ex:4969643,uses:6)
    • ‌ೋ‌ (U+cc6 KANNADA VOWEL SIGN E)(U+cc2 KANNADA VOWEL SIGN UU)(U+cd5 KANNADA LENGTH MARK) → ‌ೋ‌ (U+ccb KANNADA VOWEL SIGN OO) kan (ex:4776865,uses:1)
    • ‌ೇ‌ (U+cc6 KANNADA VOWEL SIGN E)(U+cd5 KANNADA LENGTH MARK) → ‌ೇ‌ (U+cc7 KANNADA VOWEL SIGN EE) kan (ex:4969643,uses:1)
  • ോ → ോ Affects: Malayalam [mal]
    • ‌ോ‌ (U+d47 MALAYALAM VOWEL SIGN EE)(U+d3e MALAYALAM VOWEL SIGN AA) → ‌ോ‌ (U+d4b MALAYALAM VOWEL SIGN OO) mal (ex:3964449,uses:1)
  • יִ → יִ ײַ → ײַ שׂ → שׂ אַ → אַ אָ → אָ וּ → וּ כּ → כּ פּ → פּ תּ → תּ בֿ → בֿ כֿ → כֿ פֿ → פֿ Affects: Hebrew [heb], Yiddish [yid]
    • ‌יִ‌ (U+fb1d HEBREW LETTER YOD WITH HIRIQ) → ‌יִ‌ (U+5d9 HEBREW LETTER YOD)(U+5b4 HEBREW POINT HIRIQ) yid (ex:1559698,uses:1)
    • ‌ײַ‌ (U+fb1f HEBREW LIGATURE YIDDISH YOD YOD PATAH) → ‌ײַ‌ (U+5f2 HEBREW LIGATURE YIDDISH DOUBLE YOD)(U+5b7 HEBREW POINT PATAH) yid (ex:8106226,uses:103)
    • ‌שׂ‌ (U+fb2b HEBREW LETTER SHIN WITH SIN DOT) → ‌שׂ‌ (U+5e9 HEBREW LETTER SHIN)(U+5c2 HEBREW POINT SIN DOT) yid (ex:1557269,uses:1)
    • ‌אַ‌ (U+fb2e HEBREW LETTER ALEF WITH PATAH) → ‌אַ‌ (U+5d0 HEBREW LETTER ALEF)(U+5b7 HEBREW POINT PATAH) heb (ex:583487,uses:1), yid (ex:8106222,uses:229)
    • ‌אָ‌ (U+fb2f HEBREW LETTER ALEF WITH QAMATS) → ‌אָ‌ (U+5d0 HEBREW LETTER ALEF)(U+5b8 HEBREW POINT QAMATS) yid (ex:8106222,uses:210)
    • ‌וּ‌ (U+fb35 HEBREW LETTER VAV WITH DAGESH) → ‌וּ‌ (U+5d5 HEBREW LETTER VAV)(U+5bc HEBREW POINT DAGESH OR MAPIQ) yid (ex:1583142,uses:11)
    • ‌כּ‌ (U+fb3b HEBREW LETTER KAF WITH DAGESH) → ‌כּ‌ (U+5db HEBREW LETTER KAF)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:3044647,uses:1)
    • ‌פּ‌ (U+fb44 HEBREW LETTER PE WITH DAGESH) → ‌פּ‌ (U+5e4 HEBREW LETTER PE)(U+5bc HEBREW POINT DAGESH OR MAPIQ) yid (ex:1585540,uses:43)
    • ‌תּ‌ (U+fb4a HEBREW LETTER TAV WITH DAGESH) → ‌תּ‌ (U+5ea HEBREW LETTER TAV)(U+5bc HEBREW POINT DAGESH OR MAPIQ) yid (ex:1559127,uses:1)
    • ‌בֿ‌ (U+fb4c HEBREW LETTER BET WITH RAFE) → ‌בֿ‌ (U+5d1 HEBREW LETTER BET)(U+5bf HEBREW POINT RAFE) yid (ex:1564997,uses:2)
    • ‌כֿ‌ (U+fb4d HEBREW LETTER KAF WITH RAFE) → ‌כֿ‌ (U+5db HEBREW LETTER KAF)(U+5bf HEBREW POINT RAFE) yid (ex:392343,uses:1)
    • ‌פֿ‌ (U+fb4e HEBREW LETTER PE WITH RAFE) → ‌פֿ‌ (U+5e4 HEBREW LETTER PE)(U+5bf HEBREW POINT RAFE) heb (ex:583487,uses:1), yid (ex:8106222,uses:118)
  • ָֹ → ָֹ ְּ → ְּ ֳּ → ֳּ ִּ → ִּ ֵּ → ֵּ ֶּ → ֶּ ַּ → ַּ ָּ → ָּ ֹּ → ֹּ ֻּ → ֻּ ְׁ → ְׁ ִׁ → ִׁ ֶׁ → ֶׁ ַׁ → ַׁ ָׁ → ָׁ ֹׁ → ֹׁ ֻׁ → ֻׁ ְׂ → ְׂ ִׂ → ִׂ ֵׂ → ֵׂ ָׂ → ָׂ ֹׂ → ֹׂ َّ → َّ ُّ → ُّ ِّ → ِّ ़् → ़् ့် → ့် Affects: Arabic [ara], Persian [pes], North Levantine Arabic [apc], Hindi [hin], Yiddish [yid], Hebrew [heb], Algerian Arabic [arq], Burmese [mya]
    • ‌ָֹ‌ (U+5b9 HEBREW POINT HOLAM)(U+5b8 HEBREW POINT QAMATS) → ‌ָֹ‌ (U+5b8 HEBREW POINT QAMATS)(U+5b9 HEBREW POINT HOLAM) heb (ex:2106929,uses:2)
    • ‌ְּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ)(U+5b0 HEBREW POINT SHEVA) → ‌ְּ‌ (U+5b0 HEBREW POINT SHEVA)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:8100773,uses:66)
    • ‌ֳּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ)(U+5b3 HEBREW POINT HATAF QAMATS) → ‌ֳּ‌ (U+5b3 HEBREW POINT HATAF QAMATS)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:3198074,uses:1)
    • ‌ִּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ)(U+5b4 HEBREW POINT HIRIQ) → ‌ִּ‌ (U+5b4 HEBREW POINT HIRIQ)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:8100773,uses:67)
    • ‌ֵּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ)(U+5b5 HEBREW POINT TSERE) → ‌ֵּ‌ (U+5b5 HEBREW POINT TSERE)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:8100773,uses:45)
    • ‌ֶּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ)(U+5b6 HEBREW POINT SEGOL) → ‌ֶּ‌ (U+5b6 HEBREW POINT SEGOL)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:8170851,uses:39)
    • ‌ַּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ)(U+5b7 HEBREW POINT PATAH) → ‌ַּ‌ (U+5b7 HEBREW POINT PATAH)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:8100773,uses:69)
    • ‌ָּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ)(U+5b8 HEBREW POINT QAMATS) → ‌ָּ‌ (U+5b8 HEBREW POINT QAMATS)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:8096277,uses:100), yid (ex:3867227,uses:1)
    • ‌ֹּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ)(U+5b9 HEBREW POINT HOLAM) → ‌ֹּ‌ (U+5b9 HEBREW POINT HOLAM)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:8096277,uses:22)
    • ‌ֻּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ)(U+5bb HEBREW POINT QUBUTS) → ‌ֻּ‌ (U+5bb HEBREW POINT QUBUTS)(U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:3217509,uses:1)
    • ‌ְׁ‌ (U+5c1 HEBREW POINT SHIN DOT)(U+5b0 HEBREW POINT SHEVA) → ‌ְׁ‌ (U+5b0 HEBREW POINT SHEVA)(U+5c1 HEBREW POINT SHIN DOT) heb (ex:8087703,uses:24)
    • ‌ִׁ‌ (U+5c1 HEBREW POINT SHIN DOT)(U+5b4 HEBREW POINT HIRIQ) → ‌ִׁ‌ (U+5b4 HEBREW POINT HIRIQ)(U+5c1 HEBREW POINT SHIN DOT) heb (ex:8087703,uses:14)
    • ‌ֶׁ‌ (U+5c1 HEBREW POINT SHIN DOT)(U+5b6 HEBREW POINT SEGOL) → ‌ֶׁ‌ (U+5b6 HEBREW POINT SEGOL)(U+5c1 HEBREW POINT SHIN DOT) heb (ex:5454503,uses:35)
    • ‌ַׁ‌ (U+5c1 HEBREW POINT SHIN DOT)(U+5b7 HEBREW POINT PATAH) → ‌ַׁ‌ (U+5b7 HEBREW POINT PATAH)(U+5c1 HEBREW POINT SHIN DOT) heb (ex:2200420,uses:6)
    • ‌ָׁ‌ (U+5c1 HEBREW POINT SHIN DOT)(U+5b8 HEBREW POINT QAMATS) → ‌ָׁ‌ (U+5b8 HEBREW POINT QAMATS)(U+5c1 HEBREW POINT SHIN DOT) heb (ex:8087703,uses:29)
    • ‌ֹׁ‌ (U+5c1 HEBREW POINT SHIN DOT)(U+5b9 HEBREW POINT HOLAM) → ‌ֹׁ‌ (U+5b9 HEBREW POINT HOLAM)(U+5c1 HEBREW POINT SHIN DOT) heb (ex:2786402,uses:2)
    • ‌ֻׁ‌ (U+5c1 HEBREW POINT SHIN DOT)(U+5bb HEBREW POINT QUBUTS) → ‌ֻׁ‌ (U+5bb HEBREW POINT QUBUTS)(U+5c1 HEBREW POINT SHIN DOT) heb (ex:2119699,uses:1)
    • ‌ְׂ‌ (U+5c2 HEBREW POINT SIN DOT)(U+5b0 HEBREW POINT SHEVA) → ‌ְׂ‌ (U+5b0 HEBREW POINT SHEVA)(U+5c2 HEBREW POINT SIN DOT) heb (ex:5241527,uses:9)
    • ‌ִׂ‌ (U+5c2 HEBREW POINT SIN DOT)(U+5b4 HEBREW POINT HIRIQ) → ‌ִׂ‌ (U+5b4 HEBREW POINT HIRIQ)(U+5c2 HEBREW POINT SIN DOT) heb (ex:8038317,uses:6)
    • ‌ֵׂ‌ (U+5c2 HEBREW POINT SIN DOT)(U+5b5 HEBREW POINT TSERE) → ‌ֵׂ‌ (U+5b5 HEBREW POINT TSERE)(U+5c2 HEBREW POINT SIN DOT) heb (ex:8100773,uses:5)
    • ‌ָׂ‌ (U+5c2 HEBREW POINT SIN DOT)(U+5b8 HEBREW POINT QAMATS) → ‌ָׂ‌ (U+5b8 HEBREW POINT QAMATS)(U+5c2 HEBREW POINT SIN DOT) heb (ex:3180725,uses:10)
    • ‌ֹׂ‌ (U+5c2 HEBREW POINT SIN DOT)(U+5b9 HEBREW POINT HOLAM) → ‌ֹׂ‌ (U+5b9 HEBREW POINT HOLAM)(U+5c2 HEBREW POINT SIN DOT) heb (ex:2141632,uses:1)
    • ‌َّ‌ (U+651 ARABIC SHADDA)(U+64e ARABIC FATHA) → ‌َّ‌ (U+64e ARABIC FATHA)(U+651 ARABIC SHADDA) ara (ex:8158682,uses:88), arq (ex:8129876,uses:35), pes (ex:1792619,uses:1), apc (ex:3493954,uses:1)
    • ‌ُّ‌ (U+651 ARABIC SHADDA)(U+64f ARABIC DAMMA) → ‌ُّ‌ (U+64f ARABIC DAMMA)(U+651 ARABIC SHADDA) ara (ex:7816072,uses:43), arq (ex:8129667,uses:7)
    • ‌ِّ‌ (U+651 ARABIC SHADDA)(U+650 ARABIC KASRA) → ‌ِّ‌ (U+650 ARABIC KASRA)(U+651 ARABIC SHADDA) ara (ex:7756213,uses:65), arq (ex:8129881,uses:28)
    • ‌़्‌ (U+94d DEVANAGARI SIGN VIRAMA)(U+93c DEVANAGARI SIGN NUKTA) → ‌़्‌ (U+93c DEVANAGARI SIGN NUKTA)(U+94d DEVANAGARI SIGN VIRAMA) hin (ex:3803319,uses:5)
    • ‌့်‌ (U+103a MYANMAR SIGN ASAT)(U+1037 MYANMAR SIGN DOT BELOW) → ‌့်‌ (U+1037 MYANMAR SIGN DOT BELOW)(U+103a MYANMAR SIGN ASAT) mya (ex:7239157,uses:11)

Near Duplicates a.k.a. Unicode NFKC

  • ª → a º → o Affects: Finnish [fin], Esperanto [epo], Lingua Franca Nova [lfn], German [deu], English [eng], Japanese [jpn], French [fra], Italian [ita], Turkish [tur], Danish [dan], Ukrainian [ukr], Spanish [spa], Interlingua [ina], Portuguese [por], Russian [rus]
    • ‌ª‌ (U+aa FEMININE ORDINAL INDICATOR) → ‌a‌ (U+61 LATIN SMALL LETTER A) spa (ex:2941724,uses:1), por (ex:8194357,uses:10)
    • ‌º‌ (U+ba MASCULINE ORDINAL INDICATOR) → ‌o‌ (U+6f LATIN SMALL LETTER O) fin (ex:7992721,uses:2), epo (ex:790556,uses:1), lfn (ex:5426144,uses:1), deu (ex:3560497,uses:4), eng (ex:8196897,uses:10), jpn (ex:2748281,uses:1), fra (ex:3559736,uses:2), ita (ex:790561,uses:2), tur (ex:5781216,uses:3), dan (ex:658770,uses:1), ukr (ex:8037753,uses:2), spa (ex:8006406,uses:19), ina (ex:2775888,uses:1), por (ex:8196926,uses:49), rus (ex:5156461,uses:4)
  • ⁰ → 0 ⁸ → 8 ⁿ → n Affects: Danish [dan], Russian [rus], Portuguese [por], French [fra], German [deu], Finnish [fin], Esperanto [epo], Ukrainian [ukr], Japanese [jpn], English [eng], Choctaw [cho]
    • ‌⁰‌ (U+0x2070 SUPERSCRIPT ZERO) → ‌0‌ (U+0x30 DIGIT ZERO) Affects: dan (ex:6128328,uses:2), deu (ex:2748520,uses:5), eng (ex:6555555,uses:1), epo (ex:8234680,uses:1), fin (ex:7992705,uses:4), jpn (ex:2748281,uses:1), por (ex:2775865,uses:1), rus (ex:2774845,uses:1), ukr (ex:8037753,uses:1)
    • ‌⁸‌ (U+0x2078 SUPERSCRIPT EIGHT) → ‌8‌ (U+0x38 DIGIT EIGHT) Affects: deu (ex:2554485,uses:1)
    • ‌ⁿ‌ (U+0x207f SUPERSCRIPT LATIN SMALL LETTER N) → ‌n‌ (U+0x6e LATIN SMALL LETTER N) Affects: cho (ex:4652117,uses:1), deu (ex:2717554,uses:1), epo (ex:2721120,uses:1), fra (ex:6195321,uses:1), jpn (ex:5056562,uses:1)
  • ₁ → 1 ₂ → 2 ₃ → 3 ₄ → 4 ₈ → 8 ₙ → n Affects: Danish [dan], Thai [tha], Esperanto [epo], Macedonian [mkd], Hungarian [hun], French [fra], Turkish [tur], Italian [ita], Czech [ces], Japanese [jpn], Dutch [nld], Finnish [fin], English [eng], Marathi [mar], Spanish [spa], Russian [rus], Kabyle [kab], Interlingua [ina], Portuguese [por], Welsh [cym], German [deu], Basque [eus], Ukrainian [ukr], Vietnamese [vie]
    • ‌₁‌ (U+0x2081 SUBSCRIPT ONE) → ‌1‌ (U+0x31 DIGIT ONE) Affects: deu (ex:8210770,uses:1)
    • ‌₂‌ (U+0x2082 SUBSCRIPT TWO) → ‌2‌ (U+0x32 DIGIT TWO) Affects: ces (ex:3064235,uses:1), cym (ex:8269515,uses:1), dan (ex:2698443,uses:2), deu (ex:1644862,uses:9), eng (ex:270775,uses:10), epo (ex:1645589,uses:5), eus (ex:7886473,uses:1), fin (ex:1042895,uses:3), fra (ex:2698602,uses:4), hun (ex:2368316,uses:1), ina (ex:2698473,uses:1), ita (ex:2769910,uses:1), jpn (ex:143791,uses:2), kab (ex:7154519,uses:1), mar (ex:2513344,uses:1), mkd (ex:4074770,uses:1), nld (ex:7902219,uses:2), por (ex:818402,uses:3), rus (ex:2698442,uses:2), spa (ex:1432548,uses:2), tha (ex:8703161,uses:1), tur (ex:1811230,uses:3), ukr (ex:6744700,uses:1), vie (ex:3356187,uses:1)
    • ‌₃‌ (U+0x2083 SUBSCRIPT THREE) → ‌3‌ (U+0x33 DIGIT THREE) Affects: deu (ex:8285152,uses:1)
    • ‌₄‌ (U+0x2084 SUBSCRIPT FOUR) → ‌4‌ (U+0x34 DIGIT FOUR) Affects: deu (ex:8285152,uses:1)
    • ‌₈‌ (U+0x2088 SUBSCRIPT EIGHT) → ‌8‌ (U+0x38 DIGIT EIGHT) Affects: deu (ex:8285152,uses:1)
    • ‌ₙ‌ (U+0x2099 LATIN SUBSCRIPT SMALL LETTER N) → ‌n‌ (U+0x6e LATIN SMALL LETTER N) Affects: deu (ex:588230,uses:2), rus (ex:5237885,uses:1)
  • ① → 1 ② → 2 Affects: Japanese [jpn]
    • ‌①‌ (U+2460 CIRCLED DIGIT ONE) → ‌1‌ (U+31 DIGIT ONE) jpn (ex:75097,uses:1)
    • ‌②‌ (U+2461 CIRCLED DIGIT TWO) → ‌2‌ (U+32 DIGIT TWO) jpn (ex:75097,uses:1)
  • 𝑎 → a 𝑏 → b 𝑐 → c 𝑒 → e 𝑖 → i 𝑘 → k 𝑚 → m 𝑛 → n 𝑟 → r 𝑥 → x 𝑦 → y 𝘨 → g 𝜀 → ε 𝜋 → π Affects: Spanish [spa], Esperanto [epo], Russian [rus], German [deu]
    • ‌𝑎‌ (U+1d44e MATHEMATICAL ITALIC SMALL A) → ‌a‌ (U+61 LATIN SMALL LETTER A) deu (ex:6061506,uses:2)
    • ‌𝑏‌ (U+1d44f MATHEMATICAL ITALIC SMALL B) → ‌b‌ (U+62 LATIN SMALL LETTER B) deu (ex:6287665,uses:2)
    • ‌𝑐‌ (U+1d450 MATHEMATICAL ITALIC SMALL C) → ‌c‌ (U+63 LATIN SMALL LETTER C) rus (ex:5320437,uses:1), deu (ex:5320325,uses:1)
    • ‌𝑒‌ (U+1d452 MATHEMATICAL ITALIC SMALL E) → ‌e‌ (U+65 LATIN SMALL LETTER E) rus (ex:7469997,uses:1), deu (ex:8213980,uses:3)
    • ‌𝑖‌ (U+1d456 MATHEMATICAL ITALIC SMALL I) → ‌i‌ (U+69 LATIN SMALL LETTER I) deu (ex:6287670,uses:1)
    • ‌𝑘‌ (U+1d458 MATHEMATICAL ITALIC SMALL K) → ‌k‌ (U+6b LATIN SMALL LETTER K) epo (ex:1748814,uses:1), deu (ex:8215767,uses:3)
    • ‌𝑚‌ (U+1d45a MATHEMATICAL ITALIC SMALL M) → ‌m‌ (U+6d LATIN SMALL LETTER M) deu (ex:6287670,uses:2)
    • ‌𝑛‌ (U+1d45b MATHEMATICAL ITALIC SMALL N) → ‌n‌ (U+6e LATIN SMALL LETTER N) deu (ex:5325889,uses:2)
    • ‌𝑟‌ (U+1d45f MATHEMATICAL ITALIC SMALL R) → ‌r‌ (U+72 LATIN SMALL LETTER R) deu (ex:6287670,uses:2)
    • ‌𝑥‌ (U+1d465 MATHEMATICAL ITALIC SMALL X) → ‌x‌ (U+78 LATIN SMALL LETTER X) spa (ex:6893531,uses:1), rus (ex:6541831,uses:1), deu (ex:8215767,uses:7)
    • ‌𝑦‌ (U+1d466 MATHEMATICAL ITALIC SMALL Y) → ‌y‌ (U+79 LATIN SMALL LETTER Y) rus (ex:6541831,uses:1), deu (ex:4916798,uses:2)
    • ‌𝘨‌ (U+1d628 MATHEMATICAL SANS-SERIF ITALIC SMALL G) → ‌g‌ (U+67 LATIN SMALL LETTER G) epo (ex:6967447,uses:1), deu (ex:6954060,uses:1)
    • ‌𝜀‌ (U+1d700 MATHEMATICAL ITALIC SMALL EPSILON) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) deu (ex:5309361,uses:1)
    • ‌𝜋‌ (U+1d70b MATHEMATICAL ITALIC SMALL PI) → ‌π‌ (U+3c0 GREEK SMALL LETTER PI) deu (ex:8213980,uses:1)
  • ℎ → h Affects: German [deu]
    • ‌ℎ‌ (U+210e PLANCK CONSTANT) → ‌h‌ (U+68 LATIN SMALL LETTER H) deu (ex:6287670,uses:1)
  • ℵ → א Affects: German [deu]
    • ‌ℵ‌ (U+0x2135 ALEF SYMBOL) → ‌א‌ (U+0x5d0 HEBREW LETTER ALEF) Affects: deu (ex:8210770,uses:1)
  • ʰ → h ʷ → w ⵯ → ⵡ Affects: Kabyle [kab], Waray [war], Berber [ber], English [eng], Khmer [khm], Ngeq [ngt]
    • ‌ʰ‌ (U+0x2b0 MODIFIER LETTER SMALL H) → ‌h‌ (U+0x68 LATIN SMALL LETTER H) Affects: eng (ex:867278,uses:1), khm (ex:3773482,uses:3), ngt (ex:3942520,uses:2)
    • ‌ʷ‌ (U+0x2b7 MODIFIER LETTER SMALL W) → ‌w‌ (U+0x77 LATIN SMALL LETTER W) Affects: ber (ex:2520170,uses:732), kab (ex:7749982,uses:12), war (ex:7838212,uses:1)
    • ‌ⵯ‌ (U+0x2d6f TIFINAGH MODIFIER LETTER LABIALIZATION MARK) → ‌ⵡ‌ (U+0x2d61 TIFINAGH LETTER YAW) Affects: ber (ex:7206307,uses:4)
  • ſ → s Affects: Middle French [frm]
    • ‌ſ‌ (U+17f LATIN SMALL LETTER LONG S) → ‌s‌ (U+73 LATIN SMALL LETTER S) frm (ex:3229995,uses:3)
  • ﮐ → ک ﺋ → ئ ﺎ → ا ﺣ → ح ﺹ → ص ﻊ → ع ﻋ → ع ﻞ → ل ﻠ → ل ﻣ → م ﻪ → ه Affects: Ottoman Turkish [ota]
    • ‌ﮐ‌ (U+fb90 ARABIC LETTER KEHEH INITIAL FORM) → ‌ک‌ (U+6a9 ARABIC LETTER KEHEH) ota (ex:7882707,uses:1)
    • ‌ﺋ‌ (U+fe8b ARABIC LETTER YEH WITH HAMZA ABOVE INITIAL FORM) → ‌ئ‌ (U+626 ARABIC LETTER YEH WITH HAMZA ABOVE) ota (ex:8133264,uses:2)
    • ‌ﺎ‌ (U+fe8e ARABIC LETTER ALEF FINAL FORM) → ‌ا‌ (U+627 ARABIC LETTER ALEF) ota (ex:8133264,uses:3)
    • ‌ﺣ‌ (U+fea3 ARABIC LETTER HAH INITIAL FORM) → ‌ح‌ (U+62d ARABIC LETTER HAH) ota (ex:7882707,uses:1)
    • ‌ﺹ‌ (U+feb9 ARABIC LETTER SAD ISOLATED FORM) → ‌ص‌ (U+635 ARABIC LETTER SAD) ota (ex:7882707,uses:1)
    • ‌ﻊ‌ (U+feca ARABIC LETTER AIN FINAL FORM) → ‌ع‌ (U+639 ARABIC LETTER AIN) ota (ex:8133264,uses:2)
    • ‌ﻋ‌ (U+fecb ARABIC LETTER AIN INITIAL FORM) → ‌ع‌ (U+639 ARABIC LETTER AIN) ota (ex:8133264,uses:2)
    • ‌ﻞ‌ (U+fede ARABIC LETTER LAM FINAL FORM) → ‌ل‌ (U+644 ARABIC LETTER LAM) ota (ex:7882707,uses:1)
    • ‌ﻠ‌ (U+fee0 ARABIC LETTER LAM MEDIAL FORM) → ‌ل‌ (U+644 ARABIC LETTER LAM) ota (ex:8133264,uses:2)
    • ‌ﻣ‌ (U+fee3 ARABIC LETTER MEEM INITIAL FORM) → ‌م‌ (U+645 ARABIC LETTER MEEM) ota (ex:8133264,uses:2)
    • ‌ﻪ‌ (U+feea ARABIC LETTER HEH FINAL FORM) → ‌ه‌ (U+647 ARABIC LETTER HEH) ota (ex:8133264,uses:2)
  • ⺟ → 母 ⼀ → 一 ⾯ → 面 ⾷ → 食 Affects: Min Nan Chinese [nan]
    • ‌⺟‌ (U+2e9f CJK RADICAL MOTHER) → ‌母‌ (U+6bcd CJK UNIFIED IDEOGRAPH-6BCD) nan (ex:6142180,uses:1)
    • ‌⼀‌ (U+2f00 KANGXI RADICAL ONE) → ‌一‌ (U+4e00 CJK UNIFIED IDEOGRAPH-4E00) nan (ex:6142178,uses:1)
    • ‌⾯‌ (U+2faf KANGXI RADICAL FACE) → ‌面‌ (U+9762 CJK UNIFIED IDEOGRAPH-9762) nan (ex:6142178,uses:1)
    • ‌⾷‌ (U+2fb7 KANGXI RADICAL EAT) → ‌食‌ (U+98df CJK UNIFIED IDEOGRAPH-98DF) nan (ex:6142178,uses:1)
  • µ → μ Affects: Greek [ell]
    • ‌µ‌ (U+0xb5 MICRO SIGN) → ‌μ‌ (U+0x3bc GREEK SMALL LETTER MU) Affects: ell (ex:1384306,uses:1)

Near Duplicates (multiple codepoints)

  • ij → ij և → եւ fi → fi ﻹ → لإ ﻻ → لا ﻼ → لا Affects: Arabic [ara], Armenian [hye], Ottoman Turkish [ota], Dutch [nld], Irish [gle]
    • ‌ij‌ (U+133 LATIN SMALL LIGATURE IJ) → ‌ij‌ (U+69 LATIN SMALL LETTER I)(U+6a LATIN SMALL LETTER J) nld (ex:7786633,uses:1)
    • ‌և‌ (U+587 ARMENIAN SMALL LIGATURE ECH YIWN) → ‌եւ‌ (U+565 ARMENIAN SMALL LETTER ECH)(U+582 ARMENIAN SMALL LETTER YIWN) hye (ex:5155812,uses:98)
    • ‌fi‌ (U+fb01 LATIN SMALL LIGATURE FI) → ‌fi‌ (U+66 LATIN SMALL LETTER F)(U+69 LATIN SMALL LETTER I) gle (ex:873069,uses:1)
    • ‌ﻹ‌ (U+fef9 ARABIC LIGATURE LAM WITH ALEF WITH HAMZA BELOW ISOLATED FORM) → ‌لإ‌ (U+644 ARABIC LETTER LAM)(U+625 ARABIC LETTER ALEF WITH HAMZA BELOW) ara (ex:6570578,uses:1)
    • ‌ﻻ‌ (U+fefb ARABIC LIGATURE LAM WITH ALEF ISOLATED FORM) → ‌لا‌ (U+644 ARABIC LETTER LAM)(U+627 ARABIC LETTER ALEF) ara (ex:1609717,uses:1)
    • ‌ﻼ‌ (U+fefc ARABIC LIGATURE LAM WITH ALEF FINAL FORM) → ‌لا‌ (U+644 ARABIC LETTER LAM)(U+627 ARABIC LETTER ALEF) ota (ex:7882707,uses:1)
  • ㌔ → キロ ㌘ → グラム Affects: Japanese [jpn]
    • ‌㌔‌ (U+3314 SQUARE KIRO) → ‌キロ‌ (U+30ad KATAKANA LETTER KI)(U+30ed KATAKANA LETTER RO) jpn (ex:1490062,uses:1)
    • ‌㌘‌ (U+3318 SQUARE GURAMU) → ‌グラム‌ (U+30b0 KATAKANA LETTER GU)(U+30e9 KATAKANA LETTER RA)(U+30e0 KATAKANA LETTER MU) jpn (ex:75133,uses:1)
  • ำ → ํา Affects: Thai [tha]
    • ‌ำ‌ (U+e33 THAI CHARACTER SARA AM) → ‌ํา‌ (U+e4d THAI CHARACTER NIKHAHIT)(U+e32 THAI CHARACTER SARA AA) tha (ex:7920753,uses:185)
  • ໜ → ຫນ ໝ → ຫມ Affects: Lao [lao]
    • ‌ໜ‌ (U+edc LAO HO NO) → ‌ຫນ‌ (U+eab LAO LETTER HO SUNG)(U+e99 LAO LETTER NO) lao (ex:3791461,uses:3)
    • ‌ໝ‌ (U+edd LAO HO MO) → ‌ຫມ‌ (U+eab LAO LETTER HO SUNG)(U+ea1 LAO LETTER MO) lao (ex:3791443,uses:1)

Case Alternatives a.k.a. fixed points under iterative application of Unicode NFKC, uppercasing and lowercasing using ICU

  • H → h I → ı J → j U → u W → w Á → á Â → â Ä → ä Å → å É → é Ú → ú Ā → ā Č → č Ē → ē Ġ → ġ Ĥ → ĥ Ī → ī İ → i ı → i Ĵ → ĵ Ļ → ļ Ľ → ľ Ł → ł Ņ → ņ Ŝ → ŝ Ū → ū ℂ → c ℃ → c ℕ → n ℝ → r Ꞌ → ꞌ 𝐴 → a 𝐵 → b 𝐾 → k 𝑁 → n 𝑋 → x Affects: Polish [pol], Finnish [fin], Ottoman Turkish [ota], English [eng], Japanese [jpn], Kashmiri [kas], Ido [ido], Dutch [nld], Danish [dan], Spanish [spa], Lojban [jbo], Portuguese [por], Russian [rus], Turkmen [tuk], Bashkir [bak], Esperanto [epo], Old East Slavic [orv], Latvian [lvs], Croatian [hrv], Talysh [tly], Latin [lat], Tatar [tat], Hungarian [hun], Unknown Language, Italian [ita], Lower Sorbian [dsb], Greek [ell], Chamorro [cha], Zaza [zza], German [deu], French [fra], Kashubian [csb], Czech [ces], Berber [ber], Slovak [slk], Navajo [nav], Upper Sorbian [hsb], Azerbaijani [aze], Turkish [tur], Crimean Tatar [crh], Chuvash [chv]
    • ‌H‌ (U+48 LATIN CAPITAL LETTER H) → ‌h‌ (U+68 LATIN SMALL LETTER H) jbo (ex:5181286,uses:36)
    • ‌I‌ (U+49 LATIN CAPITAL LETTER I) → ‌ı‌ (U+131 LATIN SMALL LETTER DOTLESS I) aze (ex:8174218,uses:3855)
    • ‌J‌ (U+4a LATIN CAPITAL LETTER J) → ‌j‌ (U+6a LATIN SMALL LETTER J) lat (ex:8214141,uses:501)
    • ‌U‌ (U+55 LATIN CAPITAL LETTER U) → ‌u‌ (U+75 LATIN SMALL LETTER U) lat (ex:8227281,uses:27379)
    • ‌W‌ (U+57 LATIN CAPITAL LETTER W) → ‌w‌ (U+77 LATIN SMALL LETTER W) lat (ex:8107101,uses:16)
    • ‌Á‌ (U+c1 LATIN CAPITAL LETTER A WITH ACUTE) → ‌á‌ (U+e1 LATIN SMALL LETTER A WITH ACUTE) tur (ex:6029054,uses:2)
    • ‌Â‌ (U+c2 LATIN CAPITAL LETTER A WITH CIRCUMFLEX) → ‌â‌ (U+e2 LATIN SMALL LETTER A WITH CIRCUMFLEX) tur (ex:7767029,uses:18)
    • ‌Ä‌ (U+c4 LATIN CAPITAL LETTER A WITH DIAERESIS) → ‌ä‌ (U+e4 LATIN SMALL LETTER A WITH DIAERESIS) tur (ex:5716114,uses:1)
    • ‌Å‌ (U+c5 LATIN CAPITAL LETTER A WITH RING ABOVE) → ‌å‌ (U+e5 LATIN SMALL LETTER A WITH RING ABOVE) tur (ex:5074739,uses:2)
    • ‌É‌ (U+c9 LATIN CAPITAL LETTER E WITH ACUTE) → ‌é‌ (U+e9 LATIN SMALL LETTER E WITH ACUTE) tur (ex:5153425,uses:1)
    • ‌Ú‌ (U+da LATIN CAPITAL LETTER U WITH ACUTE) → ‌ú‌ (U+fa LATIN SMALL LETTER U WITH ACUTE) tur (ex:5429697,uses:1)
    • ‌Ā‌ (U+100 LATIN CAPITAL LETTER A WITH MACRON) → ‌ā‌ (U+101 LATIN SMALL LETTER A WITH MACRON) lat (ex:4424729,uses:1)
    • ‌Č‌ (U+10c LATIN CAPITAL LETTER C WITH CARON) → ‌č‌ (U+10d LATIN SMALL LETTER C WITH CARON) tur (ex:4770152,uses:1)
    • ‌Ē‌ (U+112 LATIN CAPITAL LETTER E WITH MACRON) → ‌ē‌ (U+113 LATIN SMALL LETTER E WITH MACRON) lat (ex:1194237,uses:1)
    • ‌Ġ‌ (U+120 LATIN CAPITAL LETTER G WITH DOT ABOVE) → ‌ġ‌ (U+121 LATIN SMALL LETTER G WITH DOT ABOVE) tur (ex:7526803,uses:2)
    • ‌Ĥ‌ (U+124 LATIN CAPITAL LETTER H WITH CIRCUMFLEX) → ‌ĥ‌ (U+125 LATIN SMALL LETTER H WITH CIRCUMFLEX) lat (ex:7108484,uses:1)
    • ‌Ī‌ (U+12a LATIN CAPITAL LETTER I WITH MACRON) → ‌ī‌ (U+12b LATIN SMALL LETTER I WITH MACRON) lat (ex:5876456,uses:3)
    • ‌İ‌ (U+130 LATIN CAPITAL LETTER I WITH DOT ABOVE) → ‌i‌ (U+69 LATIN SMALL LETTER I) aze (ex:7705499,uses:264)
    • ‌ı‌ (U+131 LATIN SMALL LETTER DOTLESS I) → ‌i‌ (U+69 LATIN SMALL LETTER I) pol (ex:5079161,uses:1), ota (ex:8230612,uses:317), eng (ex:7917234,uses:8), kas (ex:6788229,uses:1), nld (ex:7462518,uses:1), spa (ex:8152943,uses:3), tuk (ex:7914908,uses:1), bak (ex:4177744,uses:1), por (ex:5883585,uses:1), epo (ex:7867822,uses:2), orv (ex:4808980,uses:308), hrv (ex:5860202,uses:1), tly (ex:5998610,uses:25), tat (ex:8115462,uses:529), hun (ex:8205444,uses:2), ita (ex:3883868,uses:2), ell (ex:2292407,uses:1), cha (ex:5312349,uses:1), zza (ex:6096271,uses:474), deu (ex:4559774,uses:4), fra (ex:7905958,uses:1), ber (ex:5994152,uses:5), slk (ex:5555679,uses:1), crh (ex:6742585,uses:188), chv (ex:3765546,uses:1)
    • ‌Ĵ‌ (U+134 LATIN CAPITAL LETTER J WITH CIRCUMFLEX) → ‌ĵ‌ (U+135 LATIN SMALL LETTER J WITH CIRCUMFLEX) lat (ex:7108485,uses:1)
    • ‌Ļ‌ (U+13b LATIN CAPITAL LETTER L WITH CEDILLA) → ‌ļ‌ (U+13c LATIN SMALL LETTER L WITH CEDILLA) lvs (ex:3366306,uses:3)
    • ‌Ľ‌ (U+13d LATIN CAPITAL LETTER L WITH CARON) → ‌ľ‌ (U+13e LATIN SMALL LETTER L WITH CARON) slk (ex:7584386,uses:11)
    • ‌Ł‌ (U+141 LATIN CAPITAL LETTER L WITH STROKE) → ‌ł‌ (U+142 LATIN SMALL LETTER L WITH STROKE) pol (ex:7970846,uses:123), nav (ex:7561712,uses:1), dsb (ex:3798231,uses:4), epo (ex:1317640,uses:2), hsb (ex:3798230,uses:4), deu (ex:2219281,uses:3), eng (ex:919743,uses:2), csb (ex:6918751,uses:2), ces (ex:1683422,uses:1), ido (ex:921523,uses:3), tur (ex:921680,uses:5), hun (ex:3256785,uses:4), slk (ex:1045020,uses:4)
    • ‌Ņ‌ (U+145 LATIN CAPITAL LETTER N WITH CEDILLA) → ‌ņ‌ (U+146 LATIN SMALL LETTER N WITH CEDILLA) lvs (ex:5935039,uses:2)
    • ‌Ŝ‌ (U+15c LATIN CAPITAL LETTER S WITH CIRCUMFLEX) → ‌ŝ‌ (U+15d LATIN SMALL LETTER S WITH CIRCUMFLEX) lat (ex:7108485,uses:2)
    • ‌Ū‌ (U+16a LATIN CAPITAL LETTER U WITH MACRON) → ‌ū‌ (U+16b LATIN SMALL LETTER U WITH MACRON) lat (ex:3616273,uses:2)
    • ‌ℂ‌ (U+2102 DOUBLE-STRUCK CAPITAL C) → ‌c‌ (U+63 LATIN SMALL LETTER C) fra (ex:638468,uses:1)
    • ‌℃‌ (U+2103 DEGREE CELSIUS) → ‌c‌ (U+63 LATIN SMALL LETTER C) spa (ex:5031254,uses:1), jpn (ex:4043564,uses:7), rus (ex:6398412,uses:1)
    • ‌ℕ‌ (U+2115 DOUBLE-STRUCK CAPITAL N) → ‌n‌ (U+6e LATIN SMALL LETTER N) deu (ex:5309361,uses:1)
    • ‌ℝ‌ (U+211d DOUBLE-STRUCK CAPITAL R) → ‌r‌ (U+72 LATIN SMALL LETTER R) fin (ex:7263060,uses:2), epo (ex:7370483,uses:2), deu (ex:7362006,uses:4), fra (ex:5311532,uses:2), jpn (ex:5067208,uses:3), dan (ex:7370673,uses:2)
    • ‌Ꞌ‌ (U+a78b LATIN CAPITAL LETTER SALTILLO) → ‌ꞌ‌ (U+a78c LATIN SMALL LETTER SALTILLO) Affects: Unknown Language (ex:6473120,uses:1)
    • ‌𝐴‌ (U+1d434 MATHEMATICAL ITALIC CAPITAL A) → ‌a‌ (U+61 LATIN SMALL LETTER A) rus (ex:5320437,uses:1), deu (ex:5320325,uses:1)
    • ‌𝐵‌ (U+1d435 MATHEMATICAL ITALIC CAPITAL B) → ‌b‌ (U+62 LATIN SMALL LETTER B) rus (ex:5320437,uses:1), deu (ex:5320325,uses:1)
    • ‌𝐾‌ (U+1d43e MATHEMATICAL ITALIC CAPITAL K) → ‌k‌ (U+6b LATIN SMALL LETTER K) deu (ex:6287665,uses:1)
    • ‌𝑁‌ (U+1d441 MATHEMATICAL ITALIC CAPITAL N) → ‌n‌ (U+6e LATIN SMALL LETTER N) deu (ex:5309361,uses:1)
    • ‌𝑋‌ (U+1d44b MATHEMATICAL ITALIC CAPITAL X) → ‌x‌ (U+78 LATIN SMALL LETTER X) deu (ex:5565039,uses:2)
  • Ԑ → ԑ Affects: Kabyle [kab]
    • ‌Ԑ‌ (U+510 CYRILLIC CAPITAL LETTER REVERSED ZE) → ‌ԑ‌ (U+511 CYRILLIC SMALL LETTER REVERSED ZE) kab (ex:8224182,uses:169)
  • ¨ → ̈ ´ → ́ ˙ → ̇ ˚ → ̊ Affects: Finnish [fin], Guarani [grn], Low German (Low Saxon) [nds], English [eng], Dutch [nld], Spanish [spa], Portuguese [por], Esperanto [epo], Old Tupi [tpw], Ukrainian [ukr], Italian [ita], Catalan [cat], Greek [ell], Mandarin Chinese [cmn], German [deu], French [fra], Czech [ces], Berber [ber], Slovak [slk], Ancient Greek [grc], Turkish [tur], Occitan [oci]
    • ‌¨‌ (U+a8 DIAERESIS) → ‌̈‌ (U+308 COMBINING DIAERESIS) epo (ex:5560373,uses:2), cmn (ex:805701,uses:1), deu (ex:2672188,uses:1), ces (ex:8160109,uses:9), spa (ex:8084904,uses:3), ber (ex:7557144,uses:1), por (ex:2672187,uses:1)
    • ‌´‌ (U+b4 ACUTE ACCENT) → ‌́‌ (U+301 COMBINING ACUTE ACCENT) fin (ex:2815914,uses:2), epo (ex:3860188,uses:1), cat (ex:4647629,uses:1), grn (ex:2241790,uses:9), nds (ex:808169,uses:3), ell (ex:2558180,uses:2), deu (ex:6554418,uses:16), tpw (ex:6690405,uses:1), eng (ex:5363976,uses:9), fra (ex:3982330,uses:3), tur (ex:4492717,uses:4), nld (ex:1056975,uses:1), spa (ex:7870105,uses:2), oci (ex:5801694,uses:10), slk (ex:5142227,uses:5), por (ex:2672187,uses:1), ita (ex:4371606,uses:5)
    • ‌˙‌ (U+2d9 DOT ABOVE) → ‌̇‌ (U+307 COMBINING DOT ABOVE) ces (ex:4202006,uses:1), grc (ex:4107692,uses:1)
    • ‌˚‌ (U+2da RING ABOVE) → ‌̊‌ (U+30a COMBINING RING ABOVE) ukr (ex:5655947,uses:1)
  • 𑢩 → 𑣉 𑢮 → 𑣎 𑢯 → 𑣏 Affects: Ho [hoc]
    • ‌𑢩‌ (U+118a9 WARANG CITI CAPITAL LETTER O) → ‌𑣉‌ (U+118c9 WARANG CITI SMALL LETTER O) hoc (ex:4712781,uses:3)
    • ‌𑢮‌ (U+118ae WARANG CITI CAPITAL LETTER YUJ) → ‌𑣎‌ (U+118ce WARANG CITI SMALL LETTER YUJ) hoc (ex:4690415,uses:1)
    • ‌𑢯‌ (U+118af WARANG CITI CAPITAL LETTER UC) → ‌𑣏‌ (U+118cf WARANG CITI SMALL LETTER UC) hoc (ex:4712757,uses:2)
  • ͅ → ι ΄ → ́ Ά → α Έ → ε Ή → ή Ί → ι Ό → ο Ύ → υ Ώ → ω ΐ → ϊ ά → α έ → ε ί → ι ς → σ ό → ο ύ → υ ώ → ω ἀ → α ἁ → α ἄ → α Ἀ → ἀ Ἄ → α Ἄ → ἄ Ἆ → ἆ ἐ → ε ἔ → ε ἕ → ε Ἐ → ἐ Ἑ → ἑ Ἓ → ε Ἓ → ἓ Ἔ → ε Ἔ → ἔ ἠ → η ἡ → η ἦ → ή Ἡ → ἡ Ἢ → ἢ Ἥ → ἥ Ἦ → ἦ ἰ → ι ἱ → ι ἶ → ι Ἰ → ἰ Ἱ → ἱ ὁ → ο ὅ → ο Ὀ → ὀ Ὁ → ο Ὁ → ὁ Ὃ → ὃ Ὄ → ὄ Ὅ → ὅ ὐ → υ ὔ → υ Ὑ → ὑ Ὕ → ὕ ὠ → ω ὡ → ω Ὡ → ὡ Ὤ → ὤ Ὦ → ὦ ὰ → α ὲ → ε έ → ε ὴ → ή ὶ → ι ί → ι ὸ → ο ὺ → υ ὼ → ω ώ → ω ᾶ → α ᾽ → ̓ ᾿ → ̓ ῆ → ή ῖ → ι ῦ → υ ῶ → ω ῾ → ̔ Affects: Ancient Greek [grc], Greek [ell], Portuguese [por]
    • ‌ͅ‌ (U+345 COMBINING GREEK YPOGEGRAMMENI) → ‌ι‌ (U+3b9 GREEK SMALL LETTER IOTA) grc (ex:5095334,uses:1)
    • ‌΄‌ (U+384 GREEK TONOS) → ‌́‌ (U+301 COMBINING ACUTE ACCENT) ell (ex:5613134,uses:91), grc (ex:3105129,uses:1)
    • ‌Ά‌ (U+386 GREEK CAPITAL LETTER ALPHA WITH TONOS) → ‌α‌ (U+3b1 GREEK SMALL LETTER ALPHA) ell (ex:8080260,uses:207)
    • ‌Έ‌ (U+388 GREEK CAPITAL LETTER EPSILON WITH TONOS) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) ell (ex:8231056,uses:1339)
    • ‌Ή‌ (U+389 GREEK CAPITAL LETTER ETA WITH TONOS) → ‌ή‌ (U+3ae GREEK SMALL LETTER ETA WITH TONOS) ell (ex:8190378,uses:480)
    • ‌Ί‌ (U+38a GREEK CAPITAL LETTER IOTA WITH TONOS) → ‌ι‌ (U+3b9 GREEK SMALL LETTER IOTA) ell (ex:7796656,uses:26)
    • ‌Ό‌ (U+38c GREEK CAPITAL LETTER OMICRON WITH TONOS) → ‌ο‌ (U+3bf GREEK SMALL LETTER OMICRON) ell (ex:8188267,uses:383)
    • ‌Ύ‌ (U+38e GREEK CAPITAL LETTER UPSILON WITH TONOS) → ‌υ‌ (U+3c5 GREEK SMALL LETTER UPSILON) ell (ex:7773402,uses:1)
    • ‌Ώ‌ (U+38f GREEK CAPITAL LETTER OMEGA WITH TONOS) → ‌ω‌ (U+3c9 GREEK SMALL LETTER OMEGA) ell (ex:7323854,uses:6)
    • ‌ΐ‌ (U+390 GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS) → ‌ϊ‌ (U+3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) ell (ex:7797195,uses:18)
    • ‌ά‌ (U+3ac GREEK SMALL LETTER ALPHA WITH TONOS) → ‌α‌ (U+3b1 GREEK SMALL LETTER ALPHA) ell (ex:8232454,uses:13416)
    • ‌έ‌ (U+3ad GREEK SMALL LETTER EPSILON WITH TONOS) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) ell (ex:8231042,uses:13333)
    • ‌ί‌ (U+3af GREEK SMALL LETTER IOTA WITH TONOS) → ‌ι‌ (U+3b9 GREEK SMALL LETTER IOTA) ell (ex:8232454,uses:16784)
    • ‌ς‌ (U+3c2 GREEK SMALL LETTER FINAL SIGMA) → ‌σ‌ (U+3c3 GREEK SMALL LETTER SIGMA) ell (ex:8232454,uses:13412), grc (ex:8207837,uses:469)
    • ‌ό‌ (U+3cc GREEK SMALL LETTER OMICRON WITH TONOS) → ‌ο‌ (U+3bf GREEK SMALL LETTER OMICRON) ell (ex:8232454,uses:11259)
    • ‌ύ‌ (U+3cd GREEK SMALL LETTER UPSILON WITH TONOS) → ‌υ‌ (U+3c5 GREEK SMALL LETTER UPSILON) ell (ex:8232454,uses:6393)
    • ‌ώ‌ (U+3ce GREEK SMALL LETTER OMEGA WITH TONOS) → ‌ω‌ (U+3c9 GREEK SMALL LETTER OMEGA) ell (ex:8231045,uses:4906)
    • ‌ἀ‌ (U+1f00 GREEK SMALL LETTER ALPHA WITH PSILI) → ‌α‌ (U+3b1 GREEK SMALL LETTER ALPHA) ell (ex:3577893,uses:7)
    • ‌ἁ‌ (U+1f01 GREEK SMALL LETTER ALPHA WITH DASIA) → ‌α‌ (U+3b1 GREEK SMALL LETTER ALPHA) ell (ex:3577869,uses:1)
    • ‌ἄ‌ (U+1f04 GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA) → ‌α‌ (U+3b1 GREEK SMALL LETTER ALPHA) ell (ex:3577873,uses:4)
    • ‌Ἀ‌ (U+1f08 GREEK CAPITAL LETTER ALPHA WITH PSILI) → ‌ἀ‌ (U+1f00 GREEK SMALL LETTER ALPHA WITH PSILI) grc (ex:7088471,uses:32), por (ex:2457794,uses:1)
    • ‌Ἄ‌ (U+1f0c GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA) → ‌α‌ (U+3b1 GREEK SMALL LETTER ALPHA) ell (ex:7796568,uses:1)
    • ‌Ἄ‌ (U+1f0c GREEK CAPITAL LETTER ALPHA WITH PSILI AND OXIA) → ‌ἄ‌ (U+1f04 GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA) grc (ex:7796578,uses:8)
    • ‌Ἆ‌ (U+1f0e GREEK CAPITAL LETTER ALPHA WITH PSILI AND PERISPOMENI) → ‌ἆ‌ (U+1f06 GREEK SMALL LETTER ALPHA WITH PSILI AND PERISPOMENI) grc (ex:7412000,uses:12)
    • ‌ἐ‌ (U+1f10 GREEK SMALL LETTER EPSILON WITH PSILI) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) ell (ex:8169676,uses:4)
    • ‌ἔ‌ (U+1f14 GREEK SMALL LETTER EPSILON WITH PSILI AND OXIA) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) ell (ex:3577873,uses:3)
    • ‌ἕ‌ (U+1f15 GREEK SMALL LETTER EPSILON WITH DASIA AND OXIA) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) ell (ex:3577856,uses:1)
    • ‌Ἐ‌ (U+1f18 GREEK CAPITAL LETTER EPSILON WITH PSILI) → ‌ἐ‌ (U+1f10 GREEK SMALL LETTER EPSILON WITH PSILI) grc (ex:7234271,uses:40)
    • ‌Ἑ‌ (U+1f19 GREEK CAPITAL LETTER EPSILON WITH DASIA) → ‌ἑ‌ (U+1f11 GREEK SMALL LETTER EPSILON WITH DASIA) grc (ex:3103818,uses:1)
    • ‌Ἓ‌ (U+1f1b GREEK CAPITAL LETTER EPSILON WITH DASIA AND VARIA) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) ell (ex:2395416,uses:1)
    • ‌Ἓ‌ (U+1f1b GREEK CAPITAL LETTER EPSILON WITH DASIA AND VARIA) → ‌ἓ‌ (U+1f13 GREEK SMALL LETTER EPSILON WITH DASIA AND VARIA) grc (ex:2652865,uses:2)
    • ‌Ἔ‌ (U+1f1c GREEK CAPITAL LETTER EPSILON WITH PSILI AND OXIA) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) ell (ex:4980695,uses:1)
    • ‌Ἔ‌ (U+1f1c GREEK CAPITAL LETTER EPSILON WITH PSILI AND OXIA) → ‌ἔ‌ (U+1f14 GREEK SMALL LETTER EPSILON WITH PSILI AND OXIA) grc (ex:7075924,uses:8)
    • ‌ἠ‌ (U+1f20 GREEK SMALL LETTER ETA WITH PSILI) → ‌η‌ (U+3b7 GREEK SMALL LETTER ETA) ell (ex:3577869,uses:1)
    • ‌ἡ‌ (U+1f21 GREEK SMALL LETTER ETA WITH DASIA) → ‌η‌ (U+3b7 GREEK SMALL LETTER ETA) ell (ex:8169676,uses:2)
    • ‌ἦ‌ (U+1f26 GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI) → ‌ή‌ (U+3ae GREEK SMALL LETTER ETA WITH TONOS) ell (ex:7796568,uses:1)
    • ‌Ἡ‌ (U+1f29 GREEK CAPITAL LETTER ETA WITH DASIA) → ‌ἡ‌ (U+1f21 GREEK SMALL LETTER ETA WITH DASIA) grc (ex:7101615,uses:18)
    • ‌Ἢ‌ (U+1f2a GREEK CAPITAL LETTER ETA WITH PSILI AND VARIA) → ‌ἢ‌ (U+1f22 GREEK SMALL LETTER ETA WITH PSILI AND VARIA) grc (ex:1811821,uses:1)
    • ‌Ἥ‌ (U+1f2d GREEK CAPITAL LETTER ETA WITH DASIA AND OXIA) → ‌ἥ‌ (U+1f25 GREEK SMALL LETTER ETA WITH DASIA AND OXIA) grc (ex:5095347,uses:2)
    • ‌Ἦ‌ (U+1f2e GREEK CAPITAL LETTER ETA WITH PSILI AND PERISPOMENI) → ‌ἦ‌ (U+1f26 GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI) grc (ex:3227274,uses:2)
    • ‌ἰ‌ (U+1f30 GREEK SMALL LETTER IOTA WITH PSILI) → ‌ι‌ (U+3b9 GREEK SMALL LETTER IOTA) ell (ex:3577873,uses:2)
    • ‌ἱ‌ (U+1f31 GREEK SMALL LETTER IOTA WITH DASIA) → ‌ι‌ (U+3b9 GREEK SMALL LETTER IOTA) ell (ex:3577872,uses:2)
    • ‌ἶ‌ (U+1f36 GREEK SMALL LETTER IOTA WITH PSILI AND PERISPOMENI) → ‌ι‌ (U+3b9 GREEK SMALL LETTER IOTA) ell (ex:3577870,uses:3)
    • ‌Ἰ‌ (U+1f38 GREEK CAPITAL LETTER IOTA WITH PSILI) → ‌ἰ‌ (U+1f30 GREEK SMALL LETTER IOTA WITH PSILI) grc (ex:7086480,uses:13)
    • ‌Ἱ‌ (U+1f39 GREEK CAPITAL LETTER IOTA WITH DASIA) → ‌ἱ‌ (U+1f31 GREEK SMALL LETTER IOTA WITH DASIA) grc (ex:6633416,uses:3)
    • ‌ὁ‌ (U+1f41 GREEK SMALL LETTER OMICRON WITH DASIA) → ‌ο‌ (U+3bf GREEK SMALL LETTER OMICRON) ell (ex:8169676,uses:4)
    • ‌ὅ‌ (U+1f45 GREEK SMALL LETTER OMICRON WITH DASIA AND OXIA) → ‌ο‌ (U+3bf GREEK SMALL LETTER OMICRON) ell (ex:3577869,uses:2)
    • ‌Ὀ‌ (U+1f48 GREEK CAPITAL LETTER OMICRON WITH PSILI) → ‌ὀ‌ (U+1f40 GREEK SMALL LETTER OMICRON WITH PSILI) grc (ex:5095069,uses:1)
    • ‌Ὁ‌ (U+1f49 GREEK CAPITAL LETTER OMICRON WITH DASIA) → ‌ο‌ (U+3bf GREEK SMALL LETTER OMICRON) ell (ex:3577872,uses:1)
    • ‌Ὁ‌ (U+1f49 GREEK CAPITAL LETTER OMICRON WITH DASIA) → ‌ὁ‌ (U+1f41 GREEK SMALL LETTER OMICRON WITH DASIA) grc (ex:8207830,uses:63)
    • ‌Ὃ‌ (U+1f4b GREEK CAPITAL LETTER OMICRON WITH DASIA AND VARIA) → ‌ὃ‌ (U+1f43 GREEK SMALL LETTER OMICRON WITH DASIA AND VARIA) grc (ex:2724721,uses:1)
    • ‌Ὄ‌ (U+1f4c GREEK CAPITAL LETTER OMICRON WITH PSILI AND OXIA) → ‌ὄ‌ (U+1f44 GREEK SMALL LETTER OMICRON WITH PSILI AND OXIA) grc (ex:2951372,uses:1)
    • ‌Ὅ‌ (U+1f4d GREEK CAPITAL LETTER OMICRON WITH DASIA AND OXIA) → ‌ὅ‌ (U+1f45 GREEK SMALL LETTER OMICRON WITH DASIA AND OXIA) grc (ex:6674880,uses:2)
    • ‌ὐ‌ (U+1f50 GREEK SMALL LETTER UPSILON WITH PSILI) → ‌υ‌ (U+3c5 GREEK SMALL LETTER UPSILON) ell (ex:8169676,uses:5)
    • ‌ὔ‌ (U+1f54 GREEK SMALL LETTER UPSILON WITH PSILI AND OXIA) → ‌υ‌ (U+3c5 GREEK SMALL LETTER UPSILON) ell (ex:3577893,uses:3)
    • ‌Ὑ‌ (U+1f59 GREEK CAPITAL LETTER UPSILON WITH DASIA) → ‌ὑ‌ (U+1f51 GREEK SMALL LETTER UPSILON WITH DASIA) grc (ex:5095347,uses:1)
    • ‌Ὕ‌ (U+1f5d GREEK CAPITAL LETTER UPSILON WITH DASIA AND OXIA) → ‌ὕ‌ (U+1f55 GREEK SMALL LETTER UPSILON WITH DASIA AND OXIA) grc (ex:6705467,uses:1)
    • ‌ὠ‌ (U+1f60 GREEK SMALL LETTER OMEGA WITH PSILI) → ‌ω‌ (U+3c9 GREEK SMALL LETTER OMEGA) ell (ex:2652576,uses:1)
    • ‌ὡ‌ (U+1f61 GREEK SMALL LETTER OMEGA WITH DASIA) → ‌ω‌ (U+3c9 GREEK SMALL LETTER OMEGA) ell (ex:3577871,uses:1)
    • ‌Ὡ‌ (U+1f69 GREEK CAPITAL LETTER OMEGA WITH DASIA) → ‌ὡ‌ (U+1f61 GREEK SMALL LETTER OMEGA WITH DASIA) grc (ex:7088133,uses:3)
    • ‌Ὤ‌ (U+1f6c GREEK CAPITAL LETTER OMEGA WITH PSILI AND OXIA) → ‌ὤ‌ (U+1f64 GREEK SMALL LETTER OMEGA WITH PSILI AND OXIA) grc (ex:2657032,uses:1)
    • ‌Ὦ‌ (U+1f6e GREEK CAPITAL LETTER OMEGA WITH PSILI AND PERISPOMENI) → ‌ὦ‌ (U+1f66 GREEK SMALL LETTER OMEGA WITH PSILI AND PERISPOMENI) grc (ex:5095104,uses:2)
    • ‌ὰ‌ (U+1f70 GREEK SMALL LETTER ALPHA WITH VARIA) → ‌α‌ (U+3b1 GREEK SMALL LETTER ALPHA) ell (ex:3577893,uses:6)
    • ‌ὲ‌ (U+1f72 GREEK SMALL LETTER EPSILON WITH VARIA) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) ell (ex:3577892,uses:6)
    • ‌έ‌ (U+1f73 GREEK SMALL LETTER EPSILON WITH OXIA) → ‌ε‌ (U+3b5 GREEK SMALL LETTER EPSILON) ell (ex:4980695,uses:1)
    • ‌ὴ‌ (U+1f74 GREEK SMALL LETTER ETA WITH VARIA) → ‌ή‌ (U+3ae GREEK SMALL LETTER ETA WITH TONOS) ell (ex:3577872,uses:3)
    • ‌ὶ‌ (U+1f76 GREEK SMALL LETTER IOTA WITH VARIA) → ‌ι‌ (U+3b9 GREEK SMALL LETTER IOTA) ell (ex:3577873,uses:4)
    • ‌ί‌ (U+1f77 GREEK SMALL LETTER IOTA WITH OXIA) → ‌ι‌ (U+3b9 GREEK SMALL LETTER IOTA) ell (ex:6787936,uses:1)
    • ‌ὸ‌ (U+1f78 GREEK SMALL LETTER OMICRON WITH VARIA) → ‌ο‌ (U+3bf GREEK SMALL LETTER OMICRON) ell (ex:3577892,uses:7)
    • ‌ὺ‌ (U+1f7a GREEK SMALL LETTER UPSILON WITH VARIA) → ‌υ‌ (U+3c5 GREEK SMALL LETTER UPSILON) ell (ex:3577893,uses:3)
    • ‌ὼ‌ (U+1f7c GREEK SMALL LETTER OMEGA WITH VARIA) → ‌ω‌ (U+3c9 GREEK SMALL LETTER OMEGA) ell (ex:3577870,uses:1)
    • ‌ώ‌ (U+1f7d GREEK SMALL LETTER OMEGA WITH OXIA) → ‌ω‌ (U+3c9 GREEK SMALL LETTER OMEGA) ell (ex:4980695,uses:1)
    • ‌ᾶ‌ (U+1fb6 GREEK SMALL LETTER ALPHA WITH PERISPOMENI) → ‌α‌ (U+3b1 GREEK SMALL LETTER ALPHA) ell (ex:3577859,uses:1)
    • ‌᾽‌ (U+1fbd GREEK KORONIS) → ‌̓‌ (U+313 COMBINING COMMA ABOVE) grc (ex:5108730,uses:3)
    • ‌᾿‌ (U+1fbf GREEK PSILI) → ‌̓‌ (U+313 COMBINING COMMA ABOVE) grc (ex:6674905,uses:6)
    • ‌ῆ‌ (U+1fc6 GREEK SMALL LETTER ETA WITH PERISPOMENI) → ‌ή‌ (U+3ae GREEK SMALL LETTER ETA WITH TONOS) ell (ex:3577893,uses:3)
    • ‌ῖ‌ (U+1fd6 GREEK SMALL LETTER IOTA WITH PERISPOMENI) → ‌ι‌ (U+3b9 GREEK SMALL LETTER IOTA) ell (ex:8169676,uses:3)
    • ‌ῦ‌ (U+1fe6 GREEK SMALL LETTER UPSILON WITH PERISPOMENI) → ‌υ‌ (U+3c5 GREEK SMALL LETTER UPSILON) ell (ex:3577873,uses:7)
    • ‌ῶ‌ (U+1ff6 GREEK SMALL LETTER OMEGA WITH PERISPOMENI) → ‌ω‌ (U+3c9 GREEK SMALL LETTER OMEGA) ell (ex:8169676,uses:3)
    • ‌῾‌ (U+1ffe GREEK DASIA) → ‌̔‌ (U+314 COMBINING REVERSED COMMA ABOVE) grc (ex:3220532,uses:1)
  • Ա → ա Բ → բ Գ → գ Դ → դ Ե → ե Զ → զ Է → է Ը → ը Թ → թ Ժ → ժ Ի → ի Լ → լ Խ → խ Ծ → ծ Կ → կ Հ → հ Ձ → ձ Ղ → ղ Ճ → ճ Մ → մ Յ → յ Ն → ն Շ → շ Ո → ո Չ → չ Պ → պ Ջ → ջ Ս → ս Վ → վ Տ → տ Ց → ց Ւ → ւ Փ → փ Ք → ք Օ → օ Ֆ → ֆ Affects: Armenian [hye]
    • ‌Ա‌ (U+531 ARMENIAN CAPITAL LETTER AYB) → ‌ա‌ (U+561 ARMENIAN SMALL LETTER AYB) hye (ex:8108601,uses:197)
    • ‌Բ‌ (U+532 ARMENIAN CAPITAL LETTER BEN) → ‌բ‌ (U+562 ARMENIAN SMALL LETTER BEN) hye (ex:8096000,uses:93)
    • ‌Գ‌ (U+533 ARMENIAN CAPITAL LETTER GIM) → ‌գ‌ (U+563 ARMENIAN SMALL LETTER GIM) hye (ex:5157473,uses:29)
    • ‌Դ‌ (U+534 ARMENIAN CAPITAL LETTER DA) → ‌դ‌ (U+564 ARMENIAN SMALL LETTER DA) hye (ex:5157514,uses:164)
    • ‌Ե‌ (U+535 ARMENIAN CAPITAL LETTER ECH) → ‌ե‌ (U+565 ARMENIAN SMALL LETTER ECH) hye (ex:5157524,uses:251)
    • ‌Զ‌ (U+536 ARMENIAN CAPITAL LETTER ZA) → ‌զ‌ (U+566 ARMENIAN SMALL LETTER ZA) hye (ex:5157478,uses:33)
    • ‌Է‌ (U+537 ARMENIAN CAPITAL LETTER EH) → ‌է‌ (U+567 ARMENIAN SMALL LETTER EH) hye (ex:5157503,uses:5)
    • ‌Ը‌ (U+538 ARMENIAN CAPITAL LETTER ET) → ‌ը‌ (U+568 ARMENIAN SMALL LETTER ET) hye (ex:3620998,uses:2)
    • ‌Թ‌ (U+539 ARMENIAN CAPITAL LETTER TO) → ‌թ‌ (U+569 ARMENIAN SMALL LETTER TO) hye (ex:8108398,uses:383)
    • ‌Ժ‌ (U+53a ARMENIAN CAPITAL LETTER ZHE) → ‌ժ‌ (U+56a ARMENIAN SMALL LETTER ZHE) hye (ex:5157469,uses:8)
    • ‌Ի‌ (U+53b ARMENIAN CAPITAL LETTER INI) → ‌ի‌ (U+56b ARMENIAN SMALL LETTER INI) hye (ex:5157383,uses:116)
    • ‌Լ‌ (U+53c ARMENIAN CAPITAL LETTER LIWN) → ‌լ‌ (U+56c ARMENIAN SMALL LETTER LIWN) hye (ex:8108390,uses:33)
    • ‌Խ‌ (U+53d ARMENIAN CAPITAL LETTER XEH) → ‌խ‌ (U+56d ARMENIAN SMALL LETTER XEH) hye (ex:5157454,uses:38)
    • ‌Ծ‌ (U+53e ARMENIAN CAPITAL LETTER CA) → ‌ծ‌ (U+56e ARMENIAN SMALL LETTER CA) hye (ex:4903846,uses:9)
    • ‌Կ‌ (U+53f ARMENIAN CAPITAL LETTER KEN) → ‌կ‌ (U+56f ARMENIAN SMALL LETTER KEN) hye (ex:5635307,uses:67)
    • ‌Հ‌ (U+540 ARMENIAN CAPITAL LETTER HO) → ‌հ‌ (U+570 ARMENIAN SMALL LETTER HO) hye (ex:7774704,uses:103)
    • ‌Ձ‌ (U+541 ARMENIAN CAPITAL LETTER JA) → ‌ձ‌ (U+571 ARMENIAN SMALL LETTER JA) hye (ex:8108404,uses:21)
    • ‌Ղ‌ (U+542 ARMENIAN CAPITAL LETTER GHAD) → ‌ղ‌ (U+572 ARMENIAN SMALL LETTER GHAD) hye (ex:5157415,uses:2)
    • ‌Ճ‌ (U+543 ARMENIAN CAPITAL LETTER CHEH) → ‌ճ‌ (U+573 ARMENIAN SMALL LETTER CHEH) hye (ex:5157467,uses:5)
    • ‌Մ‌ (U+544 ARMENIAN CAPITAL LETTER MEN) → ‌մ‌ (U+574 ARMENIAN SMALL LETTER MEN) hye (ex:5157536,uses:244)
    • ‌Յ‌ (U+545 ARMENIAN CAPITAL LETTER YI) → ‌յ‌ (U+575 ARMENIAN SMALL LETTER YI) hye (ex:5156293,uses:5)
    • ‌Ն‌ (U+546 ARMENIAN CAPITAL LETTER NOW) → ‌ն‌ (U+576 ARMENIAN SMALL LETTER NOW) hye (ex:8108395,uses:162)
    • ‌Շ‌ (U+547 ARMENIAN CAPITAL LETTER SHA) → ‌շ‌ (U+577 ARMENIAN SMALL LETTER SHA) hye (ex:5157533,uses:86)
    • ‌Ո‌ (U+548 ARMENIAN CAPITAL LETTER VO) → ‌ո‌ (U+578 ARMENIAN SMALL LETTER VO) hye (ex:5157489,uses:84)
    • ‌Չ‌ (U+549 ARMENIAN CAPITAL LETTER CHA) → ‌չ‌ (U+579 ARMENIAN SMALL LETTER CHA) hye (ex:8108397,uses:21)
    • ‌Պ‌ (U+54a ARMENIAN CAPITAL LETTER PEH) → ‌պ‌ (U+57a ARMENIAN SMALL LETTER PEH) hye (ex:5157513,uses:40)
    • ‌Ջ‌ (U+54b ARMENIAN CAPITAL LETTER JHEH) → ‌ջ‌ (U+57b ARMENIAN SMALL LETTER JHEH) hye (ex:5978194,uses:8)
    • ‌Ս‌ (U+54d ARMENIAN CAPITAL LETTER SEH) → ‌ս‌ (U+57d ARMENIAN SMALL LETTER SEH) hye (ex:5157510,uses:72)
    • ‌Վ‌ (U+54e ARMENIAN CAPITAL LETTER VEW) → ‌վ‌ (U+57e ARMENIAN SMALL LETTER VEW) hye (ex:8108390,uses:53)
    • ‌Տ‌ (U+54f ARMENIAN CAPITAL LETTER TIWN) → ‌տ‌ (U+57f ARMENIAN SMALL LETTER TIWN) hye (ex:8108592,uses:25)
    • ‌Ց‌ (U+551 ARMENIAN CAPITAL LETTER CO) → ‌ց‌ (U+581 ARMENIAN SMALL LETTER CO) hye (ex:5157357,uses:8)
    • ‌Ւ‌ (U+552 ARMENIAN CAPITAL LETTER YIWN) → ‌ւ‌ (U+582 ARMENIAN SMALL LETTER YIWN) hye (ex:3356645,uses:1)
    • ‌Փ‌ (U+553 ARMENIAN CAPITAL LETTER PIWR) → ‌փ‌ (U+583 ARMENIAN SMALL LETTER PIWR) hye (ex:8108392,uses:13)
    • ‌Ք‌ (U+554 ARMENIAN CAPITAL LETTER KEH) → ‌ք‌ (U+584 ARMENIAN SMALL LETTER KEH) hye (ex:8108388,uses:24)
    • ‌Օ‌ (U+555 ARMENIAN CAPITAL LETTER OH) → ‌օ‌ (U+585 ARMENIAN SMALL LETTER OH) hye (ex:5157508,uses:6)
    • ‌Ֆ‌ (U+556 ARMENIAN CAPITAL LETTER FEH) → ‌ֆ‌ (U+586 ARMENIAN SMALL LETTER FEH) hye (ex:4911469,uses:3)
  • Ꭰ → ꭰ Ꭱ → ꭱ Ꭴ → ꭴ Ꭶ → ꭶ Ꭷ → ꭷ Ꭸ → ꭸ Ꭹ → ꭹ Ꭺ → ꭺ Ꭼ → ꭼ Ꭽ → ꭽ Ꭿ → ꭿ Ꮂ → ꮂ Ꮃ → ꮃ Ꮅ → ꮅ Ꮆ → ꮆ Ꮈ → ꮈ Ꮎ → ꮎ Ꮑ → ꮑ Ꮒ → ꮒ Ꮓ → ꮓ Ꮕ → ꮕ Ꮖ → ꮖ Ꮗ → ꮗ Ꮙ → ꮙ Ꮛ → ꮛ Ꮜ → ꮜ Ꮝ → ꮝ Ꮟ → ꮟ Ꮡ → ꮡ Ꮢ → ꮢ Ꮣ → ꮣ Ꮤ → ꮤ Ꮥ → ꮥ Ꮧ → ꮧ Ꮨ → ꮨ Ꮩ → ꮩ Ꮪ → ꮪ Ꮭ → ꮭ Ꮰ → ꮰ Ꮱ → ꮱ Ꮲ → ꮲ Ꮳ → ꮳ Ꮵ → ꮵ Ꮷ → ꮷ Ꮸ → ꮸ Ꮹ → ꮹ Ꮺ → ꮺ Ꮻ → ꮻ Ꮼ → ꮼ Ꮿ → ꮿ Ᏸ → ᏸ Ᏹ → ᏹ Ᏺ → ᏺ Ᏼ → ᏼ Affects: Cherokee [chr]
    • ‌Ꭰ‌ (U+13a0 CHEROKEE LETTER A) → ‌ꭰ‌ (U+ab70 CHEROKEE SMALL LETTER A) chr (ex:4254558,uses:9)
    • ‌Ꭱ‌ (U+13a1 CHEROKEE LETTER E) → ‌ꭱ‌ (U+ab71 CHEROKEE SMALL LETTER E) chr (ex:5418749,uses:3)
    • ‌Ꭴ‌ (U+13a4 CHEROKEE LETTER U) → ‌ꭴ‌ (U+ab74 CHEROKEE SMALL LETTER U) chr (ex:3158211,uses:6)
    • ‌Ꭶ‌ (U+13a6 CHEROKEE LETTER GA) → ‌ꭶ‌ (U+ab76 CHEROKEE SMALL LETTER GA) chr (ex:5418749,uses:11)
    • ‌Ꭷ‌ (U+13a7 CHEROKEE LETTER KA) → ‌ꭷ‌ (U+ab77 CHEROKEE SMALL LETTER KA) chr (ex:4254558,uses:4)
    • ‌Ꭸ‌ (U+13a8 CHEROKEE LETTER GE) → ‌ꭸ‌ (U+ab78 CHEROKEE SMALL LETTER GE) chr (ex:2422758,uses:2)
    • ‌Ꭹ‌ (U+13a9 CHEROKEE LETTER GI) → ‌ꭹ‌ (U+ab79 CHEROKEE SMALL LETTER GI) chr (ex:3176269,uses:9)
    • ‌Ꭺ‌ (U+13aa CHEROKEE LETTER GO) → ‌ꭺ‌ (U+ab7a CHEROKEE SMALL LETTER GO) chr (ex:5418759,uses:6)
    • ‌Ꭼ‌ (U+13ac CHEROKEE LETTER GV) → ‌ꭼ‌ (U+ab7c CHEROKEE SMALL LETTER GV) chr (ex:5418759,uses:7)
    • ‌Ꭽ‌ (U+13ad CHEROKEE LETTER HA) → ‌ꭽ‌ (U+ab7d CHEROKEE SMALL LETTER HA) chr (ex:5418759,uses:9)
    • ‌Ꭿ‌ (U+13af CHEROKEE LETTER HI) → ‌ꭿ‌ (U+ab7f CHEROKEE SMALL LETTER HI) chr (ex:3158198,uses:3)
    • ‌Ꮂ‌ (U+13b2 CHEROKEE LETTER HV) → ‌ꮂ‌ (U+ab82 CHEROKEE SMALL LETTER HV) chr (ex:2424258,uses:1)
    • ‌Ꮃ‌ (U+13b3 CHEROKEE LETTER LA) → ‌ꮃ‌ (U+ab83 CHEROKEE SMALL LETTER LA) chr (ex:2422749,uses:1)
    • ‌Ꮅ‌ (U+13b5 CHEROKEE LETTER LI) → ‌ꮅ‌ (U+ab85 CHEROKEE SMALL LETTER LI) chr (ex:5418749,uses:12)
    • ‌Ꮆ‌ (U+13b6 CHEROKEE LETTER LO) → ‌ꮆ‌ (U+ab86 CHEROKEE SMALL LETTER LO) chr (ex:3158211,uses:3)
    • ‌Ꮈ‌ (U+13b8 CHEROKEE LETTER LV) → ‌ꮈ‌ (U+ab88 CHEROKEE SMALL LETTER LV) chr (ex:3158198,uses:3)
    • ‌Ꮎ‌ (U+13be CHEROKEE LETTER NA) → ‌ꮎ‌ (U+ab8e CHEROKEE SMALL LETTER NA) chr (ex:3592632,uses:3)
    • ‌Ꮑ‌ (U+13c1 CHEROKEE LETTER NE) → ‌ꮑ‌ (U+ab91 CHEROKEE SMALL LETTER NE) chr (ex:2423583,uses:1)
    • ‌Ꮒ‌ (U+13c2 CHEROKEE LETTER NI) → ‌ꮒ‌ (U+ab92 CHEROKEE SMALL LETTER NI) chr (ex:3176269,uses:6)
    • ‌Ꮓ‌ (U+13c3 CHEROKEE LETTER NO) → ‌ꮓ‌ (U+ab93 CHEROKEE SMALL LETTER NO) chr (ex:2422745,uses:1)
    • ‌Ꮕ‌ (U+13c5 CHEROKEE LETTER NV) → ‌ꮕ‌ (U+ab95 CHEROKEE SMALL LETTER NV) chr (ex:2422757,uses:2)
    • ‌Ꮖ‌ (U+13c6 CHEROKEE LETTER QUA) → ‌ꮖ‌ (U+ab96 CHEROKEE SMALL LETTER QUA) chr (ex:4254558,uses:4)
    • ‌Ꮗ‌ (U+13c7 CHEROKEE LETTER QUE) → ‌ꮗ‌ (U+ab97 CHEROKEE SMALL LETTER QUE) chr (ex:3633475,uses:2)
    • ‌Ꮙ‌ (U+13c9 CHEROKEE LETTER QUO) → ‌ꮙ‌ (U+ab99 CHEROKEE SMALL LETTER QUO) chr (ex:2422745,uses:1)
    • ‌Ꮛ‌ (U+13cb CHEROKEE LETTER QUV) → ‌ꮛ‌ (U+ab9b CHEROKEE SMALL LETTER QUV) chr (ex:3633475,uses:1)
    • ‌Ꮜ‌ (U+13cc CHEROKEE LETTER SA) → ‌ꮜ‌ (U+ab9c CHEROKEE SMALL LETTER SA) chr (ex:2422746,uses:1)
    • ‌Ꮝ‌ (U+13cd CHEROKEE LETTER S) → ‌ꮝ‌ (U+ab9d CHEROKEE SMALL LETTER S) chr (ex:3633475,uses:9)
    • ‌Ꮟ‌ (U+13cf CHEROKEE LETTER SI) → ‌ꮟ‌ (U+ab9f CHEROKEE SMALL LETTER SI) chr (ex:3176269,uses:3)
    • ‌Ꮡ‌ (U+13d1 CHEROKEE LETTER SU) → ‌ꮡ‌ (U+aba1 CHEROKEE SMALL LETTER SU) chr (ex:2423583,uses:2)
    • ‌Ꮢ‌ (U+13d2 CHEROKEE LETTER SV) → ‌ꮢ‌ (U+aba2 CHEROKEE SMALL LETTER SV) chr (ex:2423586,uses:3)
    • ‌Ꮣ‌ (U+13d3 CHEROKEE LETTER DA) → ‌ꮣ‌ (U+aba3 CHEROKEE SMALL LETTER DA) chr (ex:3158100,uses:4)
    • ‌Ꮤ‌ (U+13d4 CHEROKEE LETTER TA) → ‌ꮤ‌ (U+aba4 CHEROKEE SMALL LETTER TA) chr (ex:3592632,uses:4)
    • ‌Ꮥ‌ (U+13d5 CHEROKEE LETTER DE) → ‌ꮥ‌ (U+aba5 CHEROKEE SMALL LETTER DE) chr (ex:3633475,uses:3)
    • ‌Ꮧ‌ (U+13d7 CHEROKEE LETTER DI) → ‌ꮧ‌ (U+aba7 CHEROKEE SMALL LETTER DI) chr (ex:3633475,uses:6)
    • ‌Ꮨ‌ (U+13d8 CHEROKEE LETTER TI) → ‌ꮨ‌ (U+aba8 CHEROKEE SMALL LETTER TI) chr (ex:5418759,uses:3)
    • ‌Ꮩ‌ (U+13d9 CHEROKEE LETTER DO) → ‌ꮩ‌ (U+aba9 CHEROKEE SMALL LETTER DO) chr (ex:3633475,uses:4)
    • ‌Ꮪ‌ (U+13da CHEROKEE LETTER DU) → ‌ꮪ‌ (U+abaa CHEROKEE SMALL LETTER DU) chr (ex:4254558,uses:2)
    • ‌Ꮭ‌ (U+13dd CHEROKEE LETTER TLA) → ‌ꮭ‌ (U+abad CHEROKEE SMALL LETTER TLA) chr (ex:3633475,uses:4)
    • ‌Ꮰ‌ (U+13e0 CHEROKEE LETTER TLO) → ‌ꮰ‌ (U+abb0 CHEROKEE SMALL LETTER TLO) chr (ex:3633475,uses:2)
    • ‌Ꮱ‌ (U+13e1 CHEROKEE LETTER TLU) → ‌ꮱ‌ (U+abb1 CHEROKEE SMALL LETTER TLU) chr (ex:2423576,uses:1)
    • ‌Ꮲ‌ (U+13e2 CHEROKEE LETTER TLV) → ‌ꮲ‌ (U+abb2 CHEROKEE SMALL LETTER TLV) chr (ex:3158166,uses:5)
    • ‌Ꮳ‌ (U+13e3 CHEROKEE LETTER TSA) → ‌ꮳ‌ (U+abb3 CHEROKEE SMALL LETTER TSA) chr (ex:2422758,uses:1)
    • ‌Ꮵ‌ (U+13e5 CHEROKEE LETTER TSI) → ‌ꮵ‌ (U+abb5 CHEROKEE SMALL LETTER TSI) chr (ex:5418749,uses:8)
    • ‌Ꮷ‌ (U+13e7 CHEROKEE LETTER TSU) → ‌ꮷ‌ (U+abb7 CHEROKEE SMALL LETTER TSU) chr (ex:2422745,uses:1)
    • ‌Ꮸ‌ (U+13e8 CHEROKEE LETTER TSV) → ‌ꮸ‌ (U+abb8 CHEROKEE SMALL LETTER TSV) chr (ex:3158166,uses:1)
    • ‌Ꮹ‌ (U+13e9 CHEROKEE LETTER WA) → ‌ꮹ‌ (U+abb9 CHEROKEE SMALL LETTER WA) chr (ex:5418759,uses:5)
    • ‌Ꮺ‌ (U+13ea CHEROKEE LETTER WE) → ‌ꮺ‌ (U+abba CHEROKEE SMALL LETTER WE) chr (ex:2422746,uses:1)
    • ‌Ꮻ‌ (U+13eb CHEROKEE LETTER WI) → ‌ꮻ‌ (U+abbb CHEROKEE SMALL LETTER WI) chr (ex:4254558,uses:3)
    • ‌Ꮼ‌ (U+13ec CHEROKEE LETTER WO) → ‌ꮼ‌ (U+abbc CHEROKEE SMALL LETTER WO) chr (ex:3176269,uses:5)
    • ‌Ꮿ‌ (U+13ef CHEROKEE LETTER YA) → ‌ꮿ‌ (U+abbf CHEROKEE SMALL LETTER YA) chr (ex:3633475,uses:3)
    • ‌Ᏸ‌ (U+13f0 CHEROKEE LETTER YE) → ‌ᏸ‌ (U+13f8 CHEROKEE SMALL LETTER YE) chr (ex:3158211,uses:1)
    • ‌Ᏹ‌ (U+13f1 CHEROKEE LETTER YI) → ‌ᏹ‌ (U+13f9 CHEROKEE SMALL LETTER YI) chr (ex:5418759,uses:3)
    • ‌Ᏺ‌ (U+13f2 CHEROKEE LETTER YO) → ‌ᏺ‌ (U+13fa CHEROKEE SMALL LETTER YO) chr (ex:3158211,uses:1)
    • ‌Ᏼ‌ (U+13f4 CHEROKEE LETTER YV) → ‌ᏼ‌ (U+13fc CHEROKEE SMALL LETTER YV) chr (ex:2422757,uses:1)
  • ゜ → ゚ Affects: Japanese [jpn]
    • ‌゜‌ (U+309c KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK) → ‌゚‌ (U+309a COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK) jpn (ex:236102,uses:1)
@Yorwba
Copy link
Contributor Author

Yorwba commented Oct 4, 2019

And GitHub has a character limit on the issue text, so here's the second part:

Case Alternatives (multiple codepoints)

  • ß → ss í → i̇́ İ → i̇ ẞ → ss Affects: Danish [dan], Kölsch [ksh], Swabian [swg], Low German (Low Saxon) [nds], Esperanto [epo], Crimean Tatar [crh], Venetian [vec], Galician [glg], Unknown Language, Latin [lat], Hungarian [hun], Arabic [ara], Hindi [hin], Bavarian [bar], French [fra], Hebrew [heb], Italian [ita], Ottoman Turkish [ota], Czech [ces], Mandarin Chinese [cmn], Japanese [jpn], Dutch [nld], Turkish [tur], Ido [ido], Slovenian [slv], Talossan [tzl], Finnish [fin], Berber [ber], Afrikaans [afr], English [eng], Spanish [spa], Lithuanian [lit], Talysh [tly], Zaza [zza], Russian [rus], Kabyle [kab], Interlingua [ina], Polish [pol], Portuguese [por], Toki Pona [toki], German [deu], Basque [eus], Tatar [tat]
    • ‌ß‌ (U+0xdf LATIN SMALL LETTER SHARP S) → ‌ss‌ (U+0x73 LATIN SMALL LETTER S)(U+0x73 LATIN SMALL LETTER S) Affects: afr (ex:2332641,uses:1), ara (ex:6564379,uses:1), bar (ex:5184415,uses:12), ber (ex:7527656,uses:1), ces (ex:2262091,uses:1), cmn (ex:1489800,uses:1), dan (ex:4632044,uses:2), deu (ex:89,uses:36638), eng (ex:556059,uses:8), epo (ex:590275,uses:12), eus (ex:920679,uses:1), fin (ex:3812353,uses:1), fra (ex:556061,uses:5), glg (ex:1727537,uses:1), heb (ex:590637,uses:1), hin (ex:4063579,uses:1), hun (ex:5604124,uses:2), ina (ex:3049206,uses:1), ita (ex:828363,uses:4), jpn (ex:2481158,uses:1), kab (ex:7090847,uses:1), ksh (ex:3089066,uses:12), nds (ex:807724,uses:58), nld (ex:683643,uses:5), pol (ex:937244,uses:1), por (ex:734642,uses:6), rus (ex:556075,uses:1), slv (ex:3811360,uses:1), spa (ex:587278,uses:4), swg (ex:5481300,uses:157), toki (ex:5898696,uses:1), tur (ex:937778,uses:4), tzl (ex:5176671,uses:17)
    • ‌í‌ (U+0xed LATIN SMALL LETTER I WITH ACUTE) → ‌i̇́‌ (U+0x69 LATIN SMALL LETTER I)(U+0x307 COMBINING DOT ABOVE)(U+0x301 COMBINING ACUTE ACCENT) Affects: lit (ex:7069241,uses:1)
    • ‌İ‌ (U+0x130 LATIN CAPITAL LETTER I WITH DOT ABOVE) → ‌i̇‌ (U+0x69 LATIN SMALL LETTER I)(U+0x307 COMBINING DOT ABOVE) Affects: Unknown Language (ex:7524825,uses:6), crh (ex:3212830,uses:3), eng (ex:4796838,uses:3), ido (ex:7456954,uses:1), lat (ex:7987987,uses:1), nld (ex:7535123,uses:1), ota (ex:7765625,uses:77), tat (ex:474461,uses:27), tly (ex:5741594,uses:1), vec (ex:8080814,uses:1), zza (ex:4434802,uses:8)
    • ‌ẞ‌ (U+0x1e9e LATIN CAPITAL LETTER SHARP S) → ‌ss‌ (U+0x73 LATIN SMALL LETTER S)(U+0x73 LATIN SMALL LETTER S) Affects: deu (ex:1651474,uses:1)
  • ᾄ → ἄι ᾐ → ἠι ᾔ → ἤι ᾕ → ἥι ᾖ → ἦι ᾗ → ἧι ᾧ → ὧι ᾳ → αι ᾷ → αι ᾷ → ᾶι ῂ → ὴι ῃ → ηι ῄ → ήι ῇ → ῆι ῞ → ̔́ ῳ → ωι ῴ → ώι ῷ → ῶι Affects: Ancient Greek [grc], Greek [ell]
    • ‌ᾄ‌ (U+0x1f84 GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA AND YPOGEGRAMMENI) → ‌ἄι‌ (U+0x1f04 GREEK SMALL LETTER ALPHA WITH PSILI AND OXIA)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:8491858,uses:1)
    • ‌ᾐ‌ (U+0x1f90 GREEK SMALL LETTER ETA WITH PSILI AND YPOGEGRAMMENI) → ‌ἠι‌ (U+0x1f20 GREEK SMALL LETTER ETA WITH PSILI)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:5096206,uses:1)
    • ‌ᾔ‌ (U+0x1f94 GREEK SMALL LETTER ETA WITH PSILI AND OXIA AND YPOGEGRAMMENI) → ‌ἤι‌ (U+0x1f24 GREEK SMALL LETTER ETA WITH PSILI AND OXIA)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:5096212,uses:2)
    • ‌ᾕ‌ (U+0x1f95 GREEK SMALL LETTER ETA WITH DASIA AND OXIA AND YPOGEGRAMMENI) → ‌ἥι‌ (U+0x1f25 GREEK SMALL LETTER ETA WITH DASIA AND OXIA)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:3041767,uses:1)
    • ‌ᾖ‌ (U+0x1f96 GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI AND YPOGEGRAMMENI) → ‌ἦι‌ (U+0x1f26 GREEK SMALL LETTER ETA WITH PSILI AND PERISPOMENI)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:5095129,uses:4)
    • ‌ᾗ‌ (U+0x1f97 GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI) → ‌ἧι‌ (U+0x1f27 GREEK SMALL LETTER ETA WITH DASIA AND PERISPOMENI)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:6674886,uses:1)
    • ‌ᾧ‌ (U+0x1fa7 GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI AND YPOGEGRAMMENI) → ‌ὧι‌ (U+0x1f67 GREEK SMALL LETTER OMEGA WITH DASIA AND PERISPOMENI)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:3103771,uses:3)
    • ‌ᾳ‌ (U+0x1fb3 GREEK SMALL LETTER ALPHA WITH YPOGEGRAMMENI) → ‌αι‌ (U+0x3b1 GREEK SMALL LETTER ALPHA)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:3103803,uses:9)
    • ‌ᾷ‌ (U+0x1fb7 GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND YPOGEGRAMMENI) → ‌αι‌ (U+0x3b1 GREEK SMALL LETTER ALPHA)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: ell (ex:3577873,uses:1)
    • ‌ᾷ‌ (U+0x1fb7 GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND YPOGEGRAMMENI) → ‌ᾶι‌ (U+0x1fb6 GREEK SMALL LETTER ALPHA WITH PERISPOMENI)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:5094979,uses:18)
    • ‌ῂ‌ (U+0x1fc2 GREEK SMALL LETTER ETA WITH VARIA AND YPOGEGRAMMENI) → ‌ὴι‌ (U+0x1f74 GREEK SMALL LETTER ETA WITH VARIA)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:6064076,uses:1)
    • ‌ῃ‌ (U+0x1fc3 GREEK SMALL LETTER ETA WITH YPOGEGRAMMENI) → ‌ηι‌ (U+0x3b7 GREEK SMALL LETTER ETA)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:2730756,uses:28)
    • ‌ῄ‌ (U+0x1fc4 GREEK SMALL LETTER ETA WITH OXIA AND YPOGEGRAMMENI) → ‌ήι‌ (U+0x3ae GREEK SMALL LETTER ETA WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:5096325,uses:4)
    • ‌ῇ‌ (U+0x1fc7 GREEK SMALL LETTER ETA WITH PERISPOMENI AND YPOGEGRAMMENI) → ‌ῆι‌ (U+0x1fc6 GREEK SMALL LETTER ETA WITH PERISPOMENI)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:2950602,uses:35)
    • ‌῞‌ (U+0x1fde GREEK DASIA AND OXIA) → ‌̔́‌ (U+0x314 COMBINING REVERSED COMMA ABOVE)(U+0x301 COMBINING ACUTE ACCENT) Affects: grc (ex:3012465,uses:1)
    • ‌ῳ‌ (U+0x1ff3 GREEK SMALL LETTER OMEGA WITH YPOGEGRAMMENI) → ‌ωι‌ (U+0x3c9 GREEK SMALL LETTER OMEGA)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:2458033,uses:37)
    • ‌ῴ‌ (U+0x1ff4 GREEK SMALL LETTER OMEGA WITH OXIA AND YPOGEGRAMMENI) → ‌ώι‌ (U+0x3ce GREEK SMALL LETTER OMEGA WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:6632591,uses:3)
    • ‌ῷ‌ (U+0x1ff7 GREEK SMALL LETTER OMEGA WITH PERISPOMENI AND YPOGEGRAMMENI) → ‌ῶι‌ (U+0x1ff6 GREEK SMALL LETTER OMEGA WITH PERISPOMENI)(U+0x3b9 GREEK SMALL LETTER IOTA) Affects: grc (ex:2724723,uses:54)
  • ﷺ → صلىاللهعليهوسلم Affects: Turkish [tur]
    • ‌ﷺ‌ (U+0xfdfa ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM) → ‌صلىاللهعليهوسلم‌ (U+0x635 ARABIC LETTER SAD)(U+0x644 ARABIC LETTER LAM)(U+0x649 ARABIC LETTER ALEF MAKSURA)(U+0x627 ARABIC LETTER ALEF)(U+0x644 ARABIC LETTER LAM)(U+0x644 ARABIC LETTER LAM)(U+0x647 ARABIC LETTER HEH)(U+0x639 ARABIC LETTER AIN)(U+0x644 ARABIC LETTER LAM)(U+0x64a ARABIC LETTER YEH)(U+0x647 ARABIC LETTER HEH)(U+0x648 ARABIC LETTER WAW)(U+0x633 ARABIC LETTER SEEN)(U+0x644 ARABIC LETTER LAM)(U+0x645 ARABIC LETTER MEEM) Affects: tur (ex:8099167,uses:1)
  • № → no ™ → tm Affects: Russian [rus], Kazakh [kaz], Belarusian [bel], Bulgarian [bul], Meadow Mari [mhr], French [fra], Tatar [tat], English [eng], Spanish [spa]
    • ‌№‌ (U+0x2116 NUMERO SIGN) → ‌no‌ (U+0x6e LATIN SMALL LETTER N)(U+0x6f LATIN SMALL LETTER O) Affects: bel (ex:409043,uses:2), bul (ex:2860457,uses:1), kaz (ex:1649477,uses:1), mhr (ex:2878124,uses:1), rus (ex:407514,uses:18), spa (ex:1436726,uses:1), tat (ex:2879788,uses:1)
    • ‌™‌ (U+0x2122 TRADE MARK SIGN) → ‌tm‌ (U+0x74 LATIN SMALL LETTER T)(U+0x6d LATIN SMALL LETTER M) Affects: eng (ex:1270412,uses:1), fra (ex:1270415,uses:1)
  • ¼ → 14 ½ → 12 ⅓ → 13 Affects: Danish [dan], English [eng], German [deu]
    • ‌¼‌ (U+0xbc VULGAR FRACTION ONE QUARTER) → ‌14‌ (U+0x31 DIGIT ONE)(U+0x34 DIGIT FOUR) Affects: deu (ex:6310288,uses:1), eng (ex:6310278,uses:1)
    • ‌½‌ (U+0xbd VULGAR FRACTION ONE HALF) → ‌12‌ (U+0x31 DIGIT ONE)(U+0x32 DIGIT TWO) Affects: dan (ex:6310967,uses:1), deu (ex:6310250,uses:7), eng (ex:6310241,uses:5)
    • ‌⅓‌ (U+0x2153 VULGAR FRACTION ONE THIRD) → ‌13‌ (U+0x31 DIGIT ONE)(U+0x33 DIGIT THREE) Affects: deu (ex:8044695,uses:1)
  • Mή → mη Ẹ̀ → ẹ̀ Άι → αϊ Άσ → ας Έί → εϊ Έι → εϊ Βή → βη Ζή → ζη Λή → λη Μή → μη Μῆ → μη Νή → νη Πή → πη Ρή → ρη Σή → ση Τὴ → τη Χή → χη Ψή → ψη άι → αϊ άσ → ας άυ → αϋ έι → εϊ έσ → ες έυ → εϋ ήµ → ημ ήι → ηϊ ίσ → ις όι → οϊ ύι → υϊ ώι → ωϊ ώσ → ως ᾷς → ᾶις ῃς → ηις ῇς → ῆις Affects: Ancient Greek [grc], Greek [ell]
    • ‌Mή‌ (U+0x4d LATIN CAPITAL LETTER M)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌mη‌ (U+0x6d LATIN SMALL LETTER M)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:1290852,uses:1)
    • ‌Ẹ̀‌ (U+0xc8 LATIN CAPITAL LETTER E WITH GRAVE)(U+0x323 COMBINING DOT BELOW) → ‌ẹ̀‌ (U+0x1eb9 LATIN SMALL LETTER E WITH DOT BELOW)(U+0x300 COMBINING GRAVE ACCENT) Affects:
    • ‌Άι‌ (U+0x386 GREEK CAPITAL LETTER ALPHA WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) → ‌αϊ‌ (U+0x3b1 GREEK SMALL LETTER ALPHA)(U+0x3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) Affects: ell (ex:1515707,uses:4)
    • ‌Άσ‌ (U+0x386 GREEK CAPITAL LETTER ALPHA WITH TONOS)(U+0x3c3 GREEK SMALL LETTER SIGMA) → ‌ας‌ (U+0x3b1 GREEK SMALL LETTER ALPHA)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) Affects: ell (ex:1285457,uses:34)
    • ‌Έί‌ (U+0x388 GREEK CAPITAL LETTER EPSILON WITH TONOS)(U+0x3af GREEK SMALL LETTER IOTA WITH TONOS) → ‌εϊ‌ (U+0x3b5 GREEK SMALL LETTER EPSILON)(U+0x3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) Affects: ell (ex:1389708,uses:1)
    • ‌Έι‌ (U+0x388 GREEK CAPITAL LETTER EPSILON WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) → ‌εϊ‌ (U+0x3b5 GREEK SMALL LETTER EPSILON)(U+0x3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) Affects: ell (ex:3399178,uses:7)
    • ‌Βή‌ (U+0x392 GREEK CAPITAL LETTER BETA)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌βη‌ (U+0x3b2 GREEK SMALL LETTER BETA)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:5456768,uses:1)
    • ‌Ζή‌ (U+0x396 GREEK CAPITAL LETTER ZETA)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌ζη‌ (U+0x3b6 GREEK SMALL LETTER ZETA)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:1402347,uses:15)
    • ‌Λή‌ (U+0x39b GREEK CAPITAL LETTER LAMDA)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌λη‌ (U+0x3bb GREEK SMALL LETTER LAMDA)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:5485787,uses:1)
    • ‌Μή‌ (U+0x39c GREEK CAPITAL LETTER MU)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌μη‌ (U+0x3bc GREEK SMALL LETTER MU)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:539702,uses:8)
    • ‌Μῆ‌ (U+0x39c GREEK CAPITAL LETTER MU)(U+0x1fc6 GREEK SMALL LETTER ETA WITH PERISPOMENI) → ‌μη‌ (U+0x3bc GREEK SMALL LETTER MU)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:3577873,uses:1)
    • ‌Νή‌ (U+0x39d GREEK CAPITAL LETTER NU)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌νη‌ (U+0x3bd GREEK SMALL LETTER NU)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:3000443,uses:1)
    • ‌Πή‌ (U+0x3a0 GREEK CAPITAL LETTER PI)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌πη‌ (U+0x3c0 GREEK SMALL LETTER PI)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:716029,uses:112)
    • ‌Ρή‌ (U+0x3a1 GREEK CAPITAL LETTER RHO)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌ρη‌ (U+0x3c1 GREEK SMALL LETTER RHO)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:3577900,uses:1)
    • ‌Σή‌ (U+0x3a3 GREEK CAPITAL LETTER SIGMA)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌ση‌ (U+0x3c3 GREEK SMALL LETTER SIGMA)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:699020,uses:80)
    • ‌Τὴ‌ (U+0x3a4 GREEK CAPITAL LETTER TAU)(U+0x1f74 GREEK SMALL LETTER ETA WITH VARIA) → ‌τη‌ (U+0x3c4 GREEK SMALL LETTER TAU)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:3577858,uses:1)
    • ‌Χή‌ (U+0x3a7 GREEK CAPITAL LETTER CHI)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌χη‌ (U+0x3c7 GREEK SMALL LETTER CHI)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:1401873,uses:1)
    • ‌Ψή‌ (U+0x3a8 GREEK CAPITAL LETTER PSI)(U+0x3ae GREEK SMALL LETTER ETA WITH TONOS) → ‌ψη‌ (U+0x3c8 GREEK SMALL LETTER PSI)(U+0x3b7 GREEK SMALL LETTER ETA) Affects: ell (ex:5444012,uses:4)
    • ‌άι‌ (U+0x3ac GREEK SMALL LETTER ALPHA WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) → ‌αϊ‌ (U+0x3b1 GREEK SMALL LETTER ALPHA)(U+0x3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) Affects: ell (ex:948472,uses:105)
    • ‌άσ‌ (U+0x3ac GREEK SMALL LETTER ALPHA WITH TONOS)(U+0x3c3 GREEK SMALL LETTER SIGMA) → ‌ας‌ (U+0x3b1 GREEK SMALL LETTER ALPHA)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) Affects: ell (ex:448678,uses:1323)
    • ‌άυ‌ (U+0x3ac GREEK SMALL LETTER ALPHA WITH TONOS)(U+0x3c5 GREEK SMALL LETTER UPSILON) → ‌αϋ‌ (U+0x3b1 GREEK SMALL LETTER ALPHA)(U+0x3cb GREEK SMALL LETTER UPSILON WITH DIALYTIKA) Affects: ell (ex:8251807,uses:1)
    • ‌έι‌ (U+0x3ad GREEK SMALL LETTER EPSILON WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) → ‌εϊ‌ (U+0x3b5 GREEK SMALL LETTER EPSILON)(U+0x3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) Affects: ell (ex:715898,uses:74)
    • ‌έσ‌ (U+0x3ad GREEK SMALL LETTER EPSILON WITH TONOS)(U+0x3c3 GREEK SMALL LETTER SIGMA) → ‌ες‌ (U+0x3b5 GREEK SMALL LETTER EPSILON)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) Affects: ell (ex:338375,uses:1545)
    • ‌έυ‌ (U+0x3ad GREEK SMALL LETTER EPSILON WITH TONOS)(U+0x3c5 GREEK SMALL LETTER UPSILON) → ‌εϋ‌ (U+0x3b5 GREEK SMALL LETTER EPSILON)(U+0x3cb GREEK SMALL LETTER UPSILON WITH DIALYTIKA) Affects: ell (ex:4296991,uses:4)
    • ‌ήµ‌ (U+0x3ae GREEK SMALL LETTER ETA WITH TONOS)(U+0xb5 MICRO SIGN) → ‌ημ‌ (U+0x3b7 GREEK SMALL LETTER ETA)(U+0x3bc GREEK SMALL LETTER MU) Affects: ell (ex:1384306,uses:1)
    • ‌ήι‌ (U+0x3ae GREEK SMALL LETTER ETA WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) → ‌ηϊ‌ (U+0x3b7 GREEK SMALL LETTER ETA)(U+0x3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) Affects: ell (ex:1386552,uses:1)
    • ‌ίσ‌ (U+0x3af GREEK SMALL LETTER IOTA WITH TONOS)(U+0x3c3 GREEK SMALL LETTER SIGMA) → ‌ις‌ (U+0x3b9 GREEK SMALL LETTER IOTA)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) Affects: ell (ex:425923,uses:2236)
    • ‌όι‌ (U+0x3cc GREEK SMALL LETTER OMICRON WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) → ‌οϊ‌ (U+0x3bf GREEK SMALL LETTER OMICRON)(U+0x3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) Affects: ell (ex:998991,uses:34)
    • ‌ύι‌ (U+0x3cd GREEK SMALL LETTER UPSILON WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) → ‌υϊ‌ (U+0x3c5 GREEK SMALL LETTER UPSILON)(U+0x3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) Affects: ell (ex:5316087,uses:1)
    • ‌ώι‌ (U+0x3ce GREEK SMALL LETTER OMEGA WITH TONOS)(U+0x3b9 GREEK SMALL LETTER IOTA) → ‌ωϊ‌ (U+0x3c9 GREEK SMALL LETTER OMEGA)(U+0x3ca GREEK SMALL LETTER IOTA WITH DIALYTIKA) Affects: ell (ex:2403860,uses:2)
    • ‌ώσ‌ (U+0x3ce GREEK SMALL LETTER OMEGA WITH TONOS)(U+0x3c3 GREEK SMALL LETTER SIGMA) → ‌ως‌ (U+0x3c9 GREEK SMALL LETTER OMEGA)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) Affects: ell (ex:715872,uses:716)
    • ‌ᾷς‌ (U+0x1fb7 GREEK SMALL LETTER ALPHA WITH PERISPOMENI AND YPOGEGRAMMENI)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) → ‌ᾶις‌ (U+0x1fb6 GREEK SMALL LETTER ALPHA WITH PERISPOMENI)(U+0x3b9 GREEK SMALL LETTER IOTA)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) Affects: grc (ex:5094979,uses:4)
    • ‌ῃς‌ (U+0x1fc3 GREEK SMALL LETTER ETA WITH YPOGEGRAMMENI)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) → ‌ηις‌ (U+0x3b7 GREEK SMALL LETTER ETA)(U+0x3b9 GREEK SMALL LETTER IOTA)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) Affects: grc (ex:3229996,uses:4)
    • ‌ῇς‌ (U+0x1fc7 GREEK SMALL LETTER ETA WITH PERISPOMENI AND YPOGEGRAMMENI)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) → ‌ῆις‌ (U+0x1fc6 GREEK SMALL LETTER ETA WITH PERISPOMENI)(U+0x3b9 GREEK SMALL LETTER IOTA)(U+0x3c2 GREEK SMALL LETTER FINAL SIGMA) Affects: grc (ex:5095228,uses:4)

Other Mappings Currently in Use

  • Ά → ά · → έ Έ → ή Ή → ί Ό → ό Ύ → ώ Affects: Greek [ell]
    • ‌Ά‌ (U+386 GREEK CAPITAL LETTER ALPHA WITH TONOS) → ‌ά‌ (U+3ac GREEK SMALL LETTER ALPHA WITH TONOS) ell (ex:8080260,uses:207)
    • ‌·‌ (U+387 GREEK ANO TELEIA) → ‌έ‌ (U+3ad GREEK SMALL LETTER EPSILON WITH TONOS) ell (ex:5142461,uses:2)
    • ‌Έ‌ (U+388 GREEK CAPITAL LETTER EPSILON WITH TONOS) → ‌ή‌ (U+3ae GREEK SMALL LETTER ETA WITH TONOS) ell (ex:8231056,uses:1339)
    • ‌Ή‌ (U+389 GREEK CAPITAL LETTER ETA WITH TONOS) → ‌ί‌ (U+3af GREEK SMALL LETTER IOTA WITH TONOS) ell (ex:8190378,uses:480)
    • ‌Ό‌ (U+38c GREEK CAPITAL LETTER OMICRON WITH TONOS) → ‌ό‌ (U+3cc GREEK SMALL LETTER OMICRON WITH TONOS) ell (ex:8188267,uses:383)
    • ‌Ύ‌ (U+38e GREEK CAPITAL LETTER UPSILON WITH TONOS) → ‌ώ‌ (U+3ce GREEK SMALL LETTER OMEGA WITH TONOS) ell (ex:7773402,uses:1)
  • J → i U → v W → v j → i u → v w → v á → a é → e í → i ó → o Ā → a ā → a Ē → e ē → e ĕ → e Ī → i ī → i ĭ → i ō → o Ū → v ū → v Affects: Latin [lat]
    • ‌J‌ (U+4a LATIN CAPITAL LETTER J) → ‌i‌ (U+69 LATIN SMALL LETTER I) lat (ex:8214141,uses:501)
    • ‌U‌ (U+55 LATIN CAPITAL LETTER U) → ‌v‌ (U+76 LATIN SMALL LETTER V) lat (ex:8227281,uses:27379)
    • ‌W‌ (U+57 LATIN CAPITAL LETTER W) → ‌v‌ (U+76 LATIN SMALL LETTER V) lat (ex:8107101,uses:16)
    • ‌j‌ (U+6a LATIN SMALL LETTER J) → ‌i‌ (U+69 LATIN SMALL LETTER I) lat (ex:8214141,uses:501)
    • ‌u‌ (U+75 LATIN SMALL LETTER U) → ‌v‌ (U+76 LATIN SMALL LETTER V) lat (ex:8227281,uses:27379)
    • ‌w‌ (U+77 LATIN SMALL LETTER W) → ‌v‌ (U+76 LATIN SMALL LETTER V) lat (ex:8107101,uses:16)
    • ‌á‌ (U+e1 LATIN SMALL LETTER A WITH ACUTE) → ‌a‌ (U+61 LATIN SMALL LETTER A) lat (ex:3577855,uses:3)
    • ‌é‌ (U+e9 LATIN SMALL LETTER E WITH ACUTE) → ‌e‌ (U+65 LATIN SMALL LETTER E) lat (ex:7366641,uses:6)
    • ‌í‌ (U+ed LATIN SMALL LETTER I WITH ACUTE) → ‌i‌ (U+69 LATIN SMALL LETTER I) lat (ex:3577855,uses:3)
    • ‌ó‌ (U+f3 LATIN SMALL LETTER O WITH ACUTE) → ‌o‌ (U+6f LATIN SMALL LETTER O) lat (ex:3577855,uses:3)
    • ‌Ā‌ (U+100 LATIN CAPITAL LETTER A WITH MACRON) → ‌a‌ (U+61 LATIN SMALL LETTER A) lat (ex:4424729,uses:1)
    • ‌ā‌ (U+101 LATIN SMALL LETTER A WITH MACRON) → ‌a‌ (U+61 LATIN SMALL LETTER A) lat (ex:8091615,uses:170)
    • ‌Ē‌ (U+112 LATIN CAPITAL LETTER E WITH MACRON) → ‌e‌ (U+65 LATIN SMALL LETTER E) lat (ex:1194237,uses:1)
    • ‌ē‌ (U+113 LATIN SMALL LETTER E WITH MACRON) → ‌e‌ (U+65 LATIN SMALL LETTER E) lat (ex:8091615,uses:162)
    • ‌ĕ‌ (U+115 LATIN SMALL LETTER E WITH BREVE) → ‌e‌ (U+65 LATIN SMALL LETTER E) lat (ex:1151573,uses:2)
    • ‌Ī‌ (U+12a LATIN CAPITAL LETTER I WITH MACRON) → ‌i‌ (U+69 LATIN SMALL LETTER I) lat (ex:5876456,uses:3)
    • ‌ī‌ (U+12b LATIN SMALL LETTER I WITH MACRON) → ‌i‌ (U+69 LATIN SMALL LETTER I) lat (ex:8091615,uses:140)
    • ‌ĭ‌ (U+12d LATIN SMALL LETTER I WITH BREVE) → ‌i‌ (U+69 LATIN SMALL LETTER I) lat (ex:3278929,uses:2)
    • ‌ō‌ (U+14d LATIN SMALL LETTER O WITH MACRON) → ‌o‌ (U+6f LATIN SMALL LETTER O) lat (ex:8091615,uses:156)
    • ‌Ū‌ (U+16a LATIN CAPITAL LETTER U WITH MACRON) → ‌v‌ (U+76 LATIN SMALL LETTER V) lat (ex:3616273,uses:2)
    • ‌ū‌ (U+16b LATIN SMALL LETTER U WITH MACRON) → ‌v‌ (U+76 LATIN SMALL LETTER V) lat (ex:8091615,uses:59)
  • ł → Ń Affects: Polish [pol], Navajo [nav], Lower Sorbian [dsb], Esperanto [epo], Mandarin Chinese [cmn], Upper Sorbian [hsb], German [deu], English [eng], Belarusian [bel], Kashubian [csb], Dutch [nld], Danish [dan], Bavarian [bar], Hungarian [hun], Spanish [spa], Berber [ber], Slovak [slk], Portuguese [por], Italian [ita], Indonesian [ind]
    • ‌ł‌ (U+142 LATIN SMALL LETTER L WITH STROKE) → ‌Ń‌ (U+143 LATIN CAPITAL LETTER N WITH ACUTE) pol (ex:8231498,uses:47420), nav (ex:7570627,uses:30), dsb (ex:5499999,uses:276), epo (ex:8152747,uses:8), cmn (ex:919685,uses:1), hsb (ex:5302415,uses:235), deu (ex:6330832,uses:8), eng (ex:6819949,uses:15), bel (ex:5145432,uses:7), csb (ex:6920795,uses:268), nld (ex:3115870,uses:3), dan (ex:6676968,uses:1), bar (ex:5113727,uses:1), hun (ex:3174924,uses:1), spa (ex:5883608,uses:3), ber (ex:1630774,uses:1), slk (ex:1045027,uses:5), por (ex:4982378,uses:4), ita (ex:4982027,uses:2), ind (ex:3878526,uses:1)
  • ם → מ ף → פ Affects: Hebrew [heb], Yiddish [yid]
    • ‌ם‌ (U+5dd HEBREW LETTER FINAL MEM) → ‌מ‌ (U+5de HEBREW LETTER MEM) heb (ex:8207535,uses:89098), yid (ex:8232829,uses:365)
    • ‌ף‌ (U+5e3 HEBREW LETTER FINAL PE) → ‌פ‌ (U+5e4 HEBREW LETTER PE) heb (ex:8207528,uses:9396), yid (ex:8181047,uses:83)
  • ņ → Ň Affects: Esperanto [epo], Lithuanian [lit], Latvian [lvs], English [eng], French [fra], Livonian [liv], Unknown Language, Portuguese [por], Italian [ita]
    • ‌ņ‌ (U+146 LATIN SMALL LETTER N WITH CEDILLA) → ‌Ň‌ (U+147 LATIN CAPITAL LETTER N WITH CARON) epo (ex:1175950,uses:1), lit (ex:8222969,uses:1), lvs (ex:7790689,uses:381), eng (ex:1175971,uses:1), fra (ex:1175977,uses:1), liv (ex:3665328,uses:4), por (ex:1176097,uses:1), ita (ex:3349755,uses:1)
  • H → ' h → ' Affects: Lojban [jbo]
    • ‌H‌ (U+48 LATIN CAPITAL LETTER H) → ‌'‌ (U+27 APOSTROPHE) jbo (ex:5181286,uses:36)
    • ‌h‌ (U+68 LATIN SMALL LETTER H) → ‌'‌ (U+27 APOSTROPHE) jbo (ex:5181286,uses:36)
  • ĺ → Ļ Affects: Spanish [spa], Slovak [slk], Danish [dan], Hungarian [hun]
    • ‌ĺ‌ (U+13a LATIN SMALL LETTER L WITH ACUTE) → ‌Ļ‌ (U+13b LATIN CAPITAL LETTER L WITH CEDILLA) spa (ex:3520542,uses:1), slk (ex:4959254,uses:3), dan (ex:7208279,uses:1), hun (ex:6455494,uses:2)
  • ľ → Ŀ Affects: Czech [ces], Slovak [slk], Veps [vep], Romani [rom]
    • ‌ľ‌ (U+13e LATIN SMALL LETTER L WITH CARON) → ‌Ŀ‌ (U+13f LATIN CAPITAL LETTER L WITH MIDDLE DOT) ces (ex:4404806,uses:1), slk (ex:8217533,uses:383), vep (ex:7918868,uses:1), rom (ex:6717589,uses:9)
  • Â → a â → a î → ı û → u Affects: Turkish [tur]
    • ‌Â‌ (U+c2 LATIN CAPITAL LETTER A WITH CIRCUMFLEX) → ‌a‌ (U+61 LATIN SMALL LETTER A) tur (ex:7767029,uses:18)
    • ‌â‌ (U+e2 LATIN SMALL LETTER A WITH CIRCUMFLEX) → ‌a‌ (U+61 LATIN SMALL LETTER A) tur (ex:8228866,uses:6126)
    • ‌î‌ (U+ee LATIN SMALL LETTER I WITH CIRCUMFLEX) → ‌ı‌ (U+131 LATIN SMALL LETTER DOTLESS I) tur (ex:8172014,uses:84)
    • ‌û‌ (U+fb LATIN SMALL LETTER U WITH CIRCUMFLEX) → ‌u‌ (U+75 LATIN SMALL LETTER U) tur (ex:8063569,uses:114)
  • Ơ → ơ Affects: Vietnamese [vie]
    • ‌Ơ‌ (U+1a0 LATIN CAPITAL LETTER O WITH HORN) → ‌ơ‌ (U+1a1 LATIN SMALL LETTER O WITH HORN) vie (ex:6552524,uses:2)
  • ך → כ Affects: Hebrew [heb], Yiddish [yid], Old Aramaic [oar]
    • ‌ך‌ (U+5da HEBREW LETTER FINAL KAF) → ‌כ‌ (U+5db HEBREW LETTER KAF) heb (ex:8207523,uses:31311), yid (ex:8223428,uses:551), oar (ex:8107628,uses:1)
  • È → è Affects: Yoruba [yor]
    • ‌È‌ (U+c8 LATIN CAPITAL LETTER E WITH GRAVE) → ‌è‌ (U+e8 LATIN SMALL LETTER E WITH GRAVE) yor (ex:3352025,uses:1)
  • ń → Ņ Affects: Polish [pol], Wolof [wol], Lower Sorbian [dsb], Esperanto [epo], Upper Sorbian [hsb], Yoruba [yor], German [deu], English [eng], Belarusian [bel], Hungarian [hun], Spanish [spa], Berber [ber], Slovak [slk]
    • ‌ń‌ (U+144 LATIN SMALL LETTER N WITH ACUTE) → ‌Ņ‌ (U+145 LATIN CAPITAL LETTER N WITH CEDILLA) pol (ex:8231482,uses:4158), wol (ex:4617438,uses:1), dsb (ex:3834455,uses:49), epo (ex:3798149,uses:2), hsb (ex:5499962,uses:23), yor (ex:3352030,uses:1), deu (ex:3798148,uses:2), eng (ex:6197105,uses:3), bel (ex:4017893,uses:1), hun (ex:1059441,uses:1), spa (ex:5447409,uses:3), ber (ex:1735847,uses:1), slk (ex:1045007,uses:2)
  • ץ → צ Affects: English [eng], Hebrew [heb], Yiddish [yid]
    • ‌ץ‌ (U+5e5 HEBREW LETTER FINAL TSADI) → ‌צ‌ (U+5e6 HEBREW LETTER TSADI) eng (ex:5242417,uses:1), heb (ex:8183237,uses:4354), yid (ex:8177266,uses:66)
  • ן → נ Affects: English [eng], Yiddish [yid], Old Aramaic [oar], Hebrew [heb], Ladino [lad]
    • ‌ן‌ (U+5df HEBREW LETTER FINAL NUN) → ‌נ‌ (U+5e0 HEBREW LETTER NUN) eng (ex:5242417,uses:1), yid (ex:8223428,uses:1061), oar (ex:5285789,uses:1), heb (ex:8201962,uses:38490), lad (ex:7079461,uses:1)
  • ļ → Ľ Affects: Unknown Language, Lithuanian [lit], Livonian [liv], Latvian [lvs]
    • ‌ļ‌ (U+13c LATIN SMALL LETTER L WITH CEDILLA) → ‌Ľ‌ (U+13d LATIN CAPITAL LETTER L WITH CARON) Affects: Unknown Language (ex:8222277,uses:1), lit (ex:8222969,uses:1), liv (ex:7918865,uses:4), lvs (ex:8217251,uses:151)
  • ň → ʼn Affects: Czech [ces], Slovak [slk], Turkmen [tuk], Romani [rom]
    • ‌ň‌ (U+148 LATIN SMALL LETTER N WITH CARON) → ‌ʼn‌ (U+149 LATIN SMALL LETTER N PRECEDED BY APOSTROPHE) ces (ex:8223389,uses:445), slk (ex:8214878,uses:131), tuk (ex:8073715,uses:2045), rom (ex:6715909,uses:7)
  • I → i Affects: Azerbaijani [aze]
    • ‌I‌ (U+49 LATIN CAPITAL LETTER I) → ‌i‌ (U+69 LATIN SMALL LETTER I) aze (ex:8174218,uses:3855)
  • İ → ı Affects: Ottoman Turkish [ota], Zaza [zza], Talysh [tly], Tatar [tat], English [eng], Azerbaijani [aze], Ido [ido], Dutch [nld], Venetian [vec], Crimean Tatar [crh]
    • ‌İ‌ (U+130 LATIN CAPITAL LETTER I WITH DOT ABOVE) → ‌ı‌ (U+131 LATIN SMALL LETTER DOTLESS I) ota (ex:8105906,uses:70), zza (ex:6092147,uses:8), tly (ex:5741594,uses:1), tat (ex:8086832,uses:26), eng (ex:6119444,uses:3), aze (ex:7705499,uses:264), ido (ex:7456954,uses:1), nld (ex:7535123,uses:1), vec (ex:8080814,uses:1), crh (ex:4477569,uses:3)

Punctuation and Symbols

  • ՛ ՜ ՝ ՞ ՟ ։ Affects: Armenian [hye]
    • ‌՛‌ (U+55b ARMENIAN EMPHASIS MARK) hye (ex:5157407,uses:183)
    • ‌՜‌ (U+55c ARMENIAN EXCLAMATION MARK) hye (ex:8108392,uses:40)
    • ‌՝‌ (U+55d ARMENIAN COMMA) hye (ex:5155631,uses:27)
    • ‌՞‌ (U+55e ARMENIAN QUESTION MARK) hye (ex:8108398,uses:197)
    • ‌՟‌ (U+55f ARMENIAN ABBREVIATION MARK) hye (ex:2618354,uses:1)
    • ‌։‌ (U+589 ARMENIAN FULL STOP) hye (ex:8108601,uses:49)
  • 〈 〉 【 】 〔 〕 〜 Affects: Japanese [jpn]
    • ‌〈‌ (U+3008 LEFT ANGLE BRACKET) jpn (ex:75567,uses:1)
    • ‌〉‌ (U+3009 RIGHT ANGLE BRACKET) jpn (ex:75567,uses:1)
    • ‌【‌ (U+3010 LEFT BLACK LENTICULAR BRACKET) jpn (ex:75968,uses:3)
    • ‌】‌ (U+3011 RIGHT BLACK LENTICULAR BRACKET) jpn (ex:75968,uses:3)
    • ‌〔‌ (U+3014 LEFT TORTOISE SHELL BRACKET) jpn (ex:200469,uses:1)
    • ‌〕‌ (U+3015 RIGHT TORTOISE SHELL BRACKET) jpn (ex:200469,uses:1)
    • ‌〜‌ (U+301c WAVE DASH) jpn (ex:5592680,uses:6)
  • 「 」 Affects: Mandarin Chinese [cmn], Cantonese [yue], Japanese [jpn], Ainu [ain], Korean [kor], Literary Chinese [lzh], Russian [rus], Shanghainese [wuu]
    • ‌「‌ (U+300c LEFT CORNER BRACKET) cmn (ex:7980260,uses:130), yue (ex:7999883,uses:87), jpn (ex:8210853,uses:2069), ain (ex:2850225,uses:1), kor (ex:6568041,uses:1), lzh (ex:4623520,uses:58), rus (ex:769335,uses:1), wuu (ex:1264952,uses:18)
    • ‌」‌ (U+300d RIGHT CORNER BRACKET) cmn (ex:7980260,uses:130), yue (ex:7999883,uses:87), jpn (ex:8210853,uses:2071), ain (ex:2850225,uses:1), kor (ex:6568041,uses:1), lzh (ex:4623520,uses:58), rus (ex:769335,uses:1), wuu (ex:1264952,uses:18)
  • 』 Affects: Ancient Greek [grc], Mandarin Chinese [cmn], Cantonese [yue], Japanese [jpn], Literary Chinese [lzh]
    • ‌』‌ (U+300f RIGHT WHITE CORNER BRACKET) grc (ex:3103770,uses:1), cmn (ex:2757150,uses:6), yue (ex:2226556,uses:1), jpn (ex:8210817,uses:73), lzh (ex:3239574,uses:7)
  • (IDEOGRAPHIC SPACE) Affects: Mandarin Chinese [cmn], German [deu], English [eng], Japanese [jpn], Turkish [tur], Ainu [ain], Literary Chinese [lzh]
    • ‌ ‌ (U+3000 IDEOGRAPHIC SPACE) cmn (ex:923457,uses:12), deu (ex:3989506,uses:2), eng (ex:1655179,uses:1), jpn (ex:8210818,uses:104), tur (ex:5472733,uses:1), ain (ex:7368423,uses:17), lzh (ex:1259362,uses:1)
  • _ Affects: Polish [pol], Finnish [fin], Uyghur [uig], English [eng], Japanese [jpn], Dutch [nld], Spanish [spa], Portuguese [por], Russian [rus], Bulgarian [bul], Esperanto [epo], Macedonian [mkd], Swedish [swe], Tatar [tat], Hungarian [hun], Italian [ita], Arabic [ara], Mandarin Chinese [cmn], German [deu], French [fra], Czech [ces], Berber [ber], Georgian [kat], Serbian [srp], Kabyle [kab], Belarusian [bel], Basque [eus], Turkish [tur]
    • ‌_‌ (U+5f LOW LINE) pol (ex:8231498,uses:99436), fin (ex:8230498,uses:101471), uig (ex:8009343,uses:7591), eng (ex:8232961,uses:1225118), jpn (ex:8225029,uses:187409), nld (ex:8232962,uses:100355), spa (ex:8232965,uses:311525), por (ex:8232904,uses:339177), rus (ex:8232551,uses:716993), bul (ex:8213039,uses:24226), epo (ex:8232953,uses:604352), mkd (ex:8216522,uses:77778), swe (ex:8229074,uses:34335), tat (ex:8221991,uses:13513), hun (ex:8232902,uses:259781), ita (ex:8232706,uses:730250), ara (ex:8226940,uses:33455), cmn (ex:8215755,uses:60891), deu (ex:8232944,uses:478260), fra (ex:8232876,uses:394070), ces (ex:8232406,uses:34965), ber (ex:8232954,uses:210513), kat (ex:8192728,uses:1283), srp (ex:8227074,uses:30547), kab (ex:8232827,uses:106606), bel (ex:8223467,uses:11008), eus (ex:7886476,uses:6060), tur (ex:8232730,uses:677367)
  • 『 Affects: Japanese [jpn], Literary Chinese [lzh], Mandarin Chinese [cmn], Cantonese [yue]
    • ‌『‌ (U+300e LEFT WHITE CORNER BRACKET) jpn (ex:8210817,uses:73), lzh (ex:3239574,uses:7), cmn (ex:2757150,uses:6), yue (ex:2226556,uses:1)
  • 《 》 Affects: Literary Chinese [lzh], Mandarin Chinese [cmn], Cantonese [yue], Shanghainese [wuu]
    • ‌《‌ (U+300a LEFT DOUBLE ANGLE BRACKET) lzh (ex:3334390,uses:12), cmn (ex:6486650,uses:26), yue (ex:5734977,uses:2), wuu (ex:348486,uses:1)
    • ‌》‌ (U+300b RIGHT DOUBLE ANGLE BRACKET) lzh (ex:3334390,uses:12), cmn (ex:6486650,uses:26), yue (ex:5734977,uses:2), wuu (ex:348486,uses:1)
  • ・ Affects: English [eng], Japanese [jpn]
    • ‌・‌ (U+30fb KATAKANA MIDDLE DOT) eng (ex:328235,uses:1), jpn (ex:8210814,uses:803)
  • 。 Affects: Hakka Chinese [hak], Xiang Chinese [hsn], Bulgarian [bul], Chinese (Jin) [cjy], Mandarin Chinese [cmn], Cantonese [yue], Japanese [jpn], Sumerian [sux], Gan Chinese [gan], Irish [gle], Ainu [ain], Korean [kor], Literary Chinese [lzh], Lojban [jbo], Min Nan Chinese [nan], Chavacano [cbk], Italian [ita], Shanghainese [wuu]
    • ‌。‌ (U+3002 IDEOGRAPHIC FULL STOP) hak (ex:6958765,uses:1), hsn (ex:5077880,uses:1), bul (ex:779726,uses:1), cjy (ex:7914365,uses:7), cmn (ex:8215755,uses:50491), yue (ex:8223095,uses:4645), jpn (ex:8225029,uses:180036), sux (ex:5844025,uses:1), gan (ex:5079259,uses:2), gle (ex:4654601,uses:1), ain (ex:7368423,uses:19), kor (ex:1643628,uses:1), lzh (ex:7980357,uses:1562), jbo (ex:5072726,uses:2), nan (ex:6401586,uses:8), cbk (ex:4546942,uses:1), ita (ex:7731252,uses:1), wuu (ex:6080234,uses:3396)
  • $ Affects: Polish [pol], Finnish [fin], Marathi [mar], Lingua Franca Nova [lfn], Bengali [ben], CycL [cycl], English [eng], Japanese [jpn], Hindi [hin], Ilocano [ilo], Dutch [nld], Danish [dan], Spanish [spa], Portuguese [por], Russian [rus], Turkmen [tuk], Maltese [mlt], Esperanto [epo], Ukrainian [ukr], Tagalog [tgl], Hebrew [heb], Italian [ita], Catalan [cat], Greek [ell], German [deu], French [fra], Romanian [ron], Berber [ber], Interlingua [ina], Estonian [est], Georgian [kat], Kabyle [kab], Belarusian [bel], Turkish [tur], Indonesian [ind]
    • ‌$‌ (U+24 DOLLAR SIGN) pol (ex:983833,uses:4), fin (ex:3994706,uses:2), mar (ex:7951354,uses:9), lfn (ex:5492984,uses:7), ben (ex:3779272,uses:1), cycl (ex:473752,uses:1), eng (ex:8227111,uses:566), jpn (ex:181889,uses:1), hin (ex:3783709,uses:1), ilo (ex:4662418,uses:1), nld (ex:7900534,uses:8), dan (ex:1239223,uses:1), spa (ex:5157620,uses:39), por (ex:7695601,uses:73), rus (ex:7785379,uses:12), tuk (ex:7940828,uses:7), mlt (ex:2318702,uses:1), epo (ex:6621800,uses:20), ukr (ex:5954216,uses:1), tgl (ex:1856494,uses:2), heb (ex:6064704,uses:24), ita (ex:5504352,uses:54), cat (ex:1518931,uses:1), ell (ex:7579916,uses:1), deu (ex:8016954,uses:30), fra (ex:7745634,uses:23), ron (ex:1186837,uses:1), ber (ex:7875854,uses:7), ina (ex:821929,uses:1), est (ex:2694616,uses:1), kat (ex:487374,uses:1), kab (ex:7464925,uses:1), bel (ex:4070958,uses:2), tur (ex:6207798,uses:13), ind (ex:4582706,uses:3)
  • ၌ ၍ ၏ Affects: Burmese [mya]
    • ‌၌‌ (U+0x104c MYANMAR SYMBOL LOCATIVE) Affects: mya (ex:8273818,uses:1)
    • ‌၍‌ (U+0x104d MYANMAR SYMBOL COMPLETED) Affects: mya (ex:3388562,uses:2)
    • ‌၏‌ (U+0x104f MYANMAR SYMBOL GENITIVE) Affects: mya (ex:8273751,uses:6)
  • ' Affects: Lojban [jbo]
    • ‌'‌ (U+27 APOSTROPHE) jbo (ex:8220991,uses:8711)
  • 、 Affects: Mandarin Chinese [cmn], Cantonese [yue], Japanese [jpn], Ainu [ain], Spanish [spa], Literary Chinese [lzh], Italian [ita], Shanghainese [wuu]
    • ‌、‌ (U+3001 IDEOGRAPHIC COMMA) cmn (ex:7781526,uses:286), yue (ex:7999878,uses:20), jpn (ex:8225019,uses:29876), ain (ex:2850225,uses:1), spa (ex:4058729,uses:1), lzh (ex:4492017,uses:47), ita (ex:7731252,uses:1), wuu (ex:1692200,uses:41)
  • · Affects: Greek [ell]
    • ‌·‌ (U+387 GREEK ANO TELEIA) ell (ex:5142461,uses:2)

@Yorwba
Copy link
Contributor Author

Yorwba commented Oct 4, 2019

And the third:

Other Unsearchable Characters

  • ؠ ً ٌ ٍ َ ُ ِ ّ ْ ٓ ٔ ٕ ٖ ٗ ٘ ٚ ٛ ٠ ١ ٢ ٣ ٤ ٥ ٦ ٧ ٨ ٩ ٰ ۜ ۰ ۱ ۲ ۳ ۴ ۵ ۶ ۷ ۸ ۹ Affects: Algerian Arabic [arq], Egyptian Arabic [arz], Unknown Language, Swedish [swe], Iraqi Arabic [acm], Hungarian [hun], Persian [pes], Arabic [ara], Ottoman Turkish [ota], German [deu], North Levantine Arabic [apc], Tatar [tat], Gulf Arabic [afb], English [eng], Urdu [urd], Kashmiri [kas], Punjabi (Western) [pnb]
    • ‌ؠ‌ (U+0x620 ARABIC LETTER KASHMIRI YEH) Affects: kas (ex:7051643,uses:2)
    • ‌ً‌ (U+0x64b ARABIC FATHATAN) Affects: acm (ex:7306181,uses:3), afb (ex:5770821,uses:4), ara (ex:332186,uses:2514), arq (ex:8125336,uses:1), arz (ex:382213,uses:10), ota (ex:4070912,uses:11), pes (ex:510563,uses:813), urd (ex:2849520,uses:6)
    • ‌ٌ‌ (U+0x64c ARABIC DAMMATAN) Affects: ara (ex:370747,uses:191), arz (ex:5531464,uses:1), ota (ex:8335852,uses:1), pes (ex:888250,uses:4)
    • ‌ٍ‌ (U+0x64d ARABIC KASRATAN) Affects: ara (ex:374751,uses:175), swe (ex:8619465,uses:1)
    • ‌َ‌ (U+0x64e ARABIC FATHA) Affects: afb (ex:5771139,uses:2), apc (ex:3493954,uses:1), ara (ex:370576,uses:1205), arq (ex:2440297,uses:305), arz (ex:332220,uses:5), hun (ex:8369094,uses:1), kas (ex:7051623,uses:14), pes (ex:949781,uses:21), tat (ex:4695098,uses:2), urd (ex:4367343,uses:2)
    • ‌ُ‌ (U+0x64f ARABIC DAMMA) Affects: acm (ex:5926376,uses:1), ara (ex:332223,uses:1704), arq (ex:2440343,uses:130), arz (ex:425237,uses:7), kas (ex:7051621,uses:13), ota (ex:8335852,uses:1), pes (ex:1792619,uses:20), pnb (ex:1754651,uses:5), tat (ex:4547482,uses:1), urd (ex:2608278,uses:32)
    • ‌ِ‌ (U+0x650 ARABIC KASRA) Affects: ara (ex:332222,uses:1122), arq (ex:2440285,uses:145), arz (ex:332220,uses:5), deu (ex:4221168,uses:1), hun (ex:8369094,uses:1), kas (ex:7051628,uses:9), pes (ex:511892,uses:55), pnb (ex:3797222,uses:1), tat (ex:4695098,uses:4), urd (ex:1460373,uses:29)
    • ‌ّ‌ (U+0x651 ARABIC SHADDA) Affects: afb (ex:5016418,uses:4), apc (ex:3074432,uses:2), ara (ex:370576,uses:6723), arq (ex:2440285,uses:770), arz (ex:332220,uses:59), eng (ex:7307500,uses:1), ota (ex:4801567,uses:5), pes (ex:521639,uses:46), urd (ex:1470291,uses:8)
    • ‌ْ‌ (U+0x652 ARABIC SUKUN) Affects: Unknown Language (ex:8761451,uses:2), ara (ex:370943,uses:269), arq (ex:2691010,uses:360), arz (ex:586404,uses:1), hun (ex:8369094,uses:1), pes (ex:4738130,uses:1), urd (ex:4992336,uses:2)
    • ‌ٓ‌ (U+0x653 ARABIC MADDAH ABOVE) Affects: ara (ex:3190717,uses:1)
    • ‌ٔ‌ (U+0x654 ARABIC HAMZA ABOVE) Affects: kas (ex:7051624,uses:6), ota (ex:7825124,uses:6), pes (ex:3793581,uses:514)
    • ‌ٕ‌ (U+0x655 ARABIC HAMZA BELOW) Affects: kas (ex:7051621,uses:21)
    • ‌ٖ‌ (U+0x656 ARABIC SUBSCRIPT ALEF) Affects: kas (ex:7051645,uses:2)
    • ‌ٗ‌ (U+0x657 ARABIC INVERTED DAMMA) Affects: kas (ex:7051619,uses:4), urd (ex:4992345,uses:8)
    • ‌٘‌ (U+0x658 ARABIC MARK NOON GHUNNA) Affects: urd (ex:1477832,uses:1)
    • ‌ٚ‌ (U+0x65a ARABIC VOWEL SIGN SMALL V ABOVE) Affects: kas (ex:7051619,uses:9)
    • ‌ٛ‌ (U+0x65b ARABIC VOWEL SIGN INVERTED SMALL V ABOVE) Affects: kas (ex:7051624,uses:3)
    • ‌٠‌ (U+0x660 ARABIC-INDIC DIGIT ZERO) Affects: ara (ex:376356,uses:12), ota (ex:4855005,uses:1), pes (ex:7373256,uses:1)
    • ‌١‌ (U+0x661 ARABIC-INDIC DIGIT ONE) Affects: ara (ex:370773,uses:23), arz (ex:3251817,uses:1), ota (ex:4855005,uses:1)
    • ‌٢‌ (U+0x662 ARABIC-INDIC DIGIT TWO) Affects: ara (ex:448705,uses:9), arz (ex:3251817,uses:1)
    • ‌٣‌ (U+0x663 ARABIC-INDIC DIGIT THREE) Affects: ara (ex:377062,uses:7)
    • ‌٤‌ (U+0x664 ARABIC-INDIC DIGIT FOUR) Affects: ara (ex:372059,uses:8)
    • ‌٥‌ (U+0x665 ARABIC-INDIC DIGIT FIVE) Affects: ara (ex:370773,uses:7)
    • ‌٦‌ (U+0x666 ARABIC-INDIC DIGIT SIX) Affects: ara (ex:370773,uses:7)
    • ‌٧‌ (U+0x667 ARABIC-INDIC DIGIT SEVEN) Affects: ara (ex:2743145,uses:3)
    • ‌٨‌ (U+0x668 ARABIC-INDIC DIGIT EIGHT) Affects: ara (ex:370773,uses:6)
    • ‌٩‌ (U+0x669 ARABIC-INDIC DIGIT NINE) Affects: ara (ex:370973,uses:15)
    • ‌ٰ‌ (U+0x670 ARABIC LETTER SUPERSCRIPT ALEF) Affects: ara (ex:2346904,uses:2), urd (ex:6112934,uses:1)
    • ‌ۜ‌ (U+0x6dc ARABIC SMALL HIGH SEEN) Affects: pes (ex:913399,uses:1)
    • ‌۰‌ (U+0x6f0 EXTENDED ARABIC-INDIC DIGIT ZERO) Affects: ara (ex:370973,uses:5), ota (ex:8331067,uses:1), pes (ex:3840007,uses:20), urd (ex:1440822,uses:10)
    • ‌۱‌ (U+0x6f1 EXTENDED ARABIC-INDIC DIGIT ONE) Affects: ara (ex:493327,uses:1), ota (ex:7929129,uses:2), pes (ex:6546746,uses:24), urd (ex:1438154,uses:18)
    • ‌۲‌ (U+0x6f2 EXTENDED ARABIC-INDIC DIGIT TWO) Affects: ara (ex:610571,uses:1), ota (ex:8331067,uses:1), pes (ex:3840007,uses:15), urd (ex:1497720,uses:2)
    • ‌۳‌ (U+0x6f3 EXTENDED ARABIC-INDIC DIGIT THREE) Affects: ota (ex:7855863,uses:3), pes (ex:3840029,uses:10), urd (ex:1438154,uses:4)
    • ‌۴‌ (U+0x6f4 EXTENDED ARABIC-INDIC DIGIT FOUR) Affects: pes (ex:6977441,uses:6)
    • ‌۵‌ (U+0x6f5 EXTENDED ARABIC-INDIC DIGIT FIVE) Affects: pes (ex:6841391,uses:10), urd (ex:1432835,uses:3)
    • ‌۶‌ (U+0x6f6 EXTENDED ARABIC-INDIC DIGIT SIX) Affects: pes (ex:3840489,uses:11)
    • ‌۷‌ (U+0x6f7 EXTENDED ARABIC-INDIC DIGIT SEVEN) Affects: pes (ex:7316933,uses:5), urd (ex:1604237,uses:2)
    • ‌۸‌ (U+0x6f8 EXTENDED ARABIC-INDIC DIGIT EIGHT) Affects: ota (ex:8331067,uses:1), pes (ex:3808554,uses:7), urd (ex:1473840,uses:6)
    • ‌۹‌ (U+0x6f9 EXTENDED ARABIC-INDIC DIGIT NINE) Affects: ara (ex:493327,uses:2), ota (ex:7929129,uses:1), pes (ex:6960699,uses:13), urd (ex:1600193,uses:6)
  • à ã è ê ë ì ï ð ñ ò ô ù ü ý Affects: Latin [lat], Turkish [tur]
    • ‌à‌ (U+0xe0 LATIN SMALL LETTER A WITH GRAVE) Affects: tur (ex:8759700,uses:1)
    • ‌ã‌ (U+0xe3 LATIN SMALL LETTER A WITH TILDE) Affects: tur (ex:1090466,uses:12)
    • ‌è‌ (U+0xe8 LATIN SMALL LETTER E WITH GRAVE) Affects: lat (ex:7366571,uses:2), tur (ex:1940603,uses:9)
    • ‌ê‌ (U+0xea LATIN SMALL LETTER E WITH CIRCUMFLEX) Affects: tur (ex:4757395,uses:2)
    • ‌ë‌ (U+0xeb LATIN SMALL LETTER E WITH DIAERESIS) Affects: lat (ex:6481588,uses:10), tur (ex:4811548,uses:6)
    • ‌ì‌ (U+0xec LATIN SMALL LETTER I WITH GRAVE) Affects: tur (ex:4417490,uses:1)
    • ‌ï‌ (U+0xef LATIN SMALL LETTER I WITH DIAERESIS) Affects: lat (ex:6473456,uses:1)
    • ‌ð‌ (U+0xf0 LATIN SMALL LETTER ETH) Affects: tur (ex:5250301,uses:1)
    • ‌ñ‌ (U+0xf1 LATIN SMALL LETTER N WITH TILDE) Affects: tur (ex:3783623,uses:5)
    • ‌ò‌ (U+0xf2 LATIN SMALL LETTER O WITH GRAVE) Affects: lat (ex:7366873,uses:1), tur (ex:8759700,uses:5)
    • ‌ô‌ (U+0xf4 LATIN SMALL LETTER O WITH CIRCUMFLEX) Affects: tur (ex:3990800,uses:3)
    • ‌ù‌ (U+0xf9 LATIN SMALL LETTER U WITH GRAVE) Affects: lat (ex:7371152,uses:1)
    • ‌ü‌ (U+0xfc LATIN SMALL LETTER U WITH DIAERESIS) Affects: lat (ex:6739551,uses:1)
    • ‌ý‌ (U+0xfd LATIN SMALL LETTER Y WITH ACUTE) Affects: tur (ex:6299107,uses:2)
  • ἂ ἃ ἅ ἣ ἳ ἴ ἵ ἷ ὓ ὖ ὗ ὢ ὣ ὥ ᾱ ῑ ῡ ῥ Affects: Ancient Greek [grc]
    • ‌ἂ‌ (U+1f02 GREEK SMALL LETTER ALPHA WITH PSILI AND VARIA) grc (ex:5096325,uses:29)
    • ‌ἃ‌ (U+1f03 GREEK SMALL LETTER ALPHA WITH DASIA AND VARIA) grc (ex:6600414,uses:3)
    • ‌ἅ‌ (U+1f05 GREEK SMALL LETTER ALPHA WITH DASIA AND OXIA) grc (ex:6632888,uses:11)
    • ‌ἣ‌ (U+1f23 GREEK SMALL LETTER ETA WITH DASIA AND VARIA) grc (ex:7847069,uses:2)
    • ‌ἳ‌ (U+1f33 GREEK SMALL LETTER IOTA WITH DASIA AND VARIA) grc (ex:3103853,uses:1)
    • ‌ἴ‌ (U+1f34 GREEK SMALL LETTER IOTA WITH PSILI AND OXIA) grc (ex:7088466,uses:37)
    • ‌ἵ‌ (U+1f35 GREEK SMALL LETTER IOTA WITH DASIA AND OXIA) grc (ex:7088027,uses:8)
    • ‌ἷ‌ (U+1f37 GREEK SMALL LETTER IOTA WITH DASIA AND PERISPOMENI) grc (ex:5096308,uses:4)
    • ‌ὓ‌ (U+1f53 GREEK SMALL LETTER UPSILON WITH DASIA AND VARIA) grc (ex:2730831,uses:1)
    • ‌ὖ‌ (U+1f56 GREEK SMALL LETTER UPSILON WITH PSILI AND PERISPOMENI) grc (ex:7051181,uses:16)
    • ‌ὗ‌ (U+1f57 GREEK SMALL LETTER UPSILON WITH DASIA AND PERISPOMENI) grc (ex:7095771,uses:8)
    • ‌ὢ‌ (U+1f62 GREEK SMALL LETTER OMEGA WITH PSILI AND VARIA) grc (ex:6633395,uses:2)
    • ‌ὣ‌ (U+1f63 GREEK SMALL LETTER OMEGA WITH DASIA AND VARIA) grc (ex:3105129,uses:1)
    • ‌ὥ‌ (U+1f65 GREEK SMALL LETTER OMEGA WITH DASIA AND OXIA) grc (ex:5096268,uses:9)
    • ‌ᾱ‌ (U+1fb1 GREEK SMALL LETTER ALPHA WITH MACRON) grc (ex:5096323,uses:51)
    • ‌ῑ‌ (U+1fd1 GREEK SMALL LETTER IOTA WITH MACRON) grc (ex:5096200,uses:15)
    • ‌ῡ‌ (U+1fe1 GREEK SMALL LETTER UPSILON WITH MACRON) grc (ex:5096331,uses:23)
    • ‌ῥ‌ (U+1fe5 GREEK SMALL LETTER RHO WITH DASIA) grc (ex:6633410,uses:12)
  • ҃ ꙗ Affects: Old East Slavic [orv]
    • ‌҃‌ (U+483 COMBINING CYRILLIC TITLO) orv (ex:6818968,uses:1)
    • ‌ꙗ‌ (U+a657 CYRILLIC SMALL LETTER IOTIFIED A) orv (ex:577386,uses:2)
  • 𒀀 𒀉 𒀊 𒀕 𒀖 𒀜 𒀝 𒀠 𒀪 𒀭 𒀲 𒀳 𒀴 𒀸 𒀾 𒁀 𒁄 𒁇 𒁉 𒁍 𒁕 𒁮 𒁯 𒁲 𒁳 𒁶 𒁹 𒁺 𒁻 𒁾 𒂊 𒂍 𒂗 𒂠 𒂦 𒂵 𒂷 𒂼 𒃮 𒃲 𒃶 𒃸 𒃻 𒄀 𒄄 𒄑 𒄘 𒄠 𒄢 𒄦 𒄨 𒄩 𒄭 𒄯 𒄰 𒄴 𒄷 𒄾 𒄿 𒅁 𒅅 𒅆 𒅇 𒅍 𒅎 𒅔 𒅗 𒅘 𒅥 𒅴 𒆕 𒆗 𒆜 𒆟 𒆠 𒆪 𒆬 𒆳 𒆷 𒇇 𒇉 𒇯 𒇳 𒇴 𒇷 𒇻 𒇽 𒈜 𒈝 𒈠 𒈣 𒈤 𒈧 𒈨 𒈪 𒈫 𒈬 𒈭 𒈾 𒉆 𒉈 𒉌 𒉘 𒉡 𒉪 𒉺 𒉽 𒉿 𒊏 𒊑 𒊒 𒊕 𒊩 𒊬 𒊭 𒊮 𒊷 𒋀 𒋃 𒋗 𒋛 𒋢 𒋤 𒋧 𒋫 𒋺 𒋻 𒋼 𒋾 𒌀 𒌅 𒌆 𒌇 𒌈 𒌉 𒌋 𒌌 𒌍 𒌒 𒌓 𒌝 𒌤 𒌦 𒌨 𒌶 𒌷 𒍂 𒍇 𒍜 𒍝 𒍠 𒍢 𒍣 𒍪 𒍼 𒐈 𒐊 𒐋 𒐼 𒑂 𒑄 𒑆 𒑏 Affects: Unknown Language, Sumerian [sux]
    • ‌𒀀‌ (U+12000 CUNEIFORM SIGN A) sux (ex:7277890,uses:49)
    • ‌𒀉‌ (U+12009 CUNEIFORM SIGN A2) sux (ex:4974890,uses:1)
    • ‌𒀊‌ (U+1200a CUNEIFORM SIGN AB) sux (ex:7278086,uses:17)
    • ‌𒀕‌ (U+12015 CUNEIFORM SIGN AB GUNU) sux (ex:4552595,uses:1)
    • ‌𒀖‌ (U+12016 CUNEIFORM SIGN AB2) sux (ex:4982877,uses:1)
    • ‌𒀜‌ (U+1201c CUNEIFORM SIGN AD) Affects: Unknown Language (ex:7460541,uses:1), sux (ex:3864969,uses:2)
    • ‌𒀝‌ (U+1201d CUNEIFORM SIGN AK) Affects: Unknown Language (ex:7460541,uses:1), sux (ex:5038513,uses:2)
    • ‌𒀠‌ (U+12020 CUNEIFORM SIGN AL) sux (ex:5178019,uses:4)
    • ‌𒀪‌ (U+1202a CUNEIFORM SIGN ALEPH) Affects: Unknown Language (ex:7460541,uses:1)
    • ‌𒀭‌ (U+1202d CUNEIFORM SIGN AN) sux (ex:7277890,uses:31)
    • ‌𒀲‌ (U+12032 CUNEIFORM SIGN ANSHE) sux (ex:5065460,uses:2)
    • ‌𒀳‌ (U+12033 CUNEIFORM SIGN APIN) sux (ex:2167102,uses:1)
    • ‌𒀴‌ (U+12034 CUNEIFORM SIGN ARAD) sux (ex:5064837,uses:1)
    • ‌𒀸‌ (U+12038 CUNEIFORM SIGN ASH) Affects: Unknown Language (ex:7457556,uses:1), sux (ex:5103826,uses:5)
    • ‌𒀾‌ (U+1203e CUNEIFORM SIGN ASH2) sux (ex:4974890,uses:1)
    • ‌𒁀‌ (U+12040 CUNEIFORM SIGN BA) sux (ex:7277890,uses:23)
    • ‌𒁄‌ (U+12044 CUNEIFORM SIGN BAL) sux (ex:7277890,uses:2)
    • ‌𒁇‌ (U+12047 CUNEIFORM SIGN BAR) sux (ex:4482897,uses:2)
    • ‌𒁉‌ (U+12049 CUNEIFORM SIGN BI) sux (ex:7278086,uses:9)
    • ‌𒁍‌ (U+1204d CUNEIFORM SIGN BU) sux (ex:5178180,uses:1)
    • ‌𒁕‌ (U+12055 CUNEIFORM SIGN DA) sux (ex:5178146,uses:9)
    • ‌𒁮‌ (U+1206e CUNEIFORM SIGN DAM) sux (ex:5065460,uses:3)
    • ‌𒁯‌ (U+1206f CUNEIFORM SIGN DAR) sux (ex:5028492,uses:1)
    • ‌𒁲‌ (U+12072 CUNEIFORM SIGN DI) sux (ex:4980704,uses:4)
    • ‌𒁳‌ (U+12073 CUNEIFORM SIGN DIB) sux (ex:2162714,uses:2)
    • ‌𒁶‌ (U+12076 CUNEIFORM SIGN DIM2) sux (ex:5065460,uses:5)
    • ‌𒁹‌ (U+12079 CUNEIFORM SIGN DISH) sux (ex:5844025,uses:2)
    • ‌𒁺‌ (U+1207a CUNEIFORM SIGN DU) sux (ex:5178180,uses:18)
    • ‌𒁻‌ (U+1207b CUNEIFORM SIGN DU OVER DU) sux (ex:3861599,uses:1)
    • ‌𒁾‌ (U+1207e CUNEIFORM SIGN DUB) sux (ex:5055800,uses:2)
    • ‌𒂊‌ (U+1208a CUNEIFORM SIGN E) Affects: Unknown Language (ex:7457556,uses:1), sux (ex:7277890,uses:33)
    • ‌𒂍‌ (U+1208d CUNEIFORM SIGN E2) sux (ex:5065425,uses:8)
    • ‌𒂗‌ (U+12097 CUNEIFORM SIGN EN) sux (ex:7278086,uses:45)
    • ‌𒂠‌ (U+120a0 CUNEIFORM SIGN ESH2) sux (ex:7278086,uses:15)
    • ‌𒂦‌ (U+120a6 CUNEIFORM SIGN EZEN TIMES BAD) sux (ex:5093687,uses:1)
    • ‌𒂵‌ (U+120b5 CUNEIFORM SIGN GA) Affects: Unknown Language (ex:7460541,uses:1), sux (ex:5103007,uses:18)
    • ‌𒂷‌ (U+120b7 CUNEIFORM SIGN GA2) sux (ex:5090735,uses:17)
    • ‌𒂼‌ (U+120bc CUNEIFORM SIGN GA2 TIMES AN) sux (ex:5070166,uses:4)
    • ‌𒃮‌ (U+120ee CUNEIFORM SIGN GABA) sux (ex:5103828,uses:4)
    • ‌𒃲‌ (U+120f2 CUNEIFORM SIGN GAL) sux (ex:3864969,uses:1)
    • ‌𒃶‌ (U+120f6 CUNEIFORM SIGN GAN) sux (ex:5978532,uses:7)
    • ‌𒃸‌ (U+120f8 CUNEIFORM SIGN GAN2 TENU) sux (ex:2162714,uses:1)
    • ‌𒃻‌ (U+120fb CUNEIFORM SIGN GAR) sux (ex:7277890,uses:8)
    • ‌𒄀‌ (U+12100 CUNEIFORM SIGN GI) sux (ex:5093687,uses:4)
    • ‌𒄄‌ (U+12104 CUNEIFORM SIGN GI4) sux (ex:5064838,uses:3)
    • ‌𒄑‌ (U+12111 CUNEIFORM SIGN GISH) sux (ex:7277890,uses:8)
    • ‌𒄘‌ (U+12118 CUNEIFORM SIGN GU2) sux (ex:7278086,uses:2)
    • ‌𒄠‌ (U+12120 CUNEIFORM SIGN GUD TIMES KUR) sux (ex:5844025,uses:1)
    • ‌𒄢‌ (U+12122 CUNEIFORM SIGN GUL) sux (ex:5055678,uses:1)
    • ‌𒄦‌ (U+12126 CUNEIFORM SIGN GUR7) sux (ex:4948751,uses:1)
    • ‌𒄨‌ (U+12128 CUNEIFORM SIGN GURUSH) sux (ex:3859388,uses:1)
    • ‌𒄩‌ (U+12129 CUNEIFORM SIGN HA) sux (ex:4692660,uses:2)
    • ‌𒄭‌ (U+1212d CUNEIFORM SIGN HI) Affects: Unknown Language (ex:7460541,uses:1), sux (ex:4548271,uses:1)
    • ‌𒄯‌ (U+1212f CUNEIFORM SIGN HI TIMES ASH2) sux (ex:5103826,uses:1)
    • ‌𒄰‌ (U+12130 CUNEIFORM SIGN HI TIMES BAD) sux (ex:5070166,uses:2)
    • ‌𒄴‌ (U+12134 CUNEIFORM SIGN HI TIMES NUN) sux (ex:4967495,uses:2)
    • ‌𒄷‌ (U+12137 CUNEIFORM SIGN HU) sux (ex:5103828,uses:1)
    • ‌𒄾‌ (U+1213e CUNEIFORM SIGN HUL2) sux (ex:5070166,uses:3)
    • ‌𒄿‌ (U+1213f CUNEIFORM SIGN I) sux (ex:5978532,uses:1)
    • ‌𒅁‌ (U+12141 CUNEIFORM SIGN IB) sux (ex:4948751,uses:1)
    • ‌𒅅‌ (U+12145 CUNEIFORM SIGN IG) sux (ex:5065369,uses:5)
    • ‌𒅆‌ (U+12146 CUNEIFORM SIGN IGI) sux (ex:5178146,uses:9)
    • ‌𒅇‌ (U+12147 CUNEIFORM SIGN IGI DIB) sux (ex:5028491,uses:4)
    • ‌𒅍‌ (U+1214d CUNEIFORM SIGN IL2) sux (ex:4948751,uses:1)
    • ‌𒅎‌ (U+1214e CUNEIFORM SIGN IM) sux (ex:5178180,uses:6)
    • ‌𒅔‌ (U+12154 CUNEIFORM SIGN IN) sux (ex:5070166,uses:5)
    • ‌𒅗‌ (U+12157 CUNEIFORM SIGN KA) sux (ex:7277890,uses:19)
    • ‌𒅘‌ (U+12158 CUNEIFORM SIGN KA TIMES A) sux (ex:5090807,uses:2)
    • ‌𒅥‌ (U+12165 CUNEIFORM SIGN KA TIMES GAR) sux (ex:4983178,uses:1)
    • ‌𒅴‌ (U+12174 CUNEIFORM SIGN KA TIMES ME) sux (ex:7278086,uses:2)
    • ‌𒆕‌ (U+12195 CUNEIFORM SIGN KAK) sux (ex:5055662,uses:3)
    • ‌𒆗‌ (U+12197 CUNEIFORM SIGN KAL) sux (ex:5015337,uses:3)
    • ‌𒆜‌ (U+1219c CUNEIFORM SIGN KASKAL) sux (ex:4948719,uses:1)
    • ‌𒆟‌ (U+1219f CUNEIFORM SIGN KESH2) sux (ex:5103828,uses:2)
    • ‌𒆠‌ (U+121a0 CUNEIFORM SIGN KI) sux (ex:5178180,uses:6)
    • ‌𒆪‌ (U+121aa CUNEIFORM SIGN KU) sux (ex:5178019,uses:3)
    • ‌𒆬‌ (U+121ac CUNEIFORM SIGN KU3) sux (ex:5055678,uses:1)
    • ‌𒆳‌ (U+121b3 CUNEIFORM SIGN KUR) sux (ex:4974869,uses:3)
    • ‌𒆷‌ (U+121b7 CUNEIFORM SIGN LA) sux (ex:5090782,uses:2)
    • ‌𒇇‌ (U+121c7 CUNEIFORM SIGN LAGAB TIMES GUD PLUS GUD) sux (ex:4609515,uses:1)
    • ‌𒇉‌ (U+121c9 CUNEIFORM SIGN LAGAB TIMES HAL) sux (ex:5015337,uses:1)
    • ‌𒇯‌ (U+121ef CUNEIFORM SIGN LAGAR GUNU) sux (ex:2162714,uses:1)
    • ‌𒇳‌ (U+121f3 CUNEIFORM SIGN LAL TIMES LAL) sux (ex:5178019,uses:2)
    • ‌𒇴‌ (U+121f4 CUNEIFORM SIGN LAM) sux (ex:3861599,uses:1)
    • ‌𒇷‌ (U+121f7 CUNEIFORM SIGN LI) sux (ex:5178180,uses:4)
    • ‌𒇻‌ (U+121fb CUNEIFORM SIGN LU) sux (ex:3874065,uses:2)
    • ‌𒇽‌ (U+121fd CUNEIFORM SIGN LU2) Affects: Unknown Language (ex:7457556,uses:1), sux (ex:5103007,uses:5)
    • ‌𒈜‌ (U+1221c CUNEIFORM SIGN LUL) sux (ex:4861896,uses:2)
    • ‌𒈝‌ (U+1221d CUNEIFORM SIGN LUM) sux (ex:5055947,uses:1)
    • ‌𒈠‌ (U+12220 CUNEIFORM SIGN MA) sux (ex:5093701,uses:14)
    • ‌𒈣‌ (U+12223 CUNEIFORM SIGN MA2) sux (ex:5094288,uses:1)
    • ‌𒈤‌ (U+12224 CUNEIFORM SIGN MAH) sux (ex:4974817,uses:2)
    • ‌𒈧‌ (U+12227 CUNEIFORM SIGN MASH2) sux (ex:5178180,uses:1)
    • ‌𒈨‌ (U+12228 CUNEIFORM SIGN ME) sux (ex:5978153,uses:34)
    • ‌𒈪‌ (U+1222a CUNEIFORM SIGN MI) Affects: Unknown Language (ex:7457556,uses:1), sux (ex:2158188,uses:1)
    • ‌𒈫‌ (U+1222b CUNEIFORM SIGN MIN) sux (ex:4655824,uses:1)
    • ‌𒈬‌ (U+1222c CUNEIFORM SIGN MU) sux (ex:7277890,uses:37)
    • ‌𒈭‌ (U+1222d CUNEIFORM SIGN MU OVER MU) sux (ex:4983447,uses:3)
    • ‌𒈾‌ (U+1223e CUNEIFORM SIGN NA) Affects: Unknown Language (ex:7460541,uses:1), sux (ex:7278086,uses:19)
    • ‌𒉆‌ (U+12246 CUNEIFORM SIGN NAM) sux (ex:5978151,uses:14)
    • ‌𒉈‌ (U+12248 CUNEIFORM SIGN NE) sux (ex:5844025,uses:26)
    • ‌𒉌‌ (U+1224c CUNEIFORM SIGN NI) sux (ex:5844025,uses:17)
    • ‌𒉘‌ (U+12258 CUNEIFORM SIGN NINDA2 TIMES NE) sux (ex:5978153,uses:4)
    • ‌𒉡‌ (U+12261 CUNEIFORM SIGN NU) sux (ex:5978551,uses:19)
    • ‌𒉪‌ (U+1226a CUNEIFORM SIGN NUN OVER NUN) sux (ex:4974925,uses:1)
    • ‌𒉺‌ (U+1227a CUNEIFORM SIGN PA) sux (ex:4983409,uses:1)
    • ‌𒉽‌ (U+1227d CUNEIFORM SIGN PAP) sux (ex:4983093,uses:3)
    • ‌𒉿‌ (U+1227f CUNEIFORM SIGN PI) sux (ex:4550094,uses:1)
    • ‌𒊏‌ (U+1228f CUNEIFORM SIGN RA) sux (ex:7277890,uses:21)
    • ‌𒊑‌ (U+12291 CUNEIFORM SIGN RI) sux (ex:5028495,uses:3)
    • ‌𒊒‌ (U+12292 CUNEIFORM SIGN RU) sux (ex:7278086,uses:3)
    • ‌𒊕‌ (U+12295 CUNEIFORM SIGN SAG) sux (ex:5055662,uses:3)
    • ‌𒊩‌ (U+122a9 CUNEIFORM SIGN SAL) sux (ex:5090793,uses:5)
    • ‌𒊬‌ (U+122ac CUNEIFORM SIGN SAR) sux (ex:4974817,uses:2)
    • ‌𒊭‌ (U+122ad CUNEIFORM SIGN SHA) Affects: Unknown Language (ex:7460541,uses:1)
    • ‌𒊮‌ (U+122ae CUNEIFORM SIGN SHA3) sux (ex:5028492,uses:5)
    • ‌𒊷‌ (U+122b7 CUNEIFORM SIGN SHA6) sux (ex:5091853,uses:1)
    • ‌𒋀‌ (U+122c0 CUNEIFORM SIGN SHESH) sux (ex:5028491,uses:2)
    • ‌𒋃‌ (U+122c3 CUNEIFORM SIGN SHID) sux (ex:5844025,uses:1)
    • ‌𒋗‌ (U+122d7 CUNEIFORM SIGN SHU) sux (ex:5178180,uses:4)
    • ‌𒋛‌ (U+122db CUNEIFORM SIGN SI) sux (ex:5028502,uses:2)
    • ‌𒋢‌ (U+122e2 CUNEIFORM SIGN SU) sux (ex:4648278,uses:1)
    • ‌𒋤‌ (U+122e4 CUNEIFORM SIGN SUD) sux (ex:5178019,uses:1)
    • ‌𒋧‌ (U+122e7 CUNEIFORM SIGN SUM) sux (ex:5093701,uses:6)
    • ‌𒋫‌ (U+122eb CUNEIFORM SIGN TA) Affects: Unknown Language (ex:7460541,uses:1), sux (ex:5065425,uses:5)
    • ‌𒋺‌ (U+122fa CUNEIFORM SIGN TAK4) sux (ex:5055947,uses:1)
    • ‌𒋻‌ (U+122fb CUNEIFORM SIGN TAR) sux (ex:5038529,uses:6)
    • ‌𒋼‌ (U+122fc CUNEIFORM SIGN TE) sux (ex:5070166,uses:1)
    • ‌𒋾‌ (U+122fe CUNEIFORM SIGN TI) sux (ex:5978153,uses:6)
    • ‌𒌀‌ (U+12300 CUNEIFORM SIGN TIL) sux (ex:5037369,uses:1)
    • ‌𒌅‌ (U+12305 CUNEIFORM SIGN TU) sux (ex:4980704,uses:2)
    • ‌𒌆‌ (U+12306 CUNEIFORM SIGN TUG2) sux (ex:4550094,uses:1)
    • ‌𒌇‌ (U+12307 CUNEIFORM SIGN TUK) sux (ex:7277890,uses:9)
    • ‌𒌈‌ (U+12308 CUNEIFORM SIGN TUM) sux (ex:3851116,uses:2)
    • ‌𒌉‌ (U+12309 CUNEIFORM SIGN TUR) sux (ex:5055947,uses:6)
    • ‌𒌋‌ (U+1230b CUNEIFORM SIGN U) sux (ex:5844025,uses:3)
    • ‌𒌌‌ (U+1230c CUNEIFORM SIGN U GUD) sux (ex:4948719,uses:1)
    • ‌𒌍‌ (U+1230d CUNEIFORM SIGN U U U) Affects: Unknown Language (ex:7457556,uses:1), sux (ex:2158188,uses:1)
    • ‌𒌒‌ (U+12312 CUNEIFORM SIGN UB) sux (ex:2158092,uses:1)
    • ‌𒌓‌ (U+12313 CUNEIFORM SIGN UD) sux (ex:5844025,uses:8)
    • ‌𒌝‌ (U+1231d CUNEIFORM SIGN UM) sux (ex:5103828,uses:3)
    • ‌𒌤‌ (U+12324 CUNEIFORM SIGN UMUM TIMES KASKAL) sux (ex:4995693,uses:1)
    • ‌𒌦‌ (U+12326 CUNEIFORM SIGN UN) sux (ex:5065460,uses:11)
    • ‌𒌨‌ (U+12328 CUNEIFORM SIGN UR) sux (ex:5978551,uses:9)
    • ‌𒌶‌ (U+12336 CUNEIFORM SIGN URI3) sux (ex:5178146,uses:1)
    • ‌𒌷‌ (U+12337 CUNEIFORM SIGN URU) sux (ex:5178180,uses:3)
    • ‌𒍂‌ (U+12342 CUNEIFORM SIGN URU TIMES IGI) sux (ex:5038513,uses:1)
    • ‌𒍇‌ (U+12347 CUNEIFORM SIGN URU TIMES MIN) sux (ex:5055910,uses:1)
    • ‌𒍜‌ (U+1235c CUNEIFORM SIGN UZU) sux (ex:4983409,uses:1)
    • ‌𒍝‌ (U+1235d CUNEIFORM SIGN ZA) sux (ex:5090762,uses:8)
    • ‌𒍠‌ (U+12360 CUNEIFORM SIGN ZAG) sux (ex:5090793,uses:3)
    • ‌𒍢‌ (U+12362 CUNEIFORM SIGN ZE2) sux (ex:4967661,uses:3)
    • ‌𒍣‌ (U+12363 CUNEIFORM SIGN ZI) sux (ex:5103007,uses:3)
    • ‌𒍪‌ (U+1236a CUNEIFORM SIGN ZU) sux (ex:7278086,uses:25)
    • ‌𒍼‌ (U+1237c CUNEIFORM SIGN GIG) sux (ex:4650204,uses:3)
    • ‌𒐈‌ (U+12408 CUNEIFORM NUMERIC SIGN THREE DISH) sux (ex:5065460,uses:2)
    • ‌𒐊‌ (U+1240a CUNEIFORM NUMERIC SIGN FIVE DISH) sux (ex:4655824,uses:1)
    • ‌𒐋‌ (U+1240b CUNEIFORM NUMERIC SIGN SIX DISH) sux (ex:4655824,uses:1)
    • ‌𒐼‌ (U+1243c CUNEIFORM NUMERIC SIGN FOUR VARIANT FORM LIMMU) sux (ex:4655824,uses:1)
    • ‌𒑂‌ (U+12442 CUNEIFORM NUMERIC SIGN SEVEN VARIANT FORM IMIN A) sux (ex:4655824,uses:1)
    • ‌𒑄‌ (U+12444 CUNEIFORM NUMERIC SIGN EIGHT VARIANT FORM USSU) sux (ex:4655824,uses:1)
    • ‌𒑆‌ (U+12446 CUNEIFORM NUMERIC SIGN NINE VARIANT FORM ILIMMU) sux (ex:4655824,uses:1)
    • ‌𒑏‌ (U+1244f CUNEIFORM NUMERIC SIGN ONE BAN2) sux (ex:5103828,uses:2)
  • ֑ ֔ ֖ ֗ ֘ ֙ ֝ ֡ ֣ ֤ ֥ ֨ ֪ ְ ֱ ֲ ֳ ִ ֵ ֶ ַ ָ ֹ ֻ ּ ֽ ֿ ׁ ׂ ׇ ﬞ Affects: Unknown Language, Old Aramaic [oar], Hebrew [heb], Jewish Babylonian Aramaic [tmr], Yiddish [yid], English [eng], Ladino [lad]
    • ‌֑‌ (U+0x591 HEBREW ACCENT ETNAHTA) Affects: heb (ex:532015,uses:4)
    • ‌֔‌ (U+0x594 HEBREW ACCENT ZAQEF QATAN) Affects: heb (ex:2144310,uses:2)
    • ‌֖‌ (U+0x596 HEBREW ACCENT TIPEHA) Affects: heb (ex:2144310,uses:3)
    • ‌֗‌ (U+0x597 HEBREW ACCENT REVIA) Affects: heb (ex:532015,uses:1)
    • ‌֘‌ (U+0x598 HEBREW ACCENT ZARQA) Affects: heb (ex:532015,uses:1)
    • ‌֙‌ (U+0x599 HEBREW ACCENT PASHTA) Affects: heb (ex:2144310,uses:2)
    • ‌֝‌ (U+0x59d HEBREW ACCENT GERESH MUQDAM) Affects: heb (ex:532015,uses:1)
    • ‌֡‌ (U+0x5a1 HEBREW ACCENT PAZER) Affects: heb (ex:532015,uses:1)
    • ‌֣‌ (U+0x5a3 HEBREW ACCENT MUNAH) Affects: heb (ex:532015,uses:3)
    • ‌֤‌ (U+0x5a4 HEBREW ACCENT MAHAPAKH) Affects: heb (ex:532015,uses:3)
    • ‌֥‌ (U+0x5a5 HEBREW ACCENT MERKHA) Affects: heb (ex:532015,uses:4)
    • ‌֨‌ (U+0x5a8 HEBREW ACCENT QADMA) Affects: heb (ex:532015,uses:2)
    • ‌֪‌ (U+0x5aa HEBREW ACCENT YERAH BEN YOMO) Affects: heb (ex:532015,uses:1)
    • ‌ְ‌ (U+0x5b0 HEBREW POINT SHEVA) Affects: Unknown Language (ex:8749665,uses:2), heb (ex:382152,uses:319), oar (ex:5285789,uses:11), tmr (ex:8292087,uses:4)
    • ‌ֱ‌ (U+0x5b1 HEBREW POINT HATAF SEGOL) Affects: heb (ex:1048531,uses:26)
    • ‌ֲ‌ (U+0x5b2 HEBREW POINT HATAF PATAH) Affects: heb (ex:532015,uses:62), oar (ex:8290584,uses:8), tmr (ex:8292087,uses:3)
    • ‌ֳ‌ (U+0x5b3 HEBREW POINT HATAF QAMATS) Affects: heb (ex:2366564,uses:2), oar (ex:8290584,uses:1)
    • ‌ִ‌ (U+0x5b4 HEBREW POINT HIRIQ) Affects: Unknown Language (ex:8749665,uses:2), heb (ex:532015,uses:298), oar (ex:5285789,uses:10), tmr (ex:8292087,uses:3), yid (ex:908462,uses:77)
    • ‌ֵ‌ (U+0x5b5 HEBREW POINT TSERE) Affects: Unknown Language (ex:8749677,uses:1), heb (ex:532015,uses:270), oar (ex:5285789,uses:8), tmr (ex:8292087,uses:3)
    • ‌ֶ‌ (U+0x5b6 HEBREW POINT SEGOL) Affects: Unknown Language (ex:8749665,uses:2), heb (ex:532015,uses:228), oar (ex:8290587,uses:2)
    • ‌ַ‌ (U+0x5b7 HEBREW POINT PATAH) Affects: Unknown Language (ex:8749665,uses:2), heb (ex:382152,uses:333), oar (ex:5285789,uses:11), tmr (ex:8292087,uses:4), yid (ex:392335,uses:2177)
    • ‌ָ‌ (U+0x5b8 HEBREW POINT QAMATS) Affects: Unknown Language (ex:8749665,uses:2), heb (ex:382152,uses:334), oar (ex:5285789,uses:11), tmr (ex:8292087,uses:3), yid (ex:392330,uses:1714)
    • ‌ֹ‌ (U+0x5b9 HEBREW POINT HOLAM) Affects: Unknown Language (ex:8749677,uses:1), heb (ex:382152,uses:186), oar (ex:8290584,uses:8), tmr (ex:8292087,uses:3), yid (ex:1561150,uses:1)
    • ‌ֻ‌ (U+0x5bb HEBREW POINT QUBUTS) Affects: heb (ex:532015,uses:17)
    • ‌ּ‌ (U+0x5bc HEBREW POINT DAGESH OR MAPIQ) Affects: Unknown Language (ex:8749665,uses:2), heb (ex:382152,uses:515), oar (ex:5285789,uses:11), tmr (ex:8292087,uses:3), yid (ex:569458,uses:587)
    • ‌ֽ‌ (U+0x5bd HEBREW POINT METEG) Affects: heb (ex:532015,uses:10)
    • ‌ֿ‌ (U+0x5bf HEBREW POINT RAFE) Affects: heb (ex:569617,uses:2), lad (ex:8304069,uses:6), yid (ex:392333,uses:995)
    • ‌ׁ‌ (U+0x5c1 HEBREW POINT SHIN DOT) Affects: Unknown Language (ex:8749677,uses:1), eng (ex:2147088,uses:8), heb (ex:382152,uses:118), oar (ex:5285789,uses:9), tmr (ex:8408240,uses:2)
    • ‌ׂ‌ (U+0x5c2 HEBREW POINT SIN DOT) Affects: Unknown Language (ex:8749665,uses:2), eng (ex:4041901,uses:3), heb (ex:569615,uses:53), oar (ex:8292091,uses:1), yid (ex:1557269,uses:20)
    • ‌ׇ‌ (U+0x5c7 HEBREW POINT QAMATS QATAN) Affects: tmr (ex:8699650,uses:1)
    • ‌ﬞ‌ (U+0xfb1e HEBREW POINT JUDEO-SPANISH VARIKA) Affects: lad (ex:8602811,uses:29)
  • ៗ ៝ Affects: Central Mnong [cmo], Khmer [khm]
    • ‌ៗ‌ (U+17d7 KHMER SIGN LEK TOO) cmo (ex:5843368,uses:1), khm (ex:6987325,uses:35)
    • ‌៝‌ (U+17dd KHMER SIGN ATTHACAN) cmo (ex:5845556,uses:41)
  • ຽ ໆ Affects: Lao [lao]
    • ‌ຽ‌ (U+ebd LAO SEMIVOWEL SIGN NYO) lao (ex:3791448,uses:6)
    • ‌ໆ‌ (U+ec6 LAO KO LA) lao (ex:2699763,uses:1)
  • ʻ ʼ ʿ ˀ ˈ ˌ ː Affects: English [eng], Tongan [ton], Spanish [spa], Russian [rus], Esperanto [epo], Ngeq [ngt], Hawaiian [haw], Ukrainian [ukr], Hebrew [heb], Italian [ita], Samoan [smo], Cayuga [cay], Breton [bre], Tahitian [tah], German [deu], French [fra], Uzbek [uzb], Navajo [nav], Ancient Greek [grc], Kabyle [kab], Niuean [niu], Belarusian [bel]
    • ‌ʻ‌ (U+2bb MODIFIER LETTER TURNED COMMA) uzb (ex:8097840,uses:1), kab (ex:7263128,uses:1), niu (ex:4606531,uses:1), deu (ex:7793846,uses:4), haw (ex:7265272,uses:105), eng (ex:7907392,uses:3), ton (ex:3615015,uses:3), tah (ex:4896695,uses:1), spa (ex:7864414,uses:2), smo (ex:6824286,uses:4)
    • ‌ʼ‌ (U+2bc MODIFIER LETTER APOSTROPHE) nav (ex:7563558,uses:32), grc (ex:5096327,uses:65), bre (ex:5362018,uses:8), eng (ex:7888103,uses:4), bel (ex:7852262,uses:23), ukr (ex:7790060,uses:1)
    • ‌ʿ‌ (U+2bf MODIFIER LETTER LEFT HALF RING) deu (ex:1817698,uses:1)
    • ‌ˀ‌ (U+2c0 MODIFIER LETTER GLOTTAL STOP) cay (ex:7828736,uses:3)
    • ‌ˈ‌ (U+2c8 MODIFIER LETTER VERTICAL LINE) epo (ex:5491585,uses:2), ngt (ex:3174787,uses:4), deu (ex:2869872,uses:2), eng (ex:1974821,uses:1), fra (ex:990440,uses:1), ita (ex:1101598,uses:1), heb (ex:1974822,uses:1), rus (ex:2204370,uses:1)
    • ‌ˌ‌ (U+2cc MODIFIER LETTER LOW VERTICAL LINE) fra (ex:990440,uses:1), ita (ex:1101598,uses:1), epo (ex:995614,uses:1), deu (ex:988467,uses:1)
    • ‌ː‌ (U+2d0 MODIFIER LETTER TRIANGULAR COLON) epo (ex:995614,uses:1), ngt (ex:3942564,uses:12), deu (ex:988467,uses:1), fra (ex:990440,uses:1), ita (ex:1101598,uses:1)
  • ൺ ൻ ർ ൽ ൾ Affects: Malayalam [mal]
    • ‌ൺ‌ (U+d7a MALAYALAM LETTER CHILLU NN) mal (ex:792902,uses:5)
    • ‌ൻ‌ (U+d7b MALAYALAM LETTER CHILLU N) mal (ex:5494550,uses:166)
    • ‌ർ‌ (U+d7c MALAYALAM LETTER CHILLU RR) mal (ex:5494545,uses:65)
    • ‌ൽ‌ (U+d7d MALAYALAM LETTER CHILLU L) mal (ex:3964449,uses:89)
    • ‌ൾ‌ (U+d7e MALAYALAM LETTER CHILLU LL) mal (ex:5494136,uses:100)
  • ᠠ ᠨ ᠩ ᠪ ᠮ ᠰ ᡝ ᡠ ᡤ ᡥ ᡩ ᡳ ᡵ Affects: Manchu [mnc]
    • ‌ᠠ‌ (U+1820 MONGOLIAN LETTER A) mnc (ex:6742672,uses:1)
    • ‌ᠨ‌ (U+1828 MONGOLIAN LETTER NA) mnc (ex:6742672,uses:1)
    • ‌ᠩ‌ (U+1829 MONGOLIAN LETTER ANG) mnc (ex:6742672,uses:1)
    • ‌ᠪ‌ (U+182a MONGOLIAN LETTER BA) mnc (ex:6742672,uses:1)
    • ‌ᠮ‌ (U+182e MONGOLIAN LETTER MA) mnc (ex:6742672,uses:1)
    • ‌ᠰ‌ (U+1830 MONGOLIAN LETTER SA) mnc (ex:6742672,uses:1)
    • ‌ᡝ‌ (U+185d MONGOLIAN LETTER SIBE E) mnc (ex:6742672,uses:1)
    • ‌ᡠ‌ (U+1860 MONGOLIAN LETTER SIBE UE) mnc (ex:6742672,uses:1)
    • ‌ᡤ‌ (U+1864 MONGOLIAN LETTER SIBE GA) mnc (ex:6742672,uses:1)
    • ‌ᡥ‌ (U+1865 MONGOLIAN LETTER SIBE HA) mnc (ex:6742672,uses:1)
    • ‌ᡩ‌ (U+1869 MONGOLIAN LETTER SIBE DA) mnc (ex:6742672,uses:1)
    • ‌ᡳ‌ (U+1873 MONGOLIAN LETTER MANCHU I) mnc (ex:6742672,uses:1)
    • ‌ᡵ‌ (U+1875 MONGOLIAN LETTER MANCHU RA) mnc (ex:6742672,uses:1)
  • 𑣁 𑣂 𑣅 𑣈 𑣋 𑣌 𑣓 𑣖 𑣗 𑣘 𑣙 𑣜 Affects: Ho [hoc]
    • ‌𑣁‌ (U+118c1 WARANG CITI SMALL LETTER A) hoc (ex:4712781,uses:4)
    • ‌𑣂‌ (U+118c2 WARANG CITI SMALL LETTER WI) hoc (ex:4712757,uses:2)
    • ‌𑣅‌ (U+118c5 WARANG CITI SMALL LETTER YO) hoc (ex:4712781,uses:2)
    • ‌𑣈‌ (U+118c8 WARANG CITI SMALL LETTER E) hoc (ex:4712781,uses:5)
    • ‌𑣋‌ (U+118cb WARANG CITI SMALL LETTER GA) hoc (ex:4712755,uses:1)
    • ‌𑣌‌ (U+118cc WARANG CITI SMALL LETTER KO) hoc (ex:4712781,uses:3)
    • ‌𑣓‌ (U+118d3 WARANG CITI SMALL LETTER NUNG) hoc (ex:4712780,uses:1)
    • ‌𑣖‌ (U+118d6 WARANG CITI SMALL LETTER AM) hoc (ex:4712779,uses:1)
    • ‌𑣗‌ (U+118d7 WARANG CITI SMALL LETTER BU) hoc (ex:4712780,uses:1)
    • ‌𑣘‌ (U+118d8 WARANG CITI SMALL LETTER PU) hoc (ex:4712781,uses:1)
    • ‌𑣙‌ (U+118d9 WARANG CITI SMALL LETTER HIYO) hoc (ex:4690415,uses:1)
    • ‌𑣜‌ (U+118dc WARANG CITI SMALL LETTER HAR) hoc (ex:4712781,uses:4)
  • 𐰀 𐰃 𐰆 𐰇 𐰉 𐰋 𐰍 𐰓 𐰕 𐰖 𐰘 𐰚 𐰞 𐰢 𐰣 𐰲 𐰸 𐰺 𐰼 𐰾 𐱃 𐱅 Affects: Old Turkish [otk]
    • ‌𐰀‌ (U+10c00 OLD TURKIC LETTER ORKHON A) otk (ex:6782743,uses:2)
    • ‌𐰃‌ (U+10c03 OLD TURKIC LETTER ORKHON I) otk (ex:6778975,uses:2)
    • ‌𐰆‌ (U+10c06 OLD TURKIC LETTER ORKHON O) otk (ex:6778975,uses:3)
    • ‌𐰇‌ (U+10c07 OLD TURKIC LETTER ORKHON OE) otk (ex:6778975,uses:1)
    • ‌𐰉‌ (U+10c09 OLD TURKIC LETTER ORKHON AB) otk (ex:6778971,uses:1)
    • ‌𐰋‌ (U+10c0b OLD TURKIC LETTER ORKHON AEB) otk (ex:6782743,uses:3)
    • ‌𐰍‌ (U+10c0d OLD TURKIC LETTER ORKHON AG) otk (ex:6778971,uses:1)
    • ‌𐰓‌ (U+10c13 OLD TURKIC LETTER ORKHON AED) otk (ex:6782743,uses:2)
    • ‌𐰕‌ (U+10c15 OLD TURKIC LETTER YENISEI EZ) otk (ex:6778975,uses:1)
    • ‌𐰖‌ (U+10c16 OLD TURKIC LETTER ORKHON AY) otk (ex:6778982,uses:1)
    • ‌𐰘‌ (U+10c18 OLD TURKIC LETTER ORKHON AEY) otk (ex:6778975,uses:1)
    • ‌𐰚‌ (U+10c1a OLD TURKIC LETTER ORKHON AEK) otk (ex:6778975,uses:1)
    • ‌𐰞‌ (U+10c1e OLD TURKIC LETTER ORKHON AL) otk (ex:6778975,uses:1)
    • ‌𐰢‌ (U+10c22 OLD TURKIC LETTER ORKHON EM) otk (ex:6782743,uses:2)
    • ‌𐰣‌ (U+10c23 OLD TURKIC LETTER ORKHON AN) otk (ex:6778975,uses:1)
    • ‌𐰲‌ (U+10c32 OLD TURKIC LETTER ORKHON EC) otk (ex:6778975,uses:1)
    • ‌𐰸‌ (U+10c38 OLD TURKIC LETTER ORKHON OQ) otk (ex:6778982,uses:2)
    • ‌𐰺‌ (U+10c3a OLD TURKIC LETTER ORKHON AR) otk (ex:6778971,uses:1)
    • ‌𐰼‌ (U+10c3c OLD TURKIC LETTER ORKHON AER) otk (ex:6782743,uses:3)
    • ‌𐰾‌ (U+10c3e OLD TURKIC LETTER ORKHON AES) otk (ex:6778975,uses:1)
    • ‌𐱃‌ (U+10c43 OLD TURKIC LETTER ORKHON AT) otk (ex:6778975,uses:2)
    • ‌𐱅‌ (U+10c45 OLD TURKIC LETTER ORKHON AET) otk (ex:6782743,uses:3)
  • 𐌰 𐌱 𐌲 𐌳 𐌴 𐌵 𐌶 𐌷 𐌸 𐌹 𐌺 𐌻 𐌼 𐌽 𐌾 𐌿 𐍀 𐍂 𐍃 𐍄 𐍅 𐍆 𐍈 𐍉 Affects: Gothic [got]
    • ‌𐌰‌ (U+10330 GOTHIC LETTER AHSA) got (ex:7962560,uses:235)
    • ‌𐌱‌ (U+10331 GOTHIC LETTER BAIRKAN) got (ex:7962524,uses:71)
    • ‌𐌲‌ (U+10332 GOTHIC LETTER GIBA) got (ex:7962390,uses:95)
    • ‌𐌳‌ (U+10333 GOTHIC LETTER DAGS) got (ex:7962560,uses:85)
    • ‌𐌴‌ (U+10334 GOTHIC LETTER AIHVUS) got (ex:7962560,uses:111)
    • ‌𐌵‌ (U+10335 GOTHIC LETTER QAIRTHRA) got (ex:7798734,uses:28)
    • ‌𐌶‌ (U+10336 GOTHIC LETTER IUJA) got (ex:7962524,uses:14)
    • ‌𐌷‌ (U+10337 GOTHIC LETTER HAGL) got (ex:7960347,uses:71)
    • ‌𐌸‌ (U+10338 GOTHIC LETTER THIUTH) got (ex:7962393,uses:133)
    • ‌𐌹‌ (U+10339 GOTHIC LETTER EIS) got (ex:7962560,uses:242)
    • ‌𐌺‌ (U+1033a GOTHIC LETTER KUSMA) got (ex:7960337,uses:77)
    • ‌𐌻‌ (U+1033b GOTHIC LETTER LAGUS) got (ex:7962390,uses:93)
    • ‌𐌼‌ (U+1033c GOTHIC LETTER MANNA) got (ex:7962560,uses:113)
    • ‌𐌽‌ (U+1033d GOTHIC LETTER NAUTHS) got (ex:7962560,uses:150)
    • ‌𐌾‌ (U+1033e GOTHIC LETTER JER) got (ex:7962560,uses:67)
    • ‌𐌿‌ (U+1033f GOTHIC LETTER URUS) got (ex:7962560,uses:132)
    • ‌𐍀‌ (U+10340 GOTHIC LETTER PAIRTHRA) got (ex:7960330,uses:10)
    • ‌𐍂‌ (U+10342 GOTHIC LETTER RAIDA) got (ex:7962524,uses:103)
    • ‌𐍃‌ (U+10343 GOTHIC LETTER SAUIL) got (ex:7962560,uses:182)
    • ‌𐍄‌ (U+10344 GOTHIC LETTER TEIWS) got (ex:7962560,uses:146)
    • ‌𐍅‌ (U+10345 GOTHIC LETTER WINJA) got (ex:7962524,uses:87)
    • ‌𐍆‌ (U+10346 GOTHIC LETTER FAIHU) got (ex:7807834,uses:42)
    • ‌𐍈‌ (U+10348 GOTHIC LETTER HWAIR) got (ex:7962560,uses:45)
    • ‌𐍉‌ (U+10349 GOTHIC LETTER OTHAL) got (ex:7807834,uses:96)
  • ꦁ ꦂ ꦃ ꦏ ꦒ ꦔ ꦕ ꦗ ꦚ ꦠ ꦡ ꦢ ꦣ ꦤ ꦥ ꦧ ꦩ ꦪ ꦫ ꦭ ꦮ ꦰ ꦱ ꦲ ꦴ ꦶ ꦸ ꦺ ꦼ ꧀ Affects: Javanese [jav]
    • ‌ꦁ‌ (U+a981 JAVANESE SIGN CECAK) jav (ex:4560200,uses:7)
    • ‌ꦂ‌ (U+a982 JAVANESE SIGN LAYAR) jav (ex:4546277,uses:2)
    • ‌ꦃ‌ (U+a983 JAVANESE SIGN WIGNYAN) jav (ex:4549792,uses:2)
    • ‌ꦏ‌ (U+a98f JAVANESE LETTER KA) jav (ex:4568164,uses:10)
    • ‌ꦒ‌ (U+a992 JAVANESE LETTER GA) jav (ex:4546281,uses:2)
    • ‌ꦔ‌ (U+a994 JAVANESE LETTER NGA) jav (ex:4560194,uses:2)
    • ‌ꦕ‌ (U+a995 JAVANESE LETTER CA) jav (ex:4554585,uses:3)
    • ‌ꦗ‌ (U+a997 JAVANESE LETTER JA) jav (ex:4560194,uses:4)
    • ‌ꦚ‌ (U+a99a JAVANESE LETTER NYA) jav (ex:4560194,uses:1)
    • ‌ꦠ‌ (U+a9a0 JAVANESE LETTER TA) jav (ex:4560200,uses:10)
    • ‌ꦡ‌ (U+a9a1 JAVANESE LETTER TA MURDA) jav (ex:4560200,uses:2)
    • ‌ꦢ‌ (U+a9a2 JAVANESE LETTER DA) jav (ex:4554579,uses:3)
    • ‌ꦣ‌ (U+a9a3 JAVANESE LETTER DA MAHAPRANA) jav (ex:4568164,uses:4)
    • ‌ꦤ‌ (U+a9a4 JAVANESE LETTER NA) jav (ex:4560200,uses:11)
    • ‌ꦥ‌ (U+a9a5 JAVANESE LETTER PA) jav (ex:4556453,uses:2)
    • ‌ꦧ‌ (U+a9a7 JAVANESE LETTER BA) jav (ex:4568164,uses:4)
    • ‌ꦩ‌ (U+a9a9 JAVANESE LETTER MA) jav (ex:4560200,uses:11)
    • ‌ꦪ‌ (U+a9aa JAVANESE LETTER YA) jav (ex:4554571,uses:2)
    • ‌ꦫ‌ (U+a9ab JAVANESE LETTER RA) jav (ex:4568164,uses:6)
    • ‌ꦭ‌ (U+a9ad JAVANESE LETTER LA) jav (ex:4560194,uses:7)
    • ‌ꦮ‌ (U+a9ae JAVANESE LETTER WA) jav (ex:4568164,uses:6)
    • ‌ꦰ‌ (U+a9b0 JAVANESE LETTER SA MAHAPRANA) jav (ex:3937561,uses:1)
    • ‌ꦱ‌ (U+a9b1 JAVANESE LETTER SA) jav (ex:4568164,uses:9)
    • ‌ꦲ‌ (U+a9b2 JAVANESE LETTER HA) jav (ex:4560194,uses:6)
    • ‌ꦴ‌ (U+a9b4 JAVANESE VOWEL SIGN TARUNG) jav (ex:4560200,uses:7)
    • ‌ꦶ‌ (U+a9b6 JAVANESE VOWEL SIGN WULU) jav (ex:4568164,uses:11)
    • ‌ꦸ‌ (U+a9b8 JAVANESE VOWEL SIGN SUKU) jav (ex:4560194,uses:10)
    • ‌ꦺ‌ (U+a9ba JAVANESE VOWEL SIGN TALING) jav (ex:4568164,uses:12)
    • ‌ꦼ‌ (U+a9bc JAVANESE VOWEL SIGN PEPET) jav (ex:4560200,uses:12)
    • ‌꧀‌ (U+a9c0 JAVANESE PANGKON) jav (ex:4568164,uses:14)
  • ꀁ ꀃ ꀐ ꀕ ꁧ ꂘ ꂯ ꃀ ꆍ ꆏ ꆹ ꇩ ꇬ ꇿ ꈍ ꉡ ꉬ ꊿ ꋋ ꋙ ꋠ ꌕ ꍏ ꏃ ꐥ ꑋ ꑍ ꑬ Affects: Unknown Language
    • ‌ꀁ‌ (U+a001 YI SYLLABLE IX) Affects: Unknown Language (ex:8191359,uses:1)
    • ‌ꀃ‌ (U+a003 YI SYLLABLE IP) Affects: Unknown Language (ex:8191501,uses:2)
    • ‌ꀐ‌ (U+a010 YI SYLLABLE OX) Affects: Unknown Language (ex:8191477,uses:1)
    • ‌ꀕ‌ (U+a015 YI SYLLABLE WU) Affects: Unknown Language (ex:8191477,uses:1)
    • ‌ꁧ‌ (U+a067 YI SYLLABLE BBO) Affects: Unknown Language (ex:8191353,uses:1)
    • ‌ꂘ‌ (U+a098 YI SYLLABLE HMAT) Affects: Unknown Language (ex:8191419,uses:1)
    • ‌ꂯ‌ (U+a0af YI SYLLABLE MIX) Affects: Unknown Language (ex:8191353,uses:1)
    • ‌ꃀ‌ (U+a0c0 YI SYLLABLE MOP) Affects: Unknown Language (ex:8191419,uses:1)
    • ‌ꆍ‌ (U+a18d YI SYLLABLE NOP) Affects: Unknown Language (ex:8191359,uses:1)
    • ‌ꆏ‌ (U+a18f YI SYLLABLE NE) Affects: Unknown Language (ex:8191501,uses:4)
    • ‌ꆹ‌ (U+a1b9 YI SYLLABLE LI) Affects: Unknown Language (ex:8191457,uses:3)
    • ‌ꇩ‌ (U+a1e9 YI SYLLABLE GUOP) Affects: Unknown Language (ex:8191457,uses:2)
    • ‌ꇬ‌ (U+a1ec YI SYLLABLE GO) Affects: Unknown Language (ex:8191359,uses:1)
    • ‌ꇿ‌ (U+a1ff YI SYLLABLE KAT) Affects: Unknown Language (ex:8191353,uses:1)
    • ‌ꈍ‌ (U+a20d YI SYLLABLE KEP) Affects: Unknown Language (ex:8191359,uses:1)
    • ‌ꉡ‌ (U+a261 YI SYLLABLE NGAX) Affects: Unknown Language (ex:8191444,uses:1)
    • ‌ꉬ‌ (U+a26c YI SYLLABLE NGE) Affects: Unknown Language (ex:8191501,uses:2)
    • ‌ꊿ‌ (U+a2bf YI SYLLABLE CO) Affects: Unknown Language (ex:8191457,uses:3)
    • ‌ꋋ‌ (U+a2cb YI SYLLABLE CYX) Affects: Unknown Language (ex:8191457,uses:2)
    • ‌ꋙ‌ (U+a2d9 YI SYLLABLE ZZAX) Affects: Unknown Language (ex:8191477,uses:1)
    • ‌ꋠ‌ (U+a2e0 YI SYLLABLE ZZE) Affects: Unknown Language (ex:8191477,uses:1)
    • ‌ꌕ‌ (U+a315 YI SYLLABLE SUO) Affects: Unknown Language (ex:8191498,uses:1)
    • ‌ꍏ‌ (U+a34f YI SYLLABLE ZHO) Affects: Unknown Language (ex:8191457,uses:2)
    • ‌ꏃ‌ (U+a3c3 YI SYLLABLE SHYP) Affects: Unknown Language (ex:8191501,uses:1)
    • ‌ꐥ‌ (U+a425 YI SYLLABLE JJO) Affects: Unknown Language (ex:8191359,uses:1)
    • ‌ꑋ‌ (U+a44b YI SYLLABLE NYIX) Affects: Unknown Language (ex:8191359,uses:1)
    • ‌ꑍ‌ (U+a44d YI SYLLABLE NYIP) Affects: Unknown Language (ex:8191501,uses:2)
    • ‌ꑬ‌ (U+a46c YI SYLLABLE XYX) Affects: Unknown Language (ex:8191501,uses:2)
  • ㇰ ㇱ ㇷ ㇻ ㇼ ㇽ ㇾ ㇿ Affects: Ainu [ain]
    • ‌ㇰ‌ (U+31f0 KATAKANA LETTER SMALL KU) ain (ex:2850264,uses:3)
    • ‌ㇱ‌ (U+31f1 KATAKANA LETTER SMALL SI) ain (ex:3717727,uses:4)
    • ‌ㇷ‌ (U+31f7 KATAKANA LETTER SMALL HU) ain (ex:2850247,uses:2)
    • ‌ㇻ‌ (U+31fb KATAKANA LETTER SMALL RA) ain (ex:7368402,uses:4)
    • ‌ㇼ‌ (U+31fc KATAKANA LETTER SMALL RI) ain (ex:3717737,uses:1)
    • ‌ㇽ‌ (U+31fd KATAKANA LETTER SMALL RU) ain (ex:2850264,uses:2)
    • ‌ㇾ‌ (U+31fe KATAKANA LETTER SMALL RE) ain (ex:587897,uses:1)
    • ‌ㇿ‌ (U+31ff KATAKANA LETTER SMALL RO) ain (ex:602515,uses:1)
  • ︎ Affects: Japanese [jpn]
    • ‌︎‌ (U+fe0e VARIATION SELECTOR-15) jpn (ex:4149189,uses:1)
  • ᜀ ᜃ ᜄ ᜅ ᜆ ᜇ ᜈ ᜉ ᜊ ᜋ ᜌ ᜎ ᜏ ᜐ ᜑ ᜒ ᜓ ᜔ Affects: Tagalog [tgl]
    • ‌ᜀ‌ (U+0x1700 TAGALOG LETTER A) Affects: tgl (ex:8603745,uses:6)
    • ‌ᜃ‌ (U+0x1703 TAGALOG LETTER KA) Affects: tgl (ex:8603746,uses:4)
    • ‌ᜄ‌ (U+0x1704 TAGALOG LETTER GA) Affects: tgl (ex:8603752,uses:4)
    • ‌ᜅ‌ (U+0x1705 TAGALOG LETTER NGA) Affects: tgl (ex:8603745,uses:5)
    • ‌ᜆ‌ (U+0x1706 TAGALOG LETTER TA) Affects: tgl (ex:8603755,uses:4)
    • ‌ᜇ‌ (U+0x1707 TAGALOG LETTER DA) Affects: tgl (ex:8603762,uses:2)
    • ‌ᜈ‌ (U+0x1708 TAGALOG LETTER NA) Affects: tgl (ex:8603745,uses:7)
    • ‌ᜉ‌ (U+0x1709 TAGALOG LETTER PA) Affects: tgl (ex:8603746,uses:3)
    • ‌ᜊ‌ (U+0x170a TAGALOG LETTER BA) Affects: tgl (ex:8603745,uses:4)
    • ‌ᜋ‌ (U+0x170b TAGALOG LETTER MA) Affects: tgl (ex:8603745,uses:6)
    • ‌ᜌ‌ (U+0x170c TAGALOG LETTER YA) Affects: tgl (ex:8603745,uses:4)
    • ‌ᜎ‌ (U+0x170e TAGALOG LETTER LA) Affects: tgl (ex:8603746,uses:6)
    • ‌ᜏ‌ (U+0x170f TAGALOG LETTER WA) Affects: tgl (ex:8603746,uses:4)
    • ‌ᜐ‌ (U+0x1710 TAGALOG LETTER SA) Affects: tgl (ex:8603745,uses:6)
    • ‌ᜑ‌ (U+0x1711 TAGALOG LETTER HA) Affects: tgl (ex:8603773,uses:1)
    • ‌ᜒ‌ (U+0x1712 TAGALOG VOWEL SIGN I) Affects: tgl (ex:8603745,uses:6)
    • ‌ᜓ‌ (U+0x1713 TAGALOG VOWEL SIGN U) Affects: tgl (ex:8603745,uses:7)
    • ‌᜔‌ (U+0x1714 TAGALOG SIGN VIRAMA) Affects: tgl (ex:8603745,uses:7)
  • 𓀀 𓀁 𓀻 𓁐 𓂋 𓂜 𓂸 𓂻 𓃀 𓄖 𓄤 𓄿 𓅓 𓅨 𓅱 𓆑 𓆓 𓆼 𓇋 𓇌 𓇛 𓈖 𓈗 𓈞 𓊄 𓊪 𓋴 𓌻 𓍿 𓎛 𓎟 𓎡 𓎢 𓏌 𓏏 𓏥 𓏫 𓏭 𓏲 𓏶 Affects: Unknown Language
    • ‌𓀀‌ (U+0x13000 EGYPTIAN HIEROGLYPH A001) Affects: Unknown Language (ex:8678389,uses:3)
    • ‌𓀁‌ (U+0x13001 EGYPTIAN HIEROGLYPH A002) Affects: Unknown Language (ex:8674795,uses:2)
    • ‌𓀻‌ (U+0x1303b EGYPTIAN HIEROGLYPH A050) Affects: Unknown Language (ex:8688778,uses:1)
    • ‌𓁐‌ (U+0x13050 EGYPTIAN HIEROGLYPH B001) Affects: Unknown Language (ex:8678535,uses:1)
    • ‌𓂋‌ (U+0x1308b EGYPTIAN HIEROGLYPH D021) Affects: Unknown Language (ex:8678389,uses:4)
    • ‌𓂜‌ (U+0x1309c EGYPTIAN HIEROGLYPH D035) Affects: Unknown Language (ex:8688804,uses:1)
    • ‌𓂸‌ (U+0x130b8 EGYPTIAN HIEROGLYPH D052) Affects: Unknown Language (ex:8678780,uses:1)
    • ‌𓂻‌ (U+0x130bb EGYPTIAN HIEROGLYPH D054) Affects: Unknown Language (ex:8688804,uses:1)
    • ‌𓃀‌ (U+0x130c0 EGYPTIAN HIEROGLYPH D058) Affects: Unknown Language (ex:8678698,uses:1)
    • ‌𓄖‌ (U+0x13116 EGYPTIAN HIEROGLYPH F022) Affects: Unknown Language (ex:8688804,uses:1)
    • ‌𓄤‌ (U+0x13124 EGYPTIAN HIEROGLYPH F035) Affects: Unknown Language (ex:8678535,uses:1)
    • ‌𓄿‌ (U+0x1313f EGYPTIAN HIEROGLYPH G001) Affects: Unknown Language (ex:8674795,uses:1)
    • ‌𓅓‌ (U+0x13153 EGYPTIAN HIEROGLYPH G017) Affects: Unknown Language (ex:8678389,uses:4)
    • ‌𓅨‌ (U+0x13168 EGYPTIAN HIEROGLYPH G036) Affects: Unknown Language (ex:8678698,uses:1)
    • ‌𓅱‌ (U+0x13171 EGYPTIAN HIEROGLYPH G043) Affects: Unknown Language (ex:8678389,uses:1)
    • ‌𓆑‌ (U+0x13191 EGYPTIAN HIEROGLYPH I009) Affects: Unknown Language (ex:8678389,uses:2)
    • ‌𓆓‌ (U+0x13193 EGYPTIAN HIEROGLYPH I010) Affects: Unknown Language (ex:8678629,uses:1)
    • ‌𓆼‌ (U+0x131bc EGYPTIAN HIEROGLYPH M012) Affects: Unknown Language (ex:8674795,uses:1)
    • ‌𓇋‌ (U+0x131cb EGYPTIAN HIEROGLYPH M017) Affects: Unknown Language (ex:8678698,uses:3)
    • ‌𓇌‌ (U+0x131cc EGYPTIAN HIEROGLYPH M017A) Affects: Unknown Language (ex:8680115,uses:1)
    • ‌𓇛‌ (U+0x131db EGYPTIAN HIEROGLYPH M029) Affects: Unknown Language (ex:8678780,uses:1)
    • ‌𓈖‌ (U+0x13216 EGYPTIAN HIEROGLYPH N035) Affects: Unknown Language (ex:8678389,uses:6)
    • ‌𓈗‌ (U+0x13217 EGYPTIAN HIEROGLYPH N035A) Affects: Unknown Language (ex:8678780,uses:1)
    • ‌𓈞‌ (U+0x1321e EGYPTIAN HIEROGLYPH N041) Affects: Unknown Language (ex:8678535,uses:1)
    • ‌𓊄‌ (U+0x13284 EGYPTIAN HIEROGLYPH O035) Affects: Unknown Language (ex:8680115,uses:1)
    • ‌𓊪‌ (U+0x132aa EGYPTIAN HIEROGLYPH Q003) Affects: Unknown Language (ex:8678389,uses:2)
    • ‌𓋴‌ (U+0x132f4 EGYPTIAN HIEROGLYPH S029) Affects: Unknown Language (ex:8674795,uses:2)
    • ‌𓌻‌ (U+0x1333b EGYPTIAN HIEROGLYPH U007) Affects: Unknown Language (ex:8680154,uses:1)
    • ‌𓍿‌ (U+0x1337f EGYPTIAN HIEROGLYPH V013) Affects: Unknown Language (ex:8680154,uses:1)
    • ‌𓎛‌ (U+0x1339b EGYPTIAN HIEROGLYPH V028) Affects: Unknown Language (ex:8688804,uses:1)
    • ‌𓎟‌ (U+0x1339f EGYPTIAN HIEROGLYPH V030) Affects: Unknown Language (ex:8678535,uses:1)
    • ‌𓎡‌ (U+0x133a1 EGYPTIAN HIEROGLYPH V031) Affects: Unknown Language (ex:8678629,uses:3)
    • ‌𓎢‌ (U+0x133a2 EGYPTIAN HIEROGLYPH V031A) Affects: Unknown Language (ex:8680115,uses:1)
    • ‌𓏌‌ (U+0x133cc EGYPTIAN HIEROGLYPH W024) Affects: Unknown Language (ex:8688778,uses:1)
    • ‌𓏏‌ (U+0x133cf EGYPTIAN HIEROGLYPH X001) Affects: Unknown Language (ex:8678535,uses:4)
    • ‌𓏥‌ (U+0x133e5 EGYPTIAN HIEROGLYPH Z002) Affects: Unknown Language (ex:8678389,uses:1)
    • ‌𓏫‌ (U+0x133eb EGYPTIAN HIEROGLYPH Z003A) Affects: Unknown Language (ex:8674795,uses:1)
    • ‌𓏭‌ (U+0x133ed EGYPTIAN HIEROGLYPH Z004) Affects: Unknown Language (ex:8678535,uses:3)
    • ‌𓏲‌ (U+0x133f2 EGYPTIAN HIEROGLYPH Z007) Affects: Unknown Language (ex:8678698,uses:2)
    • ‌𓏶‌ (U+0x133f6 EGYPTIAN HIEROGLYPH Z011) Affects: Unknown Language (ex:8678389,uses:1)

Ignored Intentionally

  • (SOFT HYPHEN) ́ Affects: Russian [rus], Latin [lat]
    • ‌­‌ (U+ad SOFT HYPHEN)
    • ‌́‌ (U+301 COMBINING ACUTE ACCENT) rus (ex:8188479,uses:364)
  • (SOFT HYPHEN) ְ ֱ ֲ ֳ ִ ֵ ֶ ַ ָ ֹ ֺ ֻ ּ ֽ ־ ֿ ׀ ׁ ׂ ׃ ׄ ׅ ׇ Affects: All Languages [all]
    • ‌­‌ (U+ad SOFT HYPHEN)
    • ‌ְ‌ (U+5b0 HEBREW POINT SHEVA) heb (ex:8100773,uses:304), oar (ex:5285789,uses:1)
    • ‌ֱ‌ (U+5b1 HEBREW POINT HATAF SEGOL) heb (ex:8096277,uses:20)
    • ‌ֲ‌ (U+5b2 HEBREW POINT HATAF PATAH) heb (ex:8096277,uses:54)
    • ‌ֳ‌ (U+5b3 HEBREW POINT HATAF QAMATS) heb (ex:3198074,uses:2)
    • ‌ִ‌ (U+5b4 HEBREW POINT HIRIQ) heb (ex:8157521,uses:282), yid (ex:8210619,uses:23), oar (ex:5285789,uses:1)
    • ‌ֵ‌ (U+5b5 HEBREW POINT TSERE) heb (ex:8157521,uses:258), oar (ex:5285789,uses:1)
    • ‌ֶ‌ (U+5b6 HEBREW POINT SEGOL) heb (ex:8170851,uses:216)
    • ‌ַ‌ (U+5b7 HEBREW POINT PATAH) heb (ex:8157521,uses:320), yid (ex:8232829,uses:546), oar (ex:5285789,uses:1)
    • ‌ָ‌ (U+5b8 HEBREW POINT QAMATS) heb (ex:8135690,uses:318), yid (ex:8232829,uses:409), oar (ex:5285789,uses:1)
    • ‌ֹ‌ (U+5b9 HEBREW POINT HOLAM) heb (ex:8100773,uses:171), yid (ex:1561150,uses:1)
    • ‌ֺ‌ (U+5ba HEBREW POINT HOLAM HASER FOR VAV)
    • ‌ֻ‌ (U+5bb HEBREW POINT QUBUTS) heb (ex:8100773,uses:16)
    • ‌ּ‌ (U+5bc HEBREW POINT DAGESH OR MAPIQ) heb (ex:8170851,uses:498), yid (ex:8226280,uses:141), oar (ex:5285789,uses:1)
    • ‌ֽ‌ (U+5bd HEBREW POINT METEG) heb (ex:8038317,uses:3)
    • ‌־‌ (U+5be HEBREW PUNCTUATION MAQAF)
    • ‌ֿ‌ (U+5bf HEBREW POINT RAFE) heb (ex:2139673,uses:2), yid (ex:8223484,uses:291)
    • ‌׀‌ (U+5c0 HEBREW PUNCTUATION PASEQ)
    • ‌ׁ‌ (U+5c1 HEBREW POINT SHIN DOT) eng (ex:5358365,uses:8), heb (ex:8087703,uses:108), oar (ex:5285789,uses:1)
    • ‌ׂ‌ (U+5c2 HEBREW POINT SIN DOT) eng (ex:5706261,uses:3), heb (ex:8100773,uses:48), yid (ex:7885650,uses:3)
    • ‌׃‌ (U+5c3 HEBREW PUNCTUATION SOF PASUQ)
    • ‌ׄ‌ (U+5c4 HEBREW MARK UPPER DOT)
    • ‌ׅ‌ (U+5c5 HEBREW MARK LOWER DOT)
    • ‌ׇ‌ (U+5c7 HEBREW POINT QAMATS QATAN)

Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Oct 5, 2019
Those two characters break the alternating pattern of
uppercase-lowercase pairs. See issue Tatoeba#1970, section "Other Mappings
Currently in Use"
@alanfgh
Copy link
Contributor

alanfgh commented Oct 5, 2019

The amount of work you've put into researching this is truly stunning. Are you planning to implement any changes regarding these suggestions?

@Yorwba
Copy link
Contributor Author

Yorwba commented Oct 6, 2019

It's about time I started fixing issues rather than just piling them up.

Fortunately, this one only affects a well-delineated part of the code base, so I can work on it without having to figure out how the rest of it fits together. Well, except for everything involving multiple codepoints, which will require Unicode normalization (either NFC or NFKC) to happen at some point.

@alanfgh
Copy link
Contributor

alanfgh commented Oct 6, 2019

It's great that you're planning to do this work yourself. If you were going to ask someone else to do it, you would probably have to break it up and/or scale it down.

Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Oct 11, 2019
Those two characters break the alternating pattern of
uppercase-lowercase pairs. See issue Tatoeba#1970, section "Other Mappings
Currently in Use"
trang pushed a commit that referenced this issue Oct 11, 2019
Those two characters break the alternating pattern of
uppercase-lowercase pairs. See issue #1970, section "Other Mappings
Currently in Use"
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Oct 22, 2019
The characters from U+31F0 ㇰ to U+31FF ㇿ are used to write Ainu.
Unicode Block: https://www.unicode.org/charts/PDF/U31F0.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Oct 22, 2019
The characters from U+A000 ꀀ to U+A48C ꒌ are used to write Yi
languages like Nuosu (iii). No Yi language has been added to Tatoeba
yet, and the person who added the sentences using Yi syllables has not
responded to [my attempt at making
contact](https://tatoeba.org/eng/sentences/show/8191359#comment-1126914) so far.
Adding the script anyway probably won't hurt.

Unicode Block: http://unicode.org/charts/PDF/UA000.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Oct 22, 2019
These characters were historically used to write Javanese (jav).
There are a few punctuation marks, which I have excluded.

Unicode Block: https://www.unicode.org/charts/PDF/UA980.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Oct 22, 2019
A handful of characters seem to have been missed when the Lao script was
added.

Unicode Block: https://www.unicode.org/charts/PDF/U0E80.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Oct 22, 2019
Cuneiform is used to write Sumerian (sux).
There are three Unicode blocks:
  - Cuneiform: https://www.unicode.org/charts/PDF/U12000.pdf
  - Cuneiform Numbers and Punctuation: https://www.unicode.org/charts/PDF/U12400.pdf
  - Early Dynastic Cuneiform: https://www.unicode.org/charts/PDF/U12480.pdf

I omitted the punctuation.

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
The characters used to write Gothic (got). The existing Gothic
sentences seem to use spaces between words, so using charset_table is
likely appropriate.
Unicode Block: https://www.unicode.org/charts/PDF/U10330.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
These characters are used to write Old Turkish (otk). The existing Old
Turkish sentences separate words with a colon, so using charset_table is
likely appropriate. Writing direction is right-to-left.
Unicode Block: https://www.unicode.org/charts/PDF/U10C00.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
Warang Citi is used by some to write Ho (hoc). According to Wikipedia,
Latin punctuation including spaces is used, so using charset_table is
likely appropriate.
The script has a range of uppercase characters, which I mapped to the
lowercase ones.
Unicode Block: https://www.unicode.org/charts/PDF/U118A0.pdf

See issue Tatoeba#1970, sections "Case Alternatives" and
"Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
Mongolian script is used to write Mongolian (mon) and Manchu (mnc).
The existing Mongolian sentences all seem to use Cyrillic, though.
Spaces are separated with words, so using charset_table is likely appropriate.
There are a few punctuation marks, which I have excluded.
Traditional writing direction is vertical, but supporting that would
really mess with the layout.
Unicode Block: https://www.unicode.org/charts/PDF/U1800.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
Some non-punctuation characters used to write Malayalam (mal) were missing from
the end of the Unicode range. Probably a typo.
Unicode Block: https://www.unicode.org/charts/PDF/U0D00.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
Of the Spacing Modifier Letters already present in the Tatoeba corpus,
  - ʻ (U+2BB MODIFIER LETTER TURNED COMMA) is used to write the Hawaiian
  Okina, so I added it.
  - ʼ (U+2BC MODIFIER LETTER APOSTROPHE) is used as a stand-in for a
  "regular" apostrophe, so I left it out.
  - ʿ (U+2BF MODIFIER LETTER LEFT HALF RING) is to transcribe the Arabic
  Ayin, so I added it.
  - ˀ (U+2C0 MODIFIER LETTER GLOTTAL STOP) is used to write the glottal
  stop in Cayuga, so I added it.
  - ˈ (U+2C8 MODIFIER LETTER VERTICAL LINE) and
  - ˌ (U+2CC MODIFIER LETTER LOW VERTICAL LINE) are used in IPA
  transcriptions to mark primary and secondary stress, which I think
  means they're better left out.
  - ː (U+2D0 MODIFIER LETTER TRIANGULAR COLON) is used to mark vowel
  length in Ngeq, so I added it.
Unicode Block: https://www.unicode.org/charts/PDF/U02B0.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
There are five Cyrillic Unicode blocks:
- Cyrillic (some combining characters in this range were missing): https://www.unicode.org/charts/PDF/U0400.pdf
- Cyrillic Supplement: https://www.unicode.org/charts/PDF/U0500.pdf
- Cyrillic Extended-A: https://www.unicode.org/charts/PDF/U2DE0.pdf
- Cyrillic Extended-B: https://www.unicode.org/charts/PDF/UA640.pdf
- Cyrillic Extended-C: https://www.unicode.org/charts/PDF/U1C80.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
The Greek Extended character set contains accented characters used to
write Ancient Greek (grc) with polytonic orthography.

Unicode Block: https://www.unicode.org/charts/PDF/U1F00.pdf

There were also some misalignments to fix in the Greek and Coptic block.

See issue Tatoeba#1970, sections "Duplicate Encodings", "Near Duplicates",
"Case Alternatives", "Other Mappings Currently in Use" and
"Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
Most Unicode blocks containing Latin characters were already covered,
but with some missing characters here and there.

There are also two new Unicode blocks:
- IPA Extensions: https://www.unicode.org/charts/PDF/U0250.pdf
- Phonetic Extensions: https://www.unicode.org/charts/PDF/U1D00.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
Lowercase Cherokee characters were added to Unicode in 2015.
There are two Unicode blocks now:
  - Cherokee: https://www.unicode.org/charts/PDF/U13A0.pdf
  - Cherokee Supplement: https://www.unicode.org/charts/PDF/UAB70.pdf

See issue Tatoeba#1970, section "Case Alternatives".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Nov 7, 2019
The Armenian block was previously listed as a single range. However, it
contains both uppercase and lowercase characters, as well as some
punctuation (which I have excluded in the updated definition).

Unicode block: https://www.unicode.org/charts/PDF/U0530.pdf

See issue Tatoeba#1970, section "Case Alternatives".
@jiru
Copy link
Member

jiru commented Mar 14, 2020

@Yorwba What’s the status of this issue? Are there some remaining mappings that you want to implement? Is the status of checkboxes relevant?

By the way, Andreas already Unicode-normalized the sentences, but I think we still need to normalize the query text that is sent to Manticore when searching.

@Yorwba
Copy link
Contributor Author

Yorwba commented Mar 14, 2020

The checkboxes reflect where I've created a PR or decided that the current behavior probably doesn't need to be changed.

I'd been working my way up the list of "Other Unsearchable Characters", because those were mostly scripts that were missing entirely. Once I hit the missing characters from the Arabic script, things got a bit complicated. I asked a few speakers of affected languages which behavior they'd prefer and got feedback regarding Arabic, Persian and Ottoman Turkish. In the case of Arabic vowel marks, the ideal behavior would be that a word with vowel marks matches one without, but not another word with different vowel marks. But that's not a transitive relation, so it can't be implemented with a simple index lookup. Also, the set of characters that are considered equivalent is different across different languages. Parts of this would probably best be handled by a stemmer. Since we now have stemming for Arabic, the situation should have improved a bit, but I need to check. See also issues #1595 (Arabic) and #1880 (Ottoman Turkish).

Unicode normalization to NFC would take care of all duplicate encodings in one fell swoop, but it seems like the cleaning function hasn't been applied to sentences already in the database. E.g. the Khách in https://tatoeba.org/eng/sentences/show/7027190 still has an composed of two codepoints.

Then there's the issue of near duplicates that are only the same in NFKC. Those are considered canonical equivalents by Unicode, but may have different appearance and are sometimes used differently. Some of them also normalize to multiple codepoints, e.g. the Dutch ij‌ (one codepoint) vs. ij (two codepoints). I don't think we can store sentences in NFKC, because that would erase too much information, but maybe a normalization step could be inserted somewhere in the Manticore pipeline. (Probably not that easy.)

Some of the other parts aren't that technically difficult, but I'm not sure what the best option is. E.g. punctuation is less problematic for languages that use the ngram_chars mechanism, which is more robust to the presence of additional characters. Maybe some people actually want to be able to search for punctuation?

I do plan to eventually get this issue fully cleaned up, but I don't have a specific timeline planned.

Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Jun 11, 2020
Unicode Block: https://www.unicode.org/charts/PDF/U1700.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
Yorwba added a commit to Yorwba/tatoeba2 that referenced this issue Jun 11, 2020
Unicode Block: https://www.unicode.org/charts/PDF/U13000.pdf

See issue Tatoeba#1970, section "Other Unsearchable Characters".
@fdietze
Copy link

fdietze commented Nov 6, 2020

I just started working with the data and stumbled about this unicode normalization problem. On the way I created a simple script that detects duplicate sentences. I hope this is helpful somehow.

#!/usr/bin/env bash
set -Eeuo pipefail # https://vaneyckt.io/posts/safer_bash_scripts_with_set_euxo_pipefail/#:~:text=set%20%2Du,is%20often%20highly%20desirable%20behavior.
set -x # print all commands
shopt -s expand_aliases

export LC_ALL=en_US.UTF-8 

# https://en.wikipedia.org/wiki/Unicode_equivalence#Normalization
# https://www.effectiveperlprogramming.com/2011/09/normalize-your-perl-source/
alias nfd="perl -MUnicode::Normalize -CS -ne 'print NFD(\$_)'" 

# Normalize different unicode space characters to the same space
# https://stackoverflow.com/a/43640405
alias normalize_spaces="perl -CSDA -plE 's/[^\\S\\t]/ /g'"

function normalize_unicode() {
    cat - | normalize_spaces | nfd
}

OUT="out"
TRANS_OUT="$OUT/translations"
mkdir -p $TRANS_OUT

# https://tatoeba.org (Multilingual collaborative sentence translation database)
# https://tatoeba.org/eng/downloads
(cd "$TRANS_OUT"; wget --no-verbose --show-progress --timestamping "https://downloads.tatoeba.org/exports/sentences.tar.bz2")
(cd "$TRANS_OUT"; wget --no-verbose --show-progress --timestamping "https://downloads.tatoeba.org/exports/links.tar.bz2")

[ -s "$TRANS_OUT/sentences.tsv" ] || (tar xOjf "$TRANS_OUT/sentences.tar.bz2" sentences.csv | normalize_unicode > "$TRANS_OUT/sentences.tsv")
[ -s "$TRANS_OUT/links.tsv" ] || (tar xOjf "$TRANS_OUT/links.tar.bz2" links.csv | normalize_unicode > "$TRANS_OUT/links.tsv")

SQLITEDB="$TRANS_OUT/translations.sqlite"

if [ ! -s "$TRANS_OUT/translations.sqlite" ]; then
# some sentences referenced by links might be invalid. That's ok, because some sentences were deduplicated, for example https://tatoeba.org/eng/sentences/show/3094
rm -f "$SQLITEDB"
cat << EOF | sqlite3 -batch "$SQLITEDB"

.bail on
PRAGMA foreign_keys = ON;


SELECT "importing all sentences...";
CREATE TABLE sentences(
  sentenceid INTEGER NOT NULL PRIMARY KEY,
  lang TEXT NOT NULL,
  sentence TEXT NOT NULL
);
CREATE INDEX sentences_sentence_lang ON sentences (lang);
CREATE INDEX sentences_sentence_sentence ON sentences (sentence);
.mode ascii
.separator "\t" "\n"
.import '$TRANS_OUT/sentences.tsv' sentences



SELECT "importing all links...";
CREATE TABLE links(
  sentenceid INTEGER NOT NULL,
  translationid INTEGER NOT NULL,

  PRIMARY KEY (sentenceid, translationid)
  FOREIGN KEY (sentenceid)
      REFERENCES sentences (sentenceid)
      ON UPDATE CASCADE
      ON DELETE CASCADE
  FOREIGN KEY (translationid)
      REFERENCES sentences (sentenceid)
      ON UPDATE CASCADE
      ON DELETE CASCADE
);
.mode ascii
.separator "\t" "\n"
.import '$TRANS_OUT/links.tsv' links



.headers off
SELECT "vacuum...";
VACUUM;

SELECT "checking database integrity...";
PRAGMA integrity_check;
EOF

fi


# translate single sentence
# select * from sentences s JOIN links l ON s.sentenceid = l.sentenceid JOIN sentences s2 ON s2.sentenceid = l.translationid where s.:sqlite> select * from sentences s JOIN links l ON s.sentenceid = l.sentenceid JOIN sentences s2 ON s2.sentenceid = l.translationid where s.sentence='D''accord.' and s2.lang = 'deu' limit 10;

# finds duplicate sentences
sqlite3 -batch $SQLITEDB "select sentence, GROUP_CONCAT(sentenceid) from sentences GROUP BY sentence,lang HAVING COUNT(sentenceid) > 1 LIMIT 50"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants