Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add swiss german as a language #164

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
290 changes: 204 additions & 86 deletions README.md

Large diffs are not rendered by default.

76 changes: 40 additions & 36 deletions accuracy-reports/aggregated-accuracy-values.csv

Large diffs are not rendered by default.

14 changes: 7 additions & 7 deletions accuracy-reports/lingua/Afrikaans.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,18 +2,18 @@

Legend: 'low accuracy mode | high accuracy mode'

>>> Accuracy on average: 64.00% | 78.63%
>>> Accuracy on average: 63.70% | 78.53%

>> Detection of 1000 single words (average length: 8 chars)
Accuracy: 37.00% | 58.30%
Erroneously classified as DUTCH: 12.80% | 14.10%, GERMAN: 2.70% | 2.30%, LATIN: 2.60% | 2.00%, DANISH: 2.10% | 1.90%, ENGLISH: 2.10% | 1.90%, BOKMAL: 2.70% | 1.60%, WELSH: 0.70% | 1.10%, NYNORSK: 1.50% | 1.00%, ESTONIAN: 1.20% | 0.90%, SWEDISH: 1.50% | 0.90%, TSWANA: 1.50% | 0.70%, ITALIAN: 0.40% | 0.70%, ZULU: 1.20% | 0.70%, LITHUANIAN: 1.30% | 0.70%, BASQUE: 1.40% | 0.60%, SOTHO: 1.00% | 0.60%, OROMO: 2.00% | 0.60%, FRENCH: 1.40% | 0.60%, GANDA: 1.10% | 0.60%, TURKISH: 0.50% | 0.60%, SWAHILI: 0.60% | 0.50%, ROMANIAN: 0.90% | 0.50%, ESPERANTO: 1.00% | 0.50%, PORTUGUESE: 0.90% | 0.50%, XHOSA: 0.40% | 0.50%, TSONGA: 1.40% | 0.50%, YORUBA: 0.70% | 0.40%, POLISH: 1.10% | 0.40%, FINNISH: 2.30% | 0.40%, LATVIAN: 1.30% | 0.40%, SHONA: 0.80% | 0.40%, INDONESIAN: 0.40% | 0.30%, ICELANDIC: 0.90% | 0.30%, MALAY: 0.80% | 0.30%, SOMALI: 1.00% | 0.30%, IRISH: 0.40% | 0.30%, MAORI: 0.60% | 0.20%, CATALAN: 0.40% | 0.20%, TAGALOG: 1.30% | 0.10%, VIETNAMESE: 0.40% | 0.10%, CROATIAN: 0.50% | 0.10%, SLOVAK: 0.50% | 0.10%, SPANISH: 0.70% | 0.10%, BOSNIAN: 0.10% | 0.10%, HUNGARIAN: 0.60% | 0.10%, CZECH: 0.40% | 0.00%, ALBANIAN: 0.40% | 0.00%, AZERBAIJANI: 0.40% | 0.00%, SLOVENE: 0.10% | 0.00%
Accuracy: 36.60% | 58.10%
Erroneously classified as DUTCH: 12.70% | 13.90%, GERMAN: 2.60% | 2.10%, DANISH: 2.00% | 1.90%, ENGLISH: 2.00% | 1.90%, LATIN: 2.60% | 1.90%, BOKMAL: 2.60% | 1.50%, SWISS_GERMAN: 1.60% | 1.20%, WELSH: 0.70% | 1.10%, NYNORSK: 1.40% | 1.00%, ESTONIAN: 1.20% | 0.90%, SWEDISH: 1.50% | 0.90%, TSWANA: 1.50% | 0.70%, ITALIAN: 0.40% | 0.70%, ZULU: 1.10% | 0.70%, LITHUANIAN: 1.30% | 0.70%, BASQUE: 1.40% | 0.60%, SOTHO: 1.00% | 0.60%, OROMO: 1.90% | 0.60%, FRENCH: 1.40% | 0.60%, GANDA: 1.10% | 0.60%, TURKISH: 0.50% | 0.60%, ROMANIAN: 0.90% | 0.50%, ESPERANTO: 1.00% | 0.50%, PORTUGUESE: 0.90% | 0.50%, XHOSA: 0.40% | 0.50%, TSONGA: 1.40% | 0.50%, SWAHILI: 0.60% | 0.40%, YORUBA: 0.70% | 0.40%, POLISH: 1.00% | 0.40%, FINNISH: 2.30% | 0.40%, LATVIAN: 1.30% | 0.40%, MALAY: 0.70% | 0.30%, SHONA: 0.80% | 0.30%, SOMALI: 1.00% | 0.30%, IRISH: 0.40% | 0.30%, INDONESIAN: 0.40% | 0.20%, MAORI: 0.60% | 0.20%, ICELANDIC: 0.80% | 0.20%, CATALAN: 0.40% | 0.20%, TAGALOG: 1.30% | 0.10%, CROATIAN: 0.50% | 0.10%, SLOVAK: 0.50% | 0.10%, VIETNAMESE: 0.30% | 0.10%, SPANISH: 0.70% | 0.10%, BOSNIAN: 0.10% | 0.10%, HUNGARIAN: 0.60% | 0.10%, CZECH: 0.40% | 0.00%, ALBANIAN: 0.40% | 0.00%, AZERBAIJANI: 0.40% | 0.00%, SLOVENE: 0.10% | 0.00%

>> Detection of 1000 word pairs (average length: 16 chars)
Accuracy: 62.20% | 80.80%
Erroneously classified as DUTCH: 13.30% | 11.00%, ENGLISH: 0.80% | 1.30%, GERMAN: 2.70% | 1.10%, LATIN: 1.00% | 0.80%, DANISH: 1.30% | 0.70%, BOKMAL: 1.90% | 0.40%, ESTONIAN: 1.40% | 0.30%, YORUBA: 0.80% | 0.30%, NYNORSK: 0.40% | 0.30%, SOTHO: 0.60% | 0.30%, FINNISH: 1.70% | 0.20%, TSONGA: 0.70% | 0.20%, SWEDISH: 1.40% | 0.20%, WELSH: 0.70% | 0.20%, GANDA: 0.40% | 0.20%, ITALIAN: 0.40% | 0.20%, OROMO: 0.40% | 0.20%, FRENCH: 0.50% | 0.10%, CATALAN: 0.10% | 0.10%, PORTUGUESE: 0.80% | 0.10%, HUNGARIAN: 0.20% | 0.10%, ESPERANTO: 0.50% | 0.10%, MALAY: 0.20% | 0.10%, SWAHILI: 0.00% | 0.10%, SHONA: 0.50% | 0.10%, TAGALOG: 0.40% | 0.10%, TURKISH: 0.10% | 0.10%, TSWANA: 0.20% | 0.10%, SPANISH: 0.10% | 0.10%, BOSNIAN: 0.10% | 0.10%, ROMANIAN: 0.40% | 0.00%, SLOVENE: 0.10% | 0.00%, MAORI: 0.30% | 0.00%, ZULU: 0.10% | 0.00%, LITHUANIAN: 0.70% | 0.00%, INDONESIAN: 0.30% | 0.00%, POLISH: 0.40% | 0.00%, BASQUE: 0.40% | 0.00%, AZERBAIJANI: 0.20% | 0.00%, XHOSA: 0.20% | 0.00%, LATVIAN: 0.40% | 0.00%, CZECH: 0.10% | 0.00%, SOMALI: 0.20% | 0.00%, ALBANIAN: 0.30% | 0.00%, SLOVAK: 0.10% | 0.00%
Accuracy: 61.70% | 80.70%
Erroneously classified as DUTCH: 13.10% | 11.00%, ENGLISH: 0.80% | 1.30%, GERMAN: 2.60% | 1.00%, LATIN: 0.90% | 0.80%, DANISH: 1.30% | 0.60%, SWISS_GERMAN: 1.10% | 0.50%, BOKMAL: 1.90% | 0.40%, ESTONIAN: 1.40% | 0.30%, YORUBA: 0.80% | 0.30%, SOTHO: 0.60% | 0.30%, FINNISH: 1.70% | 0.20%, TSONGA: 0.70% | 0.20%, SWEDISH: 1.40% | 0.20%, WELSH: 0.60% | 0.20%, GANDA: 0.40% | 0.20%, ITALIAN: 0.40% | 0.20%, OROMO: 0.40% | 0.20%, FRENCH: 0.50% | 0.10%, CATALAN: 0.10% | 0.10%, PORTUGUESE: 0.80% | 0.10%, HUNGARIAN: 0.20% | 0.10%, NYNORSK: 0.40% | 0.10%, ESPERANTO: 0.50% | 0.10%, MALAY: 0.20% | 0.10%, SWAHILI: 0.00% | 0.10%, SHONA: 0.50% | 0.10%, TAGALOG: 0.40% | 0.10%, TURKISH: 0.10% | 0.10%, TSWANA: 0.20% | 0.10%, SPANISH: 0.10% | 0.10%, BOSNIAN: 0.10% | 0.10%, ROMANIAN: 0.40% | 0.00%, SLOVENE: 0.10% | 0.00%, MAORI: 0.30% | 0.00%, ZULU: 0.10% | 0.00%, LITHUANIAN: 0.70% | 0.00%, INDONESIAN: 0.30% | 0.00%, POLISH: 0.40% | 0.00%, BASQUE: 0.40% | 0.00%, AZERBAIJANI: 0.20% | 0.00%, XHOSA: 0.20% | 0.00%, LATVIAN: 0.40% | 0.00%, CZECH: 0.10% | 0.00%, SOMALI: 0.10% | 0.00%, ALBANIAN: 0.30% | 0.00%, SLOVAK: 0.10% | 0.00%

>> Detection of 1000 sentences (average length: 102 chars)
Accuracy: 92.80% | 96.80%
Erroneously classified as DUTCH: 5.10% | 2.60%, GERMAN: 0.20% | 0.20%, LATIN: 0.10% | 0.10%, ENGLISH: 0.20% | 0.10%, DANISH: 0.00% | 0.10%, SOTHO: 0.00% | 0.10%, GANDA: 0.10% | 0.00%, BOKMAL: 0.20% | 0.00%, WELSH: 0.20% | 0.00%, ESTONIAN: 0.40% | 0.00%, CATALAN: 0.10% | 0.00%, TSWANA: 0.10% | 0.00%, OROMO: 0.10% | 0.00%, FINNISH: 0.10% | 0.00%, HUNGARIAN: 0.10% | 0.00%, TSONGA: 0.10% | 0.00%, YORUBA: 0.10% | 0.00%
Erroneously classified as DUTCH: 5.10% | 2.60%, LATIN: 0.10% | 0.10%, ENGLISH: 0.20% | 0.10%, GERMAN: 0.20% | 0.10%, DANISH: 0.00% | 0.10%, SOTHO: 0.00% | 0.10%, SWISS_GERMAN: 0.10% | 0.10%, GANDA: 0.10% | 0.00%, BOKMAL: 0.20% | 0.00%, WELSH: 0.20% | 0.00%, ESTONIAN: 0.40% | 0.00%, CATALAN: 0.10% | 0.00%, TSWANA: 0.10% | 0.00%, OROMO: 0.10% | 0.00%, HUNGARIAN: 0.10% | 0.00%, TSONGA: 0.10% | 0.00%, YORUBA: 0.10% | 0.00%

>> Exact values: 64.0 37.0 62.2 92.80000000000001 78.63333333333334 58.3 80.80000000000001 96.8
>> Exact values: 63.70000000000001 36.6 61.7 92.80000000000001 78.53333333333335 58.099999999999994 80.7 96.8
8 changes: 4 additions & 4 deletions accuracy-reports/lingua/Albanian.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

Legend: 'low accuracy mode | high accuracy mode'

>>> Accuracy on average: 79.67% | 87.60%
>>> Accuracy on average: 79.63% | 87.57%

>> Detection of 1000 single words (average length: 8 chars)
Accuracy: 54.10% | 68.50%
Erroneously classified as LATIN: 1.70% | 3.00%, ESPERANTO: 2.00% | 1.80%, ITALIAN: 1.30% | 1.50%, BASQUE: 3.10% | 1.50%, ROMANIAN: 1.10% | 1.40%, ENGLISH: 0.80% | 1.20%, FINNISH: 0.80% | 1.00%, PORTUGUESE: 1.40% | 1.00%, CATALAN: 0.90% | 1.00%, TSWANA: 1.80% | 0.90%, LITHUANIAN: 1.60% | 0.90%, GERMAN: 0.90% | 0.90%, MALAY: 1.20% | 0.90%, SWAHILI: 1.50% | 0.90%, BOKMAL: 1.20% | 0.90%, ESTONIAN: 1.20% | 0.80%, SHONA: 1.20% | 0.70%, TSONGA: 1.20% | 0.70%, TURKISH: 1.10% | 0.70%, SOTHO: 1.40% | 0.60%, TAGALOG: 1.10% | 0.60%, OROMO: 0.70% | 0.60%, ZULU: 0.80% | 0.60%, BOSNIAN: 0.60% | 0.60%, FRENCH: 1.60% | 0.60%, DANISH: 0.50% | 0.60%, INDONESIAN: 0.50% | 0.50%, YORUBA: 0.50% | 0.50%, SPANISH: 0.80% | 0.40%, ICELANDIC: 0.60% | 0.40%, WELSH: 0.60% | 0.40%, HUNGARIAN: 0.50% | 0.30%, AFRIKAANS: 0.60% | 0.30%, LATVIAN: 0.50% | 0.30%, SLOVENE: 0.80% | 0.30%, SWEDISH: 1.00% | 0.30%, CROATIAN: 0.30% | 0.30%, POLISH: 0.60% | 0.30%, MAORI: 1.60% | 0.20%, XHOSA: 0.80% | 0.20%, IRISH: 0.20% | 0.20%, GANDA: 0.00% | 0.20%, NYNORSK: 0.40% | 0.10%, SLOVAK: 0.20% | 0.10%, CZECH: 0.30% | 0.10%, SOMALI: 0.60% | 0.10%, AZERBAIJANI: 0.70% | 0.10%, DUTCH: 0.70% | 0.00%, VIETNAMESE: 0.40% | 0.00%
Accuracy: 54.00% | 68.40%
Erroneously classified as LATIN: 1.70% | 3.00%, ESPERANTO: 2.00% | 1.80%, ITALIAN: 1.30% | 1.50%, BASQUE: 3.10% | 1.50%, ROMANIAN: 1.10% | 1.40%, ENGLISH: 0.80% | 1.20%, FINNISH: 0.80% | 1.00%, PORTUGUESE: 1.40% | 1.00%, CATALAN: 0.90% | 1.00%, TSWANA: 1.80% | 0.90%, LITHUANIAN: 1.60% | 0.90%, MALAY: 1.20% | 0.90%, SWAHILI: 1.50% | 0.90%, ESTONIAN: 1.20% | 0.80%, GERMAN: 0.80% | 0.80%, BOKMAL: 1.10% | 0.80%, SHONA: 1.20% | 0.70%, TSONGA: 1.20% | 0.70%, TURKISH: 1.10% | 0.70%, SOTHO: 1.40% | 0.60%, TAGALOG: 1.10% | 0.60%, OROMO: 0.70% | 0.60%, ZULU: 0.80% | 0.60%, BOSNIAN: 0.60% | 0.60%, DANISH: 0.50% | 0.60%, INDONESIAN: 0.50% | 0.50%, SWISS_GERMAN: 0.30% | 0.50%, FRENCH: 1.60% | 0.50%, YORUBA: 0.50% | 0.50%, SPANISH: 0.80% | 0.40%, ICELANDIC: 0.60% | 0.40%, WELSH: 0.60% | 0.40%, HUNGARIAN: 0.50% | 0.30%, LATVIAN: 0.50% | 0.30%, SLOVENE: 0.80% | 0.30%, SWEDISH: 1.00% | 0.30%, CROATIAN: 0.30% | 0.30%, POLISH: 0.60% | 0.30%, MAORI: 1.60% | 0.20%, XHOSA: 0.80% | 0.20%, IRISH: 0.20% | 0.20%, AFRIKAANS: 0.60% | 0.20%, GANDA: 0.00% | 0.20%, NYNORSK: 0.40% | 0.10%, SLOVAK: 0.20% | 0.10%, CZECH: 0.30% | 0.10%, SOMALI: 0.60% | 0.10%, AZERBAIJANI: 0.70% | 0.10%, DUTCH: 0.70% | 0.00%, VIETNAMESE: 0.40% | 0.00%

>> Detection of 1000 word pairs (average length: 15 chars)
Accuracy: 86.20% | 94.80%
Expand All @@ -16,4 +16,4 @@ Erroneously classified as LATIN: 1.10% | 0.60%, ENGLISH: 0.40% | 0.60%, ESPERANT
Accuracy: 98.70% | 99.50%
Erroneously classified as LATIN: 0.30% | 0.20%, FRENCH: 0.10% | 0.10%, ESPERANTO: 0.20% | 0.10%, TSONGA: 0.10% | 0.10%, SPANISH: 0.10% | 0.00%, TAGALOG: 0.10% | 0.00%, BASQUE: 0.10% | 0.00%, TURKISH: 0.10% | 0.00%, SWAHILI: 0.10% | 0.00%, ITALIAN: 0.10% | 0.00%

>> Exact values: 79.66666666666667 54.1 86.2 98.7 87.60000000000001 68.5 94.8 99.5
>> Exact values: 79.63333333333333 54.0 86.2 98.7 87.56666666666666 68.4 94.8 99.5
2 changes: 1 addition & 1 deletion accuracy-reports/lingua/Amharic.txt
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,6 @@ Erroneously classified as TIGRINYA: 1.50% | 0.30%

>> Detection of 998 sentences (average length: 150 chars)
Accuracy: 94.39% | 96.09%
Erroneously classified as TIGRINYA: 4.01% | 2.40%, ENGLISH: 0.10% | 0.50%, FRENCH: 0.20% | 0.10%, SWAHILI: 0.10% | 0.10%, YORUBA: 0.00% | 0.10%, LATIN: 0.10% | 0.10%, MAORI: 0.20% | 0.10%, MALAY: 0.00% | 0.10%, FINNISH: 0.00% | 0.10%, XHOSA: 0.00% | 0.10%, ZULU: 0.10% | 0.10%, GERMAN: 0.00% | 0.10%, UNKNOWN: 0.30% | 0.00%, SOMALI: 0.10% | 0.00%, AFRIKAANS: 0.10% | 0.00%, IRISH: 0.10% | 0.00%, BOSNIAN: 0.10% | 0.00%, AZERBAIJANI: 0.10% | 0.00%
Erroneously classified as TIGRINYA: 4.01% | 2.40%, ENGLISH: 0.10% | 0.50%, FRENCH: 0.20% | 0.10%, SWAHILI: 0.10% | 0.10%, YORUBA: 0.00% | 0.10%, LATIN: 0.10% | 0.10%, MALAY: 0.00% | 0.10%, FINNISH: 0.00% | 0.10%, XHOSA: 0.00% | 0.10%, ZULU: 0.10% | 0.10%, SWISS_GERMAN: 0.00% | 0.10%, GERMAN: 0.00% | 0.10%, UNKNOWN: 0.30% | 0.00%, SOMALI: 0.10% | 0.00%, AFRIKAANS: 0.10% | 0.00%, MAORI: 0.20% | 0.00%, IRISH: 0.10% | 0.00%, BOSNIAN: 0.10% | 0.00%, AZERBAIJANI: 0.10% | 0.00%

>> Exact values: 95.32959251837008 93.10000000000001 98.5 94.38877755511022 97.69739478957916 97.3 99.7 96.09218436873748
8 changes: 4 additions & 4 deletions accuracy-reports/lingua/Azerbaijani.txt
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@

Legend: 'low accuracy mode | high accuracy mode'

>>> Accuracy on average: 81.93% | 89.50%
>>> Accuracy on average: 81.90% | 89.50%

>> Detection of 1000 single words (average length: 8 chars)
Accuracy: 70.90% | 77.20%
Erroneously classified as TURKISH: 9.10% | 8.00%, LATIN: 0.60% | 0.90%, BASQUE: 1.00% | 0.80%, ALBANIAN: 0.80% | 0.80%, ENGLISH: 0.40% | 0.70%, TAGALOG: 0.90% | 0.70%, OROMO: 1.10% | 0.60%, ZULU: 0.90% | 0.60%, ESPERANTO: 0.90% | 0.60%, LITHUANIAN: 0.70% | 0.60%, SOMALI: 0.60% | 0.50%, DANISH: 0.10% | 0.50%, XHOSA: 0.30% | 0.50%, SWAHILI: 0.10% | 0.50%, MALAY: 0.60% | 0.40%, YORUBA: 0.80% | 0.40%, GANDA: 0.40% | 0.40%, TSWANA: 0.80% | 0.40%, ESTONIAN: 0.50% | 0.40%, TSONGA: 0.60% | 0.40%, PORTUGUESE: 0.60% | 0.30%, ITALIAN: 0.40% | 0.30%, ROMANIAN: 0.20% | 0.30%, SPANISH: 0.40% | 0.30%, BOSNIAN: 0.40% | 0.30%, INDONESIAN: 0.30% | 0.20%, SHONA: 0.30% | 0.20%, NYNORSK: 0.20% | 0.20%, SWEDISH: 0.30% | 0.20%, SLOVENE: 0.30% | 0.20%, WELSH: 0.20% | 0.20%, DUTCH: 0.00% | 0.20%, FINNISH: 0.10% | 0.20%, SOTHO: 0.60% | 0.10%, MAORI: 0.40% | 0.10%, IRISH: 0.30% | 0.10%, FRENCH: 0.10% | 0.10%, BOKMAL: 0.50% | 0.10%, CROATIAN: 0.20% | 0.10%, ICELANDIC: 0.40% | 0.10%, GERMAN: 0.50% | 0.10%, AFRIKAANS: 0.20% | 0.10%, LATVIAN: 0.20% | 0.10%, HUNGARIAN: 0.30% | 0.00%, POLISH: 0.20% | 0.00%, CZECH: 0.10% | 0.00%, SLOVAK: 0.10% | 0.00%, CATALAN: 0.10% | 0.00%
Accuracy: 70.80% | 77.20%
Erroneously classified as TURKISH: 9.10% | 8.00%, LATIN: 0.60% | 0.90%, BASQUE: 0.90% | 0.80%, ALBANIAN: 0.80% | 0.80%, ENGLISH: 0.40% | 0.70%, TAGALOG: 0.90% | 0.70%, OROMO: 1.10% | 0.60%, ZULU: 0.90% | 0.60%, ESPERANTO: 0.90% | 0.60%, LITHUANIAN: 0.70% | 0.60%, SOMALI: 0.60% | 0.50%, XHOSA: 0.30% | 0.50%, SWAHILI: 0.10% | 0.50%, MALAY: 0.60% | 0.40%, YORUBA: 0.80% | 0.40%, GANDA: 0.40% | 0.40%, TSWANA: 0.80% | 0.40%, ESTONIAN: 0.50% | 0.40%, TSONGA: 0.60% | 0.40%, DANISH: 0.10% | 0.40%, PORTUGUESE: 0.60% | 0.30%, ITALIAN: 0.40% | 0.30%, ROMANIAN: 0.20% | 0.30%, SPANISH: 0.30% | 0.30%, BOSNIAN: 0.40% | 0.30%, INDONESIAN: 0.30% | 0.20%, SHONA: 0.30% | 0.20%, NYNORSK: 0.20% | 0.20%, SWEDISH: 0.30% | 0.20%, SLOVENE: 0.30% | 0.20%, WELSH: 0.20% | 0.20%, DUTCH: 0.00% | 0.20%, FINNISH: 0.10% | 0.20%, SOTHO: 0.60% | 0.10%, SWISS_GERMAN: 0.40% | 0.10%, MAORI: 0.40% | 0.10%, IRISH: 0.30% | 0.10%, FRENCH: 0.00% | 0.10%, BOKMAL: 0.50% | 0.10%, CROATIAN: 0.20% | 0.10%, ICELANDIC: 0.40% | 0.10%, GERMAN: 0.50% | 0.10%, AFRIKAANS: 0.20% | 0.10%, LATVIAN: 0.20% | 0.10%, HUNGARIAN: 0.30% | 0.00%, POLISH: 0.20% | 0.00%, CZECH: 0.10% | 0.00%, SLOVAK: 0.10% | 0.00%, CATALAN: 0.10% | 0.00%

>> Detection of 1000 word pairs (average length: 16 chars)
Accuracy: 78.50% | 92.30%
Expand All @@ -16,4 +16,4 @@ Erroneously classified as TURKISH: 12.00% | 4.70%, ITALIAN: 0.20% | 0.30%, SWAHI
Accuracy: 96.40% | 99.00%
Erroneously classified as TURKISH: 2.40% | 0.80%, TAGALOG: 0.20% | 0.10%, SOTHO: 0.00% | 0.10%, XHOSA: 0.20% | 0.00%, DANISH: 0.20% | 0.00%, SPANISH: 0.10% | 0.00%, BASQUE: 0.10% | 0.00%, LITHUANIAN: 0.10% | 0.00%, SWEDISH: 0.10% | 0.00%, PORTUGUESE: 0.10% | 0.00%, YORUBA: 0.10% | 0.00%

>> Exact values: 81.93333333333332 70.89999999999999 78.5 96.39999999999999 89.5 77.2 92.30000000000001 99.0
>> Exact values: 81.89999999999999 70.8 78.5 96.39999999999999 89.5 77.2 92.30000000000001 99.0
Loading