Multiple dictionaries per file #1069

forthrin · 2016-01-20T18:33:29Z

I use Sublime Text to write fiction in which the same text file may contain several languages, especially in lines spoken by characters of different nationalities, for example Norwegian, English and Japanese.

When using spell checking, it seems I must settle on a single language, which means text in other languages will be marked as "incorrectly spelled" and marked in red. Annoying and confusing.

What would be the best way to overcome this? Can support be added for multiple dictionaries per file, for example?

If it makes a difference, I do most of this writing in the Fountainhead package, if that makes it easier to come up with a solution (such as handling spoken lines in a particular way.)

titoBouzout · 2016-01-21T02:13:24Z

The issue with this is that there is no API to spell-check, but basically everything else (selecting lines, highlighting, changing words) is possible.

oliva · 2016-08-10T11:14:39Z

The same happens for me when programming or writing translation files where the variables are in English and the strings are in a different language.

evandrocoan · 2016-10-10T23:38:19Z

As I write both in English and Portuguese, I combined the English Dictionary with the Portuguese dictionary. So, now I got spell checking on both languages. You may find this dictionary here:

https://github.com/evandrocoan/SublimeTextStudio/tree/develop/MultiLingual%20Dictionary

Dxhs · 2016-11-02T11:39:00Z

@evandrocoan I just did this with my Norwegian and English. 322044 lines + 470122. Took me 5 minutes to c+p it with kate on linux. Lol

titoBouzout · 2016-11-02T17:08:39Z

Can you please briefly but precisely describe how do you that? Im interested Thanks!

evandrocoan · 2016-11-02T17:52:35Z

Steps to merge 2 dictionaries files

Download new dictionaries from: https://github.com/titoBouzout/Dictionaries
Duplicate the file EN_US.txt as EN_US_MY_LANG.txt
Duplicate the file EN_US.aff as EN_US_MY_LANG.aff
Duplicate the file EN_US.dic as EN_US_MY_LANG.dic
Open your MY_LANG.txt and append its contents on EN_US_MY_LANG.txt.
Open your MY_LANG.aff and merge its contents on EN_US_MY_LANG.aff using your intelligence.
Open your MY_LANG.dic and append its contents on EN_US_MY_LANG.dic and update the EN_US_MY_LANG.dic first file line with the correct number of words on this new file.
Set EN_US_MY_LANG as your default spelling check language.

Now you got spelling on 2 languages. But there are some downsides:

When merging the files .aff you need to take care on how you do it otherwise it may crash Sublime Text.
The misspelling suggestions will not be accurate most times, as you now got distinct languages bond by the same misspelling/spelling prediction rules.

Dxhs · 2016-11-02T18:44:46Z

@titoBouzout In linux, I used the cat command with shell redirection (>) into my output file:
english.dic norwegian.dic > output.dic.
If use my standard text editor, it crashes. Lshell did it in milliseconds. I suggest using Lshell or Cmd.

titoBouzout · 2016-11-03T13:27:29Z

Thanks for the explanation! Ill try :)

gustavobittencourt · 2016-11-10T03:33:51Z

Thank you, @evandrocoan!

I'll try to install your EN_PT dictionary here!

Kristinita · 2016-11-15T11:06:50Z

I can not to merge Russian and English dictionaries, I get bad results. See part of answer of hunspell contributor:

Bilingual spellchecker is not supported, at least not in reliable way. Merging dictionaries should be out of question.

Thanks.

evandrocoan · 2016-11-15T15:26:23Z

@dimztimz is correct on this:

Instead, at API level you can instantiate multiple objects of the spellchecker with different languages. Then you can check the word in each object. This is the most reliable way for now.

Therefore as it is to be performed by the Sublime Text spell checking core. So we need to wait for them to implement this feature for the best functionality.

Now I got good results with some disadvantages merging the EN _ PT dictionaries. However these two languages are pretty similar. For English and Russian, should not be easy to merge them, if it is possible.

Kristinita · 2017-02-07T17:47:05Z

More programs use Hunspell as Sublime Text. Can try to find extension for another program. I try use in Sublime Text Firefox Russian-English Bilingual addon, and it successfully worked for me.

But for single language spellchecking package LanguageTool — the best solution with many nice features.

Thanks.

ghost · 2017-07-12T22:40:23Z

And I thought I had it with

{
    "dictionary":
    [
        "Packages/Language - English/en_US.dic",
        "Packages/Language - Other/Portuguese (European).dic"
    ]
}

Alas, no.

BenjaminSchaaf · 2021-12-07T03:23:04Z

Fixed in build 4123. "dictionary" can now be provided a list.

eugenesvk · 2021-12-09T13:58:53Z

Fixed in build 4123. "dictionary" can now be provided a list.

This doesn't work very reliably, please see a quick check of different dictionary combos taken from here below and note how the en_US.dic is a spoiler , though only for the Russian one :)
(changing the order of dictionaries doesn't seem to matter)

This is my complete settings file in a new portable Sublime on Windows, scroll horizontally to see the results for different dictionary combos

{
"ignored_packages":["Vintage",],
"spell_check": true,
"dictionary": [
"Packages/Language - English/en_US.dic",       // 1
"Packages/Language - English/en_GB.dic",       // 2
"Packages/User/Dictionaries/German_de_DE.dic", // 3
"Packages/User/Dictionaries/Russian.dic",      // 4
]
}
// Ln Text                                	1US	2GB	3DE	4Ru	1+2	1+3	1+4	2+3	2+4	3+4	1+3+4	2+3+4	1+2+3+4
// US "Is htis in colors?  That's insane!"	+  	+  	*  	*  	+  	+  	+  	+  	+  	*  	+    	+    	+
// GB "Is htis in colours? That's insane!"	+  	+  	*  	*  	+  	+  	+  	+  	+  	*  	+    	+    	+
// De "Rechtschreibe/stylistsische Fehler"	*  	*  	+  	*  	*  	+  	*  	+  	*  	+  	+    	+    	+
// Ru "Превед, медвед, как дела"          	-  	*  	*  	+  	-  	-  	-  	*  	+  	+  	-    	+    	-

// + highlights spellchecking errors in the Row's language (from the perspective of the Column's dictionary/ies)
// * same as +, but for a mismatching language (~the whole line is highlighted)
// - nothing is highlighted, even spellchecking errors

BenjaminSchaaf · 2021-12-10T05:07:17Z

It's working fine here with the same dictionaries:

eugenesvk · 2021-12-10T05:32:48Z

And by "working fine" you mean that it's not highlighting any spelling mistakes in the Russian line? All the example lines contain spelling errors, so no line should ever be free from red no matter the dictionary/ies

eugenesvk · 2021-12-10T05:40:11Z

I think it's due to the first wrong line in the dictionary affix file SET ISO8859-1, it should be UTF8.
Not sure if the dictionary has to be regenerated, it doesn't seem to be saved in UTF8, so likely yes, though a simple text replace seems to be fine and seems to fix the surface issue of no highlights in the Ru line (otherwise haven't done any tests re. how well any of these combos work)

Out of curiosity, are you combining the language files behind the scenes (having to deal with different affix schemes) or are you using a simpler API and send each text to each dic and then combine the results it somehow?

BenjaminSchaaf · 2021-12-10T05:43:55Z

Ah yes, the en_US dictionary seems to be wrongly configured. It's unrelated to this issue, but I'll put in a fix.

Out of curiosity, are you combining the language files behind the scenes (having to deal with different affix schemes) or are you using a simpler API and send each text to each dic and then combine the results it somehow?

There's no way to combine the languages (easily), so we just check if a sub-word is spelt correctly according to any of the listed dictionaries.

BenjaminSchaaf · 2021-12-10T05:58:27Z

Upon further inspection we just seem to be handling the encoding wrong.

MarllonMenezes · 2022-04-06T18:43:27Z

Como escrevo tanto em inglês quanto em português, combinei o Dicionário de inglês com o dicionário de português. Então, agora eu tenho verificação ortográfica em ambos os idiomas. Você pode encontrar este dicionário aqui:

https://github.com/evandrocoan/SublimeTextStudio/tree/develop/MultiLingual%20Dictionary

você ainda tem esse Dicionário ?

evandrocoan · 2022-04-06T20:41:14Z

Movi para este link: https://github.com/evandrocoan/LanguageEnglishAndPortuguese

stdedos · 2022-05-25T21:34:55Z

Is this consider to be solved? Because I still have problems with it.

I have also tried with a UTF8 dic, no luck 😕

BenjaminSchaaf · 2022-05-26T01:37:16Z

@stdedos the encoding problem was fixed in build 4125, so yes this should all be working.

stdedos · 2022-05-26T05:44:24Z

I am using https://github.com/titoBouzout/Dictionaries/blob/master/Greek.dic with

	"dictionary": [
		"Packages/Language - English/en_US.dic", // 1
		"Packages/Language - English/en_GB.dic", // 2
		"Packages/Greek.dic",                    // 3
		"Packages/Greek_UTF8.dic",               // 3
		"Packages/User/Greek.dic",               // 3
	],

and text

What is Lorem Ipsum?
Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text ever since the 1500s, when an unknown printer took a galley of type and scrambled it to make a type specimen book. It has survived not only five centuries, but also the leap into electronic typesetting, remaining essentially unchanged. It was popularised in the 1960s with the release of Letraset sheets containing Lorem Ipsum passages, and more recently with desktop publishing software like Aldus PageMaker including versions of Lorem Ipsum.

Γιατί το χρησιμοποιούμε;
Είναι πλέον κοινά παραδεκτό ότι ένας αναγνώστης αποσπάται από το περιεχόμενο που διαβάζει, όταν εξετάζει τη διαμόρφωση μίας σελίδας. Η ουσία της χρήσης του Lorem Ipsum είναι ότι έχει λίγο-πολύ μία ομαλή κατανομή γραμμάτων, αντίθετα με το να βάλει κανείς κείμενο όπως 'Εδώ θα μπει κείμενο, εδώ θα μπει κείμενο', κάνοντάς το να φαίνεται σαν κανονικό κείμενο. Πολλά λογισμικά πακέτα ηλεκτρονικής σελιδοποίησης και επεξεργαστές ιστότοπων πλέον χρησιμοποιούν το Lorem Ipsum σαν προκαθορισμένο δείγμα κειμένου, και η αναζήτησ για τις λέξεις 'lorem ipsum' στο διαδίκτυο θα αποκαλύψει πολλά web site που βρίσκονται στο στάδιο της δημιουργίας. Διάφορες εκδοχές έχουν προκύψει με το πέρασμα των χρόνων, άλλες φορές κατά λάθος, άλλες φορές σκόπιμα (με σκοπό το χιούμορ και άλλα συναφή).

Where does it come from?
Contrary to popular belief, Lorem Ipsum is not simply random text. It has roots in a piece of classical Latin literature from 45 BC, making it over 2000 years old. Richard McClintock, a Latin professor at Hampden-Sydney College in Virginia, looked up one of the more obscure Latin words, consectetur, from a Lorem Ipsum passage, and going through the cites of the word in classical literature, discovered the undoubtable source. Lorem Ipsum comes from sections 1.10.32 and 1.10.33 of "de Finibus Bonorum et Malorum" (The Extremes of Good and Evil) by Cicero, written in 45 BC. This book is a treatise on the theory of ethics, very popular during the Renaissance. The first line of Lorem Ipsum, "Lorem ipsum dolor sit amet..", comes from a line in section 1.10.32.

The standard chunk of Lorem Ipsum used since the 1500s is reproduced below for those interested. Sections 1.10.32 and 1.10.33 from "de Finibus Bonorum et Malorum" by Cicero are also reproduced in their exact original form, accompanied by English versions from the 1914 translation by H. Rackham.

Που μπορώ να βρω μερικές;
Υπάρχουν πολλές εκδοχές των αποσπασμάτων του Lorem Ipsum διαθέσιμες, αλλά η πλειοψηφία τους έχει δεχθεί κάποιας μορφής αλλοιώσεις, με ενσωματωμένους αστεεισμούς, ή τυχαίες λέξεις που δεν γίνονται καν πιστευτές. Εάν πρόκειται να χρησιμοποιήσετε ένα κομμάτι του Lorem Ipsum, πρέπει να είστε βέβαιοι πως δεν βρίσκεται κάτι προσβλητικό κρυμμένο μέσα στο κείμενο. Όλες οι γεννήτριες Lorem Ipsum στο διαδίκτυο τείνουν να επαναλαμβάνουν προκαθορισμένα κομμάτια του Lorem Ipsum κατά απαίτηση, καθιστώνας την παρούσα γεννήτρια την πρώτη πραγματική γεννήτρια στο διαδίκτυο. Χρησιμοποιεί ένα λεξικό με πάνω από 200 λατινικές λέξεις, συνδυασμένες με ένα εύχρηστο μοντέλο σύνταξης προτάσεων, ώστε να παράγει Lorem Ipsum που δείχνει λογικό. Από εκεί και πέρα, το Lorem Ipsum παραμένει πάντα ανοιχτό σε επαναλήψεις, ενσωμάτωση χιούμορ, μη κατανοητές λέξεις κλπ.

half lights up like a Christmas tree.

BenjaminSchaaf · 2022-05-26T06:07:36Z

It's working fine here with the linked dictionary. I suggest double checking you've correctly installed that dictionary - both the aff and dic files are required and must not have their encodings modified.

ghost · 2022-09-15T13:18:01Z

Hi @BenjaminSchaaf

Can you tell me what is wrong with the Ukrainian dictionary?

When I add to the dictionary list, English one stops working.

{
	"dictionary": [
		"Packages/Language - English/en_US.dic",
		"Packages/User/Dictionaries/uk_UA.dic",
	],
}

Thanks!

Sublime Text v4134
Windows 10

uk_UA.zip

BenjaminSchaaf · 2022-09-15T15:19:22Z

@ihor-oleks I suggest making a separate issue, thanks.

ghost · 2022-09-15T16:04:16Z

Done, thanks.

titoBouzout added T: enhancement C: Spellcheck C: i18n S: minor labels Jan 21, 2016

FichteFoll changed the title ~~Multiple dictionaries per text file~~ Multiple dictionaries per file Jan 21, 2016

This was referenced Nov 14, 2016

[Bug] Can not begin to use issue arty-name/hunspell-merge#6

Open

[Question] I can not create English-Russian Dictionary hunspell/hunspell#425

Closed

bartosz-antosik mentioned this issue Dec 2, 2017

No language available bartosz-antosik/vscode-spellright#53

Closed

BenjaminSchaaf self-assigned this Nov 8, 2021

BenjaminSchaaf closed this as completed Dec 7, 2021

BenjaminSchaaf added the R: fixed label Dec 7, 2021

BenjaminSchaaf added this to the Build 4123 milestone Dec 7, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multiple dictionaries per file #1069

Multiple dictionaries per file #1069

forthrin commented Jan 20, 2016

titoBouzout commented Jan 21, 2016

oliva commented Aug 10, 2016 •

edited

Loading

evandrocoan commented Oct 10, 2016

Dxhs commented Nov 2, 2016

titoBouzout commented Nov 2, 2016

evandrocoan commented Nov 2, 2016 •

edited

Loading

Dxhs commented Nov 2, 2016 •

edited

Loading

titoBouzout commented Nov 3, 2016

gustavobittencourt commented Nov 10, 2016

Kristinita commented Nov 15, 2016

evandrocoan commented Nov 15, 2016 •

edited

Loading

Kristinita commented Feb 7, 2017

ghost commented Jul 12, 2017

BenjaminSchaaf commented Dec 7, 2021

eugenesvk commented Dec 9, 2021

BenjaminSchaaf commented Dec 10, 2021

eugenesvk commented Dec 10, 2021

eugenesvk commented Dec 10, 2021 •

edited

Loading

BenjaminSchaaf commented Dec 10, 2021

BenjaminSchaaf commented Dec 10, 2021

MarllonMenezes commented Apr 6, 2022

evandrocoan commented Apr 6, 2022

stdedos commented May 25, 2022

BenjaminSchaaf commented May 26, 2022

stdedos commented May 26, 2022

BenjaminSchaaf commented May 26, 2022

ghost commented Sep 15, 2022

BenjaminSchaaf commented Sep 15, 2022

ghost commented Sep 15, 2022

Multiple dictionaries per file #1069

Multiple dictionaries per file #1069

Comments

forthrin commented Jan 20, 2016

titoBouzout commented Jan 21, 2016

oliva commented Aug 10, 2016 • edited Loading

evandrocoan commented Oct 10, 2016

Dxhs commented Nov 2, 2016

titoBouzout commented Nov 2, 2016

evandrocoan commented Nov 2, 2016 • edited Loading

Steps to merge 2 dictionaries files

Dxhs commented Nov 2, 2016 • edited Loading

titoBouzout commented Nov 3, 2016

gustavobittencourt commented Nov 10, 2016

Kristinita commented Nov 15, 2016

evandrocoan commented Nov 15, 2016 • edited Loading

Kristinita commented Feb 7, 2017

ghost commented Jul 12, 2017

BenjaminSchaaf commented Dec 7, 2021

eugenesvk commented Dec 9, 2021

BenjaminSchaaf commented Dec 10, 2021

eugenesvk commented Dec 10, 2021

eugenesvk commented Dec 10, 2021 • edited Loading

BenjaminSchaaf commented Dec 10, 2021

BenjaminSchaaf commented Dec 10, 2021

MarllonMenezes commented Apr 6, 2022

evandrocoan commented Apr 6, 2022

stdedos commented May 25, 2022

BenjaminSchaaf commented May 26, 2022

stdedos commented May 26, 2022

BenjaminSchaaf commented May 26, 2022

ghost commented Sep 15, 2022

BenjaminSchaaf commented Sep 15, 2022

ghost commented Sep 15, 2022

oliva commented Aug 10, 2016 •

edited

Loading

evandrocoan commented Nov 2, 2016 •

edited

Loading

Dxhs commented Nov 2, 2016 •

edited

Loading

evandrocoan commented Nov 15, 2016 •

edited

Loading

eugenesvk commented Dec 10, 2021 •

edited

Loading