Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BIP39 French Wordlist - My proposal #152

Merged
merged 1 commit into from
May 26, 2015
Merged

BIP39 French Wordlist - My proposal #152

merged 1 commit into from
May 26, 2015

Conversation

Kirvx
Copy link

@Kirvx Kirvx commented May 8, 2015

@voisine

Here are my restrictions:

  1. High priority on simple and common french words.
  2. Only words with 5-8 letters.
  3. A word is fully recognizable by typing the first 4 letters (special french characters "é-è" are considered equal to "e", for exemple "museau" and "musée" can not be together).
  4. Only infinitive verbs, adjectives and nouns.
  5. No pronouns, no adverbs, no prepositions, no conjunctions, no interjections (unless a noun/adjective is also popular than its interjection like "mince;chouette").
  6. No numeral adjectives.
  7. No words in the plural (except invariable words like "univers", or same spelling than singular like "heureux").
  8. No female adjectives (except words with same spelling for male and female adjectives like "magique").
  9. No words with several senses AND different spelling in speaking like "verre-vert", unless a word has a meaning much more popular than another like "perle" and "pairle".
  10. No very similar words with 1 letter of difference.
  11. No essentially reflexive verbs (unless a verb is also a noun like "souvenir").
  12. No words with "ô;â;ç;ê;œ;æ;î;ï;û;ù;à;ë;ÿ".
  13. No words ending by "é;ée;è;et;ai;ait".
  14. No demonyms.
  15. No words in conflict with the spelling corrections of 1990 (http://goo.gl/Y8DU4z).
  16. No embarrassing words (in a very, very large scope) or belonging to a particular religion.
  17. No identical words with the Spanish wordlist (as Y75QMO wants).

4 wordlists used:

Spelling verified with Hunspell French Dictionnary (1990 and Classique) in Notepad++, and meaning verified with https://fr.wiktionary.org and http://www.larousse.fr/ for hundreds words.

Guys can review:
@ecdsa @NicolasDorier @EricLarch @NicolasBigot @pollastri-pierre

Thanks to Thomas Voegtlin for his wordlist!

Please wait before merging.

--- The following message is partially outdated because of the evolution of the wordlist. ---

J'ai défini un maximum de restrictions "raisonnables" pour qu'un individu puisse deviner le plus facilement possible un de ses mots en cas d'oubli (ou s'en souvenir facilement).

Pour les mots "embarrassants", il s'agit de mots qui peuvent être assimilés à une vilaine insulte, de certains mots relatifs à une maladie grave, à la mort, à la pauvreté, au crime, à la violence, au domaine médical, à des attitudes et bien d'autres.

J'ai fait de mon mieux pour supprimer les mots qui présentaient une ressemblance avec un autre mot, à l'oral comme à l'écrit.
Plusieurs centaines de mots qui avaient une différence de 1 lettre (ou 1 lettre différente) avec un autre mot ont été supprimés.
Je considère que le résultat est plutôt satisfaisant, loin d'être parfait, mais tout à fait correct.
Aussi, les restrictions n°6 et 10 sont complémentaires à ce problème.

J'estime qu'il y a 1% de mots potentiellement inconnus du public (comme "quantum"), et 5% de mots avec des sens qui sont potentiellement incertains par le public (comme "fongible").
Je considère ces marges comme convenables.

Notez que certains éléments chimiques du tableau périodique sont présents, les plus populaires.

Pour une vérification plaisante, voici la version imprimable (5 pages PDF A4):
https://www.dropbox.com/sh/xlq3x2anb706uw1/AADUYAqcBvkvUPdhwC2uLWmEa?dl=0

Si vous voulez vérifier en 1 lecture, focalisez-vous sur les restrictions n°2,3,5,8 et 11.
Étant donné l'homogénéité de la liste (et le bon sens qu'elle doit avoir), les mots contraires aux restrictions n°1,4 et 13 devront vous sauter aux yeux.
Comptez 15 minutes de lecture par page.
Je recommande quand même une deuxième lecture.

J'espère que vous apprécierez cette wordlist, c'est un travail de plus de 70 heures que je n'envisageais pas de faire au début, étant donné l'ampleur et la responsabilité de la tâche.

Si un mot vous semble inapproprié, ou si vous avez des remarques à faire par rapport aux restrictions, vous pouvez m'en faire part.

Sachez aussi que si elle vous convient, elle sera intégrée dans une des prochaines versions de breadwallet avec les autres wordlists étrangères.

@EricLarch
Copy link

Are there restrictions on the fact that each word should be separated by more than one letter? For instance we had the case of someone writing down "fog" instead of "frog" (or the otherway around I'm not sure), and by chance both were valid words in the dictionnary.
Since since this a new dictionary, I think we have the opportunity to maybe check this rule with an algorithm just to reduce the possibilities of very costly mistakes.

@Kirvx
Copy link
Author

Kirvx commented May 8, 2015

I did not used a script for that, I eliminated the most glaring similarities.
And I do not have the skills to do that :/

@NicolasDorier
Copy link
Contributor

Good idea, I will also review it. For the similar words, I will check the combinaison of all levenstein distance, will be quick. Also, I think you word list is not in KD normalization (not a big deal, I'll fix that)

@NicolasDorier
Copy link
Contributor

Ah one more question. I'm not sure about some words which are either very unknown, or often misspelled. (like zircon and wapiti, which is the only I have seen after quick scan, and maybe the only one)

Do you think we should change such words ?

@Kirvx
Copy link
Author

Kirvx commented May 9, 2015

@NicolasDorier Thanks for the help :)

I used

perl nfkd.pl < wordlist.txt > nfkdworldlist.txt

and nfkd.pl is

#!/usr/bin/perl

use Unicode::Normalize;
use strict;
use warnings;
use open qw(:std :utf8);

while (<>) {
    print NFKD("$_");
}

Thanks to Aaron Voisine for this!
Is that ok?

Of course we can change this kind of words if we find a word that is compatible with the restrictions.

@NicolasDorier
Copy link
Contributor

I reviewed the first 1024, here my difficulties :

acerbe I don't understand meaning out of context, more less spelling it
agrafer agrapher ? aggrapher ? easy mistake
azur azure with a 'e' ? easy mispell
bénigne Difficult to pronounce of the phone + rare word
bielle never heard this word
biopsie never heard
biotype never heard
bluffer easy mispell "bleuffer"
brome never heard
bruine never heard
buccal easy mispell (bucal)
cadastre never heard
caduc easy mispell "caduque"
calepin never heard
caneton easy mispell "canneton"
césium never heard (almost)
cloporte never heard
cobalt never heard
coccyx never heard
cosy easy misspell "cosie"
dactylo did not know it existed, thought it was abbreviation of "dactylographie"
embryon easy misspell "embrillon"
ethnie "éthnie" ?
fakir easy misspell (faquir)
fenouil easy misspell "fenouille"
filetage never heard
final can confound with "finale"
gallium never heard
gecko never heard
grivois never heard
hydromel never heard
idylle never heard
iguane difficult to spell right

All of that is surely subjective. We don't have to replace if you think I am the only one having those difficulties. I'll review the next 1024 later. Let me know what you think about these words.

@Kirvx
Copy link
Author

Kirvx commented May 10, 2015

Thanks for the review :)
Have you googled these 17 unknown words?
I think after a search most of the people will say "Ah yes I know this word".
What do you think @EricLarch ?
Anyway, it represents 1.6% of these 1024 words, that's pretty correct.
I agree for all of the rest.
"ethnie" is correct :)
"dactylo" is the job http://www.larousse.fr/dictionnaires/francais/dactylo/21484, but there is also "dactylographe" which is the the old form according to larousse, so maybe we should delete this word.
"acerbe" http://www.larousse.fr/dictionnaires/francais/acerbe/604
But before trying to change all these words (if we can, finding extra words is complicated, I will try each day), it seems more logical to me to work on the 1 letter difference first, no?
I'm very curious about how many words are in conflict with another :)

@EricLarch
Copy link

I agree the bénigne and bluffer can be difficult (I have seen poker players write "bleuffer"...). For the others I would think that any French native speaker must know them, and I don't think that anyone litterate would write "embrillon" or "faquir" ever. I understand some people can have troubles with spelling, but then no words would be safe.

@NicolasDorier
Copy link
Contributor

it seems more logical to me to work on the 1 letter difference first, no?

Don't, I can do that automatically. I will do it once we agreed on the words. (I'll also code something up to verify you respected your restrictions)

For the others I would think that any French native speaker must know them

I am native speaker, but I admit I am not very good. ;)
If all of you think that the problem is between my screen and my chair, then I have no problem into believing it. ;)
I'll review the next one tomorrow.

@Kirvx
Copy link
Author

Kirvx commented May 10, 2015

Don't, I can do that automatically. I will do it once we agreed on the words. (I'll also code something up to verify you respected your restrictions)

Ok, thanks for your time to code :)

I am native speaker, but I admit I am not very good. ;)
If all of you think that the problem is between my screen and my chair, then I have no problem into believing it. ;)
I'll review the next one tomorrow.

It's also cool to learn words ^^

@NicolasDorier
Copy link
Contributor

yeah it is cool, but I'd just hope people will not have to spell words on phone, which will happen for unknown words. But I'm fine with it if you think I am one of the only who do not know them.

I expect most service provider using BIP39 will auto correct words for the user. (I will surely include that in nbitcoin... even if only for me ;D)

@Kirvx
Copy link
Author

Kirvx commented May 13, 2015

@NicolasDorier Have you had the time to review the second part? :)

@NicolasDorier
Copy link
Contributor

shit I forgot, working on that sorry

@NicolasDorier
Copy link
Contributor

Here it is :

iridium Never heard
jacinthe jacynthe ? jacynte ? jacinte ?
jaloux jalou ?
joyau joyaux ?
lasso lasseau ?
momifier mommifier ?
obturer Never heard
oxyde oxide ?
perdrix perdrie ?
phoque foque ?
pylône ô ???
rhodium never heard
sextuor never heard, I understood "sexe tueur" :s
suricate never heard
thorax torax ?
ubuesque never heard
vanadium never heard
wapiti never heard
zircon never heard

My remark are typical spelling mistake that can be done. Once you agreed on the words to change let me know, I'll then run some word analysis on the list. (dictionnary check / that your rules are satisfied / that 2 words are not too similar)

@Kirvx
Copy link
Author

Kirvx commented May 13, 2015

Thanks :)
Never heard suricate?
https://i.imgur.com/TE6PlMx.jpg
They have a reserved place in the wordlist ^^
I will try to find many words in the next 2 days.

@NicolasDorier
Copy link
Contributor

Well, I heard about suricate, as far as I was concerned, it was a french comedian group on youtube. :p

@Kirvx
Copy link
Author

Kirvx commented May 14, 2015

Ok, I propose to change these words:

bénigne bluffer cosy dactylo césium gecko gallium grivois sextuor
cadastre rhodium vanadium iridium jacinthe jaloux brome azur
agrafer caduc zircon lasso momifier fenouil bruine bielle final bise
bibelot fakir

EDIT: "fakir" too
by adding

biberon banlieue financer éthanol prélude taureau slogan punaise sternum sottise burin
tétine filière esquiver binaire festival pyjama opaque pharaon piéton pizza boycott
phobie fémur féodal fissure rituel rallye

And wombat (https://i.imgur.com/scN9gIU.jpg) 🐻

What do you think?

@NicolasDorier
Copy link
Contributor

pyjama. pijama ? (I would have bet it was spelled like that)
rallye rallie ? (comme rallier)

Except those I'm good. I like wombat, but I doubt lots people know.
What do you think ? (once again, if you think it is fine, I'm fine with it too, I just hope people do not stress too much when they don't manage to spell right 25 words)

Tell me when you update the list that I run some code on it.

@Kirvx
Copy link
Author

Kirvx commented May 14, 2015

Thanks :)
I can change "rallye" by "rallonge", "pyjama" by "pyrolyse" ,"poreux" or "pixel" you decide, or another word.
Maybe add "yacht" (the boat) instead of wombat, because we don't have words starting by "y".

@NicolasDorier
Copy link
Contributor

rallonge et pixel, ok pour yatch.

@Kirvx
Copy link
Author

Kirvx commented May 14, 2015

@NicolasDorier Updated
I also deleted "wapiti" and add "linéaire".
I change the encoding, but apparently github doesn't updated the whole file.
So here is the original https://www.dropbox.com/s/chaxgqotio59rf4/french.txt?dl=0
Note that you are a "collaborator" on Kirvx/bips, so feel free to correct what you want if you can (i don"t know what a collaborator can do).

@NicolasDorier
Copy link
Contributor

thanks, I'll run some word analysis to check everything is fine. (Hopefully before sunday)

@Kirvx
Copy link
Author

Kirvx commented May 15, 2015

poncer -> ponctuel
tréfonds -> trèfle
pâturage (which is more used in the plural) -> gerbille

Same dropbox link
https://www.dropbox.com/s/chaxgqotio59rf4/french.txt?dl=0

@NicolasDorier
Copy link
Contributor

did you update on github ? I prefer using the github version for my tests, so I'm sure there is no mistake in the modifications. (don't worry about encoding, I'll fix it)

Ps : gerbille => never heard :D

@Kirvx
Copy link
Author

Kirvx commented May 15, 2015

Yes update on github.
Ok i will try to change gerbille

@Kirvx
Copy link
Author

Kirvx commented May 15, 2015

"gerboise", "graffiti", "glycémie" or another ? :)

@NicolasDorier
Copy link
Contributor

ok let's take "graffiti"

@Kirvx
Copy link
Author

Kirvx commented May 16, 2015

Updated.

@NicolasDorier
Copy link
Contributor

Here similar words (separated by 1 letter, accent removed)

amener,mener //Similar
argent,urgent
banque,barque
baraque,barque
baron,bâton
bolide,solide
bonifier,tonifier
bonus,tonus
céder,coder
choquer,croquer
crayon,rayon
créer,crier
curieux,furieux
défaire,refaire
doyen,moyen
entier,envier
épreuve,preuve
éprouver,prouver //Similar
établir,rétablir //Similar
fermer,ferrer
fièvre,lièvre
figer,fixer
flaque,plaque
génie,genre
herbe,herse
humeur,humour
hyène,hymne
infecter,injecter
léger,loger
local,loyal
loger,louer
loger,lover
louer,lover //Lover is unknown + very similar to other words (4 occurences)
maison,saison
malade,salade
ministre,sinistre
notaire,notoire
piéton,piston
podium,sodium
préparer,réparer //Similar
proie,prose
rare,rire
redire,réduire
refermer,réformer
rejeter,répéter
réparer,séparer
soupirer,soutirer
toiture,voiture

I noted potential problems.
What do you think ?

Checking other stuff...

@NicolasDorier
Copy link
Contributor

I also noted the following collision with Spanish. (btw, the Spanish list is not normalized on github)

ceder,céder
enorme,énorme
gemir,gémir
ideal,idéal
serie,série

@voisine
Copy link
Contributor

voisine commented May 26, 2015

@Kirvx @NicolasDorier @EricLarch Thanks guys, this will be going into the next breadwallet update. Vive la France !

@realindiahotel
Copy link

Suppose I best also add French list to BIP39.NET

-----Original Message-----
From: "Kirvx" [email protected]
Sent: ‎26/‎05/‎2015 10:14 PM
To: "bitcoin/bips" [email protected]
Cc: "Thå Shïz" [email protected]
Subject: Re: [bips] BIP39 French Wordlist - My proposal (#152)

Awesome :)
Thanks to everyone :)

Reply to this email directly or view it on GitHub.

@realindiahotel
Copy link

Is now added in BIP39.NET

@Kirvx
Copy link
Author

Kirvx commented Jun 8, 2015

Yeah thanks :)

Le lun. 8 juin 2015 07:51, Thå Shïz [email protected] a écrit :

Is now added in BIP39.NET


Reply to this email directly or view it on GitHub
#152 (comment).

@NicolasDorier
Copy link
Contributor

same in NBitcoin (in master branch, will be out for the next release)

@wizardofozzie
Copy link

I had a big problem trying to detect the bip39 language as French shares ~5% of its words with English.

With the test vector (entropy 000000000000000000000000000000000000; english mnemonic = "abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon abandon about") it is incorrectly detected as French. I've changed my code to check all of the following (see below), however I'd implore the list to be made completely different to English (or at the very least, don't make the first word the same)

FRENCH_BIP39_CLASHES = [(1, u'abandon'), (88, u'amateur'), (107, u'angle'), (110, u'animal'), (148, u'aspect'), (190, u'badge'), (230, u'bicycle'), (262, u'bonus'), (277, u'brave'), (323, u'canal'), (328, u'capable'), (347, u'caution'), (403, u'civil'), (409, u'client'), (436, u'concert'), (451, u'correct'), (461, u'coyote'), (478, u'crucial'), (479, u'cruel'), (493, u'cycle'), (498, u'danger'), (562, u'digital'), (573, u'distance'), (594, u'double'), (598, u'dragon'), (631, u'effort'), (725, u'essence'), (757, u'exact'), (763, u'excuse'), (795, u'fatal'), (796, u'fatigue'), (812, u'festival'), (820, u'figure'), (854, u'fortune'), (861, u'fragile'), (880, u'fruit'), (919, u'globe'), (953, u'guide'), (998, u'humble'), (1011, u'image'), (1014, u'immense'), (1017, u'impact'), (1043, u'innocent'), (1053, u'intact'), (1070, u'jaguar'), (1093, u'junior'), (1102, u'label'), (1123, u'lecture'), (1165, u'loyal'), (1178, u'machine'), (1248, u'million'), (1254, u'minute'), (1255, u'miracle'), (1259, u'mobile'), (1286, u'muscle'), (1301, u'nation'), (1302, u'nature'), (1322, u'noble'), (1331, u'notable'), (1381, u'opinion'), (1387, u'orange'), (1409, u'ozone'), (1411, u'palace'), (1416, u'panda'), (1476, u'phrase'), (1478, u'piano'), (1492, u'pizza'), (1524, u'position'), (1548, u'prison'), (1567, u'public'), (1576, u'puzzle'), (1580, u'question'), (1626, u'relief'), (1671, u'rival'), (1674, u'romance'), (1707, u'salon'), (1727, u'science'), (1748, u'sentence'), (1756, u'service'), (1769, u'simple'), (1777, u'social'), (1801, u'source'), (1805, u'spatial'), (1809, u'stable'), (1830, u'surface'), (1833, u'surprise'), (1836, u'suspect'), (1847, u'talent'), (1911, u'train'), (1933, u'tunnel'), (1948, u'unique'), (1954, u'usage'), (1963, u'vague'), (1970, u'valve'), (2008, u'village'), (2014, u'virus'), (2020, u'vital'), (2034, u'volume'), (2039, u'voyage'), (2041, u'wagon')]

@NicolasDorier
Copy link
Contributor

I'm not convinced in that. The Auto Language detect feature is by itself dangerous. (Chinese Tradition and Modern)
Also finding 2048 well understood words that we don't share in english nor any other language is an impossible task.

@realindiahotel
Copy link

Hi @simcity4242 thanks for bringing this up, it is interesting, I don't really think there is any great reason for us to do Auto detect of the mnemonic language, do you have a specific use case in mind? I'm actually thinking of removing this functionality from BIP39.NET because at the end of the day we don't really need to know the language of the mnemonic on input. Unless of course you have a specific task in mind, it may be wise to just avoid auto language detect altogether. I did it before the french list, and while Nicholas is right in that if it's only ~5% then chances are you will have majority french only every time so shouldn't be an issue, but you will need to account for the edge cases I guess.

@schildbach
Copy link
Contributor

If auto-detection is not possible, you'd need to add to the 12 words the information what wordlist is used. So effectively it would be a 13th word.

@realindiahotel
Copy link

Why do you need to know what language is used tho?

@schildbach
Copy link
Contributor

  • For offering auto-completion of words.
  • For using the right type of space.

@realindiahotel
Copy link

Surely you would detect localization off the system for auto detect just as any other app/program does now? Correct spaces are whatever the user puts in, ideographic to normal happens during Normalization anyway so it doesn't matter what spaces are put in.

@realindiahotel
Copy link

Also if you are inputting the words you can't auto detect language as you type the words in!

@schildbach
Copy link
Contributor

On mobile devices, you generally don't type spaces. Everything is auto-completed. This is especially true if there are well defined dictionaries. You can't use the system locale reliably, as phrases should be exchangable between devices.

@schildbach
Copy link
Contributor

If all encodings use the same type of space then we're good. But I heard that's not the case?

@realindiahotel
Copy link

On mobile devices the OS handles the auto-complete based on a localized dictionary in most cases. Yes the space us different for JP however the Normalization process turns tge ideographic space into ASCII space regardless of what is input so it doesn't matter what space is auto added.

@dabura667
Copy link

Japanese phones don't auto-insert spaces at all, in fact.

@schildbach
Copy link
Contributor

Well, I will use a customized auto-complete. Otherwise it will insert words not contained in the word lists, or maybe it's even missing words from the lists. I assume I will be able to append the space myself.

@dabura667
Copy link

That is probably best.

I like Mycelium's setup.

Japanese list is unique with the first 3 characters so it should be easy to auto-complete

@schildbach
Copy link
Contributor

FWIW, for the first word I plan to auto-complete to all the supported word lists at the same time, so essentially the dictionary is a union of the wordlists. For all subsequent words, I exclude the word lists that can't match anymore. If after the 12th word there still would be multiple word lists matching, I maybe ask the user for what list to use (if that's needed, I'm not sure).

@gurnec
Copy link

gurnec commented Jun 11, 2015

FWIW, I use auto-detection in seedrecover. It's just a UI nicety.

The french word list isn't really that much of a problem; the likelihood of an entire (random) 12-word mnemonic being ambiguous between English and French is less than 1 in 5 × 1015.

As NicolasDorier already pointed out, It's the Chinese Simplified and Traditional wordlists which are problematic if you want to do auto-detection, they share 62% of their words. That's a 1 in 295 likelihood of ambiguity for a 12-word mnemonic, 1 in 4720 if you also require the checksum be valid.

This problem (if it even is one) could have been solved by requiring that for each new word list, if it shares a word with an existing word list, that word must be placed in the same position as it is in the existing word list (or just use Electrum 2.x's method).

@wizardofozzie
Copy link

@Thashiznets I initially flagged French because the first test vector contains "abandon" (11/12 words) and my code was just checking the first word (like Electrum), so the English test vectors were returning "French" as language; I've used a workaround

Basically, I've been trying to differentiate mnemonic phrases without needing to know if it's BIP39, or Electrum 2.x (or Electrum 1.x, which is much harder). I just think it's prudent to have certainty in knowing what type of mnemonic it is by the words alone.

My reasoning also extends to this (which @gurnec answered).

@Kirvx
Copy link
Author

Kirvx commented Jun 17, 2015

Sorry to not answer to this problem, I'm not a tech guy :/
Anyway, if it's still a problem, and since the wordlist has been merged, I think it's incorrect to change it for a non critical issue (because of the problem to have 2 versions of a wordlist).

@realindiahotel
Copy link

Agreed, leave as is, I think trying to guess the spec used i.e. BIP39, Electrum etc could end in tears.

TheBlueMatt added a commit to TheBlueMatt/bips that referenced this pull request May 5, 2016
@brenorb
Copy link
Contributor

brenorb commented Aug 13, 2018

Hey, I was taking a look on BIP0039 to add Portuguese and then I saw the French wordlist has a lot of words matching the English list. I know it is not on the proposed rules, but I believe it is important to not have words already used in other language mnemonic sets.

These are the ones identical to the English list:
'french.txt': ['abandon', 'amateur', 'angle', 'animal', 'aspect', 'badge', 'bicycle', 'bonus', 'brave', 'canal', 'capable', 'caution', 'civil', 'client', 'concert', 'correct', 'coyote', 'crucial', 'cruel', 'cycle', 'danger', 'digital', 'distance', 'double', 'dragon', 'effort', 'essence', 'exact', 'excuse', 'fatal', 'fatigue', 'festival', 'figure', 'fortune', 'fragile', 'fruit', 'globe', 'guide', 'humble', 'image', 'immense', 'impact', 'innocent', 'intact', 'jaguar', 'junior', 'label', 'lecture', 'loyal', 'machine', 'million', 'minute', 'miracle', 'mobile', 'muscle', 'nation', 'nature', 'noble', 'notable', 'opinion', 'orange', 'ozone', 'palace', 'panda', 'phrase', 'piano', 'pizza', 'position', 'prison', 'public', 'puzzle', 'question', 'relief', 'rival', 'romance', 'salon', 'science', 'sentence', 'service', 'simple', 'social', 'source', 'spatial', 'stable', 'surface', 'surprise', 'suspect', 'talent', 'train', 'tunnel', 'unique', 'usage', 'vague', 'valve', 'village', 'virus', 'vital', 'volume', 'voyage', 'wagon'],

@Kirvx
Copy link
Author

Kirvx commented Aug 13, 2018

You're right that was a concern during the creation of the wordlist https://en.wikipedia.org/wiki/List_of_English_words_of_French_origin but it wasn't a priority for me, and I think it wasn't easily possible to apply this additional restriction with the other rules.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.