Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't find model 'fr_core_news_sm' #35

Closed
JJMeric opened this issue Jun 29, 2021 · 6 comments
Closed

Can't find model 'fr_core_news_sm' #35

JJMeric opened this issue Jun 29, 2021 · 6 comments

Comments

@JJMeric
Copy link

JJMeric commented Jun 29, 2021

Hello ! just starting to explore spacy-leff : I have been trying to run the examples given here and on spacy.io
but I stumble on the same error. See below.
I assume it's a simple one, but I did not find help for this in the available documentation (I may not be good at looking at the right place though...).

I intend to use specy-leff, to replace MElt perl utilities, for our bamanan-french parallel corpus. Bamanan or bambara is a language spoken in west africa. See http://cormand.huma-num.fr/

Thanks for help with this...
Jean-Jacques

python3 testleff2.py
Traceback (most recent call last):
File "testleff2.py", line 9, in
nlp = spacy.load('fr_core_news_sm')
File "/usr/local/lib/python3.8/dist-packages/spacy/init.py", line 47, in load
return util.load_model(name, disable=disable, exclude=exclude, config=config)
File "/usr/local/lib/python3.8/dist-packages/spacy/util.py", line 329, in load_model
raise IOError(Errors.E050.format(name=name))
OSError: [E050] Can't find model 'fr_core_news_sm'. It doesn't seem to be a Python package or a valid path to a data directory.

@JJMeric
Copy link
Author

JJMeric commented Jun 29, 2021

OK, I found out by searching for this name "fr_core_news_sm" here on GIThub :
I need to complete the pip install spacy-lefff with this command:
python3 -m spacy download fr_core_news_sm

now it's working ok

@JJMeric
Copy link
Author

JJMeric commented Jun 29, 2021

Still, I am not 100% happy

  1. with the example sentence in French :
    Apple cherche a acheter une startup anglaise pour 1 milliard de dollard
    this sentence contains two errors, so I correct it as :
    Apple cherche à acheter une startup anglaise pour 1 milliard de dollars

  2. the results
    "cherche" is identified as a noun!? Quite wrong: it is the verb "chercher" in the present tense.
    If I replace it with "cherchait" (past tense), it is identified correctly.

@JJMeric
Copy link
Author

JJMeric commented Jun 30, 2021

trying to use the other example given = trying to use the melt_tagger, I have the following error, please advise:

python3 testleff3.py
Traceback (most recent call last):
File "testleff3.py", line 14, in
nlp.add_pipe('melt_tagger', after='parser')
File "/usr/local/lib/python3.8/dist-packages/spacy/language.py", line 768, in add_pipe
pipe_component = self.create_pipe(
File "/usr/local/lib/python3.8/dist-packages/spacy/language.py", line 659, in create_pipe
resolved = registry.resolve(cfg, validate=validate)
File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 727, in resolve
resolved, _ = cls._make(
File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 776, in _make
filled, _, resolved = cls._fill(
File "/usr/local/lib/python3.8/dist-packages/thinc/config.py", line 848, in _fill
getter_result = getter(*args, **kwargs)
File "testleff3.py", line 11, in create_melt_tagger
return POSTagger()
File "/usr/local/lib/python3.8/dist-packages/spacy_lefff/melt_tagger.py", line 102, in init
super(
File "/usr/local/lib/python3.8/dist-packages/spacy_lefff/downloader.py", line 20, in init
os.mkdir(os.path.join(download_dir, pkg))
PermissionError: [Errno 13] Permission denied: '/usr/local/lib/python3.8/dist-packages/spacy_lefff/data/tagger'

@JJMeric
Copy link
Author

JJMeric commented Jun 30, 2021

I had t use chown on /usr/local/lib/python3.8/dist-packages/spacy_lefff/data
but that's not making installation easy (we potentially have several platforms).

@sammous
Copy link
Owner

sammous commented Jul 5, 2021

hello,
Regarding your first point, you have to install spacy french corpus if you wish to use it.
Also i think you are confusing what spaCy POS and Melt Tag is returning, you can refer to this example:

import spacy
from spacy_lefff import LefffLemmatizer, POSTagger
from spacy.language import Language

@Language.factory('french_lemmatizer')
def create_french_lemmatizer(nlp, name):
    return LefffLemmatizer(after_melt=True, default=True)

@Language.factory('melt_tagger')  
def create_melt_tagger(nlp, name):
    return POSTagger()
 
nlp = spacy.load('fr_core_news_sm')
nlp.add_pipe('melt_tagger', after='parser')
nlp.add_pipe('french_lemmatizer', after='melt_tagger')
doc = nlp(u"Apple cherche a acheter une startup anglaise pour 1 milliard de dollard")
for d in doc:
    print(d.text, d.pos_, d._.melt_tagger, d._.lefff_lemma, d.tag_, d.lemma_)

Regarding your permission error, it is because you should use a virtual environment and not your default python since the package downloads the model and writes to a system folder, which you can't access.
You should use virtualenv or any or python virtual environment. (you can refer to this steps )

@sammous sammous closed this as completed Jul 12, 2021
@ezynsegnane
Copy link

In case you use french follow these steps :
setp 1 :
python -m spacy download fr
step 2 :
spacy.load('fr_core_news_sm')

#####################################
In case you use english follow these steps :

setp 1 :
python -m spacy download en
step 2 :
spacy.load('en_core_news_sm')

this comment is for people who will have the same error in the future!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants