You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm having trouble setting up the environment for this. I'm using a conda environment on Windows and get the same problem with python 3.9, 3.10 and 3.11. I also made sure to pip install with the requirements.txt here before running pip install newspaper4k.
I will encounter this first issue
File "c:\Users...\scrape_from_urls.py", line 1, in
import newspaper
File "C:\Users...\site-packages\newspaper_init_.py", line 17, in
from .api import (
File "C:\Users...\site-packages\newspaper\api.py", line 11, in
from newspaper.article import Article
File "C:\Users...\site-packages\newspaper\article.py", line 28, in
from .extractors import ContentExtractor
File "C:\Users...\site-packages\newspaper\extractors_init_.py", line 8, in
from newspaper.extractors.content_extractor import ContentExtractor
File "C:\Users...\site-packages\newspaper\extractors\content_extractor.py", line 8, in
from newspaper.extractors.articlebody_extractor import ArticleBodyExtractor
File "C:\Users...\site-packages\newspaper\extractors\articlebody_extractor.py", line 8, in
import newspaper.extractors.defines as defines
File "C:\Users...\site-packages\newspaper\extractors\defines.py", line 2, in
from typing_extensions import TypedDict, NotRequired
ModuleNotFoundError: No module named 'typing_extensions'
No biggie, just need to pip install typing-extensions, so the import works, but then it encounters another error later when I try to call newspaper.article with any url.
File "c:\Users...\scrape_from_urls.py", line 7, in
article = newspaper.article(url)
File "C:\Users...\site-packages\newspaper_init_.py", line 61, in article
a = Article(url, language=language, **kwargs)
File "C:\Users...\site-packages\newspaper\article.py", line 195, in init
scheme = urls.get_scheme(url)
File "C:\Users...\site-packages\newspaper\urls.py", line 370, in get_scheme
return urlparse(abs_url, **kwargs).scheme
File "c:\Users...\lib\urllib\parse.py", line 399, in urlparse
url, scheme, _coerce_result = _coerce_args(url, scheme)
File "c:\Users...\lib\urllib\parse.py", line 136, in _coerce_args
return _decode_args(args) + (_encode_result,)
File "c:\Users...\lib\urllib\parse.py", line 120, in _decode_args
return tuple(x.decode(encoding, errors) if x else '' for x in args)
File "c:\Users...\lib\urllib\parse.py", line 120, in
return tuple(x.decode(encoding, errors) if x else '' for x in args)
AttributeError: 'builtin_function_or_method' object has no attribute 'decode'
I also tried newspaper3k and get a similar AttributeError so I'm wondering if I should be using a different urllib version (urllib3==1.26.18).
Would be great if these could be added to the requirements.txt. Thank you.
The text was updated successfully, but these errors were encountered:
I encountered the error ModuleNotFoundError: No module named 'typing_extensions' while using M1 / Miniconda 3.10. However, I was able to resolve it by executing pip install typing_extensions. Following this, I did not encounter the error AttributeError: 'builtin_function_or_method' object has no attribute 'decode'.
Hi,
I'm having trouble setting up the environment for this. I'm using a conda environment on Windows and get the same problem with python 3.9, 3.10 and 3.11. I also made sure to pip install with the requirements.txt here before running pip install newspaper4k.
I will encounter this first issue
No biggie, just need to pip install typing-extensions, so the import works, but then it encounters another error later when I try to call newspaper.article with any url.
I also tried newspaper3k and get a similar AttributeError so I'm wondering if I should be using a different urllib version (urllib3==1.26.18).
Would be great if these could be added to the requirements.txt. Thank you.
The text was updated successfully, but these errors were encountered: