Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LXML 5.2.0 breaks import #532

Closed
marban opened this issue Mar 31, 2024 · 14 comments
Closed

LXML 5.2.0 breaks import #532

marban opened this issue Mar 31, 2024 · 14 comments
Labels
bug Something isn't working

Comments

@marban
Copy link

marban commented Mar 31, 2024

ImportError: lxml.html.clean module is now a separate project lxml_html_clean.

@rithvikshetty
Copy link

I tried installing it separately, and the error doesn't come up. But, bare_extraction() stops working.

@cyber-emreclskn
Copy link

same problem anyone solve this????
ImportError: lxml.html.clean module is now a separate project lxml_html_clean.

@romanthegentle
Copy link

I downgraded to lxml == 5.1.0
And that solved it for me

@cyber-emreclskn
Copy link

but i use trafilatura package inside lxml. And i try this solution. Not worked me!!!

@UmairAhmadBaltoro
Copy link

using lxml==5.1.0 will fix your problem..

@adbar
Copy link
Owner

adbar commented Apr 2, 2024

Alternatively you can update Trafilatura to its latest version (1.8.0), lxml.html.clean has been removed.

@adbar adbar closed this as completed Apr 2, 2024
@adbar
Copy link
Owner

adbar commented Apr 2, 2024

No this isn't correct, there is a problem with the justext dependency (miso-belica/jusText#47).

@adbar adbar reopened this Apr 2, 2024
@adbar adbar added the bug Something isn't working label Apr 2, 2024
@adbar
Copy link
Owner

adbar commented Apr 2, 2024

This is now fixed in #535, I'll issue a new Trafilatura release shortly.

@adbar
Copy link
Owner

adbar commented Apr 3, 2024

The issue is now fixed in release 1.8.1.

@adbar adbar closed this as completed Apr 3, 2024
@marban
Copy link
Author

marban commented Apr 3, 2024

setup.py would still need an update:
requires lxml (>=4.9.4,<5.2.0).

@adbar
Copy link
Owner

adbar commented Apr 4, 2024

It's already done, or can you be more specific?

@marban
Copy link
Author

marban commented Apr 7, 2024

I can't upgrade lxml to 5.2.1 because setup.py wants:
"lxml >= 4.9.4, < 5.2.0; platform_system != 'Darwin' or python_version > '3.8'",

@adbar
Copy link
Owner

adbar commented Apr 8, 2024

It's normal, that's the bug fix for now.

vmttn added a commit to gip-inclusion/data-inclusion that referenced this issue Apr 16, 2024
@adbar
Copy link
Owner

adbar commented May 13, 2024

LXML will be up-to-date in the next version, see #593.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants