-
-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
introduce MAX_REDIRECTS
config setting and fix urllib3 redirect handling
#461
Conversation
…dling Fixes issue adbar#450 After setting `MAX_REDIRECTS` to 5, I could fetch the original URL from the issue: `trafilatura -u https://www.hydrogeninsight.com/production/breaking-us-reveals-the-seven-regional-hydrogen-hubs-to-receive-7bn-of-government-funding/2-1-1534596` I also fixed this old issue: adbar#128 The underlying urllib3 bug has not been fixed: urllib3/urllib3#2475 I had to pass the retry strategy to the actual request method: it doesn't propagate from the pool maanger
Tests are failing with the following:
I'm not reproducing locally and I don't get how my changes could have affected this part 🤔 |
oh no.. lxml 5.0 just got released yesterday: https://pypi.org/project/lxml/#history |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #461 +/- ##
==========================================
+ Coverage 96.76% 96.91% +0.15%
==========================================
Files 22 22
Lines 3367 3370 +3
==========================================
+ Hits 3258 3266 +8
+ Misses 109 104 -5 ☔ View full report in Codecov by Sentry. |
It works, thanks, but does it really fix the issue urllib3/urllib3#2475 ? Merging this PR would close it. |
It doesn't, but will it close it? I wouldn't think merging a PR could close an issue living in another repository |
You could unlink it to be sure (I cannot). |
done, I edited the description to quote the issue URL |
Fixes issue #450
After setting
MAX_REDIRECTS
to 5, I could fetch the original URL from the issue:trafilatura -u https://www.hydrogeninsight.com/production/breaking-us-reveals-the-seven-regional-hydrogen-hubs-to-receive-7bn-of-government-funding/2-1-1534596
I also fixed this old issue: #128
The underlying urllib3 bug has not been fixed:
github.com/urllib3/urllib3/issues/2475
I had to pass the retry strategy to the actual request method: it doesn't propagate from the pool maanger