-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AttributeError: 'NoneType' object has no attribute 'find_all' #154
Comments
The error persists even when supplying single dates. |
@ryanelittle Can you share the code or CLI command that is triggering the error? |
I am using Site.search_by_date in a custom class. This is my function:
|
@ryanelittle Great. Can you also provide the date ranges you're using? Sounds like it may generally be broken, but I wouldn't mind trying to test with the exact parameters you've tried so far. |
@ryanelittle oh, also if you could supply the value stored in |
I tried a few. None of them worked. Just tried 'ok_atoka', '2020-03-01', '2020-03-01', did not work. |
@ryanelittle The bug appears to be due to the OSCN site now rejecting web requests with the default Python User-Agent supplied by the requests library. This must be new(ish) behavior, since the code was working a few months back when we created it. Anyhow, the site now treats such requests as unauthorized and returns a 403 error page, which does not contain the expected elements and therefore triggers the error we're seeing at the BeautifulSoup layer. Providing a realistic User-Agent header appears to fix the problem. Updating the code in search.py to pass in a User-request that mimics a more realistic browser specs should fix the issue. In the short term, if you need to press forward on your project, I would just fork and hard-code a User-Agent. |
Thank you for the fix @zstumgoren. |
@ryanelittle Sure thing. We'll try to ship a proper release to PyPI containing the bug fix in the near future. We'll leave this ticket open until then. Meantime, thanks for bringing it to our attention! |
@zstumgoren I've used fake-useragent (https://pypi.org/project/fake-useragent/) to randomize my useragents in the past. It might be a good solution so court scraper doesn't have the same header for everyone who uses it. |
I have been using Court Scraper to scrape OSCN. Counties that do not use DailyFilings will not return a list of case numbers when searching for all case numbers in a given year (start_date = 20TK-1-1, end_date = 20TK-12-31).
Looking in the code, I found this note: "Always limit query to a single filing date, to minimize chances of truncate results." I did not expect this behavior based on the documentation. Could the code be changed to behave in the same way as DailyFilings? I.E. When provided a date range, Search searches each date and provides results for a large range?
The text was updated successfully, but these errors were encountered: