Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Wrapper Still Maintained? #116

Open
tangunner opened this issue Aug 17, 2022 · 10 comments
Open

Wrapper Still Maintained? #116

tangunner opened this issue Aug 17, 2022 · 10 comments

Comments

@tangunner
Copy link

Hi - I was just wondering if this wrapper is still maintained? I tried a few different endpoints but received a couple different types of errors and wasn't sure if they're caused by a bad local installation on my end or if the wrapper was just not longer supported and there had since been updates to CL. Thanks!

@irahorecka
Copy link
Contributor

Do you have a reproducible snippet of code that threw the error(s)?

@jsudano
Copy link

jsudano commented Sep 8, 2022

Getting similar issues here. This is the error I see whenever I try to search:

Traceback (most recent call last):
  File "cl_scraper.py", line 39, in <module>
    found_posts.update({ result['id'] : result for result in CL_query.get_results() })
  File "cl_scraper.py", line 39, in <dictcomp>
    found_posts.update({ result['id'] : result for result in CL_query.get_results() })
  File "/home/jsudano/projects/cl_scraper/venv/lib/python3.7/site-packages/craigslist/base.py", line 192, in get_results
    for row in rows.find_all('li', {'class': 'result-row'},
AttributeError: 'NoneType' object has no attribute 'find_all'

Was able to reproduce it quite simply:

Python 3.7.3 (default, Jul 25 2020, 13:03:44)
[GCC 8.3.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from craigslist import CraigslistForSale
>>> CL_query = CraigslistForSale(site='sfbay', category='mca')
>>> for e in CL_query.get_results():
...     print(e)
...
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/jsudano/projects/cl_scraper/venv/lib/python3.7/site-packages/craigslist/base.py", line 191, in get_results
    for row in rows.find_all('li', {'class': 'result-row'},
AttributeError: 'NoneType' object has no attribute 'find_all'
>>>

I dug a little deeper and looked at the HTML response from craigslist, it looks like CL won't respond to requests unless you have a "browser with javascript enabled":

<noscript id="no-js"><div>
<p>We've detected that JavaScript is not enabled in your browser.</p>
<p>You must enable JavaScript to use craigslist.</p>
</div></noscript>
<div id="unsupported-browser">
<p>We've detected you are using a browser that is missing critical features.</p>
<p>Please visit craigslist from a modern browser.</p>
</div>

I'm guessing this would break the library in most cases.

@pancho-villa
Copy link

I'm having the same issue. I just downloaded this yesterday and it worked fine for a bunch of queries. Today I fired it up again and get this every time.

@natez311
Copy link

Hi,

Also having the same issue

Traceback (most recent call last): File "forSale.py", line 1, in <module> from craigslist import CraigslistJobs, CraigslistForSale File "/Users/nathanielhurwitz/.pyenv/versions/3.8.10/lib/python3.8/site-packages/craigslist/__init__.py", line 1, in <module> from .craigslist import ( File "/Users/nathanielhurwitz/.pyenv/versions/3.8.10/lib/python3.8/site-packages/craigslist/craigslist.py", line 1, in <module> from .base import CraigslistBase File "/Users/nathanielhurwitz/.pyenv/versions/3.8.10/lib/python3.8/site-packages/craigslist/base.py", line 17, in <module> ALL_SITES = utils.get_all_sites() # All the Craiglist sites File "/Users/nathanielhurwitz/.pyenv/versions/3.8.10/lib/python3.8/site-packages/craigslist/utils.py", line 40, in get_all_sites response = requests.get(ALL_SITES_URL) File "/Users/nathanielhurwitz/.pyenv/versions/3.8.10/lib/python3.8/site-packages/requests/api.py", line 75, in get return request('get', url, params=params, **kwargs) File "/Users/nathanielhurwitz/.pyenv/versions/3.8.10/lib/python3.8/site-packages/requests/api.py", line 61, in request return session.request(method=method, url=url, **kwargs) File "/Users/nathanielhurwitz/.pyenv/versions/3.8.10/lib/python3.8/site-packages/requests/sessions.py", line 529, in request resp = self.send(prep, **send_kwargs) File "/Users/nathanielhurwitz/.pyenv/versions/3.8.10/lib/python3.8/site-packages/requests/sessions.py", line 645, in send r = adapter.send(request, **kwargs) File "/Users/nathanielhurwitz/.pyenv/versions/3.8.10/lib/python3.8/site-packages/requests/adapters.py", line 519, in send raise ConnectionError(e, request=request) requests.exceptions.ConnectionError: HTTPConnectionPool(host='www.craigslist.org', port=80): Max retries exceeded with url: /about/sites (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x10b0cf430>: Failed to establish a new connection: [Errno 60] Operation timed out'))

@moozzyk
Copy link

moozzyk commented Oct 3, 2022

I came across this yesterday and it doesn't seem like the wrapper can work anymore. I used a headless browser to work around this. See https://blog.3d-logic.com/2022/10/02/craigslist-automation for more details and a prototype.

@brandomando
Copy link

+1

@f3mshep
Copy link

f3mshep commented Mar 21, 2023

@irahorecka @juliomalegria

I put together a fork of this project that uses selenium + a chrome webdriver that seems to bypass the craigslist bot detection.I also had to change the css classes in the scraping bit to get things working. If there is interest I can clean up the code a bit and submit a formal PR?

@pancho-villa
Copy link

Sounds good to me, I doubt you'd even have to submit a pull request, I'd just follow your fork.

@f3mshep
Copy link

f3mshep commented Mar 30, 2023

Fork is here for anyone curious:
https://github.com/f3mshep/python-craigslist-headless

@genialtechie
Copy link

Fork is here for anyone curious: https://github.com/f3mshep/python-craigslist-headless

@f3mshep Do you know if this fork still works? I am a bit new to python, and I am trying to use it for a little project but I keep getting the same NoneType error referenced in the above issues. A screenshot of my code is below as well, any help would be useful.

...
AttributeError: 'NoneType' object has no attribute 'find_all'

Screenshot

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

9 participants