A web scraper for AirBnb. This script will extract information like title, price, rating and bedrooms for a given location and store them in a csv file. You can use it to track your next holiday target or collect data for some analytics.
This project is inspired by X-technology. If you want to get a deeper understanding, visit the blog posts or webinar videos below.
Also be aware that airbnb could change all tag id's inside of file airbnb_parser.py so if your extracted file is missing data you need to update them.
- Clone the repository
- Create a virtual environment and activate it
virtualenv venv
source venv/bin/activate
- Install all required packages
pip install -r requirements.txt
- Run airbnb_run.py
python airbnb_run.py
Selenium requires a browser like Google-Chrome.
For deployment to a server a headless version of Google-Chrome is required as well as a ChromeDriver.
Here is a nice guide for installing Google-Chrome Headless Version.
Check your google-chrome version
google-chrome --version
Go to the ChromeDriver homepage and navigate to the driver file which matches your Chrome version and OS. For example Chrome version 95 for Linux would be
wget https://chromedriver.storage.googleapis.com/95.0.4638.69/chromedriver_linux64.zip
unzip chromedriver_linux64.zip
Test it by including your extracted ChromeDriver path into to following script:
from selenium import webdriver
chrome_options = webdriver.ChromeOptions()
chrome_options.add_argument('--headless')
driver = webdriver.Chrome('YOUR_CHROMEDRIVER_PATH', chrome_options=chrome_options, service_args=['--verbose'])
driver.get('https://google.org')
print(driver.title)
If you don't see any errors, the installation was successful. Now you have to include the commented part inside function extract_listings_dynamic on file airbnb_parser.py.
In case you are facing any error messages, please open an issue ticket!
Series of articles on Medium:
- Part 0 - Intro to the project
- Part 1 - Scrape the data from Airbnb website
- Part 2 - More details to Web Scraping