- clone the repository using git
git clone [email protected]:DeLeb86/immoscraper.git
- create a virtual environment using venv
python3 -m venv ~/.venv/eliza_scraping
- activate the virtual environment
source ~/.venv/eliza_scraping/bin/activate
- install required libraries with pip
pip install -r requirements.txt
run scrapy command :
scrapy crawl immowescraper -o data/output.json
The output is important because the post process step executed when the spider is done reads data from that file.
- raw dataset : 176910 properties
- remove null prices and postal code : 116999
- remove postal code that are not from Belgium : 115070
Now it's your turn to test !!