Skip to content

walbuc/Django-Scrapy

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Django and Scrapy

An example of how to use Django ORM to store in a db obtained data by a Scrapy Spider an then exopse the data through an REST API

As an example, i set up this project to scrap all over rolling stone lists/rankings and store them in a relational db with proper data models

Non pip requirements

  • Python 2.7
  • pip
  • virtualenv
  • Some broker compatible with celery, i use redis
  • a db compatible with django, i use sqlite 3 in dev, postgres or mongodb in prod. If you are not familiar how django manages dbs go here

Installation

Clone project and install requirements in virtualenv

# install fabric in python global enviroment
pip install fabric
# clone repo
git clone git://github.com/drkloc/rstone_scrapper.git
cd rstone_scrapper
# setup app
fab DEV setup

For OSX users only

You need to install lxml with static deps before runing pip against requirements file:

STATIC_DEPS=true pip install lxml

Settings override

Any settings override (Database config, broker config, etc) are conveniently made inside settings_local.py. Just copy the demo file:

cp settings_local_demo.py settings_local.py

and start customizing whatever you want/need.

Start redis-server and celery deamon

redis-server
python manage.py celeryd

Initialization

scrapy runspider scrap.py

Running server

python manage.py runserver

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published