##fly_offer a website for helping real programmer get better offer
-
archive
- a simple search engine
- crawler_task spider 爬取数据所用
- parser 解析爬去下的网页 并保存
- index_builder 建立索引
- query_man searcher 提供查询服务
- website it is written by flask using python
- a simple search engine
-
usage:
-
preparation
- prepare directionary
mkdir data mkdir cleaned_data
- start redis
- prepare directionary
-
crawl data
python crawler_task
or you can write custom spider br a better distribution crawler for your reference which is also written by me-- viper-py the page data will put in directionary data you mkdir just now
-
parse data and build index
python index_build.py
the cleaned data will exculded all html tags and then will be put in directionary cleaned_data br It is a better way to parse page for extracting clean data and build index in the same step
-
query service this step programme is query_man.py
-
run website
python app.py
-
-
todo
- model.py optimizing
- query suggestion function
- query result display optimizting