Scraping and basic exploratory analysis of game data from metacritic.com
- Scraping - Getting and cleaning data from metacritic website. Currently only focused on "PC games" section. Python code based on Scrapy framework.
- Data - Datasets that result from previous step. The same dataset is available in xls (Excel) and csv formats. Also there is a log file generated by the spider.
- Analysis - Exploratory analysis of the dataset. Main objectives are to explore difference between user scores and critic scores, see how scores differ across years, search for potential sources of bias. Done using R language.
- to scrape data yourself - run
scrapy crawl metacritic
from within /scraping/metacriticbot. See Scrapy documentation here. Some of the scraped fields: - to get already scraped dataset - http://is.gd/metacritic_xls
- to see results of an analysis - raw | report (russian) | conference slides (russian)
Python
- Scrapy
- xlwt
R