Common crawl to sentiment to chernoff faces to worldmap suposition
#install lots of stuff, ask me
install https://github.com/internetarchive/warctools
for warcfilter and warcextract
Repo to document 2nd place standing at Big Open Data Hackathon. Dissecting the Common Crawl Corpus of WARC files.
Team Members (in alphabetical order):
Alex Aruj Andrew Defries Adam Ericksen Trent Robbins Ed Tsang Amir Youssefi