-
Notifications
You must be signed in to change notification settings - Fork 0
Using the disco framework (Chango Search dataset & Million Song Dataset)
https://github.com/ashchristopher/HackReduceToronto
Bartek Ciszkowski (@bartek), Ash Christopher (@ashchristopher)
(Hack/Reduce 2 Toronto)
Bartek and Ash analyzed search queries that had been made during the course of one day. They grouped search queries in four categories: travel, sex, nerd and cooking. They then analyzed how the popularity of these categories in searches varied during the day.
https://github.com/joeyrobert/hackreduce
(Hack/Reduce 2 Toronto)
Joel, Johan, Joey and Ian used a 10 000 song subset of the million song dataset. They were using the Disco distributed computing framework with Python.
They analyzed:
The most romantic year by looking for the word love in song titles. The variation of words in song titles (Only 100 words are used in song titles) Average song tempo per year Song lengths per year Saddest tones (Turns out D is really sad) Recording locations.