Skip to content

realtime Twitter trending hashtags computation using RedStorm / Storm

Notifications You must be signed in to change notification settings

zeroDivisible/tweitgeist

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Tweitgeist v1.2.0

Tweitgeist analyses the Twitter Spitzer hose and compute in realtime the top trending hashtags using RedStorm/Storm. What makes this interesting other than being a cool Storm example, is the fact that this architecture will work at full Twitter Firehose scale without much modifications.

There are three components:

  • The Twitter Spitzer stream reader which pushes messages in a Redis queue
  • The Redstorm analyser which read the Twitter stream queue, computes the trending hashtags and output the top N list every 5 seconds in a Redis queue
  • The viewer UI for the visualization

Dependencies

This has been tested on OSX 10.6+, Linux 11.10 & 12.04 using JRuby 1.6.x for the RedStorm topology and Ruby 1.9.x for the Twitter Spitzer hose reader.

Installation

  • Redis is required
  • RVM is highly recommended as you will need to work with both Ruby/JRuby and different gemsets.

Redstorm backend

  • requires JRuby 1.6.x

  • set JRuby in 1.9 mode by default

    export JRUBY_OPTS=--1.9
  • install the RedStorm gem using bundler with the supplied Gemfile

    $ bundle install
  • run RedStorm installation

    $ bundle exec redstorm install
  • package the topology required gems

    $ bundle exec redstorm bundle topology
  • if you plan on running the topology on a cluster, package the topology jar

    bundle exec redstorm jar lib/tweitgeist/

Twitter Spitzer stream reader

  • requires Ruby 1.9.x

  • install required gems using bundler with the supplied Gemfile

    $ bundle install

Viewer

  • requires Node.js

    $ sudo apt-get install nodejs
  • requires npm

    $ sudo apt-get install npm
  • install CoffeeScript if you want to modify the Node.js server

    $ npm install -g coffee-script
  • install other dependencies

    $ cd lib/viewer
    $ npm install .

Usage overview

Redstorm backend

  • requires JRuby 1.6.x

  • set JRuby in 1.9 mode by default

    export JRUBY_OPTS=--1.9

RedStorm backend in local mode.

$ bundle exec redstorm local lib/tweitgeist/storm/tweitgeist_topology.rb

RedStorm backend in remote cluster mode.

$  bundle exec redstorm cluster lib/tweitgeist/storm/tweitgeist_topology.rb

Twitter Spitzer stream reader

  • requires Ruby 1.9.x

  • edit config/twitter_reader.rb to add your credentials

$ ruby lib/tweitgeist/twitter/twitter_reader.rb

Viewer

$ coffee server.coffee --port 8080 --host 127.0.0.1 --redis-port 6379 --redis-host 127.0.0.1

or (with simulated data in case of no redis)

$ coffee server.coffee --port 8080 --host 127.0.0.1 --mock

Author

Colin Surprenant, @colinsurprenant, https://github.com/colinsurprenant, [email protected]

Contributors

Francois Lafortune, @quickredfox, https://github.com/quickredfox, [email protected]

Nicholas Brochu, @nbrochu, https://github.com/nbrochu, [email protected]

License

Tweitgeist is distributed under the Apache License, Version 2.0.

About

realtime Twitter trending hashtags computation using RedStorm / Storm

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published