Skip to content

Latest commit

 

History

History
104 lines (82 loc) · 3.72 KB

README.md

File metadata and controls

104 lines (82 loc) · 3.72 KB

chromaticwhale

Twitter bot for bogus travel reports. The bot is currently tweeting to @chromaticwhale.

There are two stages to the process.

  1. Production of a Tracery grammar.
  2. Generation of reports using the grammar.

Grammar Generation

The grammar includes several lists of terminals. These are pulled from dbpedia via the SPARQL endpoint. Current items of interest include Northern Rail stations and lines, European rodents, weather conditions and Japanese monsters.

Report Generation

The grammar is then used to generate short reports using via pytracery. These are then tweeted using the python-twitter API.

Hosting

Hosting was initially via cheapbotsdonequick, a really easy to use service that just takes a tracery grammar and will post to a given twitter account.

The bot is now hosted on a free heroku account using bespoke code. This gives some more control over timing of updates (using a heroku scheduler) and also opens up the possibility of some more sophisticated tweet generation. As a down side, the grammar is now under git control to make things easier at the heroku end which isn't ideal in version control terms as it's actually being generated by the python code.

Tweeting

The pythonscript tweetbot.py will generate a tweet from a grammar and attempt to tweet it. Arguments are:

	-c --config <config>
		Configuration file (see below)
	-n --notweet
		Generate and report only, do not actually post a status
		update.
	-u --noauth
	    Don't attempt to authenticate (Implies no tweeting)
	-x --override
	    Override randomness, i.e. always generate a tweet
	-t --tweets <n>
	    Produce n tweets

The configuration file is a json file that contains information about the grammar to be used, the production rule to use, and the freqency of updates. As an example, the @chromaticwhale account configuration is:

{
  "grammar": "grammar/whale.json",
  "production": "origin",
  "frequency": 3
}

Note that this assumes that the script is run from the top level directory, e.g.

python python/tweetbot.py --config config.json

On occasion, generated tweets may be larger than 140 characters. The tweetbot will regenerate (up to 10 times) until a tweet of the right length is obtained. If, after 10 attempts, a tweet of the correct size has not been generated, the script reports this and finishes. If the grammar is changed, this may need revisiting.

Otherwise, the frequency value is used to dictate whether or not the bot will tweet on that cycle. A random number is generated between 0 and frequency-1. If that number is 0, then a tweet is issued. If the value is non zero, then no tweet is generated. Thus a frequency value of 1 guarantees that a tweet will be issued. A heroku scheduler allows for intervals of 10 minutes, hourly or daily, so the frequency mechanism gives us a certain element of randomness, and some flexibility of the periodicity of tweets. The @chromaticwhale scheduler fires up every hour, so on average, will actually tweet every three hours.

Authentication to twitter is controlled via four variables: API_KEY, API_SECRET, ACCESS_TOKEN, ACCESS_SECRET. For obvious reasons, these are not held in the repository, but should be set as environment variables, either locally or via the heroku app.

If the bot has decided to tweet, then the credentials are used to establish an api connection. The account name is then reported. If the --notweet argument is not present, a status update is then made.