Twitter bot for bogus travel reports. The bot is currently tweeting to @chromaticwhale.
There are two stages to the process.
- Production of a Tracery grammar.
- Generation of reports using the grammar.
The grammar includes several lists of terminals. These are pulled from dbpedia via the SPARQL endpoint. Current items of interest include Northern Rail stations and lines, European rodents, weather conditions and Japanese monsters.
The grammar is then used to generate short reports using via pytracery. These are then tweeted using the python-twitter API.
Hosting was initially via cheapbotsdonequick, a really easy to use service that just takes a tracery grammar and will post to a given twitter account.
The bot is now hosted on a free heroku account using bespoke code. This gives some more control over timing of updates (using a heroku scheduler) and also opens up the possibility of some more sophisticated tweet generation. As a down side, the grammar is now under git control to make things easier at the heroku end which isn't ideal in version control terms as it's actually being generated by the python code.
The pythonscript tweetbot.py
will generate a tweet from a grammar
and attempt to tweet it. Arguments are:
-c --config <config>
Configuration file (see below)
-n --notweet
Generate and report only, do not actually post a status
update.
-u --noauth
Don't attempt to authenticate (Implies no tweeting)
-x --override
Override randomness, i.e. always generate a tweet
-t --tweets <n>
Produce n tweets
The configuration file is a json file that contains information about the grammar to be used, the production rule to use, and the freqency of updates. As an example, the @chromaticwhale account configuration is:
{
"grammar": "grammar/whale.json",
"production": "origin",
"frequency": 3
}
Note that this assumes that the script is run from the top level directory, e.g.
python python/tweetbot.py --config config.json
On occasion, generated tweets may be larger than 140 characters. The tweetbot will regenerate (up to 10 times) until a tweet of the right length is obtained. If, after 10 attempts, a tweet of the correct size has not been generated, the script reports this and finishes. If the grammar is changed, this may need revisiting.
Otherwise, the frequency value is used to dictate whether or not the
bot will tweet on that cycle. A random number is generated between 0
and frequency-1
. If that number is 0
, then a tweet is issued. If
the value is non zero, then no tweet is generated. Thus a frequency
value of 1 guarantees that a tweet will be issued. A heroku scheduler
allows for intervals of 10 minutes, hourly or daily, so the frequency
mechanism gives us a certain element of randomness, and some
flexibility of the periodicity of tweets. The
@chromaticwhale scheduler fires
up every hour, so on average, will actually tweet every three hours.
Authentication to twitter is controlled via four variables: API_KEY
,
API_SECRET
, ACCESS_TOKEN
, ACCESS_SECRET
. For obvious reasons,
these are not held in the repository, but should be set as environment
variables, either locally or via the heroku app.
If the bot has decided to tweet, then the credentials are used to
establish an api connection. The account name is then reported. If the
--notweet
argument is not present, a status update is then made.