Skip to content

Latest commit

 

History

History
55 lines (34 loc) · 3.42 KB

README.md

File metadata and controls

55 lines (34 loc) · 3.42 KB

Incident worker

Part of a collection of tools used to provide an API to NSW bushfire data: Data collector, Importer (this repo) and GeoJSON API

incidentworker imports data from the NSW Rural Fire Service's major incidents GeoJSON into a database. The GeoJSON above contains a collection of current incidents and behaves just like the GeoRSS feed we previously imported. An incident is a fire (or something similar). Current incidents are those that have not been resolved yet.

What's going on

In the development of incidentworker I've used the noun Incident and introduced Report. An incident has many reports. From this point of view, the RFS feeds actually contain a collection of reports, and each report relates to an incident. To be more accurate, the feed contains the most recent report for all incidents that haven't been resolved yet.

When incidentworker performs an import it does roughly the following for each entry (or Report) in the feed:

  1. Have we seen the Incident this Report refers to before?
  2. If no, insert the Incident into the database. It will be marked as current upon insertion.
  3. If yes, ensure the existing Incident is marked as current.
  4. If we haven't seen this Report before, insert it into the database too.
  5. Ensure that the only incidents marked as current in the database are the ones from this update.

Usage

Use the command line interface to import data from a local or remote XML file.

incidentworker imports the data into a PostgreSQL database and makes use of the postgis and uuid-ossp extensions. The database is managed in this project using Goose.

Configure the database for Goose by copying the file dbconf.yml.example to dbconf.yml. The database is configured by default with a DATABASE_URL environment variable, e.g. postgres://user:pass@localhost/database_name?sslmode=disable. Alternatively you can edit dbconf.yml with your database connection details. Ensure the database has been created, then run goose up to run the migrations in db/migrations.

Import a local file

$ incidentworker /path/to/geojson.json

Import a remote file

$ incidentworker http://www.rfs.nsw.gov.au/feeds/majorIncidents.json

Import at an interval

To perform an import repeatedly at an interval, include the --tick option with the number of seconds between each import. This is what I'm using on Heroku to perform regular imports (refer to the Procfile).

This command will import the data every 5 minutes:

$ incidentworker --tick 300 http://www.rfs.nsw.gov.au/feeds/majorIncidents.json

Import a collection of files

I use the following to import the data I've collected. To import 5 months of hourly GeoRSS feeds currently takes about 5 minutes. If you wish to do this, you'll need to use an earlier version of this library as it has now switched to importing GeoJSON. The better option is just to contact me for a dump of the production database.

for file in /path/to/major-incidents-data/*.xml; do ./incidentworker $file; done