Skip to content

Determine if old OSM data on cycling infrastructure could help with uplift modelling

License

Notifications You must be signed in to change notification settings

acteng/osm_historic

Repository files navigation

osm_historic

Determine if old OSM data on cycling infrastructure could help with uplift modelling

Measure total length of cycling infrastructure in the past

Output: A CSV with LAD, a total length in 2011, and a total length in 2020

Idea 1: Manually

Setup:

  • At least 100GB disk or so
  • Connection that can reasonably download 40GB
  • npm, mapshaper, osmium

Get a osm.pbf of England as of January 1 2011

Geofabrik has old osm.pbfs, if you click on "raw directory index". But https://download.geofabrik.de/europe/united-kingdom/england.html and for Europe only go back to 2014.

From https://wiki.openstreetmap.org/wiki/Planet.osm/full, I found the oldest pbf dump at https://planet.openstreetmap.org/pbf/full-history/history-221219.osm.pbf, which is 112GB. But then thankfully I found https://planet.osm.org/planet/full-history/ and went for a 2013 history dump https://planet.osm.org/planet/full-history/2013/history_2013-02-05_1701.osm.bz2, only 40GB (15 minutes to download with gigabit fiber)!

From https://osmcode.org/osmium-tool/manual.html#working-with-history-files, then we can turn a history dump into a regular dump:

osmium time-filter history_2013-02-05_1701.osm.bz2 2011-01-01T00:00:00Z -o 2011.osm.pbf

That took about 2 hours, and now down to 10 GB. Then we can clip to a bounding box of England (cheers bboxfinder.com), using the fastest extraction strategy and removing unneeded metadata:

osmium extract -b -6.020508,49.696062,2.329102,55.949200 2011.osm.pbf -o england_2011.pbf -f pbf,add_metadata=false -s simple

That took just 30 seconds, with output just 175 MB. The equivalent extract today is about 1.2 GB, so that's a quick sense of how sparse OSM data was in 2011!

(TODO: Is it faster to clip first, then time-filter?)

Get a osm.pbf of England as of January 1 2020

This is recent enough, so Geofabrik works: http://download.geofabrik.de/europe/united-kingdom/england-200101.osm.pbf. No idea what the filename means, but osmium fileinfo -e england-200101.osm.pbf confirms the timestamp of changes in here.

And likewise, 2016 is http://download.geofabrik.de/europe/united-kingdom/england-160101.osm.pbf

Extract cycling infrastructure from it

There's no simple tag for cycling infra in OSM. Ohsome references a thorough query from this paper. osmium can't do a complicated filter, so do it ourselves in JS.

I put england.osm.pbf in a 2011 and 2020 directory and repeated the next steps for both.

cd 2011
osmium tags-filter england.osm.pbf w/highway -o highways.osm.pbf
osmium export highways.osm.pbf --config ../osmium_with_ids.cfg --geometry-type=linestring -f geojsonseq -x print_record_separator=false -o highways.geojson
npm run filter `pwd`/highways.geojson 2> cycleways.geojson
# For convenient use in QGIS, convert to geopackage
# Manually remove the trailing comma after the last feature, making it valid JSON
ogr2ogr -f GPKG cycleways.gpkg cycleways.geojson

Split by LAD boundaries

Download 2011 LAD boundaries as GJ from https://geoportal.statistics.gov.uk/datasets/cf3af807271246f4a8865e30f308fc21_0/explore, and convert it to WGS84:

ogr2ogr lads_2011.geojson -t_srs EPSG:4326 ~/Downloads/Local_Authority_Districts_December_2011_FEB_EW_2022_-7538937248730119669.geojson -sql 'SELECT * FROM "Local_Authority_Districts_December_2011_FEB_EW_2022_-7538937248730119669"'

We want to take the England-wide cycleway GJ file and split it into one file per LAD. First we add the lad11cd property to each LineString using mapshaper:

mapshaper-xl cycleways.geojson -divide ../lads_2011.geojson -o cycleways_grouped.geojson

Then we split into a bunch of files:

mkdir split; cd split; mapshaper-xl -i ../cycleways_grouped.geojson -split lad11cd -o format=geojson
# Leftover out-of-bounds stuff in Scotland
rm -f null.json
# Actually remove everything from Scotland, since it'll only be in the imperfect 2011 clip
rm -fv W*.json

Sum length

Now for each of those split files, we want to sum the length of all the LineStrings inside.

# Back in the main directory
npm run sum 2> cycleway_lengths_by_lad.csv

Idea 2: ohsome

Something like https://hex.ohsome.org/#/cycleways_w/2011-06-01T00:00:00Z/8/52.07429015262514/-0.6955267371189224 might just work

Validate if old OSM data had enough cycling data mapped in the first place

  • Using Ohsome Quality Analyst?
  • Or checking for a sample of schemes known to have been built between 2011 and 2020

About

Determine if old OSM data on cycling infrastructure could help with uplift modelling

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published