Skip to content

Determine if old OSM data on cycling infrastructure could help with uplift modelling


Notifications You must be signed in to change notification settings


Repository files navigation


Determine if old OSM data on cycling infrastructure could help with uplift modelling

Measure total length of cycling infrastructure in the past

Output: A CSV with LAD, a total length in 2011, and a total length in 2020

Idea 1: Manually


  • At least 100GB disk or so
  • Connection that can reasonably download 40GB
  • npm, mapshaper, osmium

Get a osm.pbf of England as of January 1 2011

Geofabrik has old osm.pbfs, if you click on "raw directory index". But and for Europe only go back to 2014.

From, I found the oldest pbf dump at, which is 112GB. But then thankfully I found and went for a 2013 history dump, only 40GB (15 minutes to download with gigabit fiber)!

From, then we can turn a history dump into a regular dump:

osmium time-filter history_2013-02-05_1701.osm.bz2 2011-01-01T00:00:00Z -o 2011.osm.pbf

That took about 2 hours, and now down to 10 GB. Then we can clip to a bounding box of England (cheers, using the fastest extraction strategy and removing unneeded metadata:

osmium extract -b -6.020508,49.696062,2.329102,55.949200 2011.osm.pbf -o england_2011.pbf -f pbf,add_metadata=false -s simple

That took just 30 seconds, with output just 175 MB. The equivalent extract today is about 1.2 GB, so that's a quick sense of how sparse OSM data was in 2011!

(TODO: Is it faster to clip first, then time-filter?)

Get a osm.pbf of England as of January 1 2020

This is recent enough, so Geofabrik works: No idea what the filename means, but osmium fileinfo -e england-200101.osm.pbf confirms the timestamp of changes in here.

And likewise, 2016 is

Extract cycling infrastructure from it

There's no simple tag for cycling infra in OSM. Ohsome references a thorough query from this paper. osmium can't do a complicated filter, so do it ourselves in JS.

I put england.osm.pbf in a 2011 and 2020 directory and repeated the next steps for both.

cd 2011
osmium tags-filter england.osm.pbf w/highway -o highways.osm.pbf
osmium export highways.osm.pbf --config ../osmium_with_ids.cfg --geometry-type=linestring -f geojsonseq -x print_record_separator=false -o highways.geojson
npm run filter `pwd`/highways.geojson 2> cycleways.geojson
# For convenient use in QGIS, convert to geopackage
# Manually remove the trailing comma after the last feature, making it valid JSON
ogr2ogr -f GPKG cycleways.gpkg cycleways.geojson

Split by LAD boundaries

Download 2011 LAD boundaries as GJ from, and convert it to WGS84:

ogr2ogr lads_2011.geojson -t_srs EPSG:4326 ~/Downloads/Local_Authority_Districts_December_2011_FEB_EW_2022_-7538937248730119669.geojson -sql 'SELECT * FROM "Local_Authority_Districts_December_2011_FEB_EW_2022_-7538937248730119669"'

We want to take the England-wide cycleway GJ file and split it into one file per LAD. First we add the lad11cd property to each LineString using mapshaper:

mapshaper-xl cycleways.geojson -divide ../lads_2011.geojson -o cycleways_grouped.geojson

Then we split into a bunch of files:

mkdir split; cd split; mapshaper-xl -i ../cycleways_grouped.geojson -split lad11cd -o format=geojson
# Leftover out-of-bounds stuff in Scotland
rm -f null.json
# Actually remove everything from Scotland, since it'll only be in the imperfect 2011 clip
rm -fv W*.json

Sum length

Now for each of those split files, we want to sum the length of all the LineStrings inside.

# Back in the main directory
npm run sum 2> cycleway_lengths_by_lad.csv

Idea 2: ohsome

Something like might just work

Validate if old OSM data had enough cycling data mapped in the first place

  • Using Ohsome Quality Analyst?
  • Or checking for a sample of schemes known to have been built between 2011 and 2020


Determine if old OSM data on cycling infrastructure could help with uplift modelling







No releases published


No packages published