Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent feature thinning between tiles #55

Closed
e-n-f opened this issue Jun 5, 2015 · 13 comments
Closed

Consistent feature thinning between tiles #55

e-n-f opened this issue Jun 5, 2015 · 13 comments

Comments

@e-n-f
Copy link
Contributor

e-n-f commented Jun 5, 2015

To address the highly visible tile boundaries between automatically thinned out tiles, I think the answer is probably that if a feature is dropped from one tile, it must be dropped from every tile, so that the density transitions happen at the edges of features, not of tiles. The plan:

  • Add a feature sequence number to the serialization format (also for Add a flag to inhibit sorting #37) so that features can be related between tiles
  • For each zoom level, preflight the tile sizes. For each tile that is too big, instead of just skipping a fraction of features, explicitly choose which ones will be dropped, and add them to a list.
  • Then do a second pass over that zoom level, actually writing the tiles, and dropping all the features that need to be dropped, for whatever tile they appear in

The downside: this will only help for features that actually correspond to a reasonably long trip. If each feature is just a single segment, it's hardly better than dropping at the tile boundaries, and will just have a fringed transition instead of a sharp edge.

The preflighting also unfortunately makes parallelization more complicated, since the list of features to be dropped has to be consistent between threads.

cc @lxbarth @tmcw

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 5, 2015

Probably more manageable if I still do fractional dropping and only track the features that cross tile boundaries explicitly?

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 10, 2015

What it looks like at z10 applied to tl_2010_06037_roads (with all metadata preserved) from TIGER. Still very visible differences between tiles. I hope this is just because roads in TIGER are so much shorter and more distinct than tracks from GPS.

screen shot 2015-06-10 at 10 53 13 am

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 10, 2015

Most worrisome part is the sparse regions at the edges, where roads are apparently getting thinned from both sides. Maybe when thinning a tile it needs to consider all tracks in that tile, not just ones that haven't already been eliminated by a neighbor.

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 10, 2015

Hmm, no, not just a TIGER problem: there are very visible bands at the edges of many GPS tiles. Debugging now.

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 11, 2015

Fixed the major bug and now it is fairly smooth. The gradients from a dense tile into a sparse one are still more visible than I want at moderate zooms (8 to 11 or so) but the abrupt seams are gone at high zooms.

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 11, 2015

Example of the low-zooms problem: the Skobbler corpus at z5. Germany is so dense that England falls off to nothing at the Prime Meridian.

screen shot 2015-06-11 at 2 20 29 pm

I think the answer is probably to choose features to keep within each tile so that the features are distributed as evenly as possible by spatial index rather than by index-sorted sequence number.

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 13, 2015

At least within a 200K track (~10%) sample of the Skobbler tracks, that seems to be working out pretty well.

After discussion with @virginiayung I'm using the spatial-index distance between feature bounding box centers to measure frequency within each tile, sorting the frequencies, and cutting off a percentile of frequencies at the top end proportional to the fraction of features that need to be dropped.

There's still enough going on in low-frequency rural Germany that the high-frequency tracks in the east half of London completely vanish, but that's probably better than having a dense spot with a big empty area around it. Maybe giving short tracks a boost over long ones would keep a little more of it around.

screen shot 2015-06-12 at 4 47 04 pm

Trying again now with a larger sample. Some of these drop rates are pretty crazy: with the full 2013 Skobbler tracks, tile 4/8/5, which covers most of Europe, only retains 0.26% of its original features.

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 15, 2015

Hmm, this is still not really solved. In the complete 2013 Skobbler tracks at z4, you can still see a dramatic drop in density in eastern England and southern Denmark because the data in 4/8/5 is thinned so much more than its neighbors.

screen shot 2015-06-15 at 3 11 02 pm

Maybe the rule needs to be that no tile can retain more than, say, a 50% greater fraction than its immediate neighbors retain.

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 15, 2015

Dropping a uniform fraction across the whole world would make for many very sparse tiles in sparse areas, but maybe it would work to have the frequency cutoff be uniform?

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 16, 2015

A single worldwide frequency cutoff works well for low zoom levels, even though it means that the map is highly biased toward routes that carry long trips.

screen shot 2015-06-15 at 11 53 05 pm

It loses a lot of detail in most tiles at high zooms, though. There must be some good way to get local near-uniformity without requiring global uniformity.

@lxbarth
Copy link
Contributor

lxbarth commented Jun 16, 2015

It loses a lot of detail in most tiles at high zooms, though. There must be some good way to get local near-uniformity without requiring global uniformity.

Changing the aggressiveness of the algorithm by zoom level?

@e-n-f
Copy link
Contributor Author

e-n-f commented Jun 17, 2015

Maybe that is the thing to do. Cutting out a midrange band of frequencies, as I tried today, instead of only cutting high frequencies doesn't seem to help. Even if being nonuniform at the high zooms only works because the tracks tend to be longer than a tile is wide, it still works, so I'll try easing into it.

@e-n-f
Copy link
Contributor Author

e-n-f commented Dec 14, 2016

This works now with --drop-densest-as-needed and --drop-fraction-as-needed.

@e-n-f e-n-f closed this as completed Dec 14, 2016
mmc1718 pushed a commit to mmc1718/tippecanoe that referenced this issue Jun 21, 2023
* add pmtiles.hpp from github.com/protomaps/PMTiles [mapbox#10]

* tippecanoe main writes pmtiles output. [mapbox#10]

* detect output format using suffix
* after mbtiles is done writing, replace with pmtiles based on map/image tables.
* add method to write_json for writing json sub-object.

* tippecanoe-decode reads pmtiles input. [mapbox#10]

* tile-join reads and writes pmtiles. [mapbox#10]

* pmtiles test suite for decode and tile-join [mapbox#10]

* add base GitHub CI action for compiling and test suite.

* update pmtiles.hpp with z>15 fix

* Fix some ordering problems with pmtiles decode

* Pmtiles should also pass the raw tiles tests

* Eradicate spaces from tileset metadata JSON fields

* Eradicate spaces from more test fixtures

* Update more tests

* Pmtiles tests pass now too

* Remove unnecessary sort (and make indent)

* Update changelog

* The allow-existing test for pmtiles needs -o, not -e

* Declare --allow-existing to be unsupported for pmtiles.

It was always a bad idea even for mbtiles.

Co-authored-by: Brandon Liu <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants