Speed up creation of vector tiles features #75874
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
While investigating an issue with a stackoverflowerror (#75543) I come to the conclusion that the problem is that we are calling the algorithm to simplify the polygon (TopologyPreservingSimplifier.simplify) to the original polygon on the pixel space. This might result in having lots of points and not too many different values and the algorithm was failing.
Therefore I thought that calling the algorithm on the original shape when defined in Spherical mercator projections might help and indeed it helped. There are no more stackoverflowerror and it has speed up quite a bit the generation of features. In my tests, a call to the API that was taking 110 seconds takes now 35 seconds!
One caveat is that the simplification might produce invalid polygons and that was causing me to hit some ClassNotFoundExceptions(see #75869) so I added a check to reject invalid polygons. Note this seems only to happen when polygon is comparable to the pixel scale but big enough not to be rejected so I think it is fine to just skip it.
There is a second improvement. In my testing I have a few big and very complex polygons so whenever I was zoom-in into them I could see the heap usage going up and eventually I got an OOM. I thought it would be easy to check if the polygon contains the tile and in this case we can just return the tile geometry. I did that and indeed the issue is gone.