-
Notifications
You must be signed in to change notification settings - Fork 943
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
indexing #173
Comments
I've tinkered with this with the revisions to turf-count and making turf-extent wayfaster - basically for a bunch of algorithms where we repeatedly do something expensive like point-in-polygon, doing point-in-extent first gives a serious boost. But the algorithms that benefit from this, or any other indexing strategy, seem somewhat limited for now.
Only for repeated checking of the same polygon - the initial processing of the index would be more expensive than the current pip algorithm. From what I've seen, the biggest perf issue in turf is the jsts'y algorithms, so they're top priority in my mind. I think indexing is useful but there's more low-hanging fruit to get in the short term. |
@tmcw automatic indexes on the fly are probably bad (without some finicky heuristics as to when they should happen). What if indexing required a manual call (something like
Agree with this 100% from an advanced user's perspective; people who can chop processing into reasonable chunks using indexes. I am thinking more about someone trying to find which of a million points are within a polygon. Right now, a beginner is going to be using turf like PostGIS minus indexes (unusably slow in common real world situations). For example, what if turf-inside looked for a tile index? If its there, then it does a trivial in-out quadkey check. If not, then it goes ahead with the full algorithm. This might be close to best-of-both-worlds. |
I think with the small optimizations I've made for this will give a pretty good bump - I'll try this out later today. |
If there are big unknowns or tradeoffs in the choice of index datastructure, would it be possible to define algorithms like turf-inside in terms of an index protocol rather than a specific concrete implementation? Turf would then provide implementations of the protocol for rbush, tile-cover, tilebelt, etc, and users would supply their choice when instantiating a particular algorithm. |
Hi all! +1 for protocols as @jfirebaugh mentioned (like rbush implementation with https://github.com/jvrousseau/turf-index). |
An old issue, however very useful topic! So far I've been using RBush for everything index related and it is VERY efficient on extremely large GeoJSON datasets. I had to create my own "Ad Hoc" GeoJSON implementation in It would be useful to have a basic GeoJSON support implementation of RBush called We would need to add some extra default properties while loading the RBush Tree to be able to retrieve the GeoJSON again ( tree.insert({
minX: minX,
minY: minY,
maxX: maxX,
maxY: maxY,
index: index,
geojson: geojson
}) It would |
Since there are many different ways to implement an index (r-tree, geohashes, etc...). It's best to leave these types of index operations/implementations outside of TurfJS. For example, the r-tree index implementation Without going too much outside the TurfJS scope, we could include the The benefit of having this GeoJSON BBox attribute example { "type": "Feature",
"properties": {},
"bbox": [-10.0, -10.0, 10.0, 10.0],
"geometry": {
"type": "Polygon",
"coordinates": [[
[-10.0, -10.0], [10.0, -10.0], [10.0, 10.0], [-10.0, 10.0]
]]
}
...
} @tmcw @morganherlocker Fair to say we can close this issue? |
So far I have rejected anything related to indexing. I think of it as a problem for the data layer, rather than the processing layer. That said, I think it should be considered.
pros
cons
candidates for submodules
I am 100% on the fence with this. Input greatly appreciated from anyone who understands how indexing works at a low level and people who just need to use them). This has huge architecture implications, so we really need both sides.
The text was updated successfully, but these errors were encountered: