Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce nearest neighbor search for points #1528

Merged
merged 62 commits into from
Oct 11, 2024

Conversation

ullingerc
Copy link
Collaborator

@ullingerc ullingerc commented Sep 30, 2024

QLever now supports a special predicate ?left <nearest-neighbors:k> ?right where k is a positive integer that is transformed into a spatial join operation which for each entry in ?left only returns the k spatially closest elements in ?right.
For the semantically correct usage, it is important that the <nearest-neighbors> join is performed as the last step, e.g. by putting the other constraints for ?left and right into subqueries. In the future we probably want to replace this special predicate by a special construct that better reflects the non-commutativity of the nearest neighbor join.

Currently, the nearest neighbor join (same as the <max-distance-in-meters> join only work for point objects). The join is implemented using Google's S2 library, delivers the functionality for nearest neighbor search out of the box.

Copy link

codecov bot commented Sep 30, 2024

Codecov Report

Attention: Patch coverage is 95.85799% with 7 lines in your changes missing coverage. Please review.

Project coverage is 88.40%. Comparing base (b97c44c) to head (c4be461).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
src/engine/SpatialJoin.cpp 96.22% 1 Missing and 5 partials ⚠️
src/engine/QueryPlanner.cpp 66.66% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #1528      +/-   ##
==========================================
+ Coverage   88.33%   88.40%   +0.07%     
==========================================
  Files         362      362              
  Lines       27319    27426     +107     
  Branches     3682     3705      +23     
==========================================
+ Hits        24131    24245     +114     
+ Misses       1952     1945       -7     
  Partials     1236     1236              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This already looks very very promising and is also working like a charm (apart from a serious lifetime bug, which we fortunately haven't triggered so far:)).

I have left you an initial round of comments,
So you can work on improving this really really nice feature.

Dockerfile Outdated Show resolved Hide resolved
src/index/CompressedRelation.cpp Outdated Show resolved Hide resolved
src/engine/QueryPlanner.cpp Show resolved Hide resolved
src/engine/SpatialJoin.h Outdated Show resolved Hide resolved
src/engine/SpatialJoin.h Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
test/engine/SpatialJoinTest.cpp Outdated Show resolved Hide resolved
@ullingerc ullingerc requested a review from joka921 October 7, 2024 10:57
@ullingerc ullingerc marked this pull request as ready for review October 10, 2024 13:11
Copy link

Copy link
Member

@joka921 joka921 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much, I think we can merge this now.

@joka921 joka921 changed the title Introduce SpatialJoin nearest neighbor search Introduce nearest neighbor search for points Oct 11, 2024
@joka921 joka921 merged commit 309f6b7 into ad-freiburg:master Oct 11, 2024
20 checks passed
@hannahbast
Copy link
Member

hannahbast commented Oct 15, 2024

@ullingerc This is now active on our OSM Planet instance. I just noticed a small problem, which might have an easy fix. For example, consider https://qlever.cs.uni-freiburg.de/osm-planet/N1UUCA . QLever should compute the result of the subquery first (which contains the magic <nearest-neighbors:1> predicate), but it does not. The reason seems to be the huge size estimate for the result of the nearest neighbor join (3,517,552,729,752 according to the "Analysis" tree). Shouldn't the size estimate for this join simply be the size estimate of the left side?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants