Move anonymous Weight implementation in PointRangeQuery to named class #13711

jainankitk · 2024-09-03T15:52:00Z

Description

Moves the anonymous Weight implementation in PointRangeQuery#createWeight to named class for better extensibility and resusability.

Signed-off-by: Ankit Jain <[email protected]>

jpountz · 2024-09-03T16:11:44Z

I see from the linked issue that you would like to extend PointRangeQuery, but in general we don't like to think of our queries as being extensible. I wonder if you could do what you need through composition rather than extension?

Signed-off-by: Ankit Jain <[email protected]>

jainankitk · 2024-09-03T17:48:33Z

I see from the linked issue that you would like to extend PointRangeQuery, but in general we don't like to think of our queries as being extensible. I wonder if you could do what you need through composition rather than extension?

@jpountz - Thanks for your feedback. While I agree with using composition rather than extension for PointRangeQuery itself, Weight implementation itself within PointRangeQuery will be more reusable (via composition instead of extension) by being a non-anonymous class.

jpountz · 2024-09-03T20:58:30Z

Sorry, I don't think we should make Lucene's Weight implementations public.

I looked up the OpenSearch issue, if I understand correctly, the problem you're trying to solve is that it's wasteful for PointRangeQuery to evaluate the whole range when it's only asked for the first 10 doc IDs that match the range query. I agree it's wasteful. We have the same problem on nightly benchmarks and the IntNRQ task. I wonder if there are better ways to do what you're after, e.g. adding a TopDocs Weight#topk(int n, int totalHitsThreshold) API that would default to collecting hits, and that some classes such as PointRangeQuery could override. As I'm writing this, I'm not convinced that it's actually a good idea. Using sparse indexing would likely be a better approach, especially if the index can be sorted, as this would produce good iterators that don't have this huge up-front cost of evaluating the query against the entire segment.

github-actions · 2024-09-18T00:21:19Z

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

jainankitk added 2 commits August 20, 2024 10:50

Moving Weight implementation to nested class from anonymous

0080673

Merge branch 'main' into prq-refactor

cb1c7d5

jainankitk mentioned this pull request Sep 3, 2024

Introduce ApproximateRangeQuery and ApproximateableQuery opensearch-project/OpenSearch#13788

Merged

9 tasks

Fixing spotless violations

ffeced2

Signed-off-by: Ankit Jain <[email protected]>

jainankitk added 2 commits September 3, 2024 09:12

Adding missing javadoc

918add7

Signed-off-by: Ankit Jain <[email protected]>

Fixing spotless violations

e16639b

Signed-off-by: Ankit Jain <[email protected]>

github-actions bot added the Stale label Sep 18, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move anonymous Weight implementation in PointRangeQuery to named class #13711

Move anonymous Weight implementation in PointRangeQuery to named class #13711

jainankitk commented Sep 3, 2024

jpountz commented Sep 3, 2024

jainankitk commented Sep 3, 2024

jpountz commented Sep 3, 2024

github-actions bot commented Sep 18, 2024

Move anonymous Weight implementation in PointRangeQuery to named class #13711

Are you sure you want to change the base?

Move anonymous Weight implementation in PointRangeQuery to named class #13711

Conversation

jainankitk commented Sep 3, 2024

Description

jpountz commented Sep 3, 2024

jainankitk commented Sep 3, 2024

jpountz commented Sep 3, 2024

github-actions bot commented Sep 18, 2024