-
Notifications
You must be signed in to change notification settings - Fork 2.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
LUCENE-9555: Advance conjuction Iterator for two phase iteration #1943
LUCENE-9555: Advance conjuction Iterator for two phase iteration #1943
Conversation
Some collectors provide iterators that can efficiently skip non-competitive docs. When using DefaultBulkScorer#score function we create a conjunction of scorerIterator and collectorIterator. As collectorIterator always starts from a docID = -1, and for creation of conjunction iterator we need all of its sub-iterators to be on the same doc, the creation of conjuction iterator will fail if scorerIterator has already been advanced to some other document. This patch ensures that we create conjunction between scorerIterator and collectorIterator only if scorerIterator has not been advanced yet. Relates to apache#1725 Relates to apache#1937
This patch also addresses the failure of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's a pity to disable this great optimization when scoring ranges of doc IDs versus scoring the entire doc ID space at once. Maybe we can apply the leap-frog logic without using ConjunctionDISI in that method, or alternatively wrap the scorer DISI in a way that makes usage of ConjunctionDISI legal?
class RangeDISIWrapper {
private final DocIdSetIterator in;
private final int min, max;
private int doc = -1;
@Override
public int advance(int target) {
target = Math.max(min, target);
if (target >= max) {
return NO_MORE_DOCS;
}
return doc = in.advance(target);
}
}
@jpountz Sorry for the noise, I have found the cause of this error, and the latest commit addresses it. My next steps will the following:
Plan B:
I am interested in your opinion which plan is better? |
PR #1351 introduced a sort optimization where
documents can be skipped.
But there was a bug in case we were using two phase
approximation, as we would advance it without advancing
an overall conjunction iterator.
This patch fixed it.
Relates to #1351