Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LUCENE-8599: Use sparse bitset to store docs in SingleValueDocValuesFieldUpdates #522

Merged
merged 1 commit into from
Dec 10, 2018

Conversation

s1monw
Copy link
Member

@s1monw s1monw commented Dec 10, 2018

Using a sparse bitset in SingleValueDocValuesFieldUdpates allows storing
which documents have an update much more efficient and prevents the need
to sort the docs array altogether that showed to be a significant bottleneck
in LUCENE-8598. Using the spares bitset yields another 10x performance improvement
in applying updates versus the changes proposed in LUCENE-8598.

Copy link
Contributor

@jpountz jpountz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could even use DocIdSetBuilder to records docs in this case, which should be more memory-efficient and maybe also faster.

@jpountz
Copy link
Contributor

jpountz commented Dec 10, 2018

For the record, DocIdSetBuilder needs to sort as well if the number of collected docs is low, but it does it with a pretty fast radix sort.

@s1monw
Copy link
Member Author

s1monw commented Dec 10, 2018

@jpountz I will get this in as is and then explore what we can do as a followup with DocIdSetBuilder. We do need additional stats to make efficient use of it IMO and this already yields a significant improvement.

@jpountz
Copy link
Contributor

jpountz commented Dec 10, 2018

Sure!

…ieldUpdates

Using a sparse bitset in SingleValueDocValuesFieldUdpates allows storing
which documents have an update much more efficient and prevents the need
to sort the docs array altogether that showed to be a significant bottleneck
in LUCENE-8598. Using the spares bitset yields another 10x performance improvement
in applying updates versus the changes proposed in LUCENE-8598.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants