-
Notifications
You must be signed in to change notification settings - Fork 24.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make use of global ordinals in parent/child queries #5846
Conversation
@@ -70,6 +73,7 @@ public ChildrenConstantScoreQuery(ParentChildIndexFieldData parentChildIndexFiel | |||
this.originalChildQuery = childQuery; | |||
this.shortCircuitParentDocSet = shortCircuitParentDocSet; | |||
this.nonNestedDocsFilter = nonNestedDocsFilter; | |||
this.executionMode = ExecutionMode.fromString(executionHint); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe this should rather be done in the parser?
I think this is a very good start! |
Updated the PR and all has_child/has_parent queries are now using the global ordinals. |
this.parentType = parentType; | ||
private ParentOrdCollector(ParentChildIndexFieldData.WithOrdinals indexFieldData) { | ||
// TODO: look into setting it to maxOrd | ||
this.parentOrds = new OpenBitSet(512); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could even use LongBitSet
instead of OpenBitSet
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 I didn't know about LongBitSet
, I'll add it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's pretty new. :-)
This looks good, I just left minor comments. |
…BitSet Moved ChildrenQuery and ParentQuery to LongHash
Updated the PR:
|
@@ -91,6 +91,7 @@ public DocIdSet getDocIdSet(final AtomicReaderContext context, final Bits accept | |||
return new DocIdSet() { | |||
@Override | |||
public DocIdSetIterator iterator() throws IOException { | |||
// TODO: this fails with global ordinals! The context.ord is always 0 which picks the wrong AtomicFD in GlobalOrdinalsFieldData |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure to get why context.ord is always 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This happens in the delete by query api, the searcher created there always has a view in one segment and its context.ord is therefor always 0, this messes with the readerIndex in global ordinals.
This is fine, since the p/c queries are going to be disallowed via: #5916
So once this is in, then I'll remove the TODO here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sounds good
LGTM |
… ordinals instead of being bytes values based. Closes #5846
By using global ordinals in p/c the execution will be much more efficient and therefor the time it takes to execute will much lower.
At the moment this PR only cuts over the
has_child
filter and query without score_mode (ChildrenConstantScoreQuery) to use global ordinals (via aexecution_hint
option, which is set toglobal_ordinals
by default).Update:
Some rough numbers regarding the performance improvements for p/c with global ordinals:
has_child
filter and query without score mode the execution time has been reduced around upto 5 times.has_child
query with a score_mode the execution time has been reduced around upto 1.5 times.has_parent
filter and query without score mode the execution time has been reduced around upto 4 times.has_parent
query with score_mode the execution time around upto 1.2 times.In general the transient memory needed at execution time for the mentioned queries has been reduced significantly (upto 50%).
In the case that the
has_child
andhas_parent
queries are executed on an index with 1 segment per shard (optimized index) then these queries will be an additional ~2 times faster.