-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Add way for Block
to keepMask
#112160
Conversation
This adds a `Block#keepMask(BooleanVector)` method that will make a new block, keeping all of the values where the vector is `true` and `null`ing all of the velues where the vector is false. This will be useful for implementing partial aggregation application like `| STATS MAX(a WHERE b > 1), MIN(j WHERE b > 2) BY bar`. Or however the syntax ends up being. We already skip `null` group keys and we can evaluate the `b > 2` bits to a mask pretty easily. It should also be useful in optimizing `CASE(a > 2, foo)` - but only when the RHS of the CASE is `null` and the LHS is a constant or constant-like. This is something that's very optimize-able. I haven't really optimized it in this PR, but it should be possible to speed this up a ton and remove a lot of copying. Here's where the benchmarks start: ``` (dataTypeAndBlockKind) Mode Cnt Score Error Units int/array avgt 7 3.705 ± 0.153 ns/op int/vector avgt 7 3.234 ± 0.078 ns/op ``` That's about the same speed as reading the block. In a few of these cases I expect we can get them to constant performance rather than per-record performance.
Pinging @elastic/es-analytical-engine (Team:Analytics) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 for BooleanVector (vs BooleanBlock).
There's plenty of optimization space from compact representation for the filter to convenience methods for skipping nulls (similar to a bitset), or combining this with the internal block BitSet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM and ++ to adding tests for float blocks!
} | ||
return (DoubleBlock) blockFactory().newConstantNullBlock(getPositionCount()); | ||
} | ||
try (DoubleBlock.Builder builder = blockFactory().newDoubleBlockBuilder(getPositionCount())) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the array blocks, couldn't we just incref the underlying vector and create a new nullsmask from the existing one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is what I meant but "we can optimize this". I did the simplest thing that'd get the tests to pass and figured we could grab more later.
I was thinking that we could try and replace the masks on the Block
subclasses with BooleanVector
- or BooleanArrayVector
- and then in some cases this'd be just incRef-ing a few things and returning a new combined block.
try (IntBlock.Builder builder = blockFactory().newIntBlockBuilder(getPositionCount())) { | ||
// TODO if X-ArrayBlock used BooleanVector for it's null mask then we could shuffle references here. | ||
for (int p = 0; p < getPositionCount(); p++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly here; an array block is a vector + nullmask + firstvalueindexes array; we could incref the vector and only provide the nullmask, and the firstvalueindexes should be null as there's no MVs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right - this is one that I think is super optimizeable. I went with the slow, plodding implementation.
x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/data/Block.java
Outdated
Show resolved
Hide resolved
@@ -83,6 +83,11 @@ public ConstantNullBlock filter(int... positions) { | |||
return (ConstantNullBlock) blockFactory().newConstantNullBlock(positions.length); | |||
} | |||
|
|||
@Override | |||
public ConstantNullBlock keepMask(BooleanVector mask) { | |||
return (ConstantNullBlock) blockFactory().newConstantNullBlock(getPositionCount()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we just incref?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I'll have a look at this one on it's own - there are a few cases where I think we can incRef; return this;
in this class that we're not doing.
assert false : "null vector"; | ||
throw new UnsupportedOperationException("null vector"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we throw UOE instead of returning a constant null block?
(This seems like something that could blow up in the future, where due to some optimization we end up having a constant null vector in a place where we apply a mask to the vector.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Vectors can never contain null
anyway. This exists to make the compiler happy. The class level javadoc mentions this - it's mostly so we can return something the implements all of the appropriate interfaces from ConstantNullBlock
. If there's a way to delete this class I'd be happy to.
…pute/data/Block.java Co-authored-by: Alexander Spies <[email protected]>
This has passed but the communication seems to have dropped. Merging on my own. |
This adds a `Block#keepMask(BooleanVector)` method that will make a new block, keeping all of the values where the vector is `true` and `null`ing all of the velues where the vector is false. This will be useful for implementing partial aggregation application like `| STATS MAX(a WHERE b > 1), MIN(j WHERE b > 2) BY bar`. Or however the syntax ends up being. We already skip `null` group keys and we can evaluate the `b > 2` bits to a mask pretty easily. It should also be useful in optimizing `CASE(a > 2, foo)` - but only when the RHS of the CASE is `null` and the LHS is a constant or constant-like. This is something that's very optimize-able. I haven't really optimized it in this PR, but it should be possible to speed this up a ton and remove a lot of copying. Here's where the benchmarks start: ``` (dataTypeAndBlockKind) Mode Cnt Score Error Units int/array avgt 7 3.705 ± 0.153 ns/op int/vector avgt 7 3.234 ± 0.078 ns/op ``` That's about the same speed as reading the block. In a few of these cases I expect we can get them to constant performance rather than per-record performance.
This adds a
Block#keepMask(BooleanVector)
method that will make a new block, keeping all of the values where the vector istrue
andnull
ing all of the values where the vector is false.This will be useful for implementing partial aggregation application like
| STATS MAX(a WHERE b > 1), MIN(j WHERE b > 2) BY bar
. Or however the syntax ends up being. We already skipnull
group keys and we can evaluate theb > 2
bits to a mask pretty easily. It should also be useful in optimizingCASE(a > 2, foo)
- but only when the RHS of the CASE isnull
and the LHS is a constant or constant-like.This is something that's very optimize-able. I haven't really optimized it in this PR, but it should be possible to speed this up a ton and remove a lot of copying. Here's where the benchmarks start:
That's about the same speed as reading the block. In a few of these cases I expect we can get them to constant performance rather than per-record performance.