-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: add driverContext to conversion function evaluators and use blockFactory to instantiate blocks #100016
ESQL: add driverContext to conversion function evaluators and use blockFactory to instantiate blocks #100016
Conversation
Pinging @elastic/es-ql (Team:QL) |
Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL) |
// UNORDERED, since whatever ordering there is, it isn't necessarily preserved | ||
: new BooleanArrayBlock(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED); | ||
: new BooleanArrayBlock(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED, driverContext.blockFactory()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for looking into this Luigi.
Hmm.. we need to be careful here. This constructor assumes that the breaker accountancy has been done. I think what you want is:
blockFactory.newBooleanArrayBlock(..)
And yes... this is a little error prone. We'll restrict access to these constructors once we're further along.
} | ||
builder.endControlFlow(); | ||
} | ||
builder.endControlFlow(); | ||
|
||
builder.addStatement("$T nullsMask = null", BitSet.class); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this logic (nullMasks, BytesRefArrays) is already managed by BlockBuilders... or maybe I'm missing something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assuming that the direct usage of arrays here is a deliberate optimization, I tried to minimize the impact and just replace
$T values = new $T(positionCount, $T.NON_RECYCLING_INSTANCE)
with
$T values = new $T(positionCount, driverContext.bigArrays())
The change seems logic, but CsvTests don't seem to appreciate it:
org.elasticsearch.xpack.esql.CsvTests > test {string.ConvertFromDatetime} FAILED
java.lang.AssertionError:
Expected: <0L>
but: was <3432L>
at org.elasticsearch.xpack.esql.CsvTests.string.ConvertFromFloats(string.csv-spec:599)
at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
at org.junit.Assert.assertThat(Assert.java:956)
at org.junit.Assert.assertThat(Assert.java:923)
at org.elasticsearch.xpack.esql.CsvTests.executePlan(CsvTests.java:395)
at org.elasticsearch.xpack.esql.CsvTests.doTest(CsvTests.java:232)
and then
java.lang.RuntimeException: 2 arrays have not been released
at org.elasticsearch.common.util.MockBigArrays.ensureAllArraysAreReleased(MockBigArrays.java:114)
at org.elasticsearch.test.ESTestCase.checkStaticState(ESTestCase.java:665)
at org.elasticsearch.test.ESTestCase.after(ESTestCase.java:479)
Is it worth further investigation to keep the optimizations in place or should we just go with the builders?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd switch to the builders to be honest. The builders are going to be easier to easier to manage the memory of.
The direct usage of arrays was an optimization for sure. I think when we don't have to handle null we can use the Vector.FixedBuilder
thing I wrote. When we do have to handle null we could probably use the regular block builder. Later I think we could make Block.FixedBuilder
which gets the same optimizations back.
The goal with the FixedBuilder
s is that the JVM can inline all the method calls on it to be "the same" as when we operate on the arrays directly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case we'll have to handle nulls (conversions can fail), so I guess what this PR does now is good enough. We can review it when we have a bit more time and fine-tune some specific cases (eg. maybe it's safe to assume that conversions to string never fail)
@elasticmachine run elasticsearch-ci/bwc |
@elasticmachine update branch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I think the TODO you have in the description is actually done already.
We would love to make a Block.FixedBuilder
that has a fixed number of slots and has the efficiencies of Vector.FixedBuilder
but can handle nulls. But that's a problem for later.
I do think you have an extra ;
in all the generated files - zap one please!
...lasticsearch/xpack/esql/expression/function/scalar/convert/ToBooleanFromDoubleEvaluator.java
Outdated
Show resolved
Hide resolved
? new BooleanArrayVector(values, positionCount).asBlock() | ||
// UNORDERED, since whatever ordering there is, it isn't necessarily preserved | ||
: new BooleanArrayBlock(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED); | ||
return builder.build(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This makes sense to me. :+!:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
++
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Belated LGTM. Thanks Luigi
…ckFactory to instantiate blocks (elastic#100016) Let convert functions (eg. `to_string()`, `to_int()`, `to_ip()` and so on) use DriverContext.blockFactory() to generate new blocks, so that they are properly attached to the circuit breaker.
These tests started failing after a combination of elastic#100029 and elastic#100016 which passed CI individually but aren't happy together. Proper fix incoming, but for now let's unblock some folks.
…ckFactory to instantiate blocks (elastic#100016) Let convert functions (eg. `to_string()`, `to_int()`, `to_ip()` and so on) use DriverContext.blockFactory() to generate new blocks, so that they are properly attached to the circuit breaker.
These tests started failing after a combination of elastic#100029 and elastic#100016 which passed CI individually but aren't happy together. Proper fix incoming, but for now let's unblock some folks.
Let convert functions (eg.
to_string()
,to_int()
,to_ip()
and so on) use DriverContext.blockFactory() to generate new blocks, so that they are properly attached to the circuit breaker.