Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: add driverContext to conversion function evaluators and use blockFactory to instantiate blocks #100016

Conversation

luigidellaquila
Copy link
Contributor

@luigidellaquila luigidellaquila commented Sep 28, 2023

Let convert functions (eg. to_string(), to_int(), to_ip() and so on) use DriverContext.blockFactory() to generate new blocks, so that they are properly attached to the circuit breaker.

@luigidellaquila luigidellaquila changed the title ESQL: add driverContext to convert functions evaluators and use blockFactory to instantiate blocks ESQL: add driverContext to convert function evaluators and use blockFactory to instantiate blocks Sep 28, 2023
@elasticsearchmachine elasticsearchmachine added Team:QL (Deprecated) Meta label for query languages team v8.11.0 labels Sep 28, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-ql (Team:QL)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/elasticsearch-esql (:Query Languages/ES|QL)

@luigidellaquila luigidellaquila changed the title ESQL: add driverContext to convert function evaluators and use blockFactory to instantiate blocks ESQL: add driverContext to conversion function evaluators and use blockFactory to instantiate blocks Sep 28, 2023
// UNORDERED, since whatever ordering there is, it isn't necessarily preserved
: new BooleanArrayBlock(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED);
: new BooleanArrayBlock(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED, driverContext.blockFactory());
Copy link
Contributor

@ChrisHegarty ChrisHegarty Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for looking into this Luigi.

Hmm.. we need to be careful here. This constructor assumes that the breaker accountancy has been done. I think what you want is:
blockFactory.newBooleanArrayBlock(..)

And yes... this is a little error prone. We'll restrict access to these constructors once we're further along.

}
builder.endControlFlow();
}
builder.endControlFlow();

builder.addStatement("$T nullsMask = null", BitSet.class);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this logic (nullMasks, BytesRefArrays) is already managed by BlockBuilders... or maybe I'm missing something?

Copy link
Contributor Author

@luigidellaquila luigidellaquila Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that the direct usage of arrays here is a deliberate optimization, I tried to minimize the impact and just replace

$T values = new $T(positionCount, $T.NON_RECYCLING_INSTANCE)

with

$T values = new $T(positionCount, driverContext.bigArrays())

The change seems logic, but CsvTests don't seem to appreciate it:

org.elasticsearch.xpack.esql.CsvTests > test {string.ConvertFromDatetime} FAILED
    java.lang.AssertionError: 
    Expected: <0L>
         but: was <3432L>
	at org.elasticsearch.xpack.esql.CsvTests.string.ConvertFromFloats(string.csv-spec:599)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
	at org.junit.Assert.assertThat(Assert.java:956)
	at org.junit.Assert.assertThat(Assert.java:923)
	at org.elasticsearch.xpack.esql.CsvTests.executePlan(CsvTests.java:395)
	at org.elasticsearch.xpack.esql.CsvTests.doTest(CsvTests.java:232)

and then

   java.lang.RuntimeException: 2 arrays have not been released
        at org.elasticsearch.common.util.MockBigArrays.ensureAllArraysAreReleased(MockBigArrays.java:114)
        at org.elasticsearch.test.ESTestCase.checkStaticState(ESTestCase.java:665)
        at org.elasticsearch.test.ESTestCase.after(ESTestCase.java:479)

Is it worth further investigation to keep the optimizations in place or should we just go with the builders?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd switch to the builders to be honest. The builders are going to be easier to easier to manage the memory of.

The direct usage of arrays was an optimization for sure. I think when we don't have to handle null we can use the Vector.FixedBuilder thing I wrote. When we do have to handle null we could probably use the regular block builder. Later I think we could make Block.FixedBuilder which gets the same optimizations back.

The goal with the FixedBuilders is that the JVM can inline all the method calls on it to be "the same" as when we operate on the arrays directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we'll have to handle nulls (conversions can fail), so I guess what this PR does now is good enough. We can review it when we have a bit more time and fine-tune some specific cases (eg. maybe it's safe to assume that conversions to string never fail)

@luigidellaquila
Copy link
Contributor Author

@elasticmachine run elasticsearch-ci/bwc

@luigidellaquila
Copy link
Contributor Author

@elasticmachine update branch

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I think the TODO you have in the description is actually done already.

We would love to make a Block.FixedBuilder that has a fixed number of slots and has the efficiencies of Vector.FixedBuilder but can handle nulls. But that's a problem for later.

I do think you have an extra ; in all the generated files - zap one please!

? new BooleanArrayVector(values, positionCount).asBlock()
// UNORDERED, since whatever ordering there is, it isn't necessarily preserved
: new BooleanArrayBlock(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED);
return builder.build();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. :+!:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

@luigidellaquila luigidellaquila added the auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) label Sep 29, 2023
@elasticsearchmachine elasticsearchmachine merged commit b30813b into elastic:main Sep 29, 2023
@luigidellaquila luigidellaquila deleted the esql/convert_functions_block_factory branch September 29, 2023 07:55
Copy link
Contributor

@ChrisHegarty ChrisHegarty left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Belated LGTM. Thanks Luigi

nik9000 added a commit that referenced this pull request Sep 29, 2023
These tests started failing after a combination of #100029 and #100016
which passed CI individually but aren't happy together. Proper fix
incoming, but for now let's unblock some folks.
piergm pushed a commit to piergm/elasticsearch that referenced this pull request Oct 2, 2023
…ckFactory to instantiate blocks (elastic#100016)

Let convert functions (eg. `to_string()`, `to_int()`, `to_ip()` and so
on) use DriverContext.blockFactory() to generate new blocks, so that
they are properly attached to the circuit breaker.
piergm pushed a commit to piergm/elasticsearch that referenced this pull request Oct 2, 2023
These tests started failing after a combination of elastic#100029 and elastic#100016
which passed CI individually but aren't happy together. Proper fix
incoming, but for now let's unblock some folks.
jakelandis pushed a commit to jakelandis/elasticsearch that referenced this pull request Oct 2, 2023
…ckFactory to instantiate blocks (elastic#100016)

Let convert functions (eg. `to_string()`, `to_int()`, `to_ip()` and so
on) use DriverContext.blockFactory() to generate new blocks, so that
they are properly attached to the circuit breaker.
jakelandis pushed a commit to jakelandis/elasticsearch that referenced this pull request Oct 2, 2023
These tests started failing after a combination of elastic#100029 and elastic#100016
which passed CI individually but aren't happy together. Proper fix
incoming, but for now let's unblock some folks.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) >non-issue Team:QL (Deprecated) Meta label for query languages team v8.11.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants