Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: add driverContext to conversion function evaluators and use blockFactory to instantiate blocks #100016

Original file line number Diff line number Diff line change
Expand Up @@ -13,8 +13,6 @@
import com.squareup.javapoet.TypeName;
import com.squareup.javapoet.TypeSpec;

import java.util.BitSet;

import javax.lang.model.element.ExecutableElement;
import javax.lang.model.element.Modifier;
import javax.lang.model.element.TypeElement;
Expand All @@ -23,18 +21,13 @@
import static org.elasticsearch.compute.gen.Methods.appendMethod;
import static org.elasticsearch.compute.gen.Methods.getMethod;
import static org.elasticsearch.compute.gen.Types.ABSTRACT_CONVERT_FUNCTION_EVALUATOR;
import static org.elasticsearch.compute.gen.Types.BIG_ARRAYS;
import static org.elasticsearch.compute.gen.Types.BLOCK;
import static org.elasticsearch.compute.gen.Types.BYTES_REF;
import static org.elasticsearch.compute.gen.Types.BYTES_REF_ARRAY;
import static org.elasticsearch.compute.gen.Types.BYTES_REF_BLOCK;
import static org.elasticsearch.compute.gen.Types.DRIVER_CONTEXT;
import static org.elasticsearch.compute.gen.Types.EXPRESSION_EVALUATOR;
import static org.elasticsearch.compute.gen.Types.SOURCE;
import static org.elasticsearch.compute.gen.Types.VECTOR;
import static org.elasticsearch.compute.gen.Types.arrayBlockType;
import static org.elasticsearch.compute.gen.Types.arrayVectorType;
import static org.elasticsearch.compute.gen.Types.blockType;
import static org.elasticsearch.compute.gen.Types.constantVectorType;
import static org.elasticsearch.compute.gen.Types.vectorType;

public class ConvertEvaluatorImplementer {
Expand Down Expand Up @@ -79,6 +72,8 @@ private TypeSpec type() {
builder.addModifiers(Modifier.PUBLIC, Modifier.FINAL);
builder.superclass(ABSTRACT_CONVERT_FUNCTION_EVALUATOR);

builder.addField(DRIVER_CONTEXT, "driverContext", Modifier.PRIVATE, Modifier.FINAL);

builder.addMethod(ctor());
builder.addMethod(name());
builder.addMethod(evalVector());
Expand All @@ -92,7 +87,9 @@ private MethodSpec ctor() {
MethodSpec.Builder builder = MethodSpec.constructorBuilder().addModifiers(Modifier.PUBLIC);
builder.addParameter(EXPRESSION_EVALUATOR, "field");
builder.addParameter(SOURCE, "source");
builder.addParameter(DRIVER_CONTEXT, "driverContext");
builder.addStatement("super($N, $N)", "field", "source");
builder.addStatement("this.driverContext = driverContext");
return builder.build();
}

Expand Down Expand Up @@ -121,69 +118,44 @@ private MethodSpec evalVector() {
{
builder.beginControlFlow("try");
{
var constVectType = constantVectorType(resultType);
var constVectType = blockType(resultType);
builder.addStatement(
"return new $T($N, positionCount).asBlock()",
"return driverContext.blockFactory().newConstant$TWith($N, positionCount)",
constVectType,
evalValueCall("vector", "0", scratchPadName)
);
}
builder.nextControlFlow("catch (Exception e)");
{
builder.addStatement("registerException(e)");
builder.addStatement("return Block.constantNullBlock(positionCount)");
builder.addStatement("return Block.constantNullBlock(positionCount, driverContext.blockFactory())");
}
builder.endControlFlow();
}
builder.endControlFlow();

builder.addStatement("$T nullsMask = null", BitSet.class);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All this logic (nullMasks, BytesRefArrays) is already managed by BlockBuilders... or maybe I'm missing something?

Copy link
Contributor Author

@luigidellaquila luigidellaquila Sep 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assuming that the direct usage of arrays here is a deliberate optimization, I tried to minimize the impact and just replace

$T values = new $T(positionCount, $T.NON_RECYCLING_INSTANCE)

with

$T values = new $T(positionCount, driverContext.bigArrays())

The change seems logic, but CsvTests don't seem to appreciate it:

org.elasticsearch.xpack.esql.CsvTests > test {string.ConvertFromDatetime} FAILED
    java.lang.AssertionError: 
    Expected: <0L>
         but: was <3432L>
	at org.elasticsearch.xpack.esql.CsvTests.string.ConvertFromFloats(string.csv-spec:599)
	at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:18)
	at org.junit.Assert.assertThat(Assert.java:956)
	at org.junit.Assert.assertThat(Assert.java:923)
	at org.elasticsearch.xpack.esql.CsvTests.executePlan(CsvTests.java:395)
	at org.elasticsearch.xpack.esql.CsvTests.doTest(CsvTests.java:232)

and then

   java.lang.RuntimeException: 2 arrays have not been released
        at org.elasticsearch.common.util.MockBigArrays.ensureAllArraysAreReleased(MockBigArrays.java:114)
        at org.elasticsearch.test.ESTestCase.checkStaticState(ESTestCase.java:665)
        at org.elasticsearch.test.ESTestCase.after(ESTestCase.java:479)

Is it worth further investigation to keep the optimizations in place or should we just go with the builders?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd switch to the builders to be honest. The builders are going to be easier to easier to manage the memory of.

The direct usage of arrays was an optimization for sure. I think when we don't have to handle null we can use the Vector.FixedBuilder thing I wrote. When we do have to handle null we could probably use the regular block builder. Later I think we could make Block.FixedBuilder which gets the same optimizations back.

The goal with the FixedBuilders is that the JVM can inline all the method calls on it to be "the same" as when we operate on the arrays directly.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we'll have to handle nulls (conversions can fail), so I guess what this PR does now is good enough. We can review it when we have a bit more time and fine-tune some specific cases (eg. maybe it's safe to assume that conversions to string never fail)

if (resultType.equals(BYTES_REF)) {
builder.addStatement(
"$T values = new $T(positionCount, $T.NON_RECYCLING_INSTANCE)", // TODO: see note in MvEvaluatorImplementer
BYTES_REF_ARRAY,
BYTES_REF_ARRAY,
BIG_ARRAYS
);
} else {
builder.addStatement("$T[] values = new $T[positionCount]", resultType, resultType);
}
ClassName returnBlockType = blockType(resultType);
builder.addStatement(
"$T.Builder builder = $T.newBlockBuilder(positionCount, driverContext.blockFactory())",
returnBlockType,
returnBlockType
);
builder.beginControlFlow("for (int p = 0; p < positionCount; p++)");
{
builder.beginControlFlow("try");
{
if (resultType.equals(BYTES_REF)) {
builder.addStatement("values.append($N)", evalValueCall("vector", "p", scratchPadName));
} else {
builder.addStatement("values[p] = $N", evalValueCall("vector", "p", scratchPadName));
}
builder.addStatement("builder.$L($N)", appendMethod(resultType), evalValueCall("vector", "p", scratchPadName));
}
builder.nextControlFlow("catch (Exception e)");
{
builder.addStatement("registerException(e)");
builder.beginControlFlow("if (nullsMask == null)");
{
builder.addStatement("nullsMask = new BitSet(positionCount)");
}
builder.endControlFlow();
builder.addStatement("nullsMask.set(p)");
if (resultType.equals(BYTES_REF)) {
builder.addStatement("values.append($T.NULL_VALUE)", BYTES_REF_BLOCK);
}
builder.addStatement("builder.appendNull()");
}
builder.endControlFlow();
}
builder.endControlFlow();

builder.addStatement(
"""
return nullsMask == null
? new $T(values, positionCount).asBlock()
// UNORDERED, since whatever ordering there is, it isn't necessarily preserved
: new $T(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED)""",
arrayVectorType(resultType),
arrayBlockType(resultType)
);
builder.addStatement("return builder.build()");

return builder.build();
}
Expand All @@ -196,7 +168,11 @@ private MethodSpec evalBlock() {
builder.addStatement("$T block = ($T) b", blockType, blockType);
builder.addStatement("int positionCount = block.getPositionCount()");
TypeName resultBlockType = blockType(resultType);
builder.addStatement("$T.Builder builder = $T.newBlockBuilder(positionCount)", resultBlockType, resultBlockType);
builder.addStatement(
"$T.Builder builder = $T.newBlockBuilder(positionCount, driverContext.blockFactory())",
resultBlockType,
resultBlockType
);
String scratchPadName = null;
if (argumentType.equals(BYTES_REF)) {
scratchPadName = "scratchPad";
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,12 @@

import java.lang.Override;
import java.lang.String;
import java.util.BitSet;
import org.elasticsearch.compute.data.Block;
import org.elasticsearch.compute.data.BooleanArrayBlock;
import org.elasticsearch.compute.data.BooleanArrayVector;
import org.elasticsearch.compute.data.BooleanBlock;
import org.elasticsearch.compute.data.ConstantBooleanVector;
import org.elasticsearch.compute.data.DoubleBlock;
import org.elasticsearch.compute.data.DoubleVector;
import org.elasticsearch.compute.data.Vector;
import org.elasticsearch.compute.operator.DriverContext;
import org.elasticsearch.compute.operator.EvalOperator;
import org.elasticsearch.xpack.ql.tree.Source;

Expand All @@ -23,8 +20,12 @@
* This class is generated. Do not edit it.
*/
public final class ToBooleanFromDoubleEvaluator extends AbstractConvertFunction.AbstractEvaluator {
public ToBooleanFromDoubleEvaluator(EvalOperator.ExpressionEvaluator field, Source source) {
private final DriverContext driverContext;

public ToBooleanFromDoubleEvaluator(EvalOperator.ExpressionEvaluator field, Source source,
DriverContext driverContext) {
super(field, source);
this.driverContext = driverContext;
}

@Override
Expand All @@ -38,29 +39,22 @@ public Block evalVector(Vector v) {
int positionCount = v.getPositionCount();
if (vector.isConstant()) {
try {
return new ConstantBooleanVector(evalValue(vector, 0), positionCount).asBlock();
return driverContext.blockFactory().newConstantBooleanBlockWith(evalValue(vector, 0), positionCount);
} catch (Exception e) {
registerException(e);
return Block.constantNullBlock(positionCount);
return Block.constantNullBlock(positionCount, driverContext.blockFactory());
}
}
BitSet nullsMask = null;
boolean[] values = new boolean[positionCount];
BooleanBlock.Builder builder = BooleanBlock.newBlockBuilder(positionCount, driverContext.blockFactory());
for (int p = 0; p < positionCount; p++) {
try {
values[p] = evalValue(vector, p);
builder.appendBoolean(evalValue(vector, p));
} catch (Exception e) {
registerException(e);
if (nullsMask == null) {
nullsMask = new BitSet(positionCount);
}
nullsMask.set(p);
builder.appendNull();
}
}
return nullsMask == null
? new BooleanArrayVector(values, positionCount).asBlock()
// UNORDERED, since whatever ordering there is, it isn't necessarily preserved
: new BooleanArrayBlock(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED);
return builder.build();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. :+!:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++

}

private static boolean evalValue(DoubleVector container, int index) {
Expand All @@ -72,7 +66,7 @@ private static boolean evalValue(DoubleVector container, int index) {
public Block evalBlock(Block b) {
DoubleBlock block = (DoubleBlock) b;
int positionCount = block.getPositionCount();
BooleanBlock.Builder builder = BooleanBlock.newBlockBuilder(positionCount);
BooleanBlock.Builder builder = BooleanBlock.newBlockBuilder(positionCount, driverContext.blockFactory());
for (int p = 0; p < positionCount; p++) {
int valueCount = block.getValueCount(p);
int start = block.getFirstValueIndex(p);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,12 @@

import java.lang.Override;
import java.lang.String;
import java.util.BitSet;
import org.elasticsearch.compute.data.Block;
import org.elasticsearch.compute.data.BooleanArrayBlock;
import org.elasticsearch.compute.data.BooleanArrayVector;
import org.elasticsearch.compute.data.BooleanBlock;
import org.elasticsearch.compute.data.ConstantBooleanVector;
import org.elasticsearch.compute.data.IntBlock;
import org.elasticsearch.compute.data.IntVector;
import org.elasticsearch.compute.data.Vector;
import org.elasticsearch.compute.operator.DriverContext;
import org.elasticsearch.compute.operator.EvalOperator;
import org.elasticsearch.xpack.ql.tree.Source;

Expand All @@ -23,8 +20,12 @@
* This class is generated. Do not edit it.
*/
public final class ToBooleanFromIntEvaluator extends AbstractConvertFunction.AbstractEvaluator {
public ToBooleanFromIntEvaluator(EvalOperator.ExpressionEvaluator field, Source source) {
private final DriverContext driverContext;

public ToBooleanFromIntEvaluator(EvalOperator.ExpressionEvaluator field, Source source,
DriverContext driverContext) {
super(field, source);
this.driverContext = driverContext;
}

@Override
Expand All @@ -38,29 +39,22 @@ public Block evalVector(Vector v) {
int positionCount = v.getPositionCount();
if (vector.isConstant()) {
try {
return new ConstantBooleanVector(evalValue(vector, 0), positionCount).asBlock();
return driverContext.blockFactory().newConstantBooleanBlockWith(evalValue(vector, 0), positionCount);
} catch (Exception e) {
registerException(e);
return Block.constantNullBlock(positionCount);
return Block.constantNullBlock(positionCount, driverContext.blockFactory());
}
}
BitSet nullsMask = null;
boolean[] values = new boolean[positionCount];
BooleanBlock.Builder builder = BooleanBlock.newBlockBuilder(positionCount, driverContext.blockFactory());
for (int p = 0; p < positionCount; p++) {
try {
values[p] = evalValue(vector, p);
builder.appendBoolean(evalValue(vector, p));
} catch (Exception e) {
registerException(e);
if (nullsMask == null) {
nullsMask = new BitSet(positionCount);
}
nullsMask.set(p);
builder.appendNull();
}
}
return nullsMask == null
? new BooleanArrayVector(values, positionCount).asBlock()
// UNORDERED, since whatever ordering there is, it isn't necessarily preserved
: new BooleanArrayBlock(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED);
return builder.build();
}

private static boolean evalValue(IntVector container, int index) {
Expand All @@ -72,7 +66,7 @@ private static boolean evalValue(IntVector container, int index) {
public Block evalBlock(Block b) {
IntBlock block = (IntBlock) b;
int positionCount = block.getPositionCount();
BooleanBlock.Builder builder = BooleanBlock.newBlockBuilder(positionCount);
BooleanBlock.Builder builder = BooleanBlock.newBlockBuilder(positionCount, driverContext.blockFactory());
for (int p = 0; p < positionCount; p++) {
int valueCount = block.getValueCount(p);
int start = block.getFirstValueIndex(p);
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,12 @@

import java.lang.Override;
import java.lang.String;
import java.util.BitSet;
import org.elasticsearch.compute.data.Block;
import org.elasticsearch.compute.data.BooleanArrayBlock;
import org.elasticsearch.compute.data.BooleanArrayVector;
import org.elasticsearch.compute.data.BooleanBlock;
import org.elasticsearch.compute.data.ConstantBooleanVector;
import org.elasticsearch.compute.data.LongBlock;
import org.elasticsearch.compute.data.LongVector;
import org.elasticsearch.compute.data.Vector;
import org.elasticsearch.compute.operator.DriverContext;
import org.elasticsearch.compute.operator.EvalOperator;
import org.elasticsearch.xpack.ql.tree.Source;

Expand All @@ -23,8 +20,12 @@
* This class is generated. Do not edit it.
*/
public final class ToBooleanFromLongEvaluator extends AbstractConvertFunction.AbstractEvaluator {
public ToBooleanFromLongEvaluator(EvalOperator.ExpressionEvaluator field, Source source) {
private final DriverContext driverContext;

public ToBooleanFromLongEvaluator(EvalOperator.ExpressionEvaluator field, Source source,
DriverContext driverContext) {
super(field, source);
this.driverContext = driverContext;
}

@Override
Expand All @@ -38,29 +39,22 @@ public Block evalVector(Vector v) {
int positionCount = v.getPositionCount();
if (vector.isConstant()) {
try {
return new ConstantBooleanVector(evalValue(vector, 0), positionCount).asBlock();
return driverContext.blockFactory().newConstantBooleanBlockWith(evalValue(vector, 0), positionCount);
} catch (Exception e) {
registerException(e);
return Block.constantNullBlock(positionCount);
return Block.constantNullBlock(positionCount, driverContext.blockFactory());
}
}
BitSet nullsMask = null;
boolean[] values = new boolean[positionCount];
BooleanBlock.Builder builder = BooleanBlock.newBlockBuilder(positionCount, driverContext.blockFactory());
for (int p = 0; p < positionCount; p++) {
try {
values[p] = evalValue(vector, p);
builder.appendBoolean(evalValue(vector, p));
} catch (Exception e) {
registerException(e);
if (nullsMask == null) {
nullsMask = new BitSet(positionCount);
}
nullsMask.set(p);
builder.appendNull();
}
}
return nullsMask == null
? new BooleanArrayVector(values, positionCount).asBlock()
// UNORDERED, since whatever ordering there is, it isn't necessarily preserved
: new BooleanArrayBlock(values, positionCount, null, nullsMask, Block.MvOrdering.UNORDERED);
return builder.build();
}

private static boolean evalValue(LongVector container, int index) {
Expand All @@ -72,7 +66,7 @@ private static boolean evalValue(LongVector container, int index) {
public Block evalBlock(Block b) {
LongBlock block = (LongBlock) b;
int positionCount = block.getPositionCount();
BooleanBlock.Builder builder = BooleanBlock.newBlockBuilder(positionCount);
BooleanBlock.Builder builder = BooleanBlock.newBlockBuilder(positionCount, driverContext.blockFactory());
for (int p = 0; p < positionCount; p++) {
int valueCount = block.getValueCount(p);
int start = block.getFirstValueIndex(p);
Expand Down
Loading