Skip to content

Commit

Permalink
ESQL: Compute support for filtering grouping aggs (#112476)
Browse files Browse the repository at this point in the history
Adds support to the compute engine for filtering which positions are
processed by grouping aggs. This should allow syntax like

```
| STATS
       success = COUNT(*) WHERE 200 <= response_code AND response_code < 300,
      redirect = COUNT(*) WHERE 300 <= response_code AND response_code < 400,
    client_err = COUNT(*) WHERE 400 <= response_code AND response_code < 500,
    server_err = COUNT(*) WHERE 500 <= response_code AND response_code < 600,
   total_count = COUNT(*)
  BY hostname
```

We could translate the WHERE expression into an `ExpressionEvaluator`
and run it, then plug it into the filtering support added in this PR.

The actual filtering is done by creating a
`FilteredGroupingAggregatorFunction` which runs wraps a regular
`GroupingAggregatorFunction` first executing the filter against the
incoming `Page` and then `null`ing any positions in the group that don't
match. Then passing the resulting groups into the real aggregator. When
the real grouping aggregator implementation sees `null` value for groups
it skips collecting that position.

We had to make two changes to every agg for this to work: 1. Add a
method to force local group tracking mode on any aggregator. Previously
this was only required if the agg encountered `null` values, but when
we're filtering aggs we can no longer trust the `seen` parameter we get
when building the result. This local group tracking mode let's us track
what we've actually seen locally. 2. Add `Releasable` to the `AddInput`
thing we use to handle chunked pages in grouping aggs. This is required
because the results of the filter must be closed on completion.

Both of these are fairly trivial changes, but require touching every
aggregation.
  • Loading branch information
nik9000 authored Sep 9, 2024
1 parent c946617 commit 72248e3
Show file tree
Hide file tree
Showing 68 changed files with 1,145 additions and 31 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,7 @@ private TypeSpec type() {
builder.addMethod(addRawInputLoop(INT_VECTOR, valueVectorType(init, combine)));
builder.addMethod(addRawInputLoop(INT_BLOCK, valueBlockType(init, combine)));
builder.addMethod(addRawInputLoop(INT_BLOCK, valueVectorType(init, combine)));
builder.addMethod(selectedMayContainUnseenGroups());
builder.addMethod(addIntermediateInput());
builder.addMethod(addIntermediateRowInput());
builder.addMethod(evaluateIntermediate());
Expand Down Expand Up @@ -338,6 +339,9 @@ private TypeSpec addInput(Consumer<MethodSpec.Builder> addBlock) {
addBlock.accept(vector);
builder.addMethod(vector.build());

MethodSpec.Builder close = MethodSpec.methodBuilder("close").addAnnotation(Override.class).addModifiers(Modifier.PUBLIC);
builder.addMethod(close.build());

return builder.build();
}

Expand Down Expand Up @@ -485,6 +489,14 @@ private void combineRawInputForBytesRef(MethodSpec.Builder builder, String block
builder.addStatement("$T.combine(state, groupId, $L.getBytesRef($L, scratch))", declarationType, blockVariable, offsetVariable);
}

private MethodSpec selectedMayContainUnseenGroups() {
MethodSpec.Builder builder = MethodSpec.methodBuilder("selectedMayContainUnseenGroups");
builder.addAnnotation(Override.class).addModifiers(Modifier.PUBLIC);
builder.addParameter(SEEN_GROUP_IDS, "seenGroupIds");
builder.addStatement("state.enableGroupIdTracking(seenGroupIds)");
return builder.build();
}

private MethodSpec addIntermediateInput() {
MethodSpec.Builder builder = MethodSpec.methodBuilder("addIntermediateInput");
builder.addAnnotation(Override.class).addModifiers(Modifier.PUBLIC);
Expand Down

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 72248e3

Please sign in to comment.