Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[8.x] ESQL: Compute support for filtering ungrouped aggs (#112717) #112763

Merged
merged 1 commit into from
Sep 11, 2024

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Sep 11, 2024

Backports the following commits to 8.x:

Adds support to the compute engine for filtering which positions are
processed by ungrouping aggs. This should allow syntax like:

```
| STATS
       success = COUNT(*) WHERE 200 <= response_code AND response_code < 300,
      redirect = COUNT(*) WHERE 300 <= response_code AND response_code < 400,
    client_err = COUNT(*) WHERE 400 <= response_code AND response_code < 500,
    server_err = COUNT(*) WHERE 500 <= response_code AND response_code < 600,
   total_count = COUNT(*)
```

We could translate the WHERE expression into an `ExpressionEvaluator`
and run it, then plug it into the filtering support added in this PR.

The actual filtering is done by creating a `FilteredAggregatorFunction`
which wraps a regular `AggregatorFunction` first executing the filter
against the incoming `Page` and then passing the resulting mask to the
`AggregatorFunction`. We've then added a `mask` to
`AggregatorFunction#process` which each aggregation function must use
for filtering.

We keep the unfiltered behavior by sending a constant block with `true`
in it. Each agg detects this and takes an "unfiltered" path, preserving
the original performance.

Importantly, when you don't turn this on it doesn't effect performance:

```
 (blockType)  (grouping)   (op)  Score    Error -> Score    Error  Units
vector_longs        none  count  0.007 ±  0.001 -> 0.007 ±  0.001  ns/op
vector_longs        none    min  0.123 ±  0.004 -> 0.128 ±  0.005  ns/op
vector_longs       longs  count  4.311 ±  0.192 -> 4.218 ±  0.053  ns/op
vector_longs       longs    min  5.476 ±  0.077 -> 5.451 ±  0.074  ns/op
```
@nik9000 nik9000 requested a review from a team as a code owner September 11, 2024 19:42
@nik9000 nik9000 added :Analytics/ES|QL AKA ESQL >non-issue auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Sep 11, 2024
@elasticsearchmachine elasticsearchmachine merged commit 05323e7 into elastic:8.x Sep 11, 2024
15 checks passed
@nik9000 nik9000 deleted the backport/8.x/pr-112717 branch September 11, 2024 20:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-merge-without-approval Automatically merge pull request when CI checks pass (NB doesn't wait for reviews!) backport >non-issue Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.16.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants