Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: Fix filtered grouping on ords #115312

Merged
merged 8 commits into from
Oct 23, 2024

Conversation

nik9000
Copy link
Member

@nik9000 nik9000 commented Oct 22, 2024

This fixes filtered aggs when they are grouped on a field with ordinals. This looks like:

| STATS max = max(salary) WHERE salary > 0 BY job_positions

when the job_positions field is a keyword field with doc values. In that case we use a faster group-by-segment-ordinals algorithm that needs to be able to merge the results of aggregators from multiple segments. This previously failed with a ClassCastException because of a mistake.

Also! the group-by-segment-ordinals algorithm wasn't properly releasing the closure used to add inputs, causing a breaker size leak. This wasn't really leaking memory, but leaking tracking of memory.

Closes #114897

This fixes filtered aggs when they are grouped on a field with ordinals.
This looks like:
```
| STATS max = max(salary) WHERE salary > 0 BY job_positions
```
when the `job_positions` field is a keyword field with doc values. In
that case we use a faster group-by-segment-ordinals algorithm that needs
to be able to merge the results of aggregators from multiple segments.
This previously failed with a `ClassCastException` because of a mistake.

Also! the group-by-segment-ordinals algorithm wasn't properly releasing
the closure used to add inputs, causing a breaker size leak. This wasn't
really leaking memory, but leaking *tracking* of memory.

Closes elastic#114897
@nik9000 nik9000 added >bug auto-backport Automatically create backport pull requests when merged :Analytics/ES|QL AKA ESQL v9.0.0 v8.17.0 labels Oct 22, 2024
@nik9000 nik9000 requested review from bpintea and ivancea October 22, 2024 12:37
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Oct 22, 2024
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Hi @nik9000, I've created a changelog YAML for you.

Copy link
Contributor

@bpintea bpintea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lgtm.

Copy link
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@nik9000 nik9000 enabled auto-merge (squash) October 22, 2024 17:27
@nik9000
Copy link
Member Author

nik9000 commented Oct 22, 2024

run elasticsearch-ci/rest-compatibility

@nik9000 nik9000 merged commit 60678a1 into elastic:main Oct 23, 2024
16 checks passed
nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Oct 23, 2024
This fixes filtered aggs when they are grouped on a field with ordinals.
This looks like:
```
| STATS max = max(salary) WHERE salary > 0 BY job_positions
```
when the `job_positions` field is a keyword field with doc values. In
that case we use a faster group-by-segment-ordinals algorithm that needs
to be able to merge the results of aggregators from multiple segments.
This previously failed with a `ClassCastException` because of a mistake.

Also! the group-by-segment-ordinals algorithm wasn't properly releasing
the closure used to add inputs, causing a breaker size leak. This wasn't
really leaking memory, but leaking *tracking* of memory.

Closes elastic#114897
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.16 Commit could not be cherrypicked due to conflicts
8.x

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 115312

@nik9000
Copy link
Member Author

nik9000 commented Oct 23, 2024

I'm backporting to 8.16 manually.

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Oct 23, 2024
This fixes filtered aggs when they are grouped on a field with ordinals.
This looks like:
```
| STATS max = max(salary) WHERE salary > 0 BY job_positions
```
when the `job_positions` field is a keyword field with doc values. In
that case we use a faster group-by-segment-ordinals algorithm that needs
to be able to merge the results of aggregators from multiple segments.
This previously failed with a `ClassCastException` because of a mistake.

Also! the group-by-segment-ordinals algorithm wasn't properly releasing
the closure used to add inputs, causing a breaker size leak. This wasn't
really leaking memory, but leaking *tracking* of memory.

Closes elastic#114897
@nik9000
Copy link
Member Author

nik9000 commented Oct 23, 2024

Backport: #115464

elasticsearchmachine pushed a commit that referenced this pull request Oct 23, 2024
This fixes filtered aggs when they are grouped on a field with ordinals.
This looks like:
```
| STATS max = max(salary) WHERE salary > 0 BY job_positions
```
when the `job_positions` field is a keyword field with doc values. In
that case we use a faster group-by-segment-ordinals algorithm that needs
to be able to merge the results of aggregators from multiple segments.
This previously failed with a `ClassCastException` because of a mistake.

Also! the group-by-segment-ordinals algorithm wasn't properly releasing
the closure used to add inputs, causing a breaker size leak. This wasn't
really leaking memory, but leaking *tracking* of memory.

Closes #114897
elasticsearchmachine pushed a commit that referenced this pull request Oct 23, 2024
This fixes filtered aggs when they are grouped on a field with ordinals.
This looks like:
```
| STATS max = max(salary) WHERE salary > 0 BY job_positions
```
when the `job_positions` field is a keyword field with doc values. In
that case we use a faster group-by-segment-ordinals algorithm that needs
to be able to merge the results of aggregators from multiple segments.
This previously failed with a `ClassCastException` because of a mistake.

Also! the group-by-segment-ordinals algorithm wasn't properly releasing
the closure used to add inputs, causing a breaker size leak. This wasn't
really leaking memory, but leaking *tracking* of memory.

Closes #114897
davidkyle pushed a commit to davidkyle/elasticsearch that referenced this pull request Oct 24, 2024
This fixes filtered aggs when they are grouped on a field with ordinals.
This looks like:
```
| STATS max = max(salary) WHERE salary > 0 BY job_positions
```
when the `job_positions` field is a keyword field with doc values. In
that case we use a faster group-by-segment-ordinals algorithm that needs
to be able to merge the results of aggregators from multiple segments.
This previously failed with a `ClassCastException` because of a mistake.

Also! the group-by-segment-ordinals algorithm wasn't properly releasing
the closure used to add inputs, causing a breaker size leak. This wasn't
really leaking memory, but leaking *tracking* of memory.

Closes elastic#114897
georgewallace pushed a commit to georgewallace/elasticsearch that referenced this pull request Oct 25, 2024
This fixes filtered aggs when they are grouped on a field with ordinals.
This looks like:
```
| STATS max = max(salary) WHERE salary > 0 BY job_positions
```
when the `job_positions` field is a keyword field with doc values. In
that case we use a faster group-by-segment-ordinals algorithm that needs
to be able to merge the results of aggregators from multiple segments.
This previously failed with a `ClassCastException` because of a mistake.

Also! the group-by-segment-ordinals algorithm wasn't properly releasing
the closure used to add inputs, causing a breaker size leak. This wasn't
really leaking memory, but leaking *tracking* of memory.

Closes elastic#114897
jfreden pushed a commit to jfreden/elasticsearch that referenced this pull request Nov 4, 2024
This fixes filtered aggs when they are grouped on a field with ordinals.
This looks like:
```
| STATS max = max(salary) WHERE salary > 0 BY job_positions
```
when the `job_positions` field is a keyword field with doc values. In
that case we use a faster group-by-segment-ordinals algorithm that needs
to be able to merge the results of aggregators from multiple segments.
This previously failed with a `ClassCastException` because of a mistake.

Also! the group-by-segment-ordinals algorithm wasn't properly releasing
the closure used to add inputs, causing a breaker size leak. This wasn't
really leaking memory, but leaking *tracking* of memory.

Closes elastic#114897
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged backport pending >bug Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.16.1 v8.17.0 v9.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ESQL: IAE with stats filtering
4 participants