-
Notifications
You must be signed in to change notification settings - Fork 24.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ESQL: Inlinestats improvements #112266
Labels
:Analytics/ES|QL
AKA ESQL
>enhancement
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
Comments
costin
added
:Analytics/ES|QL
AKA ESQL
and removed
needs:triage
Requires assignment of a team area label
labels
Aug 28, 2024
elasticsearchmachine
added
the
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
label
Aug 28, 2024
Pinging @elastic/es-analytical-engine (Team:Analytics) |
Merged
costin
added a commit
to costin/elasticsearch
that referenced
this issue
Aug 29, 2024
Use a nested Aggregate to simplify InlineStats definition and resolution Relates elastic#112266
costin
added a commit
to costin/elasticsearch
that referenced
this issue
Oct 16, 2024
Use a nested Aggregate to simplify InlineStats definition and resolution Relates elastic#112266
costin
added a commit
to costin/elasticsearch
that referenced
this issue
Oct 29, 2024
Previously the Join was modeled as a unary node, embedding the right side as a local relationship inside the node but not exposed as a child. This caused a lot the associated methods (like references, output and inputSet) to misbehave and the physical plan rules to pick incorrect information, such as trying to extract the local relationship fields from the underlying source - the fix was to the local relationship fields as ReferenceAttribute (which of course had its own set of issues). Essentially Join was acting both as a source and as a streaming operator. This PR looks to address this partially by: - refactoring Join into a proper binary node with left and right branches which are used for its references and input/outputSet. - refactoring InlineStats to prefer composition and move the Aggregate on the join right branch. This reuses the Aggregate resolution out of the box; in the process remove the Stats interface. - update some of the planner rules that only worked with Unary nodes. - refactor Mapper into (coordinator) Mapper and LocalMapper. - remove Phased interface by moving its functionality inside the planner (no need to unpack the phased classes, the join already indicates the two branches needed). - massage the Phased execution inside EsqlSession - improve FieldExtractor to handle binary nodes - fix incorrect references in Lookup - generalize ProjectAwayColumns rule Relates elastic#112266 Not all inline and lookup tests are passing: - 2 lookup fields are failing due to name clashes (qualifiers should fix this) - 7 or so inline failures with a similar issue I've disabled the tests for now to have them around once we complete adding the functionality.
costin
added a commit
that referenced
this issue
Oct 31, 2024
First PR that introduces a Join as a first class citizen in the planner. Previously the Join was modeled as a unary node, embedding the right side as a local relationship inside the node but not exposed as a child. This caused a lot the associated methods (like references, output and inputSet) to misbehave and the physical plan rules to pick incorrect information, such as trying to extract the local relationship fields from the underlying source - the fix was to the local relationship fields as ReferenceAttribute (which of course had its own set of issues). Essentially Join was acting both as a source and as a streaming operator. This PR looks to partially address this by: - refactoring Join into a proper binary node with left and right branches which are used for its references and input/outputSet. - refactoring InlineStats to prefer composition and move the Aggregate on the join right branch. This reuses the Aggregate resolution out of the box; in the process remove the Stats interface. - update some of the planner rules that only worked with Unary nodes. - refactor Mapper into (coordinator) Mapper and LocalMapper. - remove Phased interface by moving its functionality inside the planner (no need to unpack the phased classes, the join already indicates the two branches needed). - massage the Phased execution inside EsqlSession - improve FieldExtractor to handle binary nodes - fix incorrect references in Lookup - generalize ProjectAwayColumns rule Relates #112266 Not all inline and lookup tests are passing: - 2 lookup fields are failing due to name clashes (qualifiers should fix this) - 7 or so inline failures with a similar issue I've disabled the tests for now to have them around once we complete adding the functionality.
elasticsearchmachine
pushed a commit
that referenced
this issue
Oct 31, 2024
* ESQL: Refactor Join inside the planner (#115813) First PR that introduces a Join as a first class citizen in the planner. Previously the Join was modeled as a unary node, embedding the right side as a local relationship inside the node but not exposed as a child. This caused a lot the associated methods (like references, output and inputSet) to misbehave and the physical plan rules to pick incorrect information, such as trying to extract the local relationship fields from the underlying source - the fix was to the local relationship fields as ReferenceAttribute (which of course had its own set of issues). Essentially Join was acting both as a source and as a streaming operator. This PR looks to partially address this by: - refactoring Join into a proper binary node with left and right branches which are used for its references and input/outputSet. - refactoring InlineStats to prefer composition and move the Aggregate on the join right branch. This reuses the Aggregate resolution out of the box; in the process remove the Stats interface. - update some of the planner rules that only worked with Unary nodes. - refactor Mapper into (coordinator) Mapper and LocalMapper. - remove Phased interface by moving its functionality inside the planner (no need to unpack the phased classes, the join already indicates the two branches needed). - massage the Phased execution inside EsqlSession - improve FieldExtractor to handle binary nodes - fix incorrect references in Lookup - generalize ProjectAwayColumns rule Relates #112266 Not all inline and lookup tests are passing: - 2 lookup fields are failing due to name clashes (qualifiers should fix this) - 7 or so inline failures with a similar issue I've disabled the tests for now to have them around once we complete adding the functionality. (cherry picked from commit 4ee98e8) * ES|QL: Mute test for #116003 (#116005) (cherry picked from commit 681f509) --------- Co-authored-by: Luigi Dell'Aquila <[email protected]>
jfreden
pushed a commit
to jfreden/elasticsearch
that referenced
this issue
Nov 4, 2024
First PR that introduces a Join as a first class citizen in the planner. Previously the Join was modeled as a unary node, embedding the right side as a local relationship inside the node but not exposed as a child. This caused a lot the associated methods (like references, output and inputSet) to misbehave and the physical plan rules to pick incorrect information, such as trying to extract the local relationship fields from the underlying source - the fix was to the local relationship fields as ReferenceAttribute (which of course had its own set of issues). Essentially Join was acting both as a source and as a streaming operator. This PR looks to partially address this by: - refactoring Join into a proper binary node with left and right branches which are used for its references and input/outputSet. - refactoring InlineStats to prefer composition and move the Aggregate on the join right branch. This reuses the Aggregate resolution out of the box; in the process remove the Stats interface. - update some of the planner rules that only worked with Unary nodes. - refactor Mapper into (coordinator) Mapper and LocalMapper. - remove Phased interface by moving its functionality inside the planner (no need to unpack the phased classes, the join already indicates the two branches needed). - massage the Phased execution inside EsqlSession - improve FieldExtractor to handle binary nodes - fix incorrect references in Lookup - generalize ProjectAwayColumns rule Relates elastic#112266 Not all inline and lookup tests are passing: - 2 lookup fields are failing due to name clashes (qualifiers should fix this) - 7 or so inline failures with a similar issue I've disabled the tests for now to have them around once we complete adding the functionality.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Analytics/ES|QL
AKA ESQL
>enhancement
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
Description
WIP
Meta ticket for improving the logical modeling and planning of inlinestats & co. Relates #110923
The text was updated successfully, but these errors were encountered: