ESQL: Extend STATS command to support aggregate expressions #104958

costin · 2024-01-31T00:36:15Z

Previously only aggregate functions (max/sum/etc..) were allowed inside
the stats command. This PR allows expressions involving one or multiple
aggregates to be used, such as:

 stats x = avg(salary % 3) + max(emp_no),
       y = min(emp_no / 3) + 10 - median(salary)
       by z = languages % 2

elasticsearchmachine · 2024-01-31T00:36:39Z

Hi @costin, I've created a changelog YAML for you.

elasticsearchmachine · 2024-01-31T00:36:41Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

costin · 2024-01-31T00:43:36Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

The core of this PR - the changes are larger because:

the ReplaceDuplicateAggWithEval has been removed and incorporated into the new rule.

the behavior of CombineProjections has been fixed when dealing with a Project/Aggregate, simplifying the clean-up.

costin · 2024-01-31T00:45:02Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

-            new ReplaceDuplicateAggWithEval(),
-            // pushing down limits again, because ReplaceDuplicateAggWithEval could create new Project nodes that can still be optimized
-            new PushDownAndCombineLimits(),
-            new ReplaceLimitAndSortAsTopN()


Removed ReplaceDuplicateAgg and pushing down of limits again.

costin · 2024-01-31T00:47:31Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

+            // first extract nested aggs top-level - this simplifies the rest of the rules
+            new ReplaceStatsAggExpressionWithEval(),
+            // second extract nested aggs inside of them
+            new ReplaceStatsNestedExpressionWithEval(),
+            // lastly replace surrogate functions


The new rule breaks down expressions over aggs into eval so the underlying stats only works on top level aggregations.
While at it, it handles also duplicates to avoid repetitive computation.
This keeps the following rule simple since it's guarantees that an Aggregate will only contain AggregateFunctions not expressions over them.

costin · 2024-01-31T00:47:54Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

@@ -239,15 +239,18 @@ protected LogicalPlan rule(Aggregate aggregate) {
                    // project away transient fields and re-enforce the original order using references (not copies) to the original aggs
                    // this works since the replaced aliases have their nameId copied to avoid having to update all references (which has
                    // a cascading effect)
-                    plan = new EsqlProject(source, plan, Expressions.asAttributes(aggs));
+                    plan = new Project(source, plan, Expressions.asAttributes(aggs));


Small tweak - no need to return EsqlProject instead Project since the tree is already resolved.

costin · 2024-01-31T00:48:33Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

                }
            }

            return plan;
        }

-        static String temporaryName(NamedExpression agg, AggregateFunction af) {
-            return "__" + agg.name() + "_" + af.functionName() + "@" + Integer.toHexString(af.hashCode());
+        static String temporaryName(Expression expression, AggregateFunction af) {


I've updated the temporary name to make it a bit less confusing/repetitive.

costin · 2024-01-31T00:49:22Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

-            if (lit.value() == null) {
+            Object value = lit.value();
+
+            if (value == null) {
                return lit;
            }
-            if (lit.value() instanceof String s) {
+            if (value instanceof String s) {
                return Literal.of(lit, new BytesRef(s));
            }
-            if (lit.value() instanceof List<?> l) {
+            if (value instanceof List<?> l) {
                if (l.isEmpty() || false == l.get(0) instanceof String) {
                    return lit;
                }
-                return Literal.of(lit, l.stream().map(v -> new BytesRef((String) v)).toList());
+                List<BytesRef> byteRefs = new ArrayList<>(l.size());
+                for (Object v : l) {
+                    byteRefs.add(new BytesRef(v.toString()));
+                }
+                return Literal.of(lit, byteRefs);
            }
            return lit;
        }


Unrelated change - couldn't help not correct it to save the method invocation and replace the noise map with a good-ol' reliable iteration.

costin · 2024-01-31T00:53:27Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

-                    return p.withProjections(combineProjections(project.projections(), p.projections()));
-                } else if (child instanceof Aggregate a) {
+                    project = p.withProjections(combineProjections(project.projections(), p.projections()));
+                    child = project.child();
+                    plan = project;
+                    // don't return the plan since the grandchild (now child) might be an aggregate that could not be folded on the way up
+                    // e.g. stats c = count(x) | project c, c as x | project x
+                    // try to apply the rule again opportunistically as another node might be pushed in (a limit might be pushed in)
+                }
+                // check if the projection eliminates certain aggregates
+                // but be mindful of aliases to existing aggregates that we don't want to duplicate to avoid redundant work
+                if (child instanceof Aggregate a) {
                    var aggs = a.aggregates();
-                    var newAggs = combineProjections(project.projections(), aggs);
-                    var newGroups = replacePrunedAliasesUsedInGroupBy(a.groupings(), aggs, newAggs);
-                    return new Aggregate(a.source(), a.child(), newGroups, newAggs);
+                    var tuple = projectAggregations(project.projections(), aggs);
+                    // project can be fully removed
+                    if (tuple.v1().isEmpty()) {
+                        var newAggs = tuple.v2();
+                        var newGroups = replacePrunedAliasesUsedInGroupBy(a.groupings(), aggs, newAggs);
+                        plan = new Aggregate(a.source(), a.child(), newGroups, newAggs);
+                    }


The gist of this change takes care of the scenario creates by ReplaceStatsAggExpressionWithEval:

stats x = sum(), y = count() project x, x as a, y

The combine rule previously would combine project into stats which would duplicate the count:

stats x = sum(), a = sum(), y = count()

It removes the project (which is cheap) but duplicates the sum (which is expensive and the reason we didn't want to duplicate it in the first place).

The rule thus tracks is there's any new alias - however it keeps on removing unused aggregations and in case of basic aliasing project, removes the project.
So the following

stats x = sum(), y = count() project x as a

becomes

stats a = sum()

costin · 2024-01-31T00:53:52Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

-            if (e instanceof Alias a) {
-                return new Alias(a.source(), a.name(), a.qualifier(), trimAliases(a.child()), a.id());
-            }
-            return trimAliases(e);
+            return e instanceof Alias a ? a.replaceChild(trimAliases(a.child())) : trimAliases(e);


Ternary operator ❤️

costin · 2024-01-31T00:54:30Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

-                                Alias newAlias = new Alias(k.source(), temporaryName(agg, af), null, k, null, true);
+                                Alias newAlias = new Alias(k.source(), temporaryName(k, af), null, k, null, true);


Update the generator name strategy to be a bit more meaningful.

costin · 2024-01-31T00:55:50Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

+     * becomes
+     * stats a = min(x), c = count(*) by g | eval b = a, d = c | keep a, b, c, d, g
+     */
+    static class ReplaceStatsAggExpressionWithEval extends OptimizerRules.OptimizerRule<Aggregate> {


The core of this PR - breaks down the expression over aggs and adds an eval lazily only for the fields that have an expression over aggregate functions.

We could improve this a bit further for a referencing case: just like we now support | eval x = field, y = x + 1, we could now (but don't yet) support something like: | stats x = max(field), y = x + min(field) -- this now fails because column x isn't known.

That would be a nice little feature - raised #105102 as a follow-up.

costin · 2024-01-31T00:56:33Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

-     * eval b = a, d = c
-     * keep a, b, c, d, g
-     */
-    static class ReplaceDuplicateAggWithEval extends OptimizerRules.OptimizerRule<Aggregate> {


Handled by ReplaceStatsAggExpressionWithEval

Previously only aggregate functions (max/sum/etc..) were allowed inside the stats command. This PR allows expressions involving one or multiple aggregates to be used, such as: stats x = avg(salary % 3) + max(emp_no), y = min(emp_no / 3) + 10 - median(salary) by z = languages % 2

costin · 2024-01-31T01:05:01Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/stats.csv-spec

+
+nestedAggsNoGrouping
+FROM employees
+| STATS x = AVG(salary) /2 + MAX(salary), a = AVG(salary), m = MAX(salary)


/cc @leemthompo

@costin I understand that we tag some of these tests to be included as examples in the docs. Just wondering what the workflow was with Abdon to add these tags? Was it just simply a ping to alert the writer that we want this example in the docs? :)

(Might need to be slightly more explicit about the workflow in general just because I haven't that rhythm yet)

The tests above are meant for internal consumption hence the ping - better to create other tests, that fix the general dataset and the rest of examples in the docs and add them in, as a separate PR after this one gets merged.
See the previous PRs authored by Abdon.

elasticsearchmachine · 2024-01-31T17:35:50Z

Hi @costin, I've created a changelog YAML for you.

astefan

LGTM
Left only minor comments. Maybe the one related to an additional test to be of more importance.

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Verifier.java

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

alex-spies

Gave this a first round, focusing on tests this time. I'll give this another go tomorrow.

Two observations:

I think there's bugs for the following cases: stats max(l) by l=languages (verification exception) and, more severely, stats max(languages) + languages by l = languages (NPE, Cannot invoke \"org.elasticsearch.xpack.esql.planner.Layout$ChannelAndType.channel()\" because the return value of \"org.elasticsearch.xpack.esql.planner.Layout.get(org.elasticsearch.xpack.ql.expression.NameId)\" is null"); the latter works fine without the alias.
This allows shenanigans like stats max(languages) + languages by languages (using a grouping in the expression), but not stats languages + 1 by languages; although that may be something for a follow-up, if we want to allow this.

Other than that I have mostly minor remarks; the tests do not fully assert the (complex) expressions that are being constructed, maybe we should be stricter there.

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTests.java

alex-spies · 2024-02-01T16:06:21Z

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTests.java

+            |stats x by 1
+            """));
+
+        assertThat(e.getMessage(), containsString("aggregate function"));


Suuuuper nit:
Shouldn't the error message be expected an aggregate function or group here as well? [x] is not an aggregate function technically implies that this should be replaced by an agg function.

x-pack/plugin/esql/qa/testFixtures/src/main/resources/stats.csv-spec

alex-spies · 2024-02-01T16:30:52Z

...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java

+     * \_Eval[[____x_AVG@9efc3cf3_SUM@daf9f221{r}#18 / ____x_AVG@9efc3cf3_COUNT@53cd08ed{r}#19 AS __x_AVG@9efc3cf3, __x_AVG@
+     * 9efc3cf3{r}#16 / 2[INTEGER] + __x_MAX@475d0e4d{r}#17 AS x]]
+     *   \_Limit[500[INTEGER]]
+     *     \_Aggregate[[],[SUM(salary{f}#11) AS ____x_AVG@9efc3cf3_SUM@daf9f221, COUNT(salary{f}#11) AS ____x_AVG@9efc3cf3_COUNT@53cd0


Nit, low prio: The generated attribute names make optimized plans pretty hard to read/grasp, due to the leading underscores and @9efc3cf3 thingies. Maybe we could streamline the names?

E.g.

\_Aggregate[[],[SUM(salary{f}#11) AS x_AVG_SUM, COUNT(salary{f}#11) AS x_AVG_COUNT, MAX(salary{f}#11) AS x_MAX

We could still append a number in case of conflict.

I've revisited the naming strategy to rely on a counter across the entire rule - hopefully this avoids clashes and improves readability.

alex-spies · 2024-02-01T17:00:23Z

...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java

+        // sum/count to compute avg
+        var div = as(fields.get(0).child(), Div.class);
+        // avg + max
+        var add = as(fields.get(1).child(), Add.class);


We're technically not asserting the whole expression here - maybe better to just assert the expression string?

Applies to all added tests in this file.

alex-spies · 2024-02-01T17:08:42Z

...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java

+     *     PERCENTILE(salary{f}#1928,50[INTEGER]) AS __y_MEDIAN@705fccec]]
+     *       \_Eval[[languages{f}#1926 % 2[INTEGER] AS z,
+     *               salary{f}#1928 % 3[INTEGER] AS ____x_AVG@e03a7a5c_AVG@e03a7a5c,
+     *               emp_no{f}#1923 / 3[INTEGER] AS ____y_MIN@80cee21c_MIN@80cee21c]]


nit: the attribute names are constructed from the aggs where they will be used, but not from the expression where they will be used; this makes it a bit hard to read this bottom-up (and top-down is also hard). I think it will also lead to tricky attribute names in cases like y = min(emp_no /3) - min(emp_no + 2), as both attributes will be called ____y_MIN@...._MIN@.....
Maybe this would be more consistent?

Suggested change

* emp_no{f}#1923 / 3[INTEGER] AS ____y_MIN@80cee21c_MIN@80cee21c]]

* emp_no{f}#1923 / 3[INTEGER] AS y_SUB1_MIN]]

(resp. ...SUB0... for the left hand side)

fang-xing-esql

LGTM in general, this is more like educational purposes for me. I left two minor suggestions on the test cases, and a question on the CombineProjections rule.

fang-xing-esql · 2024-02-01T23:44:24Z

x-pack/plugin/esql/src/test/java/org/elasticsearch/xpack/esql/analysis/AnalyzerTests.java

+            |stats 1
+            """));
+
+        assertThat(e.getMessage(), containsString("expected an aggregate function or group"));


The asserts with the same exception messages can be factored out.

fang-xing-esql · 2024-02-02T06:28:06Z

...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java

+                    e = min(salary),
+                    f = max(salary),
+                    g = max(salary)
+                    by w = languages % 2


Perhaps one of these test cases can be modified to use different groupings, for example, by auto_bucket(emp_no, 10, 1, 10000), to increase the coverage a bit more.

fang-xing-esql · 2024-02-02T06:49:22Z

...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java

+     *       \_Eval[[languages{f}#37 % 2[INTEGER] AS w]]
+     *         \_EsRelation[test][_meta_field{f}#40, emp_no{f}#34, first_name{f}#35, ..]
+     */
+    public void testStatsExpOverAggsWithScalarAndDuplicateAggs() {


Do we want to support removing duplicated expressions over aggregations? Like below:

| stats x = avg(salary) /2 + max(salary) , y = avg(salary) /2 + max(salary)

It seems like we recalculate expression part for x and y, the duplicated aggregations - avg and max are not recalculated. Detecting equivalent expressions over aggregations could be more complicated than detecting equivalent aggregations. Perhaps it is because CombineProjections does not check the pattern created by ReplaceStatsAggExpressionWithEval, CombineProjections checks project over project, this case has the pattern of project over eval over project over eval. Just to take a note here, not sure if it worth supporting it.

I think this would be covered as part of our plan to eliminate common (sub-) expressions: #103301

alex-spies

Gave it another round, focusing on the optimizer code this time. I didn't find anything not already covered by others' remarks. Looks good, very neat feature!

alex-spies · 2024-02-02T17:13:24Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

+     * Replace nested expressions over aggregates with synthetic eval post the aggregation
+     * stats a = sum(a) + min(b) by x
+     * becomes
+     * stats a1 = sum(a), a2 = min(b) by x | eval a = a1 + a2 | keep a, x
+     *
+     * Since the logic is very similar, this rule also handles duplicate aggregate functions to avoid duplicate compute
+     * stats a = min(x), b = min(x), c = count(*), d = count() by g
+     * becomes
+     * stats a = min(x), c = count(*) by g | eval b = a, d = c | keep a, b, c, d, g


++ to the examples in the javadoc, very useful.

bpintea

Left a more relevant note in a comment and some smaller nits.
Otherwise, while not fundamental, this PR together with #104387 make ESQL writing really "liberating" -- very nice.

x-pack/plugin/esql/qa/testFixtures/src/main/resources/stats.csv-spec

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/analysis/Verifier.java

bpintea · 2024-02-02T21:41:17Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

+            String name = expression instanceof NamedExpression ne
+                ? ne.name()
+                : expression.nodeName() + "@" + Integer.toHexString(expression.hashCode());
+            return "__" + name + "_" + af.functionName() + "@" + Integer.toHexString(af.hashCode());


nit: not sure if this makes it easier to read, but an alternative to unify the way the nodes are named:

Suggested change

String name = expression instanceof NamedExpression ne

? ne.name()

: expression.nodeName() + "@" + Integer.toHexString(expression.hashCode());

return "__" + name + "_" + af.functionName() + "@" + Integer.toHexString(af.hashCode());

Function<Expression, String> nf = e -> (e == af ? af.functionName() : e.nodeName()) + "@" + Integer.toHexString(e.hashCode());

String name = expression instanceof NamedExpression ne ? ne.name() : nf.apply(expression);

return "__" + name + "_" + nf.apply(af);

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

costin · 2024-02-06T03:03:10Z

I think there's bugs for the following cases: stats max(l) by l=languages (verification exception) and, more severely, stats max(languages) + languages by l = languages (NPE, Cannot invoke \"org.elasticsearch.xpack.esql.planner.Layout$ChannelAndType.channel()\" because the return value of \"org.elasticsearch.xpack.esql.planner.Layout.get(org.elasticsearch.xpack.ql.expression.NameId)\" is null"); the latter works fine without the alias.

This allows shenanigans like stats max(languages) + languages by languages (using a grouping in the expression), but not stats languages + 1 by languages; although that may be something for a follow-up, if we want to allow this.

There's a subtle bug when dealing with an aliased grouping which sometimes popped up because the validation allowed the grouping column in some queries.
I've raised #105172 as a follow-up and in the meantime improved the validator to not allow this scenario.

costin · 2024-02-06T03:06:01Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

@@ -205,7 +204,7 @@ protected LogicalPlan rule(Aggregate aggregate) {
                            var attr = aggFuncToAttr.get(af);
                            // the agg doesn't exist in the Aggregate, create an alias for it and save its attribute
                            if (attr == null) {
-                                var temporaryName = temporaryName(agg, af);
+                                var temporaryName = temporaryName(af, agg, counter[0]++);


Changed the strategy to indicate the inner and outer expression as that can differ across rules - in some an aggregation is an inner expressions while in others is the outer one.

costin · 2024-02-06T03:06:27Z

...k/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizer.java

+        static String temporaryName(Expression inner, Expression outer, int suffix) {
+            String in = toString(inner);
+            String out = toString(outer);
+            return "$$" + in + "$" + out + "$" + suffix;
+        }


Opted for $$a$ instead of _ as that was used to replace spaces.

costin · 2024-02-06T03:07:07Z

...gin/esql/src/test/java/org/elasticsearch/xpack/esql/optimizer/LogicalPlanOptimizerTests.java

+     * Project[[a{r}#5, b{r}#9, $$max(salary)_+_3>$COUNT$2{r}#46 AS d, $$count(salary)_->$MIN$3{r}#47 AS e, $$avg(salary)_+_m
+     * >$MAX$1{r}#45 AS g]]
+     * \_Eval[[$$$$avg(salary)_+_m>$AVG$0$SUM$0{r}#48 / $$max(salary)_+_3>$COUNT$2{r}#46 AS $$avg(salary)_+_m>$AVG$0, $$avg(
+     * salary)_+_m>$AVG$0{r}#44 + $$avg(salary)_+_m>$MAX$1{r}#45 AS a, $$avg(salary)_+_m>$MAX$1{r}#45 + 3[INTEGER] +
+     * 3.141592653589793[DOUBLE] + $$max(salary)_+_3>$COUNT$2{r}#46 AS b]]
+     *   \_Limit[500[INTEGER]]
+     *     \_Aggregate[[w{r}#28],[SUM(salary{f}#39) AS $$$$avg(salary)_+_m>$AVG$0$SUM$0, MAX(salary{f}#39) AS $$avg(salary)_+_m>$MAX$1
+     * , COUNT(salary{f}#39) AS $$max(salary)_+_3>$COUNT$2, MIN(salary{f}#39) AS $$count(salary)_->$MIN$3]]
+     *       \_Eval[[languages{f}#37 % 2[INTEGER] AS w]]
+     *         \_EsRelation[test][_meta_field{f}#40, emp_no{f}#34, first_name{f}#35, ..]
+     */


An example of the new naming strategy.

## Summary Sync with elastic/elasticsearch#104958 for support of builtin fn in STATS * validation ✅ * autocomplete ✅ * also fixed `STATS BY <field>` syntax ![new_stats](https://github.com/elastic/kibana/assets/924948/735f9842-b1d3-4aa0-9d51-4b2f9b136ed3) Sync with elastic/elasticsearch#104913 for new `log` function * validation ✅ - also warning for negative values * autocomplete ✅ ![add_log](https://github.com/elastic/kibana/assets/924948/146b945d-a23b-45ec-9df2-2d2b291e883b) Sync with elastic/elasticsearch#105064 for removal of `PROJECT` command * validation ✅ (both new and legacy syntax supported) * autocomplete ✅ (will only suggest new syntax) ![remove_project](https://github.com/elastic/kibana/assets/924948/b6f40afe-a26d-4917-b7a1-d8ae97c5368b) Sync with elastic/elasticsearch#105221 for removal of mandatory brackets for `METADATA` command option * validation ✅ (added warning deprecation message when using brackets) * autocomplete ✅ ![fix_metadata](https://github.com/elastic/kibana/assets/924948/c65db176-dd94-45f3-9524-45453e62f51a) Sync with elastic/elasticsearch#105224 for change of syntax for ENRICH ccq mode * validation ✅ * autocomplete ✅ (not directly promoted, the user has to type `_` to trigger it) * hover ✅ * code actions ✅ ![fix_ccq_enrich](https://github.com/elastic/kibana/assets/924948/0900edd9-a0a7-4ac8-bc12-e39a72359984) ![fix_ccq_enrich_2](https://github.com/elastic/kibana/assets/924948/74b0908f-d385-4723-b3d4-c09108f47a73) Do not merge until those 5 get merged. Additional things in this PR: * Added more tests for `callbacks` not passed scenario * covered more cases like those with `dissect` * Added more tests for signature params number (calling a function with an extra arg should return an error) * Cleaned up some more unused code * Improved messages on too many arguments for functions ### Checklist - [x] [Unit or functional tests](https://www.elastic.co/guide/en/kibana/master/development-tests.html) were updated or added to match the most common scenarios

Fix #117770 Fix #117784 #117699 made changes to how we plan aggregations which were supposed to only trigger when a query contained a `CATEGORIZE`, but accidentally changed a code path that seems to only be required for interoperability with pre-8.13 nodes. Because of this, we didn't notice failing tests until the periodic bwc tests ran. The code this PR fixes addresses situations where `Aggregate` plan nodes contained _aliases inside the aggregates_. On `main` and `8.x`, this is effectively an illegal state: since #104958, aliases in the aggregates become `Eval` nodes before and after the `Aggregate` node. But here, on 8.x, we'll just fix this code path so that it behaves exactly as before #117699. If this passes the full-bwc test, I plan to forward-port this by removing the obsolete code path on `main`.

costin added >enhancement :Analytics/ES|QL AKA ESQL ES|QL-ui Impacts ES|QL UI v8.13.0 labels Jan 31, 2024

costin requested review from astefan, bpintea, alex-spies and fang-xing-esql January 31, 2024 00:36

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Jan 31, 2024

costin force-pushed the esql/expressions-over-aggs branch 2 times, most recently from 2d8be06 to 9beaca9 Compare January 31, 2024 00:39

costin commented Jan 31, 2024

View reviewed changes

costin force-pushed the esql/expressions-over-aggs branch from 9beaca9 to 47389a0 Compare January 31, 2024 00:58

costin commented Jan 31, 2024

View reviewed changes

Disable breaking bwc tests

3f39259

costin and others added 2 commits January 31, 2024 19:35

Update docs/changelog/104958.yaml

ad33281

Merge branch 'main' into esql/nested-aggs

47246d4

astefan approved these changes Feb 1, 2024

View reviewed changes

alex-spies reviewed Feb 1, 2024

View reviewed changes

fang-xing-esql reviewed Feb 2, 2024

View reviewed changes

alex-spies reviewed Feb 2, 2024

View reviewed changes

bpintea approved these changes Feb 2, 2024

View reviewed changes

costin mentioned this pull request Feb 4, 2024

ESQL: Add relative resolution for stats aggregations #105102

Open

costin added 2 commits February 4, 2024 19:51

wip - improve verification on usage of grouping key inside stats agg

4acfa4a

Improve verifier to not allow scalar functions over grouping

feeedfd

costin added 2 commits February 5, 2024 17:28

Merge branch 'main' into esql/nested-aggs

afc410b

Add missing license

0ada1e3

costin mentioned this pull request Feb 6, 2024

ESQL: Allow scalar expressions over groupings in STATS #105172

Open

costin commented Feb 6, 2024

View reviewed changes

costin merged commit ac09d75 into elastic:main Feb 6, 2024
15 checks passed

costin deleted the esql/expressions-over-aggs branch February 6, 2024 03:08

dej611 mentioned this pull request Feb 6, 2024

[ES|QL] New sync with ES changes elastic/kibana#176283

Merged

1 task

alex-spies mentioned this pull request May 22, 2024

[CI] MixedClusterEsqlSpecIT test {stats.SumOfDouble SYNC} failing #108859

Closed

bpintea mentioned this pull request Nov 11, 2024

ESQL: extract common filter from aggs #115678

Merged

This was referenced Dec 2, 2024

ESQL: Fix layout when aggregating with aliases #117832

Merged

ESQL: Remove obsolete code path for aggs #117833

Closed

		Alias newAlias = new Alias(k.source(), temporaryName(agg, af), null, k, null, true);
		Alias newAlias = new Alias(k.source(), temporaryName(k, af), null, k, null, true);

	* emp_no{f}#1923 / 3[INTEGER] AS ____y_MIN@80cee21c_MIN@80cee21c]]
	* emp_no{f}#1923 / 3[INTEGER] AS y_SUB1_MIN]]

ESQL: Extend STATS command to support aggregate expressions #104958

ESQL: Extend STATS command to support aggregate expressions #104958

Conversation

costin commented Jan 31, 2024 • edited by nik9000 Loading

elasticsearchmachine commented Jan 31, 2024

elasticsearchmachine commented Jan 31, 2024

costin Jan 31, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

leemthompo Feb 1, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

elasticsearchmachine commented Jan 31, 2024

astefan left a comment

Choose a reason for hiding this comment

alex-spies left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

fang-xing-esql left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

alex-spies left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bpintea left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

costin commented Feb 6, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

costin commented Jan 31, 2024 •

edited by nik9000

Loading

costin Jan 31, 2024 •

edited

Loading

leemthompo Feb 1, 2024 •

edited

Loading

fang-xing-esql left a comment •

edited

Loading