Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ESQL: INLINESTATS #109583

Merged
merged 63 commits into from
Jul 24, 2024
Merged
Show file tree
Hide file tree
Changes from 49 commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
e2d9da7
ESQL: INLINESTATS
nik9000 Jun 11, 2024
14017d6
Explain
nik9000 Jun 11, 2024
355905a
More nocommit
nik9000 Jun 11, 2024
a634f90
More nocommit
nik9000 Jun 11, 2024
8f04e1b
Spotless
nik9000 Jun 11, 2024
3a3939b
Merge branch 'main' into inlinestats
nik9000 Jul 5, 2024
5483426
Works again
nik9000 Jul 5, 2024
7695f67
Closer
nik9000 Jul 6, 2024
64f858b
Merge branch 'main' into inlinestats
nik9000 Jul 8, 2024
141a63f
More test
nik9000 Jul 8, 2024
d0dc736
Share
nik9000 Jul 9, 2024
6947e1c
Merge branch 'main' into inlinestats
nik9000 Jul 9, 2024
cc44421
More
nik9000 Jul 9, 2024
8466a4e
ungrouped
nik9000 Jul 9, 2024
cc20b73
WIt P
nik9000 Jul 10, 2024
2f9b8af
Merge branch 'main' into inlinestats
nik9000 Jul 10, 2024
c4f1d87
Remove
nik9000 Jul 10, 2024
1408824
Remove unused
nik9000 Jul 10, 2024
2a38bc8
Merge branch 'main' into inlinestats
nik9000 Jul 10, 2024
b786c1c
More test
nik9000 Jul 10, 2024
547e5a5
Merge
nik9000 Jul 10, 2024
0eb2958
More nocommit
nik9000 Jul 10, 2024
32a03b2
explain
nik9000 Jul 11, 2024
966c860
Merge branch 'main' into inlinestats
nik9000 Jul 12, 2024
83252cf
WIP
nik9000 Jul 13, 2024
40d3fe9
Passes now?
nik9000 Jul 15, 2024
50fc6e2
one more exampl
nik9000 Jul 16, 2024
d33a445
Merge branch 'main' into inlinestats
nik9000 Jul 16, 2024
a868e7e
Javadoc
nik9000 Jul 16, 2024
0bbdef4
MOAR JAVADOC
nik9000 Jul 16, 2024
e19c769
Update docs/changelog/109583.yaml
nik9000 Jul 16, 2024
5d019f4
Changelog
nik9000 Jul 16, 2024
434bd9b
WIP
nik9000 Jul 16, 2024
0d5d0da
Merge branch 'main' into inlinestats
nik9000 Jul 16, 2024
e7eb532
Ready?
nik9000 Jul 17, 2024
908dfc9
Update docs
nik9000 Jul 17, 2024
2386fa8
Raname to line up with other stuff
nik9000 Jul 17, 2024
887b9ce
More
nik9000 Jul 17, 2024
2e028ce
Merge branch 'main' into inlinestats
nik9000 Jul 17, 2024
1dfe527
Merge branch 'main' into inlinestats
nik9000 Jul 18, 2024
ab350c4
Apply suggestions from code review
nik9000 Jul 18, 2024
2d90569
Merge remote-tracking branch 'nik9000/inlinestats' into inlinestats
nik9000 Jul 18, 2024
c937834
Update docs
nik9000 Jul 18, 2024
0a9332c
Updates
nik9000 Jul 18, 2024
c30230c
More explain and a couple renames
nik9000 Jul 18, 2024
9d12a29
Format
nik9000 Jul 19, 2024
6f690b7
Merge branch 'main' into inlinestats
nik9000 Jul 19, 2024
18d30f0
Some progress
nik9000 Jul 19, 2024
9ff0024
percentile
nik9000 Jul 19, 2024
c142d61
Better way?
nik9000 Jul 21, 2024
76998b7
Merge branch 'main' into inlinestats
nik9000 Jul 22, 2024
40b13df
techpreview
nik9000 Jul 22, 2024
48253a4
Merge branch 'main' into inlinestats
nik9000 Jul 22, 2024
674d93d
WIP
nik9000 Jul 22, 2024
4489b27
Update
nik9000 Jul 22, 2024
8074a95
Merge branch 'main' into inlinestats
nik9000 Jul 23, 2024
10b1fcd
Link
nik9000 Jul 23, 2024
fdb43d8
Check
nik9000 Jul 23, 2024
cbb1e60
Feature flag it
nik9000 Jul 23, 2024
8fbe301
Merge branch 'main' into inlinestats
nik9000 Jul 23, 2024
09d226a
WI{
nik9000 Jul 24, 2024
0e454f7
Merge branch 'main' into inlinestats
nik9000 Jul 24, 2024
a6ec9be
more skips
nik9000 Jul 24, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 29 additions & 0 deletions docs/changelog/109583.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
pr: 109583
summary: "ESQL: INLINESTATS"
area: ES|QL
type: feature
issues:
- 107589
highlight:
title: "ESQL: INLINESTATS"
body: |-
This adds the `INLINESTATS` command to ESQL which performs a STATS and
then enriches the results into the output stream. So, this query:

[source,esql]
----
FROM test
| INLINESTATS m=MAX(a * b) BY b
| WHERE m == a * b
| SORT a DESC, b DESC
| LIMIT 3
----

Produces output like:

| a | b | m |
| --- | --- | ----- |
| 99 | 999 | 98901 |
| 99 | 998 | 98802 |
| 99 | 997 | 98703 |
notable: true
2 changes: 2 additions & 0 deletions docs/reference/esql/esql-commands.asciidoc
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ image::images/esql/processing-command.svg[A processing command changing an input
* <<esql-enrich>>
* <<esql-eval>>
* <<esql-grok>>
* experimental:[] <<esql-inlinestats-by>>
nik9000 marked this conversation as resolved.
Show resolved Hide resolved
* <<esql-keep>>
* <<esql-limit>>
ifeval::["{release-state}"=="unreleased"]
Expand All @@ -59,6 +60,7 @@ include::processing-commands/drop.asciidoc[]
include::processing-commands/enrich.asciidoc[]
include::processing-commands/eval.asciidoc[]
include::processing-commands/grok.asciidoc[]
include::processing-commands/inlinestats.asciidoc[]
include::processing-commands/keep.asciidoc[]
include::processing-commands/limit.asciidoc[]
ifeval::["{release-state}"=="unreleased"]
Expand Down
100 changes: 100 additions & 0 deletions docs/reference/esql/processing-commands/inlinestats.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
[discrete]
[[esql-inlinestats-by]]
=== `INLINESTATS ... BY`

The `INLINESTATS` command calculates an aggregate result and adds new columns
with the result to the stream of input data.

**Syntax**

[source,esql]
----
INLINESTATS [column1 =] expression1[, ..., [columnN =] expressionN]
[BY grouping_expression1[, ..., grouping_expressionN]]
----

*Parameters*

`columnX`::
The name by which the aggregated value is returned. If omitted, the name is
equal to the corresponding expression (`expressionX`). If multiple columns
have the same name, all but the rightmost column with this name will be ignored.

`expressionX`::
An expression that computes an aggregated value. If its name coincides with one
of the computed columns, that column will be ignored.

`grouping_expressionX`::
An expression that outputs the values to group by.
nik9000 marked this conversation as resolved.
Show resolved Hide resolved

NOTE: Individual `null` values are skipped when computing aggregations.

*Description*

The `INLINESTATS` command calculates an aggregate result and merges that result
back into the stream of input data. Without the optional `BY` clause this will
produce a single result which is appended to each row. With a `BY` clause this
will produce one result per grouping and merge the result into the stream based on
matching group keys.

All of the <<esql-agg-functions,aggregation functions>> are supported.

*Examples*

Find the employees that speak the most languages (it's a tie!):

[source.merge.styled,esql]
----
include::{esql-specs}/inlinestats.csv-spec[tag=max-languages]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/inlinestats.csv-spec[tag=max-languages-result]
|===

Find the longest tenured employee who's last name starts with each letter of the alphabet:

[source.merge.styled,esql]
----
include::{esql-specs}/inlinestats.csv-spec[tag=longest-tenured-by-first]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/inlinestats.csv-spec[tag=longest-tenured-by-first-result]
|===

Find the northern and southern most airports:

[source.merge.styled,esql]
----
include::{esql-specs}/inlinestats.csv-spec[tag=extreme-airports]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/inlinestats.csv-spec[tag=extreme-airports-result]
|===

NOTE: Our test data doesn't have many "small" airports.

If a `BY` field is multivalued then `INLINESTATS` will put the row in *each*
bucket like <<esql-stats-by>>:

[source.merge.styled,esql]
----
include::{esql-specs}/inlinestats.csv-spec[tag=mv-group]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/inlinestats.csv-spec[tag=mv-group-result]
|===

To treat each group key as its own row use <<esql-mv_expand>> before `INLINESTATS`:

[source.merge.styled,esql]
----
include::{esql-specs}/inlinestats.csv-spec[tag=mv-expand]
----
[%header.monospaced.styled,format=dsv,separator=|]
|===
include::{esql-specs}/inlinestats.csv-spec[tag=mv-expand-result]
|===
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,9 @@ public abstract sealed class RowInTableLookup implements Releasable permits Empt
public abstract String toString();

public static RowInTableLookup build(BlockFactory blockFactory, Block[] keys) {
if (keys.length < 1) {
throw new IllegalArgumentException("expected [keys] to be non-empty");
ivancea marked this conversation as resolved.
Show resolved Hide resolved
}
nik9000 marked this conversation as resolved.
Show resolved Hide resolved
int positions = keys[0].getPositionCount();
for (int k = 0; k < keys.length; k++) {
if (positions != keys[k].getPositionCount()) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -40,6 +40,12 @@ public String toString() {
* are never closed, so we need to build them from a non-tracking factory.
*/
public record Factory(Key[] keys, int[] blockMapping) implements Operator.OperatorFactory {
public Factory {
if (keys.length < 1) {
throw new IllegalArgumentException("expected [keys] to be non-empty");
}
}

@Override
public Operator get(DriverContext driverContext) {
return new RowInTableLookupOperator(driverContext.blockFactory(), keys, blockMapping);
Expand All @@ -56,6 +62,9 @@ public String describe() {
private final int[] blockMapping;

public RowInTableLookupOperator(BlockFactory blockFactory, Key[] keys, int[] blockMapping) {
if (keys.length < 1) {
throw new IllegalArgumentException("expected [keys] to be non-empty");
}
this.blockMapping = blockMapping;
this.keys = new ArrayList<>(keys.length);
Block[] blocks = new Block[keys.length];
Expand Down
Loading