ESQL: extend BUCKET with spans. Turn it into a grouping function (ela…

…stic#107272) This extends `BUCKET` function to accept a two-parameters-only invocation: the first parameter remains as is, while the second is a span. It can be a numeric (floating point) span, if the first argument is numeric, or a date period or time duration, if the first argument is a date. Also, the function can now be invoked with the alias BIN. Additionally, the function has been turned into a grouping-only function and thus can only be used within a `STATS` command.
carlosdelest · Apr 16, 2024 · a2c2e8f · a2c2e8f
1 parent 9626615
commit a2c2e8f
Show file tree

Hide file tree

Showing 22 changed files with 995 additions and 508 deletions.
diff --git a/docs/changelog/107272.yaml b/docs/changelog/107272.yaml
@@ -0,0 +1,5 @@
+pr: 107272
+summary: "ESQL: extend BUCKET with spans"
+area: ES|QL
+type: enhancement
+issues: []
diff --git a/docs/reference/esql/esql-get-started.asciidoc b/docs/reference/esql/esql-get-started.asciidoc
@@ -248,22 +248,22 @@ For example, to create hourly buckets for the data on October 23rd:
 
 [source,esql]
 ----
-include::{esql-specs}/date.csv-spec[tag=gs-bucket]
+include::{esql-specs}/bucket.csv-spec[tag=gs-bucket]
 ----
 
 Combine `BUCKET` with <<esql-stats-by>> to create a histogram. For example,
 to count the number of events per hour:
 
 [source,esql]
 ----
-include::{esql-specs}/date.csv-spec[tag=gs-bucket-stats-by]
+include::{esql-specs}/bucket.csv-spec[tag=gs-bucket-stats-by]
 ----
 
 Or the median duration per hour:
 
 [source,esql]
 ----
-include::{esql-specs}/date.csv-spec[tag=gs-bucket-stats-by-median]
+include::{esql-specs}/bucket.csv-spec[tag=gs-bucket-stats-by-median]
 ----
 
 [discrete]

diff --git a/docs/reference/esql/functions/bucket.asciidoc b/docs/reference/esql/functions/bucket.asciidoc
@@ -35,11 +35,11 @@ in monthly buckets:
 
 [source.merge.styled,esql]
 ----
-include::{esql-specs}/date.csv-spec[tag=docsBucketMonth]
+include::{esql-specs}/bucket.csv-spec[tag=docsBucketMonth]
 ----
 [%header.monospaced.styled,format=dsv,separator=|]
 |===
-include::{esql-specs}/date.csv-spec[tag=docsBucketMonth-result]
+include::{esql-specs}/bucket.csv-spec[tag=docsBucketMonth-result]
 |===
 
 The goal isn't to provide *exactly* the target number of buckets, it's to pick a
@@ -51,11 +51,11 @@ Combine `BUCKET` with
 
 [source.merge.styled,esql]
 ----
-include::{esql-specs}/date.csv-spec[tag=docsBucketMonthlyHistogram]
+include::{esql-specs}/bucket.csv-spec[tag=docsBucketMonthlyHistogram]
 ----
 [%header.monospaced.styled,format=dsv,separator=|]
 |===
-include::{esql-specs}/date.csv-spec[tag=docsBucketMonthlyHistogram-result]
+include::{esql-specs}/bucket.csv-spec[tag=docsBucketMonthlyHistogram-result]
 |===
 
 NOTE: `BUCKET` does not create buckets that don't match any documents.
@@ -66,11 +66,11 @@ at most 100 buckets in a year results in weekly buckets:
 
 [source.merge.styled,esql]
 ----
-include::{esql-specs}/date.csv-spec[tag=docsBucketWeeklyHistogram]
+include::{esql-specs}/bucket.csv-spec[tag=docsBucketWeeklyHistogram]
 ----
 [%header.monospaced.styled,format=dsv,separator=|]
 |===
-include::{esql-specs}/date.csv-spec[tag=docsBucketWeeklyHistogram-result]
+include::{esql-specs}/bucket.csv-spec[tag=docsBucketWeeklyHistogram-result]
 |===
 
 NOTE: `BUCKET` does not filter any rows. It only uses the provided range to
@@ -83,11 +83,11 @@ salary histogram:
 
 [source.merge.styled,esql]
 ----
-include::{esql-specs}/ints.csv-spec[tag=docsBucketNumeric]
+include::{esql-specs}/bucket.csv-spec[tag=docsBucketNumeric]
 ----
 [%header.monospaced.styled,format=dsv,separator=|]
 |===
-include::{esql-specs}/ints.csv-spec[tag=docsBucketNumeric-result]
+include::{esql-specs}/bucket.csv-spec[tag=docsBucketNumeric-result]
 |===
 
 Unlike the earlier example that intentionally filters on a date range, you
@@ -102,17 +102,17 @@ per hour:
 
 [source.styled,esql]
 ----
-include::{esql-specs}/date.csv-spec[tag=docsBucketLast24hr]
+include::{esql-specs}/bucket.csv-spec[tag=docsBucketLast24hr]
 ----
 
 Create monthly buckets for the year 1985, and calculate the average salary by
 hiring month:
 
 [source.merge.styled,esql]
 ----
-include::{esql-specs}/date.csv-spec[tag=bucket_in_agg]
+include::{esql-specs}/bucket.csv-spec[tag=bucket_in_agg]
 ----
 [%header.monospaced.styled,format=dsv,separator=|]
 |===
-include::{esql-specs}/date.csv-spec[tag=bucket_in_agg-result]
+include::{esql-specs}/bucket.csv-spec[tag=bucket_in_agg-result]
 |===
diff --git a/docs/reference/esql/functions/description/bucket.asciidoc b/docs/reference/esql/functions/description/bucket.asciidoc
@@ -2,4 +2,4 @@
 
 *Description*
 
-Creates human-friendly buckets and returns a datetime value for each row that corresponds to the resulting bucket the row falls into.
+Creates groups of values - buckets - out of a datetime or numeric input. The size of the buckets can either be provided directly, or chosen based on a recommended count and values range.
Original file line number	Diff line number	Diff line change
Expand Up		@@ -2,4 +2,4 @@

		Description

		Creates human-friendly buckets and returns a datetime value for each row that corresponds to the resulting bucket the row falls into.
		Creates groups of values - buckets - out of a datetime or numeric input. The size of the buckets can either be provided directly, or chosen based on a recommended count and values range.