Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Batch19 SQL aggregation functions #17658

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 97 additions & 18 deletions docs/querying/sql-functions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1282,21 +1282,74 @@ Returns the following:

## BLOOM_FILTER

Computes a Bloom filter from values produced by the specified expression.
Computes a [bloom filter](../development/extensions-core/bloom-filter.md) from values provided in an expression.

* **Syntax**: `BLOOM_FILTER(expr, <NUMERIC>)`
`numEntries` specifies the maximum number of distinct values before the false positive rate increases.

* **Syntax:** `BLOOM_FILTER(expr, numEntries)`
* **Function type:** Aggregation

<details><summary>Example</summary>

The following example returns a Base64-encoded bloom filter string for entries in `agent_category`:

```sql
SELECT
agent_category,
BLOOM_FILTER(agent_category, 10) as bloom
FROM "kttm"
GROUP BY agent_category
```

Returns the following:

| `agent_keys` | `bloom` |
| -- | -- |
| `empty` | `"BAAAAAgAAAAAABAAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAEABAAAAAA"` |
| `Game console` | `"BAAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAQAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgBAAAAAAAAAAAAAAAA"` |
| `Personal computer` | `"BAAAAAgAAAAAAEAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAAA"` |
| `Smart TV` | `"BAAAAAgAAAAAAAAAAAAAgAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgAA"` |
| `Smartphone` | `"BAAAAAgAAACAAAAAAAAAAAAAAEAAAAAAAAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAA"` |
| `Tablet` | `"BAAAAAgAAAAAAAAAAAAAAAIAAAAAAAAAAAAAAAAAAAAAAgAAAAAAAAAAAAAAAAAAAAACAAAAAAAAAAAAAAAAAAAAAAIA"` |

</details>

[Learn more](sql-aggregations.md)

## BLOOM_FILTER_TEST

Returns true if the expression is contained in a Base64-serialized Bloom filter.
Returns true if an expression is contained in a Base64-encoded [bloom filter](../development/extensions-core/bloom-filter.md) string.

* **Syntax**: `BLOOM_FILTER_TEST(expr, <STRING>)`
* **Syntax:** `BLOOM_FILTER_TEST(expr, <STRING>)`
* **Function type:** Scalar, other

[Learn more](sql-scalar.md#other-scalar-functions)
<details><summary>Example</summary>

The following example returns `true` for the bloom filter string associated with `agent_filter` entry `Game console`:

```sql
SELECT
agent_category,
BLOOM_FILTER_TEST(agent_category, 'BAAAAAgAAAAAAAAAAAAAAAAAAAAAAAAAQAgAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAgBAAAAAAAAAAAAAAAA') as bloom
FROM "kttm"
GROUP BY agent_category
```

Returns the following:

| `agent_keys` | `bloom` |
| -- | -- |
| `empty` | `false` |
| `Game console` | `true` |
| `Personal computer` | `false` |
| `Smart TV` | `false` |
| `Smartphone` | `false` |
| `Tablet` | `false` |

</details>

[Learn more](sql-aggregations.md)


## BTRIM

Expand Down Expand Up @@ -1756,39 +1809,66 @@ Returns the following:

## DECODE_BASE64_COMPLEX

Decodes a Base64-encoded string into a complex data type, where `dataType` is the complex data type and `expr` is the Base64-encoded string to decode.
Decodes a Base64-encoded expression into a complex data type.

* **Syntax**: `DECODE_BASE64_COMPLEX(dataType, expr)`
* **Function type:** Scalar, other
You can use the function to ingest data when a column contains an encoded data sketch such as Theta or HLL.

[Learn more](sql-scalar.md#other-scalar-functions)
The function supports `hyperUnique` and `serializablePairLongString` data types by default.
You can enable support for the following complex data types by [loading their extensions](../configuration/extensions.md):

- `druid-bloom-filter`: `bloom`
- `druid-datasketches`: `arrayOfDoublesSketch`, `HLLSketch`, `KllDoublesSketch`, `KllFloatsSketch`, `quantilesDoublesSketch`, `thetaSketch`
- `druid-histogram`: `approximateHistogram`, `fixedBucketsHistogram`
- `druid-stats`: `variance`
- `druid-compressed-bigdecimal`: `compressedBigDecimal`
- `druid-momentsketch`: `momentSketch`
- `druid-tdigestsketch`: `tDigestSketch`

* **Syntax:** `DECODE_BASE64_COMPLEX(dataType, expr)`
* **Function type:** Scalar

<details><summary>Example</summary>

The following example decodes a Theta sketch from a Base64-encoded sketch contained in `theta_input`:

```sql
DECODE_BASE64_COMPLEX('thetaSketch', "theta_input")
```
The following example counts the distinct values in an encoded Theta sketch column using [`APPROX_COUNT_DISTINCT_DS_THETA`](#approx_count_distinct_ds_theta):

```sql
APPROX_COUNT_DISTINCT_DS_THETA(DECODE_BASE64_COMPLEX('thetaSketch', "theta_input"))
```

</details>

[Learn more](./sql-scalar.md#other-scalar-functions)

## DECODE_BASE64_UTF8

Decodes a Base64-encoded string into a UTF-8 encoded string.
Decodes a Base64-encoded expression into a UTF-8 encoded string.

* **Syntax:** `DECODE_BASE64_UTF8(expr)`
* **Function type:** Scalar, string

<details><summary>Example</summary>

The following example converts the base64 encoded string `SGVsbG8gV29ybGQhCg==` into an UTF-8 encoded string.
The following example decodes the Base64-encoded representation of "Hello, World!":

```sql
SELECT
'SGVsbG8gV29ybGQhCg==' AS "base64_encoding",
DECODE_BASE64_UTF8('SGVsbG8gV29ybGQhCg==') AS "convert_to_UTF8_encoding"
DECODE_BASE64_UTF8('SGVsbG8sIFdvcmxkIQ==') as decoded
```

Returns the following:

| `base64_encoding` | `convert_to_UTF8_encoding` |
| -- | -- |
| `SGVsbG8gV29ybGQhCg==` | `Hello World!` |
| `agent_keys` |
| -- |
| `Hello, World!` |

</details>

[Learn more](sql-scalar.md#string-functions)
[Learn more](./sql-scalar.md#string-functions)

## DEGREES

Expand Down Expand Up @@ -5260,4 +5340,3 @@ Requires the [`druid-stats` extension](../development/extensions-core/stats.md).
* **Function type:** Aggregation

[Learn more](sql-aggregations.md)

Loading