Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

explicit outputType for ExpressionPostAggregator, better documentation for the differences between arrays and mvds #15245

Merged
merged 13 commits into from
Nov 2, 2023

Conversation

clintropolis
Copy link
Member

@clintropolis clintropolis commented Oct 25, 2023

Description

This PR attempts to:

  • more clearly lay out the differences between ARRAY types and multi-value dimensions in the docs
  • adds a new arrays page with examples and concepts
  • updates the multi-value dimensions page to call out that they are not arrays and include SQL examples
  • document the new arrayIngestMode parameter added in MSQ arrayIngestMode to control if arrays are ingested as ARRAY, MVD, or an exception #15093
  • and update MSQ examples to show how to ingest ARRAY types and MVDs going forward to prepare for the eventual default of "arrayIngestMode":"array"

It also adds a new outputType to ExpressionPostAggregator which is used by the SQL planner to decorate the postagg with the expected output type, particularly useful when using functions like ARRAY_TO_MV, which has a native expression type of ARRAY<STRING> but is expected to be treated as STRING outside of expressions (multi-value STRING do not exist inside of expression engine, so native expression type-inference alone isn't quite cool enough to handle this properly). This was necessary to make the new docs actually true.

@clintropolis clintropolis added Area - Documentation Area - Querying Area - SQL Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Oct 25, 2023
@clintropolis clintropolis added this to the 28.0 milestone Oct 25, 2023
@github-actions github-actions bot removed Area - Querying Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Oct 25, 2023
@317brian 317brian self-requested a review October 25, 2023 22:33
@@ -88,6 +88,9 @@ When deciding whether to use `REPLACE` or `INSERT`, keep in mind that segments g
with dimension-based pruning but those generated with `INSERT` cannot. For more information about the requirements
for dimension-based pruning, see [Clustering](#clustering).

To insert [ARRAY types](../querying/arrays.md), be sure to set context flag `"arrayIngestMode":"array"` which allows
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm this seems like the wrong place to put this. It's generic docs about INSERT, we don't want to gunk it up with stuff about specific data that might be inserted. (Otherwise this would be, like, 10 times longer.)

I suggest cutting it, and relying on the examples and the array docs to guide people.

false`](reference.md#context-parameters) in your context. This ensures that multi-value strings are left alone and
remain lists, instead of being [automatically unnested](../querying/sql-data-types.md#multi-value-strings) by the
`GROUP BY` operator.
3. To ingest [Druid multi-value dimensions](../querying/multi-value-dimensions.md), wrap all multi-value strings
Copy link
Contributor

@gianm gianm Oct 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This direction has become too complicated for people to understand, so I think we'll need an example. Or to link to one.

@@ -0,0 +1,228 @@
---
id: arrays
title: "Array columns"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Arrays" is a better title and scope — as people can use arrays even if they don't have array columns. For example they can use MV_TO_ARRAY, ARRAY_AGG, etc.

-->


Apache Druid supports SQL standard `ARRAY` typed columns for `STRING`, `LONG`, and `DOUBLE` types. Other more complicated ARRAY types must be stored in [nested columns](nested-columns.md). Druid ARRAY types are distinct from [multi-value dimension](multi-value-dimensions.md), which have significantly different behavior than standard arrays.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Odd to be mixing SQL terms with native type names here. Possibly switch to VARCHAR, BIGINT, and DOUBLE for making more sense with SQL. Or mention both the SQL name and the native name?

],
```

Arrays can also be inserted with [multi-stage ingestion](../multi-stage-query/index.md), but must include a query context parameter `"arrayIngestMode":"array"`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sort of unclear what the verb "include" refers to. The sentence construction makes it sound like the arrays themselves must include the context parameter. But that isn't right. Also, "multi-stage ingestion" isn't a thing 🙂— it's "SQL-based ingestion" or "multi-stage query".

So, suggestion:

Arrays can also be inserted with SQL-based ingestion when you use the context parameter "arrayIngestMode": "array".

Also link the text context parameter to docs/multi-stage-query/reference.md#context.

Also include some text about what will happen if you don't do arrayIngestMode: array. Something like: string arrays will be converted to multi-value dimensions, and numeric arrays will cause the query to fail with an error (what error?)

#### Example: SQL grouping query with a filter
```sql
SELECT label, arrayString
FROM "array_example" CROSS JOIN UNNEST(arrayString) as u(strings)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this CROSS JOIN meant to be here? it doesn't seem to be doing much if anything


- Value filters, like "equality", "range" match on entire array values
- The "null" filter will match rows where the entire array value is null
- Array specific functions like ARRAY_CONTAINS and ARRAY_OVERLAP follow the behavior specified by those functions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

backticks around function names, & link them to the SQL function docs

{"timestamp": "2023-01-01T00:00:00", "label": "row3", "arrayString": [], "arrayLong":[1, 2, 3], "arrayDouble":[null, 2.2, 1.1]}
{"timestamp": "2023-01-01T00:00:00", "label": "row4", "arrayString": ["a", "b"], "arrayLong":[1, 2, 3], "arrayDouble":[]}
{"timestamp": "2023-01-01T00:00:00", "label": "row5", "arrayString": null, "arrayLong":[], "arrayDouble":null}
```
Copy link
Contributor

@gianm gianm Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere around here we should have a section "String arrays vs. multi-value dimensions" that sets people straight about the differences. Suggested text:

Avoid confusing string arrays with multi-value dimensions (link to MVD docs). Arrays and multi-value dimensions are stored in different column types, and query behavior is different. You can use the functions MV_TO_ARRAY and ARRAY_TO_MV to convert between the two if needed. In general, we recommend using arrays whenever possible, since they are a newer and more powerful feature.

Use care during ingestion to ensure you get the type you want.

To get arrays when performing an ingestion using JSON ingestion specs, such as native batch (link) or streaming ingestion (link), use dimension type auto or enable useSchemaDiscovery. When performing a SQL-based ingestion, write a query that generates arrays and set the context parameter arrayIngestMode: array. Arrays may contain strings or numbers.

To get multi-value dimensions when performing an ingestion using JSON ingestion specs, use dimension type string and do not enable useSchemaDiscovery. When performing a SQL-based ingestion, wrap arrays in ARRAY_TO_MV (link to examples), which ensures you get multi-value dimensions in any arrayIngestMode. Multi-value dimensions can only contain strings.

You can tell which type you have by checking the INFORMATION_SCHEMA.COLUMNS table, using a query like SELECT COLUMN_NAME, DATA_TYPE FROM INFORMATION_SCHEMA.COLUMNS WHERE TABLE_NAME = 'mytable'. Arrays are type ARRAY, multi-value strings are type VARCHAR.

I suggest including the same exact text in multi-value-dimensions.md, or at least linking to this section prominently.

You can convert multi-value dimensions to standard SQL arrays explicitly with `MV_TO_ARRAY` or implicitly using [array functions](./sql-array-functions.md). You can also use the array functions to construct arrays from multiple columns.

Druid serializes `ARRAY` results as a JSON string of the array by default, which can be controlled by the context parameter
`sqlStringifyArrays`. When set to `false`, the arrays will instead be returned as regular JSON arrays instead of in stringified form.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Surely this is only true for certain result formats? I mean, in csv, everything must be stringified somehow.


Druid supports [ARRAY types](arrays.md), which behave as standard SQL arrays, where results are grouped by matching entire arrays. The [`UNNEST` operator](./sql-array-functions.md#unn) can be used to perform operations on individual array elements, translating each element into a separate row.

ARRAY typed columns can be stored in segments with class JSON based ingestion using the 'auto' typed dimension schema shared with [schema auto-discovery](../ingestion/schema-design.md#schema-auto-discovery-for-dimensions) to detect and ingest arrays as ARRAY typed columns. For [SQL based ingestion](../multi-stage-query/index.md), the query context parameter `arrayIngestMode` must be specified as `"array"` to ingest ARRAY types. In Druid 28, the default mode for this parameter is `'mvd'` for backwards compatibility, which instead can only handle `ARRAY<STRING>` which it stores in [multi-value string columns](#multi-value-strings).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No real reason to have the extra single quotes in 'mvd'. Doing mvd is preferred.

Copy link
Contributor

@317brian 317brian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some minor copyediting nits/suggestions. Thanks for putting this together! Will re-review once Gian's suggestions make it in.

Refer to the [Druid SQL data type documentation](sql-data-types.md#arrays) and [SQL array function reference](sql-array-functions.md) for additional details
about the functions available to use with ARRAY columns and types in SQL.

The following sections describe inserting, filtering, and grouping behavior based on the following example data, which includes 3 array typed columns.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The following sections describe inserting, filtering, and grouping behavior based on the following example data, which includes 3 array typed columns.
The following sections describe inserting, filtering, and grouping behavior based on the following example data, which includes 3 array typed columns:

@@ -30,21 +30,36 @@ array of values instead of a single value, such as the `tags` values in the foll
{"timestamp": "2011-01-12T00:00:00.000Z", "tags": ["t1","t2","t3"]}
```

This document describes filtering and grouping behavior for multi-value dimensions. For information about the internal representation of multi-value dimensions, see
It is important to be aware that multi-value dimensions are distinct from [array types](arrays.md), which behave like standard SQL arrays. This document describes the behavior of multi-value dimensions, and some additional details can be found in the [SQL data type documentation](sql-data-types.md#multi-value-strings-behavior).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Slight change to emphasize that they're different:

Suggested change
It is important to be aware that multi-value dimensions are distinct from [array types](arrays.md), which behave like standard SQL arrays. This document describes the behavior of multi-value dimensions, and some additional details can be found in the [SQL data type documentation](sql-data-types.md#multi-value-strings-behavior).
It is important to be aware that multi-value dimensions are distinct from [array types](arrays.md). While array types behave like standard SQL arrays, multi-value dimensions do not. This document describes the behavior of multi-value dimensions, and some additional details can be found in the [SQL data type documentation](sql-data-types.md#multi-value-strings-behavior).

@@ -61,20 +76,79 @@ By default, Druid sorts values in multi-value dimensions. This behavior is contr

See [Dimension Objects](../ingestion/ingestion-spec.md#dimension-objects) for information on configuring multi-value handling.

Multi-value dimensions can also be inserted with [multi-stage ingestion](../multi-stage-query/index.md). The multi-stage query engine does not have direct handling of class Druid multi-value dimensions. A special pair of functions, `MV_TO_ARRAY` which converts multi-value dimensions into `VARCHAR ARRAY` and `ARRAY_TO_MV` to coerce them back into `VARCHAR` exist to enable handling these types. Multi-value handling is not available when using the multi-stage query engine to insert data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Multi-value dimensions can also be inserted with [multi-stage ingestion](../multi-stage-query/index.md). The multi-stage query engine does not have direct handling of class Druid multi-value dimensions. A special pair of functions, `MV_TO_ARRAY` which converts multi-value dimensions into `VARCHAR ARRAY` and `ARRAY_TO_MV` to coerce them back into `VARCHAR` exist to enable handling these types. Multi-value handling is not available when using the multi-stage query engine to insert data.
Multi-value dimensions can also be inserted with [SQL-based ingestion using the multi-stage query (MSQ) task engine](../multi-stage-query/index.md). The MSQ task engine does not have direct handling of class Druid multi-value dimensions. A special pair of functions, `MV_TO_ARRAY` which converts multi-value dimensions into `VARCHAR ARRAY` and `ARRAY_TO_MV` to coerce them back into `VARCHAR` exist to enable handling these types. Multi-value handling is not available when using the multi-stage query task engine to insert data.

{"timestamp": "2011-01-14T00:00:00.000Z", "tags": ["t5","t6","t7"]} #row3
{"timestamp": "2011-01-14T00:00:00.000Z", "tags": []} #row4

Notice that `ARRAY_TO_MV` is not present in the `GROUP BY` clause, since we only wish to coerce the type _after_ grouping.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Notice that `ARRAY_TO_MV` is not present in the `GROUP BY` clause, since we only wish to coerce the type _after_ grouping.
Notice that `ARRAY_TO_MV` is not present in the `GROUP BY` clause since we only wish to coerce the type _after_ grouping.

Notice that `ARRAY_TO_MV` is not present in the `GROUP BY` clause, since we only wish to coerce the type _after_ grouping.


The `EXTERN` is also able to refer to the `tags` input type as `VARCHAR`, which is also how a query on a Druid table containing a multi-value dimension would specify the type of the `tags` column. If this is the case, `MV_TO_ARRAY` must be used since the multi-stage engine only supports grouping on multi-value dimensions as arrays, and so they must be coerced first. These arrays then must be coerced back into `VARCHAR` in the `SELECT` part of the statement with `ARRAY_TO_MV`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `EXTERN` is also able to refer to the `tags` input type as `VARCHAR`, which is also how a query on a Druid table containing a multi-value dimension would specify the type of the `tags` column. If this is the case, `MV_TO_ARRAY` must be used since the multi-stage engine only supports grouping on multi-value dimensions as arrays, and so they must be coerced first. These arrays then must be coerced back into `VARCHAR` in the `SELECT` part of the statement with `ARRAY_TO_MV`.
The `EXTERN` is also able to refer to the `tags` input type as `VARCHAR`, which is also how a query on a Druid table containing a multi-value dimension would specify the type of the `tags` column. If this is the case, you must use `MV_TO_ARRAY` since the MSQ task engine only supports grouping on multi-value dimensions as arrays. So, they must be coerced first. These arrays must then be coerced back into `VARCHAR` in the `SELECT` part of the statement with `ARRAY_TO_MV`.

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY for the updates; a smaller round of suggestions follows.

| `maxNumTasks` | SELECT, INSERT, REPLACE<br /><br />The maximum total number of tasks to launch, including the controller task. The lowest possible value for this setting is 2: one controller and one worker. All tasks must be able to launch simultaneously. If they cannot, the query returns a `TaskStartTimeout` error code after approximately 10 minutes.<br /><br />May also be provided as `numTasks`. If both are present, `maxNumTasks` takes priority. | 2 |
| `taskAssignment` | SELECT, INSERT, REPLACE<br /><br />Determines how many tasks to use. Possible values include: <ul><li>`max`: Uses as many tasks as possible, up to `maxNumTasks`.</li><li>`auto`: When file sizes can be determined through directory listing (for example: local files, S3, GCS, HDFS) uses as few tasks as possible without exceeding 512 MiB or 10,000 files per task, unless exceeding these limits is necessary to stay within `maxNumTasks`. When calculating the size of files, the weighted size is used, which considers the file format and compression format used if any. When file sizes cannot be determined through directory listing (for example: http), behaves the same as `max`.</li></ul> | `max` |
| `finalizeAggregations` | SELECT, INSERT, REPLACE<br /><br />Determines the type of aggregation to return. If true, Druid finalizes the results of complex aggregations that directly appear in query results. If false, Druid returns the aggregation's intermediate type rather than finalized type. This parameter is useful during ingestion, where it enables storing sketches directly in Druid tables. For more information about aggregations, see [SQL aggregation functions](../querying/sql-aggregations.md). | true |
| `arrayIngestMode` | INSERT, REPLACE<br /><br /> Controls how ARRAY type values are stored in Druid segments. When set to `'array'` (recommended for SQL compliance), Druid will store all ARRAY typed values in [ARRAY typed columns](../querying/arrays.md), and supports storing both VARCHAR and numeric typed arrays. When set to `'mvd'` (the default, for backwards compatibility), Druid only supports VARCHAR typed arrays, and will store them as [multi-value string columns](../querying/multi-value-dimensions.md). When set to `none`, Druid will throw an exception when trying to store any type of arrays, used to help migrate operators from `'mvd'` mode to `'array'` mode and force query writers to make an explicit choice between ARRAY and multi-value VARCHAR typed columns. | `'mvd'` (for backwards compatibility, recommended to use `array` for SQL compliance)|
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

array is preferred over 'array'. In the JSON it's "array" anyway. (But forget that, use array.)

For none, is there a way for operators to set a default value? Otherwise it doesn't seem like it'd be useful for operators. (The useful flow would be for operators to set a default of none, and users to override it to either mvd or array as their preference dictates.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think operators would need to set the default query context with druid.query.default.context.arrayIngestMode which could then be overridden on a per query basis

@@ -61,20 +77,81 @@ By default, Druid sorts values in multi-value dimensions. This behavior is contr

See [Dimension Objects](../ingestion/ingestion-spec.md#dimension-objects) for information on configuring multi-value handling.

### SQL-based ingestion
Multi-value dimensions can also be inserted with [SQL-based ingestion](../multi-stage-query/index.md). The multi-stage query engine does not have direct handling of class Druid multi-value dimensions. A special pair of functions, `MV_TO_ARRAY` which converts multi-value dimensions into `VARCHAR ARRAY` and `ARRAY_TO_MV` to coerce them back into `VARCHAR` exist to enable handling these types. Multi-value handling is not available when using the multi-stage query engine to insert data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"classic" (spelling)

Although… what does it mean to say that MSQ doesn't have "direct handling of classic Druid multi-value dimensions"? I would think it does directly handle them, if you use ARRAY_TO_MV? I guess I'm not sure what you're trying to say here.

Grammar for the sentence starting with "A special pair of functions" is kind of wonky. Please rewrite it to be clearer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I was thinking of groupByEnableMultiValueUnnesting which looking at the code is actually allowed by default, and it is the web-console which sets it to false for MSQ queries by default. I'll try to clarify stuff


Druid supports [`ARRAY` types](arrays.md), which behave as standard SQL arrays, where results are grouped by matching entire arrays. The [`UNNEST` operator](./sql-array-functions.md#unn) can be used to perform operations on individual array elements, translating each element into a separate row.

`ARRAY` typed columns can be stored in segments with class JSON based ingestion using the 'auto' typed dimension schema shared with [schema auto-discovery](../ingestion/schema-design.md#schema-auto-discovery-for-dimensions) to detect and ingest arrays as ARRAY typed columns. For [SQL based ingestion](../multi-stage-query/index.md), the query context parameter `arrayIngestMode` must be specified as `"array"` to ingest ARRAY types. In Druid 28, the default mode for this parameter is `"mvd"` for backwards compatibility, which instead can only handle `ARRAY<STRING>` which it stores in [multi-value string columns](#multi-value-strings).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the word "class" is useful here. I suppose you meant "classic", but JSON based ingestion isn't entirely "classic" / "legacy"; for example it's the only way to do realtime still.

@github-actions github-actions bot added Area - Batch Ingestion Area - Querying Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Nov 1, 2023
Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor comments. Could you also add a case to MSQInsertTest that tests that ARRAY_TO_MV makes an MVD even if arrayIngestMode: array? Some other tests in there would also benefit from having a version for array and a version for mvd.

@@ -78,7 +78,7 @@ By default, Druid sorts values in multi-value dimensions. This behavior is contr
See [Dimension Objects](../ingestion/ingestion-spec.md#dimension-objects) for information on configuring multi-value handling.

### SQL-based ingestion
Multi-value dimensions can also be inserted with [SQL-based ingestion](../multi-stage-query/index.md). The multi-stage query engine does not have direct handling of class Druid multi-value dimensions. A special pair of functions, `MV_TO_ARRAY` which converts multi-value dimensions into `VARCHAR ARRAY` and `ARRAY_TO_MV` to coerce them back into `VARCHAR` exist to enable handling these types. Multi-value handling is not available when using the multi-stage query engine to insert data.
Multi-value dimensions can also be inserted with [SQL-based ingestion](../multi-stage-query/index.md). The functions `MV_TO_ARRAY` and `ARRAY_TO_MV` can assist in converting `VARCHAR` to `VARCHAR ARRAY` and `VARCHAR ARRAY` into `VARCHAR` respectively. Multi-value handling is not available when using the multi-stage query engine to insert data.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Multi-value handling" in English like that I think will be confusing. It sounds like we're saying that multi-value dimensions cannot be handled by MSQ. Probably clearer to use multiValueHandling to make it clear we're talking about a parameter.

}
```

Output type is optional, and can be any native Druid type: `LONG`, `FLOAT`, `DOUBLE`, `STRING`, `ARRAY` types (e.g. `ARRAY<LONG>`), or `COMPLEX` types (e.g. `COMPLEX<json>`).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This raises questions that the docs should answer:

  • What benefit is there to providing outputType?
  • What happens if outputType different from the type of expression? Error, cast, something else?

…dated post-aggregations.md to be consistent with aggregations.md and filters.md and use tables
@clintropolis clintropolis changed the title better documentation for the differences between arrays and mvds explicit outputType for ExpressionPostAggregator, better documentation for the differences between arrays and mvds Nov 1, 2023
@gianm
Copy link
Contributor

gianm commented Nov 2, 2023

Thanks, the latest changes look good to me!

@clintropolis clintropolis merged commit d261587 into apache:master Nov 2, 2023
82 checks passed
@clintropolis clintropolis deleted the arrays-are-not-mvds branch November 2, 2023 07:31
clintropolis added a commit to clintropolis/druid that referenced this pull request Nov 2, 2023
…n for the differences between arrays and mvds (apache#15245)

* better documentation for the differences between arrays and mvds
* add outputType to ExpressionPostAggregator to make docs true
* add output coercion if outputType is defined on ExpressionPostAgg
* updated post-aggregations.md to be consistent with aggregations.md and filters.md and use tables
abhishekagarwal87 pushed a commit that referenced this pull request Nov 3, 2023
…n for the differences between arrays and mvds (#15245) (#15307)

* better documentation for the differences between arrays and mvds
* add outputType to ExpressionPostAggregator to make docs true
* add output coercion if outputType is defined on ExpressionPostAgg
* updated post-aggregations.md to be consistent with aggregations.md and filters.md and use tables
CaseyPan pushed a commit to CaseyPan/druid that referenced this pull request Nov 17, 2023
…n for the differences between arrays and mvds (apache#15245)

* better documentation for the differences between arrays and mvds
* add outputType to ExpressionPostAggregator to make docs true
* add output coercion if outputType is defined on ExpressionPostAgg
* updated post-aggregations.md to be consistent with aggregations.md and filters.md and use tables
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Batch Ingestion Area - Documentation Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Area - Querying Area - SQL Bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants