-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] MAX/MIN functions does not work properly with array values #3138
Comments
This appears to be due to how the OpenSearch engine calculates the min and max for arrays values.
|
For reference, this is how PostgreSQL behaves:
|
Thanks for rasing issue. OpenSearch/Lucene does not support ARRAY data type. Lucene allows adding multiple values for the same field name. for example. when calcualte min/max of field y, OpenSeach read all the value of y, and calcuate metrics, the result is min(y)=1, max(y)=5.
|
Related issue in opensearch-core. opensearch-project/OpenSearch#16420 |
One solution is if array_field is used in aggregation, we should do post-processing, instead of rewrite as DSL aggregation query. |
@penghuo I'm not sure the SQL plugin can easily know that the field is an array field. By default, the mapping will only contain information about the element types.
If the user adds something to the mapping, we could know that it is an array. Is there anything similar currently in OpenSearch? |
@normanj-bitquill agree, opensearch-project/OpenSearch#16420 discuss similar approach. |
@normanj-bitquill @acarbonetto I've seen the approach suggested here. IMO we can utilize indexing: ingestion a mismatched value will not cause an error (if this field only contains a single value) but the query engine (PPL/SQL) should treat this value as a single list. IMO in the first iteration we can have the plugin be aware of the |
What is the bug?
If MAX or MIN are applied to a field with array values, then the MAX or MIN of any element in any of the array values is returned.
Consider an index with the following data:
and this query:
or this query:
For MAX, the expectd result is
[3, 4]
and for MIN, the expected result is[1, 2]
. The comparison should be performed element by element and preserve the entire array value.How can one reproduce the bug?
Steps to reproduce the behavior:
What is the expected behavior?
Use array comparisons to compare the array values and return the MAX or MIN array.
What is your host/environment?
Do you have any screenshots?
N/A
Do you have any additional context?
Issue #1300 had a change recently merged in that allows array values to be used in query evaluation and in the result set.
The text was updated successfully, but these errors were encountered: