Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] aggregation on nested field returns incorrect result #14347

Closed
LantaoJin opened this issue Jun 14, 2024 · 5 comments
Closed

[BUG] aggregation on nested field returns incorrect result #14347

LantaoJin opened this issue Jun 14, 2024 · 5 comments
Assignees
Labels

Comments

@LantaoJin
Copy link
Member

LantaoJin commented Jun 14, 2024

Describe the bug

Aggregation value_count on field of nested type returns incorrect result. This issue only happens on nested type, object type without nested looks fine.

Related component

Search:Aggregations

To Reproduce

Repro:

# create an test index with nested field
PUT test_nested_field_index
{
  "mappings" : {
    "properties": {
      "patients": { 
        "type" : "nested"
      }
    }
  }
}
# add some testing data
PUT test_nested_field_index/_doc/100
{ 
  "patients": [ 
    {"name" : "John Doe", "age" : 56, "smoker" : true},
    {"name" : "Mary Major", "age" : 85, "smoker" : false}
  ] 
}
# value_count should return 2 instead of 0
GET test_nested_field_index/_search
{
  "size": 0,
  "aggs": {
    "count_patients_name": {
      "value_count": {
        "field": "patients.name.keyword"
      }
    }
  }
}

Expected behavior

value_count should return 2 instead of 0

Additional Details

Plugins
Please list all plugins currently enabled.

Screenshots
If applicable, add screenshots to help explain your problem.

Host/Environment (please complete the following information):

  • OS: Mac
  • Version 2.14

Additional context
Add any other context about the problem here.

@mch2
Copy link
Member

mch2 commented Jun 26, 2024

possibly related - #13324 / #14208 / #13288

@jed326
Copy link
Collaborator

jed326 commented Jun 26, 2024

@LantaoJin

Please correct me if my understanding here is wrong, but it looks like your agg is on the field patients.name.keyword but from the document it should just be patients.name.

Furthermore, when I try to do your value_count agg on just patients.name, I get the following:

"root_cause": [
            {
                "type": "illegal_argument_exception",
                "reason": "Text fields are not optimised for operations that require per-document field data like aggregations and sorting, so these operations are disabled by default. Please use a keyword field instead. Alternatively, set fielddata=true on [patients.name] in order to load field data by uninverting the inverted index. Note that this can use significant memory."
            }
        ]

Given that I don't think there are any problems here then, could you please double check?

@jed326
Copy link
Collaborator

jed326 commented Jun 26, 2024

Also, I think in order to interact with nested fields in an aggregation, a nested aggregation must be used. See: https://opensearch.org/docs/latest/aggregations/bucket/nested/

@kkewwei
Copy link
Contributor

kkewwei commented Jun 28, 2024

@jed326, @mch2, I test this, it'ok, it seems not a problem.

GET test_nested_field_index/_search
{
   "aggs": {
      "patients1": {
         "nested": {
            "path": "patients"
         },
         "aggs": {
            "aggs1": {
               "value_count": {
                  "field": "patients.name.keyword"
               }
            }
         }
      }
   }
}
image

@jed326
Copy link
Collaborator

jed326 commented Jun 28, 2024

Thanks @kkewwei! I also took at look at this, I think the problem is the nested aggregation was not used in the original query. I'm going to close this issue for now, @LantaoJin please re-open if there's any other issues here, thanks!

@jed326 jed326 closed this as completed Jun 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: Done
Development

No branches or pull requests

4 participants