Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bucket path name resolution fails with siblings and child aggregations with the same name #30608

Closed
ruizmarc opened this issue May 15, 2018 · 10 comments · Fixed by #30632
Closed
Assignees

Comments

@ruizmarc
Copy link

ruizmarc commented May 15, 2018

Elasticsearch version (bin/elasticsearch --version): 6.2.4

Plugins installed: No plugins installed

JVM version (java -version): java version "1.8.0_25"

OS version (uname -a if on a Unix-like system): Darwin Kernel Version 17.5.0

Description of the problem including expected versus actual behavior:

When trying to use a pipeline aggregation on a date histogram, bucket_path cannot properly be resolved if it is pointing to a child aggregation of the date histogram if a sibling aggregation (pipeline aggregation's sibling) has the same name as the child aggregation of the date histogram. (see example).

I would expect that elasticsearch could properly resolve the path, as it doesn't seem to exist an ambiguation. Maybe I'm wrong with this and it should behave as it is behaving...

Steps to reproduce:

Here you have a simple query example that allows to reproduce the problem

{
  "query": { "match_all": {} },
  "size": 0,
  "aggs": {
    "sessionsCount": {
      "filter": {
        "bool": {
          "must": [
            {
              "terms": {
                "status": [
                  "FINISHED"
                ]
              }
            }
          ]
        }
      }
    },
    "monthlyAverageSessions": {
      "avg_bucket": {
        "buckets_path": "monthBuckets>sessionsCount>_count",
        "gap_policy": "insert_zeros"
      }
    },
    "monthBuckets": {
      "date_histogram": {
        "field": "startTimestamp",
        "interval": "month"
      },
      "aggs": {
        "sessionsCount": {
          "filter": {
            "bool": {
              "must": [
                {
                  "terms": {
                    "status": [
                      "FINISHED"
                    ]
                  }
                }
              ]
            }
          }
        }
      }
    }
  }
}

And this is the error that I receive.

"{\"error\":{\"root_cause\":[],\"type\":\"search_phase_execution_exception\",\"reason\":\"\",\"phase\":\"fetch\",\"grouped\":true,\"failed_shards\":[],\"caused_by\":{\"type\":\"class_cast_exception\",\"reason\":\"org.elasticsearch.search.aggregations.bucket.filter.InternalFilter cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation\"}},\"status\":503}"

Any of both aggregations works fine if they are not together (sessionsCount and monthlyAverageSessions). And it also works well if I change the name of the first aggregation (sessionsCount) to a different one (sessions for example). So it looks like a naming problem.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/es-search-aggs

@cbuescher cbuescher added the :Analytics/Aggregations Aggregations label May 15, 2018
@cbuescher
Copy link
Member

@colings86 do you have an idea whether this is something that should be supported or a usage problem?

@colings86
Copy link
Contributor

@ruizmarc This does indeed looks like a bug at first glance. Do you have the full stack trace for the ClassCastException from your server logs?

@ruizmarc
Copy link
Author

ruizmarc commented May 15, 2018

Thanks for your quick response :) Sure, here you have the stack trace:

 org.elasticsearch.action.search.SearchPhaseExecutionException:
    at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:274) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:92) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:657) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.4.jar:6.2.4]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
 Caused by: java.lang.ClassCastException: org.elasticsearch.search.aggregations.bucket.filter.InternalFilter cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation
    at org.elasticsearch.search.aggregations.pipeline.bucketmetrics.BucketMetricsPipelineAggregator.doReduce(BucketMetricsPipelineAggregator.java:83) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reduceAggs(SearchPhaseController.java:533) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:504) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:421) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController$1.reduce(SearchPhaseController.java:740) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:102) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase.access$000(FetchSearchPhase.java:45) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:87) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.4.jar:6.2.4]
    ... 5 more
 [2018-05-15T10:42:12,952][WARN ][r.suppressed             ] path: /charges/_search, params: {index=charges}
 org.elasticsearch.action.search.SearchPhaseExecutionException:
    at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:274) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:92) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:657) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.4.jar:6.2.4]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
 Caused by: java.lang.ClassCastException: org.elasticsearch.search.aggregations.bucket.filter.InternalFilter cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation
    at org.elasticsearch.search.aggregations.pipeline.bucketmetrics.BucketMetricsPipelineAggregator.doReduce(BucketMetricsPipelineAggregator.java:83) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reduceAggs(SearchPhaseController.java:533) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:504) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:421) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController$1.reduce(SearchPhaseController.java:740) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:102) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase.access$000(FetchSearchPhase.java:45) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:87) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.4.jar:6.2.4]
    ... 5 more
 [2018-05-15T10:42:28,853][WARN ][r.suppressed             ] path: /charges/_search, params: {index=charges}
 org.elasticsearch.action.search.SearchPhaseExecutionException:
    at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:274) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:92) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:657) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.4.jar:6.2.4]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
 Caused by: java.lang.ClassCastException: org.elasticsearch.search.aggregations.bucket.filter.InternalFilter cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation
    at org.elasticsearch.search.aggregations.pipeline.bucketmetrics.BucketMetricsPipelineAggregator.doReduce(BucketMetricsPipelineAggregator.java:83) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reduceAggs(SearchPhaseController.java:533) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:504) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:421) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController$1.reduce(SearchPhaseController.java:740) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:102) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase.access$000(FetchSearchPhase.java:45) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:87) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.4.jar:6.2.4]
    ... 5 more
 [2018-05-15T13:31:07,538][WARN ][r.suppressed             ] path: /charges/_search, params: {index=charges}
 org.elasticsearch.action.search.SearchPhaseExecutionException:
    at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:274) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:92) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:657) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.TimedRunnable.doRun(TimedRunnable.java:41) [elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) [elasticsearch-6.2.4.jar:6.2.4]
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_161]
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_161]
    at java.lang.Thread.run(Thread.java:748) [?:1.8.0_161]
 Caused by: java.lang.ClassCastException: org.elasticsearch.search.aggregations.bucket.filter.InternalFilter cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation
    at org.elasticsearch.search.aggregations.pipeline.bucketmetrics.BucketMetricsPipelineAggregator.doReduce(BucketMetricsPipelineAggregator.java:83) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reduceAggs(SearchPhaseController.java:533) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:504) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:421) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.SearchPhaseController$1.reduce(SearchPhaseController.java:740) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:102) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase.access$000(FetchSearchPhase.java:45) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:87) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:672) ~[elasticsearch-6.2.4.jar:6.2.4]
    at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-6.2.4.jar:6.2.4]
    ... 5 more

@cbuescher cbuescher added the >bug label May 15, 2018
@colings86
Copy link
Contributor

@ruizmarc I hope you don't mind but I re-formatted your stack trace a bit to make it a bit easier to read

@polyfractal
Copy link
Contributor

I see the bug. It's not just when the aggs are the same name, but when they line up in just the correct manner:

  • A. Multi-bucket agg in the first entry of our internal list
  • B. Regular agg as the immediate child of the multi-bucket in A
  • C. Regular agg with the same name as B at the top level, listed as the second entry in our internal list
  • D. Finally, a pipeline agg with the path down to B

It blows up because we overwrite the bucket path with the sublist. So when we start iterating, we match the agg in A, sublist the path and recurse down. But when the loop comes back around to check agg C, the sublisted path now matches because it went from A>B to just B, and then it throws the cast exception.

The fix should be pretty straightforward. I'll work something up. Thanks for the bug report @ruizmarc!

@polyfractal polyfractal self-assigned this May 15, 2018
polyfractal added a commit to polyfractal/elasticsearch that referenced this issue May 15, 2018
When processing a top-level sibling pipeline, we destructively sublist
the path by assigning back onto the same variable.  But if aggs are
specified such:

A. Multi-bucket agg in the first entry of our internal list
B. Regular agg as the immediate child of the multi-bucket in A
C. Regular agg with the same name as B at the top level, listed as the
   second entry in our internal list
D. Finally, a pipeline agg with the path down to B

We'll get class cast exception.  The first agg will sublist the path
from [A,B] to [B], and then when we loop around to check agg C,
the sublisted path [B] matches the name of C and it fails.

The fix is simple: we just need to store the sublist in a new object
so that the old path remains valid for the rest of the aggs in the loop

Closes elastic#30608
@ruizmarc
Copy link
Author

I'm glad it helped! Thanks for your quick fix, it will be very helpful! 💯 ☺️

polyfractal added a commit that referenced this issue May 16, 2018
When processing a top-level sibling pipeline, we destructively sublist
the path by assigning back onto the same variable.  But if aggs are
specified such:

A. Multi-bucket agg in the first entry of our internal list
B. Regular agg as the immediate child of the multi-bucket in A
C. Regular agg with the same name as B at the top level, listed as the
   second entry in our internal list
D. Finally, a pipeline agg with the path down to B

We'll get class cast exception.  The first agg will sublist the path
from [A,B] to [B], and then when we loop around to check agg C,
the sublisted path [B] matches the name of C and it fails.

The fix is simple: we just need to store the sublist in a new object
so that the old path remains valid for the rest of the aggs in the loop

Closes #30608
polyfractal added a commit that referenced this issue May 16, 2018
When processing a top-level sibling pipeline, we destructively sublist
the path by assigning back onto the same variable.  But if aggs are
specified such:

A. Multi-bucket agg in the first entry of our internal list
B. Regular agg as the immediate child of the multi-bucket in A
C. Regular agg with the same name as B at the top level, listed as the
   second entry in our internal list
D. Finally, a pipeline agg with the path down to B

We'll get class cast exception.  The first agg will sublist the path
from [A,B] to [B], and then when we loop around to check agg C,
the sublisted path [B] matches the name of C and it fails.

The fix is simple: we just need to store the sublist in a new object
so that the old path remains valid for the rest of the aggs in the loop

Closes #30608
@jmuscireum
Copy link

It seems like I stumbled across the same bug, so I backported the fix to 5.6.8 to check. But unfortunatly the fix isn't working for me:

org.elasticsearch.action.search.SearchPhaseExecutionException: 
	at org.elasticsearch.action.search.AbstractSearchAsyncAction.onPhaseFailure(AbstractSearchAsyncAction.java:272) [elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.action.search.FetchSearchPhase$1.onFailure(FetchSearchPhase.java:92) [elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.onFailure(ThreadContext.java:659) [elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:39) [elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_144]
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_144]
	at java.lang.Thread.run(Thread.java:748) [?:1.8.0_144]
Caused by: java.lang.ClassCastException: org.elasticsearch.search.aggregations.bucket.filter.InternalFilter cannot be cast to org.elasticsearch.search.aggregations.InternalMultiBucketAggregation
	at org.elasticsearch.search.aggregations.pipeline.bucketmetrics.BucketMetricsPipelineAggregator.doReduce(BucketMetricsPipelineAggregator.java:83) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.action.search.SearchPhaseController.reduceAggs(SearchPhaseController.java:519) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:490) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.action.search.SearchPhaseController.reducedQueryPhase(SearchPhaseController.java:408) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.action.search.SearchPhaseController$1.reduce(SearchPhaseController.java:725) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.action.search.FetchSearchPhase.innerRun(FetchSearchPhase.java:102) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.action.search.FetchSearchPhase.access$000(FetchSearchPhase.java:45) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.action.search.FetchSearchPhase$1.doRun(FetchSearchPhase.java:87) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:674) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	at org.elasticsearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:37) ~[elasticsearch-5.6.8-SNAPSHOT.jar:5.6.8-SNAPSHOT]
	... 3 more

The query is:

{
  "size" : 13,
  "query" : {
    "match_all": {}
  },
  "aggregations" : {  
    "filteredPrices" : {
      "filter" : {
        "match_all": {}
      },
      "aggregations" : {
        "groupById" : {
          "terms" : {
            "field" : "itemId"
          },
          "aggregations" : {
            "nestedPrices" : {
              "nested" : {
                "path" : "prices"
              },
              "aggregations" : {
                "catalogFiltered" : {
                  "filter" : {
                    "match_all": {}
                  },
                  "aggregations" : {
                    "minPrice" : {
                      "min" : {
                        "field" : "prices.price"
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    },
    "minPriceOfAllItems" : {
      "min_bucket" : {
        "buckets_path" : [
          "filteredPrices>groupById>nestedPrices>catalogFiltered>minPrice"
        ],
        "gap_policy" : "skip"
      }
    },
    "maxPriceOfAllItems" : {
      "max_bucket" : {
        "buckets_path" : [
          "filteredPrices>groupById>nestedPrices>catalogFiltered>minPrice"
        ],
        "gap_policy" : "skip"
      }
    }
  }
}

@polyfractal
Copy link
Contributor

Hey @jmuscireum, I believe your running into a triplet of unrelated issues that constrain pipeline aggs right now.

First, filter aggs don't play nicely with pipelines because they only emit a single bucket, whereas the various sibling pipeline aggs (like min_bucket) can only use multi-bucket aggs. Issue #14600 deals with this (it talks about bucket_script, but generally applicable to other sibling aggs).

You can "workaround" it by using filters agg, but then you'll run into this issue: #29287. Which is actually the same thing: nested aggs emit a single bucket instead of multi-buckets

Finally, and I'm not sure there's an issue for this, but pipeline aggs can't aggregate across multiple levels of terms aggregations easily. You can sometimes work around it by "proxy'ing" the value out of the terms agg with an intermediate pipeline agg (e.g. a min_bucket on the same level as the terms, to "roll up" the value at that level, then another min_bucket at a higher level to roll up all the previous min_buckets). But you can't normally use one pipeline agg to aggregate across multiple levels of terms aggs.

Sorry for all the bad news... pipeline aggs have some fundamental limitations based on how the framework works. :(

@jmuscireum
Copy link

Hey @polyfractal, thank you for the detailed answer! As a workaround, we can manually fetch the min and max price from the aggregated bucket. But I have the feeling, that there is a way that is more suited for our use case, but I can't think of it. Maybe you have an idea how we can achieve this, without having so many nested aggregations.

Our products can have different prices in multiple catalogs. Depending on the user, different catalogs are accessible. So the prices differ from user to user. On our search page every product filter is showing the active filter values and the values that are additionally possible. For this we are using the post filter to filter the products after the filter aggregations were done. That's why filteredPrices exists. It filters the products that should be taken into consideration for the min and max values of our price filter.

Thank you for doing awesome work and have a nice weekend!

ywelsch pushed a commit to ywelsch/elasticsearch that referenced this issue May 23, 2018
When processing a top-level sibling pipeline, we destructively sublist
the path by assigning back onto the same variable.  But if aggs are
specified such:

A. Multi-bucket agg in the first entry of our internal list
B. Regular agg as the immediate child of the multi-bucket in A
C. Regular agg with the same name as B at the top level, listed as the
   second entry in our internal list
D. Finally, a pipeline agg with the path down to B

We'll get class cast exception.  The first agg will sublist the path
from [A,B] to [B], and then when we loop around to check agg C,
the sublisted path [B] matches the name of C and it fails.

The fix is simple: we just need to store the sublist in a new object
so that the old path remains valid for the rest of the aggs in the loop

Closes elastic#30608
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants