Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bucket_script & bucket_selector fail to get reference from bucket_path with meaningful cumulative_sum,serial_diff and moving_avg aggregtaion #27602

Closed
voidlps opened this issue Nov 30, 2017 · 1 comment

Comments

@voidlps
Copy link

voidlps commented Nov 30, 2017

Elasticsearch version: 5.6.3

Plugins installed: []

JVM version: 1.8.0_151

OS version: official docker image

Description of the problem including expected versus actual behavior:
bucket_script / bucket_selector aggregation with a bucket_path pointing to cumulative_sum or serial_diff result may fail or become zero.
The major problem comes from in date_histogram, we may have some empty buckets but the cumulative_sum and serial_diff values are still meaningful and shall be used in bucket_script, for example, let elasticsearch do the unit conversation and provide the result together.

It seems related to 3rd OR in this line:

if (Double.isInfinite(value) || Double.isNaN(value) || (bucket.getDocCount() == 0 && !isDocCountProperty)) {

when bucket doc_count==0, here will always use gapPolicy even there is some meaningful value with that bucket_path, but this assumption is incorrect siince cumulative_sum / serial_diff / moving_avg aggregations will still provide meaningful values even it's an empty bucket.

Steps to reproduce:

Please include a minimal but complete recreation of the problem, including
(e.g.) index creation, mappings, settings, query etc. The easier you make for
us to reproduce it, the more likely that somebody will take the time to look at it.

  1. create example data:
curl http://localhost:9200/test_bucket_path/testdata/1 -XPUT -d'{"name":"a","num":1, "t":"2017-11-30T09:03"}'
curl http://localhost:9200/test_bucket_path/testdata/2 -XPUT -d'{"name":"a","num":3, "t":"2017-11-30T09:15"}'
curl http://localhost:9200/test_bucket_path/testdata/3 -XPUT -d'{"name":"a","num":5, "t":"2017-11-30T09:45"}'

2.try the aggregation with empty bucket:

GET test_bucket_path/_search
{
  "size": 0,
  "aggs": {
    "test": {
      "date_histogram": {
        "field": "t",
        "interval": "10m"
      },
      "aggs": {
        "avg": {
          "avg": {
            "field": "num"
          }
        },
        "cul_sum": {
          "cumulative_sum": {
            "buckets_path": "avg"
          }
        },
        "broken_script": {
          "bucket_script": {
            "buckets_path": {
              "cul_sum": "cul_sum"
            },
            "script": "params.cul_sum*5"
          }
        }
      }
    }
  }
}
  1. the buckets of "2017-11-30T09:20:00.000Z" and "2017-11-30T09:30:00.000Z", the bucket_script result is not provided
{
  "took": 167,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "skipped": 0,
    "failed": 0
  },
  "hits": {
    "total": 3,
    "max_score": 0,
    "hits": []
  },
  "aggregations": {
    "test": {
      "buckets": [
        {
          "key_as_string": "2017-11-30T09:00:00.000Z",
          "key": 1512032400000,
          "doc_count": 1,
          "avg": {
            "value": 1
          },
          "cul_sum": {
            "value": 1
          },
          "broken_script": {
            "value": 5
          }
        },
        {
          "key_as_string": "2017-11-30T09:10:00.000Z",
          "key": 1512033000000,
          "doc_count": 1,
          "avg": {
            "value": 3
          },
          "cul_sum": {
            "value": 4
          },
          "broken_script": {
            "value": 20
          }
        },
        {
          "key_as_string": "2017-11-30T09:20:00.000Z",
          "key": 1512033600000,
          "doc_count": 0,
          "avg": {
            "value": null
          },
          "cul_sum": {
            "value": 4
          }
        },
        {
          "key_as_string": "2017-11-30T09:30:00.000Z",
          "key": 1512034200000,
          "doc_count": 0,
          "avg": {
            "value": null
          },
          "cul_sum": {
            "value": 4
          }
        },
        {
          "key_as_string": "2017-11-30T09:40:00.000Z",
          "key": 1512034800000,
          "doc_count": 1,
          "avg": {
            "value": 5
          },
          "cul_sum": {
            "value": 9
          },
          "broken_script": {
            "value": 45
          }
        }
      ]
    }
  }
}

Provide logs (if relevant):

@voidlps voidlps changed the title bucket_script & bucket_selector fail to reference bucket_path with meaningful cumulative_su,serial_diff and moving_avg aggregtaion bucket_script & bucket_selector fail to get reference from bucket_path with meaningful cumulative_sum,serial_diff and moving_avg aggregtaion Nov 30, 2017
@colings86
Copy link
Contributor

This is a duplicate of #27377. Closing in favour of that issue

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants