Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] Add cumulative sum aggregation #61776

Closed
2 tasks done
timroes opened this issue Mar 30, 2020 · 12 comments
Closed
2 tasks done

[Lens] Add cumulative sum aggregation #61776

timroes opened this issue Mar 30, 2020 · 12 comments
Assignees
Labels
enhancement New value added to drive a business result Feature:Lens Project:LensDefault Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@timroes
Copy link
Contributor

timroes commented Mar 30, 2020

Add a cumulative sum aggregation to Lens. See #56696 for more discussion.

Tasks:

@timroes timroes added enhancement New value added to drive a business result Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens labels Mar 30, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app (Team:KibanaApp)

@wylieconlon
Copy link
Contributor

wylieconlon commented Sep 16, 2020

This calculation is the simplest of the timeseries functions, so I don't think it needs the level of additional clarification that we need for the other time series functions.

The only restriction that I would like to add is that cumulative sum does not make sense on all metrics. It only makes sense when the underlying data represents something that can be summed up. So for best practices, we should prevent users from taking a cumulative sum of:

  • Averages
  • Medians
  • Cardinality (this won't be correct, issue here)

Screen Shot 2020-09-16 at 6 54 28 PM

Should we support cumulative distribution?

Unlike the other time series functions, it might make sense to allow a cumulative sum on any number, not just numbers that are part of a date histogram. This can produce a cumulative distribution chart which is a long-standing Kibana request.

@flash1293
Copy link
Contributor

@wylieconlon It seems like adding cumulative distribution isn't more work in our architecture than just allowing this for timeseries, right? In that case I would suggest implementing this from the start.

@wylieconlon
Copy link
Contributor

@flash1293 I took a second look at cumulative distributions, and have come to the conclusion that we should implement it as a completely separate function. I attempted to create some examples using Vega, and the main methods I found were:

  • Apply a series of functions to calculate a result that I don't have confidence in. For reference, I was implementing it with the following functions:
    1. Calculate the overall sum
    2. Divide each value by the overall sum
    3. Take the cumulative sum of 2, and display as a percentage
  • Generate an entirely separate ES request with the percentiles agg instead of histogram.

Based on this, I prefer splitting this into a separate function in both the UI and the expression functions.

@flash1293
Copy link
Contributor

@wylieconlon I'm fine with spliting it out. I think I misunderstood what you were planning to do initially - I thought it would just be about summing up a metric over histogram buckets without any normalizing. This seems like a much easier to implement feature which would also provide value, wdyt?

@wylieconlon
Copy link
Contributor

@flash1293 Sorry for the confusion, I agree that we should only do cumulative sum for now. I wasn't clear that when I said "I prefer splitting it" I didn't also say that I would want to implement them at different times. Cumulative sum is the easier one to implement.

@flash1293
Copy link
Contributor

@wylieconlon Thanks for the clarification. What I meant is also allowing to use the cumulative sum operation if the user has a histogram operation in the stack instead of a date histogram (and maybe even on all bucket operations) - it can still be meaningful in some cases and shouldn't be harder to add. What do you think?

@wylieconlon
Copy link
Contributor

@flash1293 Yes, that makes sense. There is a Vega example of the same thing.

If we can come up with a good use case for cumulative sum inside ordinal functions (terms or filters), we could also support those, but I don't have any good examples right now.

@monfera
Copy link
Contributor

monfera commented Oct 1, 2020

Would cumulative sums sometimes be computed in Kibana/Lens? Agreeing that it's typically before, or early in the aggregation pipe, is there a risk that the cumulative sum calculation needs such a fine granularity that Kibana can't handle it, even if subsequently, other aggregations make the data much coarser, and only a dozen or two bars are shown in a barchart? It feels like one of those operations that are ideally pushed down.

@flash1293
Copy link
Contributor

Cumulative sums would be calculated in Kibana - we currently don't have use cases where cumulative sum is calculated on very fine granularity and subsequently aggregated down. It's just a postprocessing step on top of existing buckets and subject to the same limitations (e.g. total number of buckets). I don't see a reason why to treat it differently than other operations.

@flash1293 flash1293 added the loe:needs-research This issue requires some research before it can be worked on or estimated label Oct 2, 2020
@flash1293 flash1293 removed the loe:needs-research This issue requires some research before it can be worked on or estimated label Oct 12, 2020
@flash1293 flash1293 self-assigned this Oct 12, 2020
@flash1293 flash1293 removed their assignment Oct 20, 2020
@flash1293
Copy link
Contributor

Removing assignment as the rest of this issue is blocked by #76828 for now.

@flash1293 flash1293 self-assigned this Nov 17, 2020
@flash1293
Copy link
Contributor

Closed by #84384

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Lens Project:LensDefault Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests

6 participants