Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lens] Support summary row for table #84558

Closed
wylieconlon opened this issue Nov 30, 2020 · 14 comments · Fixed by #101075
Closed

[Lens] Support summary row for table #84558

wylieconlon opened this issue Nov 30, 2020 · 14 comments · Fixed by #101075
Assignees
Labels
enhancement New value added to drive a business result Feature:Lens Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@wylieconlon
Copy link
Contributor

wylieconlon commented Nov 30, 2020

User-configurable options

Users should be able to summarize their metrics using the basic summary functions:

  • Sum (default for numeric)
  • Count
  • Avg
  • Min
  • Max
  • is this list complete?

Each metric should be displayed with a string label, for example "Sum: 1,234Kb/s", per the EUI examples.

There are several open product questions, input needed: cc @flash1293 @timductive @timroes

How to enable the summary row:

a. More flexible than visualize: Separate summary row and function for each metric (this is my preference)
b. Same functionality as visualize: Enable the summary row for all metrics together, only one function allowed for the summary

Number formatting:

a. Use the column number formatter unless we are taking the Count. Count uses the default number formatter. (simplest option, and my preference)
b. Allow a separate number formatter for the summary value

Edge cases in configuration

I believe there should be a separate summary for each transposed column, which will increase the complexity of the implementation. It requires client-side calculation instead of server-side.

The "Last value" metric has several edge cases. I recommend that we don't enable summary on the "last value" because it will cause more problems than benefits. The edge cases are:

  • Array values: This is an existing bug in the Visualize data table. We can't summarize array values.
  • Non-numeric values: The only metric we can support is Count or Unique count, which I am not sure benefits users

Question: Should we allow summary on last value, even with the known limitation?

@wylieconlon wylieconlon added enhancement New value added to drive a business result Team:Visualizations Visualization editors, elastic-charts and infrastructure Feature:Lens labels Nov 30, 2020
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-app (Team:KibanaApp)

@wylieconlon wylieconlon mentioned this issue Nov 30, 2020
29 tasks
@timductive
Copy link
Member

I agree with the list of summary functions you have listed.

I agree with having a more flexible summary row than visualize and using the same column number formatter.

From a product perspective, having a separate summary for each transposed column makes sense but probably not a hard requirement depending on how complex that makes the implementation.

I'm generally fine with not enabling a summary for "last value" if it won't produce anything valuable to the user. My concern would be addressing how we explain that to the user, I defer to whatever @MichaelMarcialis thinks is best there, maybe it's not worth explaining.

@wylieconlon
Copy link
Contributor Author

wylieconlon commented Dec 1, 2020

@timductive Thanks for updating! Just to make sure, your preference is the same as mine: "Separate summary row and function for each metric" instead of having a single toggle?

My concern would be addressing how we explain that to the user

We had a quick discussion about this earlier today. Instead of preventing it, we could do the same approach we do for XY charts. For those charts, we will take the values if they are plain numbers, but show a warning on non-numeric values. This sounds like a better approach than preventing it entirely.

@timductive
Copy link
Member

@wylieconlon the word I'm a little confused by is having a separate "row" for each metric, I think we are saying the same thing but a quick sketch or example of how this would look would be easier for me to understand.

Your described solution sounds right for how to handle "last value".

@wylieconlon
Copy link
Contributor Author

I accidentally added the word "row". I would expect that each column has its own summary. To take one of the table examples from #69450:

Date A - Count A - Sum B - Count B - Sum
3/24/20 20 102 2 23
3/25/20 3 50 6 81
3/26/20 7 97 8 12
. Sum: 30 Sum: 249 Sum: 16 Sum: 116

@MichaelMarcialis
Copy link
Contributor

I'm generally fine with not enabling a summary for "last value" if it won't produce anything valuable to the user. My concern would be addressing how we explain that to the user, I defer to whatever @MichaelMarcialis thinks is best there, maybe it's not worth explaining.

If it's a solid assumption that users won't find value in summarizing a "last value" metric and wouldn't lament its absence, I'm inclined to just omit it and not call any additional attention to the omission (unless we start getting user feedback that indicates otherwise).

We had a quick discussion about this earlier today. Instead of preventing it, we could do the same approach we do for XY charts. For those charts, we will take the values if they are plain numbers, but show a warning on non-numeric values. This sounds like a better approach than preventing it entirely.

Alternatively, if we're unsure about that assumption and we do wish to allow a summary for a "last value" metric, I like @wylieconlon's suggestion to utilize the new warning popover UI.

@flash1293
Copy link
Contributor

flash1293 commented Dec 2, 2020

My takes:

I think we should add the following summary functions to the list and also allow summary rows for string columns (including buckets): unique count, last, first

It should be configured per dimension, not gobally

Reusing the formatter for numeric fields for sum, avg, min and max sounds right to me, I think an additional formatter would be overkill, falling back to default number formatter is enough.

Having a separate summary for each transposed column sounds right to me as well, in a later step we can add overall totals for all columns of a metric as well.

About last value:

There are definitely cases where a summary row for last value metrics makes sense. It’s only really an issue for array fields but it’s pretty common to have fields never containing arrays. I think using th warning ui is exactly the right approach.

About the string/number fields edge case - As mentioned above I would allow summary row for all string columns and just treat last value strings columns the same way.

@MichaelMarcialis
Copy link
Contributor

While working on wireframe sketches for the table enhancements, a question occurred to me regarding these summary rows. Do we wish to support just a single "grand" summary row for the entire table? Or is the desire to support both a "grand" summary row for the entire table and sub-summary rows (assuming there is more than one row dimension item defined)?

image

If we do wish to support both types of summary row, how do we reconcile that with the fact that the user will be defining the presence of the summary at the metric dimension level? Specifically, if the user enables a summary row on a metric dimension item for a table that has multiple row dimension items, do we assume the user wants both the "grand" summary row and any available sub-summary rows? Or do we think it should be an additional option for the user to choose which row summary types to show?

Also, as color-by-value is also available at the metric dimension, should these summary rows conform to any color settings the user may have set for that particular metric? Off the top of my head, I'm thinking no, as it may not make sense to adhere to the same color rules depending on the summary function being chosen. Thoughts?

@wylieconlon
Copy link
Contributor Author

@MichaelMarcialis That's a good question. I think there are no technical blockers to add that feature- we can insert extra rows with different styling, based on the example above. It still adds complexity to the feature set. Is it important to have in the first version, or are we okay with grand totals only in V1?

In terms of configuration: Grouped summaries only make sense when there are 2 or more bucketed columns, and they never make sense on the innermost bucket. We are already planning to add some metrics like unique count for these bucketed columns, so we could add a separate set of options for bucket configuration. One of these approaches would work:

  • There could be a checkbox that turns on "subtotals" or "grouped summaries" on all possible groups
  • Each group could have a checkbox to turn on grouped summaries for this particular dimension

@flash1293
Copy link
Contributor

I agree this is a cool feature, but I would like to not tackle it in the first phase.

Once we do, I would expect this to be a setting of the bucket dimension - in the metric dimension you are configuring the aggregation function (e.g. sum), and it will show the grand summaries, then you have a checkbox in the "rows" dimensions to turn on grouped summaries (if there is at least one metric dimension with summaries enabled)

@MichaelMarcialis
Copy link
Contributor

@flash1293, just to clarify for the purposes of my designs, are we saying that the first phase will only offer the "grand" summary row (and that additional sub-summary options will be handled in later phases)?

@flash1293
Copy link
Contributor

@MichaelMarcialis Exactly, I think that's a good phasing for this feature.

@fbuecklers
Copy link

fbuecklers commented Jun 6, 2021

Hi together,

I love to see that this feature is finally landing in the kibana data tables. One small addition,
We are displaying a lot of performance timers where typically median and percentiles are used. Therefore it would be helpful if those functions are also supported in the summary row.

To add a small example we are applying a term aggregation on the device to get the different performance timers per device. The summary row will then contains the overall performance timers.

Hope that make sense for You and you will take that use case in account as well!

Thanks

@flash1293
Copy link
Contributor

@fbuecklers Thanks for reaching out. I'm not sure whether our current understanding of this feature would fit your use case well - the summary row is calculated based on the values of the table column, not the underlying documents. This means in this scenario a "median" summary row would give you the median of the data row values, not the median of all underlying documents - this can diverge quite a bit depending on your data.

If you had a median/percentile calculation on the document level in mind to fill the summary row, could you open a separate feature request to track this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New value added to drive a business result Feature:Lens Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants