Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds event loop delay percentiles to mean delay #121052

Conversation

TinaHeiligers
Copy link
Contributor

@TinaHeiligers TinaHeiligers commented Dec 12, 2021

Resolves #120667

Kibana's status page shows a summary of Kibana's process metrics and plugin statuses. At the moment, the page displays if there's a problem (or someone navigates directly to /status) meaning that users rely on the information available to help them sort out issues.

This PR adds the event loop delay metrics, showing a subset of the percentiles as detail to the tile that displays the mean.
This PR also enhances the Load and Response time metric tiles, thereby guiding users to what the metrics indicate.

Internal only: Click here to see the PR on cloud

Before:

Screen Shot 2021-12-12 at 12 17 54

After:

Screen Shot 2021-12-10 at 16 02 21

Checklist

Delete any items that are not applicable to this PR.

For maintainers

Release Note:

Displays event loop delay metrics in the Status page to assist with performance monitoring.

@TinaHeiligers TinaHeiligers added v8.0.0 v8.1.0 Feature:StatusPage Issues related to the Kibana Status Page and APIs v7.17.0 labels Dec 12, 2021
layout="horizontal"
title={formatMetric(metric)}
description={name}
<EuiStat
Copy link
Contributor Author

@TinaHeiligers TinaHeiligers Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ended up using a combination of the suggestions from design here, using EuiStat as a footer to the existing EuiCard in MetricTile. EuiCard with horizontal layout doesn't support a footer so I had to go with aligning the text instead.

const metrics = Array.isArray(value) ? value : [value];
return metrics.map((metric) => formatNumber(metric, type)).join(', ');
};

const formatMetricId = ({ name }: Metric) => {
return name.toLowerCase().replace(/[ ]+/g, '-');
};

const formatDelayFooterTitle = (meta: MetricMeta) => {
Copy link
Contributor Author

@TinaHeiligers TinaHeiligers Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I extracted the formatting to make it easier to read. We could do something similar with the others and maybe even move them into a single formatDetails.ts file if we need to.

}
/>
);
} else if (name === 'Response time avg') {
Copy link
Contributor Author

@TinaHeiligers TinaHeiligers Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I grouped the Response time metrics together for two reasons:

  1. The average makes more sense when one knows what it's relative to (the max). Having the min in here would be nice down the line too.
  2. Allows us to use 2 lines of 3 cards rather than 3 lines where the last one only has 2 cards. We could change the flex on these components but this is more toward Design's recommendations or grouping relevant metrics together.

name: i18n.translate('core.statusPage.metricsTiles.columns.resTimeAvgHeader', {
defaultMessage: 'Response time avg',
name: i18n.translate('core.statusPage.metricsTiles.columns.processDelayHeader', {
defaultMessage: 'Delay',
Copy link
Contributor Author

@TinaHeiligers TinaHeiligers Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any reason we weren't showing event loop delay metrics?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just forgot to add them, likely.

description={name}
<EuiStat
data-test-subj={testSubjectName}
title={title}
Copy link
Contributor Author

@TinaHeiligers TinaHeiligers Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can only change the text size of the title prop, so I'm using the title to render the important info (values) in the footer.

metric.meta && (
<MetricCardFooter
testSubjectName="serverMetricMeta"
title={formatDelayFooterTitle(metric.meta.value as number[], metric.meta.type)}
Copy link
Contributor Author

@TinaHeiligers TinaHeiligers Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What we want to show in the footer depends on the metric. For event loop delay, it's the percentiles, for the Load it's the load average intervals and in the case of the response times, we're showing the max to give more context to what the mean is. I'm using inline formatting to extract text from the info we want to show. We could, theoretically, do all the formatting in load_status but I chose to do the formatting in the component because it feels more relevant here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to cast metric.meta.value as number[] if you do something like metric.meta?.value && (<MetricCardFooter... ?

We could, theoretically, do all the formatting in load_status but I chose to do the formatting in the component because it feels more relevant here.

Yeah we were mixing concerns here already anyway -- some formatting happens in load_status, some in the component. It would be nice to have metric preformatted so that you don't need any conditional logic in the component, but I don't have particularly strong feelings on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Value is value: number | number[];, so it has to be force-casted.

I think we should really get rid of this formatMetrics function that converts a properly typed object from the server to this untyped Metric array. It maybe made sense previously as we wanted to have a generic component to rule them all handle all metrics in the same way, but now that we have specific display and component depending on the metric, I think formatMetrics is just making things worse.

That's quite a lot of additional changes though, so we probably don't want to do this in current PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's quite a lot of additional changes though, so we probably don't want to do this in current PR.

We want this to go into 7.17 as a bug fix (forgetting to add the event_loop_delay metrics), so keeping the changes in this PR small is the way to go. It's "only" adding the metrics we didn't have, hence, fixing a bug.

I have the feeling the status page is going to evolve a lot and probably (hopefully) soon to add more detail to the status info at least. We'll need to get design in here to help us figure out the best way we can make this page more useful for diagnosing issues. Once that redesign is done, we can go ahead and refactor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lukeelmers @pgayvallet I've made the changes and errored on the side of caution, keeping the changes very small. Would you mind having another look please?

@@ -46,3 +121,10 @@ const formatMetric = ({ value, type }: Metric) => {
const formatMetricId = ({ name }: Metric) => {
return name.toLowerCase().replace(/[ ]+/g, '-');
};

const formatDelayFooterTitle = (values: number[], type?: DataType) => {
Copy link
Contributor Author

@TinaHeiligers TinaHeiligers Dec 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having this formatting in-line in the component makes the component code messy to read. We could eventually extract all the formatting helpers into their own file if we end up adding a lot of them.

defaultMessage: 'Percentiles',
}
),
title: '',
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're using the value as the title for the Delay card.

defaultMessage: 'Response time max',
}),
title: '',
value: [metrics.response_times.max_in_millis],
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to Delay, I'm using the value as the title in the card footer.

@TinaHeiligers TinaHeiligers added the Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc label Dec 13, 2021
@TinaHeiligers TinaHeiligers marked this pull request as ready for review December 13, 2021 15:37
@TinaHeiligers TinaHeiligers requested a review from a team as a code owner December 13, 2021 15:37
@elasticmachine
Copy link
Contributor

Pinging @elastic/kibana-core (Team:Core)

@TinaHeiligers
Copy link
Contributor Author

@elasticmachine merge upstream

metric.meta && (
<MetricCardFooter
testSubjectName="serverMetricMeta"
title={formatDelayFooterTitle(metric.meta.value as number[], metric.meta.type)}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to cast metric.meta.value as number[] if you do something like metric.meta?.value && (<MetricCardFooter... ?

We could, theoretically, do all the formatting in load_status but I chose to do the formatting in the component because it feels more relevant here.

Yeah we were mixing concerns here already anyway -- some formatting happens in load_status, some in the component. It would be nice to have metric preformatted so that you don't need any conditional logic in the component, but I don't have particularly strong feelings on it.

export const MetricTile: FunctionComponent<{ metric: Metric }> = ({ metric }) => {
const { name } = metric;
export const MetricCardFooter: FunctionComponent<{
testSubjectName: string;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: you can probably call this data-test-subj directly (as EUI does)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it wasn't possible to spread such keys, but in case someone else though so, TIL:

const foo = { 'data-test-subj': 'bar' };
const { 'data-test-subj': dataTestSub } = foo;

footer={
metric.meta && (
<MetricCardFooter
testSubjectName="serverMetricMeta"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually try to make data-test-subj unique on the page, so maybe we should do something like serverMetricMeta-${formatMetricId(metric)} as was done in EuiCard above?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had to go with a single data-test-subj for the MetricCardFooter component, in the same way as we use the static "serverMetric" for the flex items.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually try to make data-test-subj unique on the page

It depends, we usually do when we need to be able to select each node individually from FTR tests. Some elements have shared testSubj id depending on the needs (e.g find all rows from a table).

So it's probably fine. But just to understand:

I had to go with a single data-test-subj for the MetricCardFooter component

Why were you forced to do it? Can't we go with serverMetric-${formatMetricId(metric)}-footer or something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't so much forced as wanting to KISS the functional test and use testSubjects.findAll.

*/
export const MetricTile: FunctionComponent<{ metric: Metric }> = ({ metric }) => {
const { name } = metric;
if (name === 'Delay') {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd probably go with a switch here for readability, falling back to a default

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would even create sub-components for each of the cases and just have this one behaves as a dispatcher.

export const MetricTile: FunctionComponent<{ metric: Metric }> = ({ metric }) => {
  switch(metric.name) {
     case 'Delay': return <DelayMetricTile ... />
     case 'Load': return <LoadMetricTile ... />
     // ....
   }

}

That's not mandatory though, but probably more readable and isolated.

Comment on lines +71 to +74
it('correctly displays a metric with metadata', () => {
const component = shallow(<MetricTile metric={metricWithMeta} />);
expect(component).toMatchSnapshot();
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(thinking out loud, unrelated to the PR, don't mind me) snapshot testing was without any doubt the worse thing React did bring to the javascript ecosystem. This thing is a plague.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This thing is a plague.

Ditto to that.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💯

export const MetricTile: FunctionComponent<{ metric: Metric }> = ({ metric }) => {
const { name } = metric;
export const MetricCardFooter: FunctionComponent<{
testSubjectName: string;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought it wasn't possible to spread such keys, but in case someone else though so, TIL:

const foo = { 'data-test-subj': 'bar' };
const { 'data-test-subj': dataTestSub } = foo;

*/
export const MetricTile: FunctionComponent<{ metric: Metric }> = ({ metric }) => {
const { name } = metric;
if (name === 'Delay') {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I would even create sub-components for each of the cases and just have this one behaves as a dispatcher.

export const MetricTile: FunctionComponent<{ metric: Metric }> = ({ metric }) => {
  switch(metric.name) {
     case 'Delay': return <DelayMetricTile ... />
     case 'Load': return <LoadMetricTile ... />
     // ....
   }

}

That's not mandatory though, but probably more readable and isolated.

footer={
metric.meta && (
<MetricCardFooter
testSubjectName="serverMetricMeta"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we usually try to make data-test-subj unique on the page

It depends, we usually do when we need to be able to select each node individually from FTR tests. Some elements have shared testSubj id depending on the needs (e.g find all rows from a table).

So it's probably fine. But just to understand:

I had to go with a single data-test-subj for the MetricCardFooter component

Why were you forced to do it? Can't we go with serverMetric-${formatMetricId(metric)}-footer or something?

metric.meta && (
<MetricCardFooter
testSubjectName="serverMetricMeta"
title={formatDelayFooterTitle(metric.meta.value as number[], metric.meta.type)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Value is value: number | number[];, so it has to be force-casted.

I think we should really get rid of this formatMetrics function that converts a properly typed object from the server to this untyped Metric array. It maybe made sense previously as we wanted to have a generic component to rule them all handle all metrics in the same way, but now that we have specific display and component depending on the metric, I think formatMetrics is just making things worse.

That's quite a lot of additional changes though, so we probably don't want to do this in current PR.

name: i18n.translate('core.statusPage.metricsTiles.columns.resTimeAvgHeader', {
defaultMessage: 'Response time avg',
name: i18n.translate('core.statusPage.metricsTiles.columns.processDelayHeader', {
defaultMessage: 'Delay',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just forgot to add them, likely.

@TinaHeiligers
Copy link
Contributor Author

Why were you forced to do it? Can't we go with serverMetric-${formatMetricId(metric)}-footer or something?

Yes, we can go with the unique test-subj but I just wanted to keep it simple and use findAll for that.

@kibana-ci
Copy link
Collaborator

💚 Build Succeeded

Metrics [docs]

Page load bundle

Size of the bundles that are downloaded on every page load. Target size is below 100kb

id before after diff
core 275.7KB 277.6KB +1.9KB

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

Copy link
Member

@lukeelmers lukeelmers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates LGTM!

@TinaHeiligers TinaHeiligers merged commit ebafec2 into elastic:main Dec 15, 2021
@TinaHeiligers TinaHeiligers deleted the status-page/event_loop_delay_percentiles branch December 15, 2021 22:08
@TinaHeiligers TinaHeiligers added the auto-backport Deprecated - use backport:version if exact versions are needed label Dec 15, 2021
kibanamachine added a commit to kibanamachine/kibana that referenced this pull request Dec 15, 2021
@kibanamachine
Copy link
Contributor

💔 Backport failed

Status Branch Result
8.0
7.17 The branch "7.17" is invalid or doesn't exist

Successful backport PRs will be merged automatically after passing CI.

To backport manually run:
node scripts/backport --pr 121052

kibanamachine added a commit that referenced this pull request Dec 15, 2021
Co-authored-by: Kibana Machine <[email protected]>

Co-authored-by: Christiane (Tina) Heiligers <[email protected]>
TinaHeiligers added a commit to TinaHeiligers/kibana that referenced this pull request Dec 16, 2021
TinaHeiligers added a commit that referenced this pull request Dec 16, 2021
TinLe pushed a commit to TinLe/kibana that referenced this pull request Dec 22, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Deprecated - use backport:version if exact versions are needed Feature:StatusPage Issues related to the Kibana Status Page and APIs release_note:fix Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.17.0 v8.0.0 v8.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Display event loop delay histogram data in the Status page
6 participants