Add new telemetry data from event-log index. #140943

ersin-erdal · 2022-09-19T12:47:58Z

This PR is a follow-on of #139901.
And covers the rest of the requested telemetry data(numbers 7-9 in the issue) by aggregating data from event-log index

To Verify:

Create a couple of rules with always triggering conditions (e.g. index-threshold that checks if the doc count is above 0)
add at least 1 action.

Change alerting telemetry task run interval to something very short time (e.g. 1 min)

kibana/x-pack/plugins/alerting/server/usage/task.ts

Line 197 in 01ecbd4

return moment().add(1, 'd').startOf('d').toDate();

Do the same for the actions telemetry task too

kibana/x-pack/plugins/actions/server/usage/task.ts

Line 143 in 01ecbd4

return moment().add(1, 'd').startOf('d').toDate();

Check the below data on https://localhost:5601/api/stats?extended=true&legacy=true
count_rules_by_execution_status_per_day,
count_connector_types_by_action_run_outcome_per_day,
avg_actions_run_duration_by_connector_type_per_day,

Edit: Realised that avg_actions_run_duration_by_connector_type_per_day is already in telemetry with the name avg_execution_time_by_type_per_day, therefore removed that field and its tests.

count_rules_by_execution_status_per_day, count_connector_types_by_action_run_outcome_per_day, avg_actions_run_duration_by_connector_type

elasticmachine · 2022-09-19T14:18:42Z

Pinging @elastic/response-ops (Team:ResponseOps)

x-pack/plugins/alerting/server/usage/task.ts

afharo

Telemetry schema changes LGTM

… 138996-telemetry-event-log

ymao1 · 2022-09-20T10:46:17Z

x-pack/plugins/actions/server/usage/task.ts

@@ -140,5 +144,5 @@ export function telemetryTaskRunner(
 }

 function getNextMidnight() {
-  return moment().add(1, 'd').startOf('d').toDate();
+  return moment().add(1, 'm').startOf('m').toDate();


Should revert this :)

haha, fixed. Thanks for catching it.

For the actions telemetry, I'm seeing both the connector type id and the rule type id in the results. For example, I currently have a rule with a server log action. The telemetry looks like:

avg_actions_run_duration_by_connector_type_per_day: { __server-log: 814815, example.always-firing: 814815 }, count_connector_types_by_action_run_outcome_per_day: { __server-log: { success: 54 }, example.always-firing: { success: 54 } },

I think example.always-firing shouldn't be there. Is that right?

Is this from the integration test or from an instance you ran locally?
I also saw this in integration tests and thought that somehow rule id is used for a test connector.
Because this is not happening when i run and test Kibana locally.

I saw this when running locally with an example.always-firing rule with server log connector

Solved with b9c8252
Forgot to filter saved objects by type...

ymao1

For the actions telemetry, I'm seeing both the connector type id and the rule type id in the results. For example, I currently have a rule with a server log action. The telemetry looks like:

avg_actions_run_duration_by_connector_type_per_day: {
  __server-log: 814815,
  example.always-firing: 814815
},
count_connector_types_by_action_run_outcome_per_day: {
  __server-log: {
    success: 54
  },
  example.always-firing: {
    success: 54
  }
},

I think example.always-firing shouldn't be there. Is that right?

mikecote

I noticed the same issue that @ymao1 reported (#140943 (review)). Once that is fixed, PR LGTM 👍 Tested locally for other issues and didn't uncover any.

ymao1 · 2022-09-20T16:22:57Z

x-pack/plugins/actions/server/usage/actions_telemetry.test.ts

@@ -754,6 +784,16 @@ Object {
        __slack: 7,
      },
      countTotal: 120,
+      countRunOutcomeByConnectorType: {


I believe the paradigm we follow in other telemetry objects is to have the rule type id or connector type id last. That way, we could search by countRunOutcomeByConnectorType.failed.* and get all the different connector types that have failed, instead of having to look inside each connector type object. WDYT of switching this around to be similar?

But the request is We'd like to know which connector types are failing the most relative to their successful runs.
Wouldn't it be more difficult to get a connector's success/failure ratio?

Hmm...I'm not sure either of these options will make calculating the ratio per rule type easier :). We can leave it as is.

my idea was getting it like:
count_connector_types_by_action_run_outcome_per_day.__slack.failure / count_connector_types_by_action_run_outcome_per_day.__slack.success

but

count_connector_types_by_action_run_outcome_per_day.failure.__slack./ count_connector_types_by_action_run_outcome_per_day.success.__slack

would also do the same thing... IDK i can change it :)

ymao1

LGTM

kibana-ci · 2022-09-20T21:29:36Z

💚 Build Succeeded

Buildkite Build
Commit: 871c9fd

Metrics [docs]

✅ unchanged

History

💔 Build #73963 failed 059d839
💚 Build #73892 succeeded 4985a53
💔 Build #73887 failed 6632748
💔 Build #73845 failed de4a052
💔 Build #73632 failed be0ef83

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ersin-erdal

ersin-erdal added 2 commits September 19, 2022 11:57

Add new telemetry data from eventLog index.

5a2adfa

count_rules_by_execution_status_per_day, count_connector_types_by_action_run_outcome_per_day, avg_actions_run_duration_by_connector_type

fix the wrong telemetry field

8ba7779

ersin-erdal added Feature:Telemetry release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.5.0 labels Sep 19, 2022

ersin-erdal marked this pull request as ready for review September 19, 2022 14:18

ersin-erdal requested review from a team as code owners September 19, 2022 14:18

ersin-erdal self-assigned this Sep 19, 2022

mikecote self-requested a review September 19, 2022 16:03

mikecote reviewed Sep 19, 2022

View reviewed changes

x-pack/plugins/alerting/server/usage/task.ts Outdated Show resolved Hide resolved

ersin-erdal and others added 5 commits September 19, 2022 20:32

Move actions telemetry from alerting plugin to actions plugin

5a2b569

fix test

126f515

fix schema

0459890

revert unnecessary changes on schema

b1ded0a

Merge branch 'main' into 138996-telemetry-event-log

8b4bfcd

afharo approved these changes Sep 19, 2022

View reviewed changes

ersin-erdal added 2 commits September 19, 2022 21:38

use replaceFirstAndLastDotSymbols

693cc89

Merge remote-tracking branch 'origin/138996-telemetry-event-log' into…

be0ef83

… 138996-telemetry-event-log

ymao1 reviewed Sep 20, 2022

View reviewed changes

ersin-erdal added 2 commits September 20, 2022 13:42

fix task interval

de4a052

remove avg_actions_run_duration_by_connector_type_per_day

b9c8252

mikecote approved these changes Sep 20, 2022

View reviewed changes

ersin-erdal and others added 3 commits September 20, 2022 14:57

remove avg_actions_run_duration_by_connector_type_per_day

6632748

fix telemetry schema

4985a53

Merge branch 'main' into 138996-telemetry-event-log

059d839

ymao1 reviewed Sep 20, 2022

View reviewed changes

ymao1 approved these changes Sep 20, 2022

View reviewed changes

ersin-erdal added 3 commits September 20, 2022 20:07

Merge branch 'main' into 138996-telemetry-event-log

39ebc1a

Merge branch 'main' into 138996-telemetry-event-log

70532f5

Merge branch 'main' into 138996-telemetry-event-log

871c9fd

ersin-erdal merged commit 17a25b8 into elastic:main Sep 20, 2022

kibanamachine added the backport:skip This commit does not require backporting label Sep 20, 2022

ersin-erdal deleted the 138996-telemetry-event-log branch September 20, 2022 21:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add new telemetry data from event-log index. #140943

Add new telemetry data from event-log index. #140943

ersin-erdal commented Sep 19, 2022 •

edited

Loading

elasticmachine commented Sep 19, 2022

afharo left a comment

ymao1 Sep 20, 2022

ersin-erdal Sep 20, 2022

ersin-erdal Sep 20, 2022

ymao1 Sep 20, 2022

ersin-erdal Sep 20, 2022

ymao1 left a comment

mikecote left a comment

ymao1 Sep 20, 2022

ersin-erdal Sep 20, 2022

ymao1 Sep 20, 2022

ersin-erdal Sep 20, 2022

ymao1 left a comment

kibana-ci commented Sep 20, 2022

Add new telemetry data from event-log index. #140943

Add new telemetry data from event-log index. #140943

Conversation

ersin-erdal commented Sep 19, 2022 • edited Loading

elasticmachine commented Sep 19, 2022

afharo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ymao1 left a comment

Choose a reason for hiding this comment

mikecote left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ymao1 left a comment

Choose a reason for hiding this comment

kibana-ci commented Sep 20, 2022

💚 Build Succeeded

Metrics [docs]

History

ersin-erdal commented Sep 19, 2022 •

edited

Loading