Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Collect new alerting telemetry data from Kibana index #139901

Merged
merged 14 commits into from
Sep 19, 2022

Conversation

ersin-erdal
Copy link
Contributor

@ersin-erdal ersin-erdal commented Sep 1, 2022

Resolves #138996

This PR covers only the telemetry data that we can aggregate from the Kibana index (numbers 1-6 in the issue). The telemetry data from event-log index (numbers 7-9) will be covered by another follow up PR.

To Verify:

  • Create some rules with tags, some without
  • Use different notify when options (Every time alert is active, On status change etc)
  • Snooze one of the rules
  • Mute (AKA infinitive snooze) one of the rules
  • Mute one of the alerts

Change telemetry task run interval to something very short time (e.g. 1 min)

return moment().add(1, 'd').startOf('d').toDate();

Expect to see below data on https://localhost:5601/api/stats?extended=true&legacy=true
count_rules_by_execution_status,
count_rules_with_tags,
count_rules_by_notify_when,
count_rules_snoozed,
count_rules_muted,
count_rules_with_muted_alerts

…execution_status, count_rules_with_tags, count_rules_by_notify_when, count_rules_snoozed, count_rules_muted, count_rules_with_muted_alerts
@ersin-erdal ersin-erdal added Feature:Telemetry release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.5.0 labels Sep 1, 2022
count_rules_snoozed: { type: 'long' },
count_rules_muted: { type: 'long' },
count_rules_with_muted_alerts: { type: 'long' },
count_connector_types_by_consumers: { DYNAMIC_KEY: { DYNAMIC_KEY: { type: 'long' } } },
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure about this DYNAMIC_KEY usage...

@ersin-erdal ersin-erdal marked this pull request as ready for review September 5, 2022 13:01
@ersin-erdal ersin-erdal requested review from a team as code owners September 5, 2022 13:01
@elasticmachine
Copy link
Contributor

Pinging @elastic/response-ops (Team:ResponseOps)

@ersin-erdal ersin-erdal self-assigned this Sep 6, 2022
Copy link
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified new telemetry is being generated as expected. Nice job! Left a question about whether we should be explicitly mapping all the possible connector types.

},
"count_connector_types_by_consumers": {
"properties": {
"DYNAMIC_KEY": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is going to look for a field DYNAMIC_KEY and apply the mapping to that, so the mapping will not be applied to data like siem.__email or alerts.__email. With other items that we break down by rule type id, we have to explicitly map each possible value.

However, since we plan to copy these to the custom response ops index via the indexer, maybe having this mapping doesn't matter as much anymore? WDYT @mikecote

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From what I recall, the introduction of DYNAMIC_KEY was to allow telemetry to be indexed without failures (using strict mappings?). I wonder if this has changed now where they just store the data within the _source unmapped.

@elastic/kibana-core can you weigh in on whether or not we still need to use DYNAMIC_KEY to make such data accessible in our custom telemetry indexer?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should ping @elastic/kibana-core one more time :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the ping! I think there is some confusion between the schema and the mappings:

Kibana describes the schema as a way to declare a contract of the format of the document we ship. This may or may not result in a direct mapping approach.


Re DYNAMIC_KEY, it was introduced to allow Kibana developers to specify tree-shaped structures where the keys are dynamic/arbitrarily built. Whether we use flattened type, use dynamic templates or add the known keys to the mappings, it's up to the analysis end to decide. The important thing is that, from the Kibana POV, it's sending an object with an arbitrary set of keys.


Re the keys including a dot inside, that's a good point, @ymao1. We allow shipping objects like:

{
  "siem.__prop1": "some value",
  "siem.__prop2": "some other value"
}

Technically speaking, it adheres to the DYNAMIC_KEY contract, although I agree that indexing that into ES might end up as:

{
  "siem": {
    "__prop1": "some value",
    "__prop2": "some other value"
  }
}

FWIW, I don't think it's such an issue:

  • If we declare count_connector_types_by_consumers as type: "flattened", it won't break.
  • If we declare dynamic_templates with match_path: "count_connector_types_by_consumers.*, it won't break.
  • If we explicitly declare their mappings, it won't break either.

The only potential error would be if we also report the "siem" key on its own. Do we?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only potential error would be if we also report the "siem" key on its own. Do we?

No, this is not possible.

And thank you for clarifying DYNAMIC_KEY usage :)

Copy link
Contributor

@ymao1 ymao1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@afharo afharo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@mikecote mikecote left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM! Tested locally and saw the telemetry reflect the state of my rules.

@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #50 / apis UnifiedFieldList field stats apis field distribution should return examples for non-aggregatable fields

Metrics [docs]

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

cc @ersin-erdal

@ersin-erdal ersin-erdal merged commit e90ab44 into elastic:main Sep 19, 2022
@kibanamachine kibanamachine added the backport:skip This commit does not require backporting label Sep 19, 2022
@ersin-erdal ersin-erdal deleted the 138996-telemetry branch September 19, 2022 09:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting Feature:Telemetry release_note:skip Skip the PR/issue when compiling release notes Team:ResponseOps Label for the ResponseOps team (formerly the Cases and Alerting teams) v8.5.0
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Alerting telemetry to add for 8.5
8 participants