Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Unified Observability] update network fields #134471

Merged
merged 12 commits into from
Jul 6, 2022

Conversation

neptunian
Copy link
Contributor

@neptunian neptunian commented Jun 15, 2022

Resolves #131152

Swaps out usage of non ECS fields system.network.in.bytes and system.network.out.bytes (counters) for ECS fields host.network.ingress.bytes and host.network.egress.bytes which are gauges. These new ECS fields were added in metricbeat 7.14 so users who are on Kibana 8.4+ but on a version of metricbeat prior to 7.14 will have no data until they upgrade metricbeat to at least 7.14.

The new ECS fields according to the docs:
The number of bytes received/sent out (gauge) on all network interfaces by the host since the last metric collection.

The main calculation we now need to do is calculate bytes per second. Since this depends on the user's metricset.period based on the docs, we use (avg) host.network.ingress.bytes / (metricset.period / 1000)
In order to make sure the changes made matched the previous field's calculations I used a TSVB chart to compare:

Screen Shot 2022-06-23 at 5 26 10 PM

The query aggs look as follows now:

rx_avg: {
    avg: {
      field: 'host.network.ingress.bytes',
    },
  },
  rx_period: {
    filter: {
      exists: {
        field: 'host.network.ingress.bytes',
      },
    },
    aggs: {
      period: {
        max: {
          field: 'metricset.period',
        },
      },
    },
  },
  rx: {
    bucket_script: {
      buckets_path: {
        value: 'rx_avg',
        period: 'rx_period>period',
      },
      script: {
        source: 'params.value / (params.period / 1000)',
        lang: 'painless',
      },
      gap_policy: 'skip',
    },
  },

I also had to fix an existing bug where we only got the default of 10 interfaces. I found this issue while using my mac as it has a lot more interfaces and I wasn't getting the correct data in any of these places where we report rx/tx. Notice in the tsvb chart I changed the "Top" to 100 under system.network.in.bytes vis in order for the chart to match. We won't need to worry about this when using the new ECS fields because it returns the value of "all network interfaces".

Docs will be updated in another PR.

The following places are affected:
Metrics: "Outbound traffic" and "Inbound traffic" the overlay and Network vis in popover:
Screen Shot 2022-06-15 at 11 07 33 AM

Metrics: Inventory panels and vis:
Screen Shot 2022-06-15 at 11 10 40 AM
Screen Shot 2022-06-15 at 11 10 45 AM

Observability Overview: RX/TX
Screen Shot 2022-06-15 at 11 11 45 AM

Inventory Threshold Rule for Inbound/Outbound traffic
Screen Shot 2022-06-15 at 2 28 32 PM

@neptunian neptunian added release_note:fix Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Team:Unified observability labels Jun 15, 2022
@neptunian neptunian marked this pull request as ready for review June 15, 2022 15:26
@neptunian neptunian requested a review from a team as a code owner June 15, 2022 15:26
@elasticmachine
Copy link
Contributor

Pinging @elastic/infra-monitoring-ui (Team:Infra Monitoring UI)

@elasticmachine
Copy link
Contributor

Pinging @elastic/unified-observability (Team:Unified observability)

Copy link
Contributor

@miltonhultgren miltonhultgren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍🏼

},
],
split_mode: 'terms',
terms_field: 'system.network.name',
split_mode: 'everything',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this mean?

Copy link
Contributor Author

@neptunian neptunian Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it basically means groupBy. when using the system.network fields we grouped by system.network.name which is the network interface name. But now we don't do that now because the new field (host.network) returns the sum of all the interfaces and there is no such equivalent. Generated by this UI:
Screen Shot 2022-06-23 at 5 12 07 PM
Screen Shot 2022-06-23 at 5 12 12 PM

Copy link
Contributor

@crespocarlos crespocarlos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool stuff. Just added a couple of comments

I think we need to update the files in x-pack/test/functional/es_archives/infra/8.0.0/hosts_only so that the tests can reflect these changes

@neptunian neptunian requested a review from simianhacker June 24, 2022 19:16
@matschaffer
Copy link
Contributor

@elasticmachine merge upstream

Copy link
Member

@simianhacker simianhacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two spots I noted that I think we should revert the tests. The first one is the Metrics Explorer test that confirms the "rate" aggregation is using the correct backing aggregations for Elasticsearch. Both tests used the field name system.network.in.bytes BUT they are not necessarily tied to that metric. We should change the field from system.network.in.bytes to counter for the tests so it's more ambiguous.

@neptunian neptunian force-pushed the 131152-new-network-fields branch from 7091fce to 6c755ed Compare July 5, 2022 17:40
@neptunian
Copy link
Contributor Author

@elasticmachine merge upstream

@neptunian
Copy link
Contributor Author

@elasticmachine merge upstream

@kibana-ci
Copy link
Collaborator

💛 Build succeeded, but was flaky

Failed CI Steps

Test Failures

  • [job] [logs] FTR Configs #38 / endpoint When on the Endpoint Policy Details Page "after all" hook in "When on the Endpoint Policy Details Page"
  • [job] [logs] FTR Configs #38 / endpoint When on the Endpoint Policy Details Page "before all" hook in "When on the Endpoint Policy Details Page"

Metrics [docs]

Async chunks

Total size of all lazy-loaded chunks that will be downloaded as the user navigates the app

id before after diff
infra 1019.6KB 1020.5KB +947.0B

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@simianhacker simianhacker self-requested a review July 5, 2022 20:43
Copy link
Member

@simianhacker simianhacker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thanks for reverting the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport:skip This commit does not require backporting release_note:fix Team:Infra Monitoring UI - DEPRECATED DEPRECATED - Label for the Infra Monitoring UI team. Use Team:obs-ux-infra_services Team:Unified observability v8.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Unified Observability] Metrics overview table is not showing RX or TX metrics although they're available
8 participants