Add detailed docker network summary stats #25354

fearful-symmetry · 2021-04-27T21:47:20Z

What does this PR do?

This PR Creates a new metricset, docker/network_summary. This uses network namespaces in order to fetch per-network namespace stats for a given container, similar to the counters in system/network_summary

I also tried to achieve some level of parity with system/network_summary, as it contains many of the same fields. As a result of this, we're using the same fields and same mapping methods, which have been moved to libbeat.

Why is it important?

A few users have been requesting data at this level, but associated with a given cgroup/container.

Checklist

My code follows the style guidelines of this project
I have commented my code, particularly in hard-to-understand areas
I have made corresponding changes to the documentation
I have made corresponding change to the default configuration files
I have added tests that prove my fix is effective or that my feature works
I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

pull down PR and build on a linux system with docker
enable docker/network_summary in the module options.
with some docker containers running, run this metricset, check to see if docker.network_summary is reporting, and that fields are populated.

mergify · 2021-04-27T21:47:41Z

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b docker-network-summary upstream/docker-network-summary
git merge upstream/master
git push upstream docker-network-summary

…mmary

elasticmachine · 2021-04-27T22:18:36Z

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS

Expand to view the summary

Build stats

Build Cause: jsoriano commented: /test
Start Time: 2021-05-06T14:34:51.490+0000
Duration: 93 min 28 sec
Commit: b989478

Trends 🧪

❕ Flaky test report

No test was executed to be analysed.

…mmary

metricbeat/module/docker/network/data.go

metricbeat/module/docker/_meta/config.yml

metricbeat/module/docker/network/helper.go

jsoriano

Great to see this added, this is going to give more consistency when monitoring at the host and the container level.

I think we are going to need to do this as a new metricset as we did in the system module. As in there, this network metricset reports events per interface, and these new counters are per container. This new metricset could be disabled by default, and then we wouldn't need the new configuration flag.

I also wonder if we really need to sum all the counters of all the processes, see my comment about this.

This PR adds detailed network stats to the docker/memory metricset.

By the way, this first sentence of the description confused me a little bit 😅

jsoriano · 2021-04-29T10:08:37Z

metricbeat/module/docker/network/helper.go

+			return &sysinfotypes.NetworkCountersInfo{}, errors.Wrapf(err, "error fetching network counters for PID %d", intPID)
+		}
+
+		summedMetrics = sumCounter(summedMetrics, counters)


If I understand this correctly I wonder if we are summing here several times the same data. I think that all processes in the same network namespace report the same network counters.

For example if I start two listening netcats in a container, I see the two sockets under both pids:

$ for pid in $(pidof nc);do echo $pid; cat /proc/$pid/net/tcp; done 3119046 sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 00000000:1F9B 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 36489370 1 0000000000000000 100 0 0 10 0 1: 00000000:1F9C 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 36489816 1 0000000000000000 100 0 0 10 0 3118539 sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 00000000:1F9B 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 36489370 1 0000000000000000 100 0 0 10 0 1: 00000000:1F9C 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 36489816 1 0000000000000000 100 0 0 10 0

I also see them if I start a cat inside the container:

$ for pid in $(pidof cat);do echo $pid; cat /proc/$pid/net/tcp; done 3115294 sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode 0: 00000000:1F9B 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 36489370 1 0000000000000000 100 0 0 10 0 1: 00000000:1F9C 00000000:0000 0A 00000000:00000000 00:00000000 00000000 0 0 36489816 1 0000000000000000 100 0 0 10 0

This also happens if I look to other files such as /proc/$pid/net/netstat or /proc/$pid/net/snmp, the content is the same for all the processes in the same namespace.

Not sure about the API call, but docker container top lists all the processes in the container, so here we may be summing several times the same information if we are looping over all the processes in the container.

I think that we would only need to report the counters of only one pid in the container. Or to be more correct, given that inside a container there can be processes of multiple network namespaces (dind case for example), for correctness we would need to sum the counters once per namespace found, that's it to sum the counters once per different /proc/<pid>/ns/net found.

metricbeat/module/docker/network/_meta/data.json

fearful-symmetry · 2021-04-29T16:07:33Z

@jsoriano No idea how I overlooked the namespacing, you're definitely right, the counters share a namespace.

I was sort of going back and forth on if I should create a new metricset for this. This didn't seem quite "useful" enough on its own, but you might be right. In general, most containers only have one interface, so I wasn't too worried about the awkwardness of reporting this alongside the interface stats, but it also seems sub-optimal.

fearful-symmetry · 2021-04-29T19:24:53Z

Alright, I completely refactored this. It's now in its own metricset, and I also removed a lot of the summing logic--we shouldn't have multiple network namespaces per container I think, so we use Inspect to grab a PID and report the network counters.

jsoriano

Thanks for the refactor, it looks great. Added some small suggestions, after they are addressed I think this can be merged.

we shouldn't have multiple network namespaces per container

This is actually possible in dind (docker in docker) and some other fancier cases, so we might actually sum counters of different namespaces found inside the same container. But I don't think we need to cover these cases, and actually the system module probably doesn't cover multiple namespaces in the host. So current approach looks good to me.

This makes me think that maybe we could have a linux/network_namespace metricset that collects all these metrics per namespace from the host.

metricbeat/module/docker/network_summary/network_summary.go

metricbeat/module/docker/network_summary/network_summary_test.go

jsoriano

👍

…mmary

elasticmachine · 2021-05-03T22:02:17Z

Pinging @elastic/integrations (Team:Integrations)

fearful-symmetry · 2021-05-03T22:02:29Z

/test

…mmary

jsoriano · 2021-05-06T08:42:14Z

/package

jsoriano · 2021-05-06T08:48:44Z

/packaging

jsoriano · 2021-05-06T14:34:31Z

/test

jsoriano · 2021-05-06T16:28:33Z

@fearful-symmetry current failure is not related, I think you can merge this.

* add detailed network summary stats to docker/memory * add changelog * update deps, notice * update ref yml * fix test * move to new metricset, refactor * revert changes to docker/network * remove config file * update xpack * small fixes (cherry picked from commit f7e80da)

…25601) * Add detailed docker network summary stats (#25354) * add detailed network summary stats to docker/memory * add changelog * update deps, notice * update ref yml * fix test * move to new metricset, refactor * revert changes to docker/network * remove config file * update xpack * small fixes (cherry picked from commit f7e80da) * update docs

add detailed network summary stats to docker/memory

67d2e66

fearful-symmetry added enhancement v7.13.0 labels Apr 27, 2021

fearful-symmetry requested a review from a team April 27, 2021 21:47

fearful-symmetry self-assigned this Apr 27, 2021

botelastic bot added the needs_team Indicates that the issue/PR needs a Team:* label label Apr 27, 2021

fearful-symmetry added 2 commits April 27, 2021 14:53

Merge remote-tracking branch 'upstream/master' into docker-network-su…

1087627

…mmary

add changelog

11d3048

fearful-symmetry added 4 commits April 28, 2021 08:28

update deps, notice

9ab1223

update ref yml

15111b1

fix test

fbe6b79

Merge remote-tracking branch 'upstream/master' into docker-network-su…

dec7fa8

…mmary

kaiyan-sheng reviewed Apr 29, 2021

View reviewed changes

metricbeat/module/docker/network/data.go Outdated Show resolved Hide resolved

kaiyan-sheng reviewed Apr 29, 2021

View reviewed changes

metricbeat/module/docker/_meta/config.yml Outdated Show resolved Hide resolved

kaiyan-sheng reviewed Apr 29, 2021

View reviewed changes

metricbeat/module/docker/network/helper.go Outdated Show resolved Hide resolved

kaiyan-sheng reviewed Apr 29, 2021

View reviewed changes

metricbeat/module/docker/network/helper.go Outdated Show resolved Hide resolved

jsoriano requested changes Apr 29, 2021

View reviewed changes

move to new metricset, refactor

022b2d4

fearful-symmetry changed the title ~~Add detailed network summary stats to docker/memory~~ Add detailed docker network summary stats Apr 29, 2021

fearful-symmetry added 2 commits April 29, 2021 12:20

revert changes to docker/network

71cf0a7

remove config file

69450b1

fearful-symmetry requested review from jsoriano and kaiyan-sheng April 29, 2021 19:24

update xpack

3f9eb28

jsoriano reviewed Apr 30, 2021

View reviewed changes

small fixes

c4f31e5

fearful-symmetry requested a review from jsoriano April 30, 2021 17:53

jsoriano approved these changes Apr 30, 2021

View reviewed changes

fearful-symmetry added 2 commits May 3, 2021 07:56

Merge remote-tracking branch 'upstream/master' into docker-network-su…

408e1db

…mmary

Merge remote-tracking branch 'upstream/master' into docker-network-su…

9fa65e6

…mmary

fearful-symmetry added the Team:Integrations Label for the Integrations team label May 3, 2021

botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label May 3, 2021

fearful-symmetry added the Metricbeat Metricbeat label May 3, 2021

fearful-symmetry added 2 commits May 3, 2021 15:46

Merge remote-tracking branch 'upstream/master' into docker-network-su…

6af9f82

…mmary

Merge remote-tracking branch 'upstream/master' into docker-network-su…

b989478

…mmary

jsoriano mentioned this pull request May 6, 2021

[Don't merge] Trying to reproduce failure in CI #25577

Closed

fearful-symmetry merged commit f7e80da into elastic:master May 6, 2021

fearful-symmetry mentioned this pull request May 6, 2021

Cherry-pick #25354 to 7.x: Add detailed docker network summary stats #25601

Merged

6 tasks

fearful-symmetry added the v7.14.0 label May 6, 2021

fearful-symmetry mentioned this pull request May 10, 2021

[System] Make network_summary namespace/container aware elastic/integrations#605

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add detailed docker network summary stats #25354

Add detailed docker network summary stats #25354

fearful-symmetry commented Apr 27, 2021 •

edited

Loading

mergify bot commented Apr 27, 2021

elasticmachine commented Apr 27, 2021 •

edited by jenkins-beats-ci bot

Loading

Build stats

Trends 🧪

jsoriano left a comment •

edited

Loading

jsoriano Apr 29, 2021

fearful-symmetry commented Apr 29, 2021

fearful-symmetry commented Apr 29, 2021

jsoriano left a comment

jsoriano left a comment

elasticmachine commented May 3, 2021

fearful-symmetry commented May 3, 2021

jsoriano commented May 6, 2021

jsoriano commented May 6, 2021

jsoriano commented May 6, 2021

jsoriano commented May 6, 2021

Add detailed docker network summary stats #25354

Add detailed docker network summary stats #25354

Conversation

fearful-symmetry commented Apr 27, 2021 • edited Loading

What does this PR do?

Why is it important?

Checklist

How to test this PR locally

mergify bot commented Apr 27, 2021

elasticmachine commented Apr 27, 2021 • edited by jenkins-beats-ci bot Loading

💚 Build Succeeded

Build stats

Trends 🧪

❕ Flaky test report

jsoriano left a comment • edited Loading

Choose a reason for hiding this comment

jsoriano Apr 29, 2021

Choose a reason for hiding this comment

fearful-symmetry commented Apr 29, 2021

fearful-symmetry commented Apr 29, 2021

jsoriano left a comment

Choose a reason for hiding this comment

jsoriano left a comment

Choose a reason for hiding this comment

elasticmachine commented May 3, 2021

fearful-symmetry commented May 3, 2021

jsoriano commented May 6, 2021

jsoriano commented May 6, 2021

jsoriano commented May 6, 2021

jsoriano commented May 6, 2021

fearful-symmetry commented Apr 27, 2021 •

edited

Loading

elasticmachine commented Apr 27, 2021 •

edited by jenkins-beats-ci bot

Loading

jsoriano left a comment •

edited

Loading