Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve collection of Docker memory metrics in Windows #11971

Closed
chrisvanderpennen opened this issue Apr 29, 2019 · 9 comments
Closed

Improve collection of Docker memory metrics in Windows #11971

chrisvanderpennen opened this issue Apr 29, 2019 · 9 comments
Labels
containers Related to containers use case enhancement Metricbeat Metricbeat :Windows

Comments

@chrisvanderpennen
Copy link

Describe the enhancement:

Describe a specific use case for the enhancement or feature:
Monitoring Windows container memory usage

@exekias exekias added :Windows containers Related to containers use case enhancement Metricbeat Metricbeat labels Apr 29, 2019
@exekias
Copy link
Contributor

exekias commented Apr 29, 2019

pinging @narph, I think you wanted to this module a pass on Windows

@chrisvanderpennen
Copy link
Author

Looks like this has been affected by #11676 - Windows memory stats don't include limit, so that change will prevent memory stat collection altogether on Windows.

@exekias
Copy link
Contributor

exekias commented May 2, 2019

good catch! thank you for sharing. @fearful-symmetry we may need to rethink your fix?

@fearful-symmetry
Copy link
Contributor

@exekias Yah. I tried to find a more elegant way of fixing the NaN issues on Linux, but settled on checking the limit, as the moby source stated that it would never be 0 for a running container. The core problem is that it's possible to get a status object back before a container has started, hence a bunch of junk values and the NaN. Ideally we would find a better way of detecting the invalid status objects such that we don't interfere with valid windows events.

@fearful-symmetry
Copy link
Contributor

Still chewing over how to do this elegantly. After poking around the docker API for a bit, I figure there's two other checks we can get from the stats object that should be platform independent:

OnlineCPUs : count of online CPUs. Is zero during an invalid event.
CPUusage.TotalUsage : Used to calculate deltas for CPU usage percentage. Also zero for an invalid event.

Does any of this sound reasonable @chrisvanderpennen ?

@chrisvanderpennen
Copy link
Author

OnlineCPUs is Linux only, sadly. 0 CPUusage.TotalUsage seems a reasonable guard for an invalid stats sample though, I don't see a way for a container in normal operation to have used <1ns of CPU time and my testing on Windows shows it resets to 0 after a running container is stopped.

In case it's useful, here's a gist with json from the stats API on Windows for a started and stopped container.

@fearful-symmetry
Copy link
Contributor

@chrisvanderpennen Thanks a ton! I'm gonna put in a PR for this, so we can at least get this part of the issue squared away.

@fearful-symmetry
Copy link
Contributor

That PR has been merged, so we should be good to move on to adding the actual windows memory stats.

@fearful-symmetry
Copy link
Contributor

That's been merged! I'm gonna close this for now, since it seems like we've squared away the problems in the original issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
containers Related to containers use case enhancement Metricbeat Metricbeat :Windows
Projects
None yet
Development

No branches or pull requests

3 participants