Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

missing memory metrics #10697

Closed
MorphBonehunter opened this issue Jun 3, 2021 · 8 comments
Closed

missing memory metrics #10697

MorphBonehunter opened this issue Jun 3, 2021 · 8 comments

Comments

@MorphBonehunter
Copy link

Nomad version

Nomad v1.1.0 (2678c36)

Operating system and Environment details

Arch Linux

Issue

A few days ago i upgrade my home cluster from 1.0.4 to 1.1.0.
Today i realized that since then some metrics are missing.
Especially the nomad.client.allocs.memory.rss isn't present anymore.
It doesn't matter if i try to get it via the "nomad" format or "prometheus":

~ $ curl -s http://charon.underverse.net:4646/v1/metrics?format=prometheus  |grep rss
daniel@charon ~ $ curl -s http://charon.underverse.net:4646/v1/metrics |grep rss
~ $

This could also be seen in the UI, there the memory graph only shows 0 Bytes:

image

But the task definitely needs mem 😄:
1af5d0839d2e minidlna-task-76395c6b-64f0-8abf-efec-dacbf98f070f 0.01% 21.84MiB / 128MiB 17.07% 0B / 0B 19.8MB / 8.19kB 1

Reproduction steps

Upgrade to nomad 1.1.0 and query metrics.

Expected Result

The memory stats are displayed.

Actual Result

Minimal the memory rss stats aren't displayed.

I've restarted the whole cluster in an rolling manner but the behavior didn't change.
Also there is nothing in the logs.

@MorphBonehunter
Copy link
Author

I've updated to 1.1.2 today.
Unfortunately the problem is still existing for me.
Is this something which only impacts me or are there other reports for such behavior?
Are there additional Information needed?

@pznamensky
Copy link

Is this something which only impacts me or are there other reports for such behavior?

In our environment everything seems to be working fine: Centos7 + docker 20.10 + nomad 1.1.2.

@MorphBonehunter
Copy link
Author

@pznamensky thanks for your feedback.
Can you please check what exact version of docker you run on your Centos7?
And also if you use cgroup v1 or v2?

I've checked my logs again and i think it has something todo with the docker and cgroupv2.
My metrics drop to zero on 2021-04-14 on the same day i've upgraded docker from 20.10.5 to 20.10.6
(the nomad upgrade from 1.0.4 to 1.1.0 was later on 2021-05-19, i didn't realize this 🤷)

I've checked the docker release logs and there they mentioned:

So maybe it's still an problem with nomad on capturing the mem data but it isn't related to the version bump to 1.1.0.

@pznamensky
Copy link

Can you please check what exact version of docker you run on your Centos7?

Docker 20.10.5.
Cgroup v1.

@MorphBonehunter
Copy link
Author

Ok, thanks.
So as Arch Linux uses cgroup v2 i think it's indeed something which doesn't play well together...

@m1keil
Copy link

m1keil commented Jul 8, 2021

I believe this is a duplicate of: #10251

@MorphBonehunter
Copy link
Author

Yes, that's right. Thanks for pointing this out.
I will close this one in favor of #10251.

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 17, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants