-
Notifications
You must be signed in to change notification settings - Fork 312
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase user watches on nodes #772
Comments
Have the same issue. (k8s 1.11.5) |
We saw this too but concluded that inotify objects were being leaked by os/docker and raising the limit only delayed the period between pain. We did a periodic rolling reboot instead (which can be az aks upgrade --same version on aks). |
@paulgmiller Kured effectively provides periodic reboots. The systems I noticed this on had an 18 day uptime. |
Same problem here, temporary solved by daemonset, which sets fs.inotify.max_user_watches to bigger value using sysctl and also mounts /etc/sysctl.d directory and creates file with fs.inotify.max_user_watches=OUR_VALUE as content. |
@rubroboletus we concluded with same "hack" on our clusters. it would be nice not to have to resort to this |
we are running into this limit sometimes, if more of the containers are started with nodemon (these containers are used by engineers to test / debug code) |
more-watches.yaml.txt |
We are experiencing this issue on our clusters. also when we run the above daemon set we get outages in our workloads. How exactly does it do a rolling update do you have to ensure that there are replica sets that are spreading pods across nodes? |
If anyone is interested this is how we are solving it for now. We would love to see a resolution on this issue
|
Increased to fs.inotify.max_user_watches = 1048576 |
@palma21 Could you please expand on what changed? Has this been rolled out to Azure AKS? If so, which version? How do we get the fix? |
Sure |
Any upgrade done after the AKS release date (2019-11-18) will have this change (since it's a node setting change it needs an upgrade) |
Thanks, will trigger upgrades as soon as possible |
@xinyanmsft why do you mount /sys directory to the container ? i tried to configure without this, seems everything is ok. But maybe i don't know something... |
What happened:
On a well-sized cluster, I started getting "no space left on device" issues when trying to run
kubectl logs
on pods.What you expected to happen:
Logs print successfully, and tail successfully.
How to reproduce it (as minimally and precisely as possible):
Run
kubectl logs
Anything else we need to know?:
Environment:
kubectl version
): 1.9.9It looks like the fix here for me was to increase the
fs.inotify.max_user_watches
value. The AKS nodes are deployed with the default 8192. Can this value be increased to avoid this issue in the future?The text was updated successfully, but these errors were encountered: