Skip to content
This repository has been archived by the owner on Oct 24, 2023. It is now read-only.

fix: Increasing inotify limit to 1048576 from the default 8192 #1801

Merged
merged 3 commits into from
Oct 1, 2019
Merged

fix: Increasing inotify limit to 1048576 from the default 8192 #1801

merged 3 commits into from
Oct 1, 2019

Conversation

junaid-ali
Copy link
Contributor

@junaid-ali junaid-ali commented Aug 19, 2019

Links:

Signed-off-by: Junaid Ali [email protected]

Reason for Change:

Issue Fixed:

Requirements:

Notes:

@welcome
Copy link

welcome bot commented Aug 19, 2019

💖 Thanks for opening your first pull request! 💖 We use semantic commit messages to streamline the release process. Before your pull request can be merged, you should make sure your first commit and PR title start with a semantic prefix. Examples of commit messages with semantic prefixes: - fix: change azure disk cachingMode to ReadOnly - feat: make maximumLoadBalancerRuleCount configurable - docs: add note on AKS Engine and AKS relationship
Make sure to check out the developer guide for guidance on testing your change.

@junaid-ali junaid-ali changed the title Increasing inotify limit to 1048576 from the default 8192 fix: Increasing inotify limit to 1048576 from the default 8192 Aug 19, 2019
@junaid-ali
Copy link
Contributor Author

/assign patricklang

@junaid-ali
Copy link
Contributor Author

Backgroup for this PR: kubectl logs <pod-name> -f errors No space left on device. It appears the issue is due to inotify limit on worker nodes. Whenever more than ~15 pods are created on a node (Standard D16s v3 - 16 vcpus, 64 GiB memory), we start seeing this issue. Increasing the inotify limit from the default 8192 solves the issue.

I checked the limit on Ubuntu 18.04 and 16.04, where the limit is set to 1048576.

@mboersma
Copy link
Member

/azp run pr-e2e

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@msftclas
Copy link

msftclas commented Aug 21, 2019

CLA assistant check
All CLA requirements met.

@junaid-ali
Copy link
Contributor Author

/azp run pr-e2e

@azure-pipelines
Copy link

Commenter does not have sufficient privileges for PR 1801 in repo Azure/aks-engine

@junaid-ali
Copy link
Contributor Author

@mboersma I have ran make build and make ensure-generated, pr-e2e should pass now. Also added a test for the change.

@mboersma
Copy link
Member

/azp run pr-e2e

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@codecov
Copy link

codecov bot commented Aug 21, 2019

Codecov Report

Merging #1801 into master will increase coverage by 0.24%.
The diff coverage is n/a.

@@            Coverage Diff             @@
##           master    #1801      +/-   ##
==========================================
+ Coverage   76.43%   76.67%   +0.24%     
==========================================
  Files         130      135       +5     
  Lines       19398    20542    +1144     
==========================================
+ Hits        14827    15751     +924     
- Misses       3774     3873      +99     
- Partials      797      918     +121

@junaid-ali
Copy link
Contributor Author

@mboersma It looks like the net-config-validate.sh script is failing (please correct me if I'm wrong). I manually tested the test steps on my workstation and it seemed fine. Do you think I'm missing anything here for e2e tests?

@junaid-ali
Copy link
Contributor Author

It looks like the updated /etc/sysctl.d/60-CIS.conf should be baked into VHD else it will use the old 60-CIS.conf which doesn't have my changes, so net-config-validate.sh will always fail. Please correct me if I'm wrong.
CC: @mboersma @jackfrancis

Ref: #999 (comment)

@CecileRobertMichon
Copy link
Contributor

@junaid-ali you're correct. The fact that the k8s_non_vhd_deployment job passed confirms that E2E passes with your changes. In order to merge this change, we would need to comment out the net-config test for this change with a //TODO to re-enable when the next VHD is published (and open an issue on GitHub so we can track it on the backlog). cc @jackfrancis

@junaid-ali
Copy link
Contributor Author

junaid-ali commented Sep 15, 2019

@CecileRobertMichon thanks for the update. I have commented the test case in this PR, and created an issue to re-enable it later.

@CecileRobertMichon
Copy link
Contributor

/azp run pr-e2e

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jackfrancis
Copy link
Member

@palma21 @yangl900 @wenwu449 FYI, we will merge this pending AKS approval

@jackfrancis
Copy link
Member

lgtm

@jackfrancis
Copy link
Member

@yangl900 @wenwu449 ping for your thoughts on this change

@yangl900
Copy link
Contributor

LGTM

Copy link
Member

@jackfrancis jackfrancis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@acs-bot acs-bot added the lgtm label Oct 1, 2019
@jackfrancis jackfrancis merged commit 7deeede into Azure:master Oct 1, 2019
@welcome
Copy link

welcome bot commented Oct 1, 2019

Congrats on merging your first pull request! 🎉🎉🎉

@acs-bot
Copy link

acs-bot commented Oct 1, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: jackfrancis, junaid-ali

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants