Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scale issue: kube-apiserver memory increases a lot when restarting Antrea Agent in scale setup #1044

Closed
wenyingd opened this issue Aug 6, 2020 · 1 comment · Fixed by #1045
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@wenyingd
Copy link
Contributor

wenyingd commented Aug 6, 2020

Describe the bug
After multiple Antrea Agents is restarted (delete the original Antrea Agent Pod), the RSS value of kube-apiserver should increase a lot.

Antrea Agent tries to list all Pods on current Node from kube-apiserver to reconcile when it starts, and kube-apiserver tries to pull all Pods from the backend etcd server to serve the request. This causes a lot of temp data is pulled into the memory of kube-apiserver.

To Reproduce

  1. Deploy a cluster with 100 Nodes, and 10K Pods.
  2. Delete 50 Antrea Agents, and observe the memory change of kube-apiserver

Expected
The RSS value of kube-apiserver should increase little.

Actual behavior
The RSS value of kube-apiserver might increase more than 1G.

Versions:
Please provide the following information:

  • Antrea version : v0.8.2
  • Kubernetes version (use kubectl version): 1.18.6

Additional context
Restart kubelet on the same Nodes, the memory of kube-apiserver increases little.

@wenyingd wenyingd added the kind/bug Categorizes issue or PR as related to a bug. label Aug 6, 2020
@wenyingd wenyingd self-assigned this Aug 6, 2020
@wenyingd wenyingd changed the title s issue: kube-apiserver Scale issue: kube-apiserver memory increases a lot when restarting Antrea Agent in scale setup Aug 6, 2020
@wenyingd
Copy link
Contributor Author

wenyingd commented Aug 6, 2020

The issue is introduced because, although kube-apiserver has a cache in the memory, it still uses the remote storage ( etcd ) to servie every "List" request if the resourceVersion is not set in the request. Antrea Agent will list all Pods located on the current Node it boots up, and no resourceVersion is set in the request, so the request will trigger kube-apiserver to pull data from the etcd server. The same usage is also in the CRD monitor in Antra Controller, antctl commands, and Octant plugin.

With fix #1045 on a setup with 100 Nodes and 10K Pods, the RSS value of kube-apiserver increases little (<1M) after restarts all 100 Antrea Agents, while the value increases 1.5GB when restarting 50 Agents before applying it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants