-
Notifications
You must be signed in to change notification settings - Fork 52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Address scalability issue when Node Watcher is enabled #76
Comments
@NickrenREN I wonder if you've seen a similar issue in production. |
Node Watcher is a single instance controller, what is the scalability issue ? |
@NickrenREN It affects the e2e tests. Details are in this issue: kubernetes/kubernetes#102452 By disabling the external-health-monitor, the failure went away. |
IIUC, the root cause of the scalability issue you mention is: Node Watcher watches PVCs, Nodes and Pods ? |
Watch is persistent connection, and Node Watcher is a single instance controller. Is this really the root cause ? |
I saw many API Throttlings, so maybe we can decrease the API call frequency ? |
This needs more investigation. The observation is that the failure went away when external-health-monitor was disabled, came back again when it is enabled, and went away again when it was disabled.
We could try that. |
This indicates the controller causes the failure (API throttling ?), but i still don't think Watch is the root cause. |
The external-health-monitor controller added more load to the API server which might have triggered those failures. |
I agree, so we can try to decrease the API call frequency first. |
I would like to work on this issue. Will start to look into it and understand. |
/assign |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close |
@k8s-triage-robot: Closing this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@pohly: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/lifecycle frozen |
/assign |
We have an issue #75 to change the code to only watch Pods and Nodes when the Node Watcher component is enabled. We still need to address the scalability issue when Node Watcher is enabled:
kubernetes/kubernetes#102452 (comment)
The text was updated successfully, but these errors were encountered: