-
Notifications
You must be signed in to change notification settings - Fork 583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default to spread worker nodes across failure domains #3203
Comments
@richardcase: This issue is currently awaiting triage. If CAPA/CAPI contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
There were some discussions around this before. |
Good idea. I will add a agenda item for this. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues and PRs. This bot triages issues and PRs according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
/remove-lifecycle stale |
From triage 12/2022: Let's add to agenda for next office hours. Core CAPI MachineDeployment does not support multiple failure domains. Please see kubernetes-sigs/cluster-api#3358. We'll hold off on applying /triage label until then. For reference: Oracle and MicroVM Infrastructure Providers do distribute machines in one MachineDeployment across multiple failure domain. (Links to be added here) |
Discussed in the 6th Jan 2023 office hours. |
As discussed in the CAPA office hours, Indeed had several CAPA workload clusters (self-managed, non-eks) spanning all az's in us-east-2 on july 28 2022 during the outage. Our clusters are configured to use machine deployments in each AZ and the cluster autoscaler is configured for autoscaling machine deployments with the clusterapi provider. We also configure the cluster autoscaler and all of the CAPI/CAPA controllers to use leader election and run 3 replicas of each. What we observed was that when power to AZ1 was lost, 10 minutes later (I believe 10 minutes is due to the 5 minutes delay for the nodes to be marked unready due to missing kubelet heartbeat + 5 minutes for the pod-eviction-timeout of the kube-controller-manager, but I'm not 100% certain), pods were recreated by kubernetes scheduler without any outside interaction, and were in the pending state. The cluster autoscaler scaled up the machine deployments, and as soon as the machines joined the cluster, workloads scheduled and workloads continued to perform normally, despite the control plane being in a degraded state. No human intervention was required for the cluster recovery after AZ1 was restored or during the outage. Below are two sets of graphs from one of those clusters, which shows the control plane becoming degraded (2/3 available), and then the pods scheduled / created. The pods are scheduled in 3 "waves" as machines join the cluster and then allow more pods to schedule. I can provide more specific details on how the MD's were configured if that's useful. So I wonder if instead of implementing this feature, documentation on how to correctly configure CAPA clusters to sustain an AZ outage would be more desirable? |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/reopen |
@richardcase: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
/remove-lifecycle rotten |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/reopen |
@richardcase: Reopened this issue. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
The Kubernetes project currently lacks enough contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle stale |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues. This bot triages un-triaged issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /lifecycle rotten |
The Kubernetes project currently lacks enough active contributors to adequately respond to all issues and PRs. This bot triages issues according to the following rules:
You can:
Please send feedback to sig-contributor-experience at kubernetes/community. /close not-planned |
@k8s-triage-robot: Closing this issue, marking it as "Not Planned". In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
/kind feature
Describe the solution you'd like
Currently, CAPI will spread control plane machines across the reported failure domains (i.e. availability zones). It doesn't do this for worker nodes, machines in a machine deployment (or machines on their own).
Current advice is to create separate machine deployments and manually assign an az (via
FailureDomain
) to each of the machine deployments to ensure that you have worker machines in different azs.It would be better when creating machines (if no failure domain is specified on the
Machine
) that we use the failuredomains on theCluster
and create the machine in a failure domain with the least amount of machine already. CAPI has some functions we could potentially use. Something like this:Anything else you would like to add:
We need to investigate if this is feasible, or if it is something that should be upstream in machine deployments.
Environment:
kubectl version
):/etc/os-release
):The text was updated successfully, but these errors were encountered: