Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🌱 Add failureDomains in MachineDeploymentTopology #5850

Conversation

fabriziopandini
Copy link
Member

What this PR does / why we need it:
This PR adds support for defining the failureDomain where a MachineDeploymentTopology should deploy machines to

Which issue(s) this PR fixes:
Fixes #5636

/cc @yastij

@k8s-ci-robot k8s-ci-robot requested a review from yastij December 13, 2021 09:24
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Dec 13, 2021
@@ -143,6 +143,11 @@ type MachineDeploymentTopology struct {
// the values are hashed together.
Name string `json:"name"`

// FailureDomain is the failure domain the machines will be created in.
// Must match a key in the FailureDomains map stored on the cluster object.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When we go create/update a MachineDeployment, could we validate the input? We can't do it beforehand (e.g. on cluster creation) because the FailureDomains wouldn't be available yet, but we could inform users if the failure domain isn't available?

We can also open an issue and do it later

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll open an issue.
This is another use case where we are putting changes on hold (like for upgrades), so we should make sure that the TopologyReconciled condition is updated accordingly
@ykakarap PTAL

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted. Looks like that work will come in a follow up PR.
To help with my understanding, we are looking to update the FailureDomain on the MD only after the cluster object is reconciled and cluster.status.failureDomains is available?
OR
Are we planing to hold creating the MD till cluster.status.failureDomains is available so that we can validate the provided failureDomain and use that to create the MD.?

Copy link
Member

@sbueringer sbueringer Jan 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ykakarap The second option sounds better to me. I think it would be surprising to the user if the Machines first show up in the wrong failure domain and then get re-created in the right one. (not entirely sure if that can happen because possibly the cluster reconcile will block everything else long enough anyway)

@vincepri
Copy link
Member

LGTM pending squash

@fabriziopandini fabriziopandini force-pushed the failureDomains-in-MachineDeploymentTopology branch from fbab740 to 1483dcc Compare December 13, 2021 21:09
Copy link
Member

@vincepri vincepri left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 13, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vincepri

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 13, 2021
@k8s-ci-robot k8s-ci-robot merged commit 2618fea into kubernetes-sigs:main Dec 13, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.1 milestone Dec 13, 2021
@fabriziopandini fabriziopandini deleted the failureDomains-in-MachineDeploymentTopology branch December 13, 2021 22:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enable setting failureDomain as part of a cluster's topology
5 participants