Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KEP-625: Update feature gate for CSI Migration kep #2966

Merged
merged 1 commit into from
Sep 15, 2021

Conversation

Jiawei0227
Copy link
Contributor

  • One-line PR description:
  • Other comments:
    /sig storage
    /cc @msau42

@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Sep 9, 2021
@@ -49,6 +49,11 @@ feature-gates:
- kube-controller-manager
- kubelet
- kube-scheduler
- name: InTreePlugin{cloud-provider}Unregister
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where some confusion is coming because it's related to but not strictly for CSI migration. @ehashman also had the suggestion that this may be better suited as a config rather than a feature gate, so I think we should revisit outside of the csi migration context.

cc @jsafrane for thoughts

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think things will get way more complicated if we change to other formats. To move this forward, I suggest we keep the flag until CSI migration is fully done and intree code is removed. Change it to a config file does not simplify it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or maybe I can remove this section and just remove the intreexxx feature gate from the vendor specific feature flags since technically they do not belong to this KEP

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I discussed with @jsafrane and we also think keeping it as a feature gate is simpler. Eventually once the feature gate is GA and intree code is removed, then the feature gates will also be removed.

With a config option, we would have to deal with deprecating the config API and following the deprecation timeline.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me, feature gates have a clear path towards GA and then they're removed. With a config (new field in component configs like KubeletConfiguration?), this API will be more permanent and harder to remove when everything is migrated to CSI.

Copy link
Member

@msau42 msau42 Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another challenge with the config is we would need to create new config APIs for many kubernetes components: kubelet, kube-controller-manager, kube-scheduler and a cluster operator would have to update the config in each of these components instead of being able to reuse existing tooling to apply feature gates.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I did some digging and it seems that these InTreePlugin{Provider}Unregister flags were added in the 1.21 cycle (https://github.com/kubernetes/website/blob/ed9728ca8cbf5f6933bfd3ad66eb22a3582937fb/content/en/docs/reference/command-line-tools-reference/feature-gates.md) but weren't documented in the KEP. So it probably doesn't make sense to revisit the design at this point, but I do think they need to be properly documented and discussed here.

The flag should be added to all applicable PRR questions, not just the first one.

Copy link
Contributor Author

@Jiawei0227 Jiawei0227 Sep 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The flag should be added to all applicable PRR questions, not just the first one.

This flag is only relevant with the first question. It is kind of standalone feature flag but happens to play a role in the overall CSI Migration feature. But technically it is not CSI migration specific feature gate and can be managed separately. So here I create a table to describe what happens with all the different scenario and hope it makes more sense.

keps/sig-storage/625-csi-migration/README.md Outdated Show resolved Hide resolved
keps/sig-storage/625-csi-migration/README.md Outdated Show resolved Hide resolved
keps/sig-storage/625-csi-migration/README.md Outdated Show resolved Hide resolved
keps/sig-storage/625-csi-migration/README.md Outdated Show resolved Hide resolved
keps/sig-storage/625-csi-migration/README.md Outdated Show resolved Hide resolved
@k8s-ci-robot k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Sep 9, 2021
@Jiawei0227
Copy link
Contributor Author

Gentle ping. @ehashman Can you help to take another look? Thanks a lot!

Comment on lines +211 to +213
- `InTreePlugin{vendor}Unregister` is a standalone feature gate that can be enabled and disabled
even out of CSI Migration scope. The name speaks for itself, when enabled, the component will not
register the specific in-tree storage plugin to the supported list. If the cluster operator only enables this flag,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if a component is started with InTreePlugin{vendor}Unregister enabled and later disabled while running? This is not currently covered in the questionnaire.

If you expect this to work, that should be documented. If not, and the component needs to be fully restarted for the behaviour to change, that should be documented.

This should be covered under the reenable feature after rollback/feature enablement tests questions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the advice! I added some more description in the section you mentioned. Please let me know if that is sufficient.

@@ -229,7 +254,7 @@ Specifically, for each in-tree plugin corresponding CSI drivers, it will have
### Rollout, Upgrade and Rollback Planning

* **How can a rollout fail? Can it impact already running workloads?**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This question should cover failures relating to InTreePlugin{vendor}Unregister. What if it's enabled when it shouldn't be? What about enabling or disabling at runtime?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks for the advice! I added some more description in the section you mentioned. Please let me know if that is sufficient.

@xing-yang
Copy link
Contributor

Hi @ehashman, your comments are addressed. Can you please take another look? As discussed on the Slack, we want to request for exception for these CSI Migration KEPs to be included in v1.23. Thanks!

Copy link
Member

@ehashman ehashman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
for PRR

@xing-yang
Copy link
Contributor

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Sep 15, 2021
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ehashman, Jiawei0227, xing-yang

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Sep 15, 2021
@k8s-ci-robot k8s-ci-robot merged commit 14d67b1 into kubernetes:master Sep 15, 2021
@k8s-ci-robot k8s-ci-robot added this to the v1.23 milestone Sep 15, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/storage Categorizes an issue or PR as relevant to SIG Storage. size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants