-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-625: Update feature gate for CSI Migration kep #2966
Conversation
@@ -49,6 +49,11 @@ feature-gates: | |||
- kube-controller-manager | |||
- kubelet | |||
- kube-scheduler | |||
- name: InTreePlugin{cloud-provider}Unregister |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is where some confusion is coming because it's related to but not strictly for CSI migration. @ehashman also had the suggestion that this may be better suited as a config rather than a feature gate, so I think we should revisit outside of the csi migration context.
cc @jsafrane for thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think things will get way more complicated if we change to other formats. To move this forward, I suggest we keep the flag until CSI migration is fully done and intree code is removed. Change it to a config file does not simplify it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe I can remove this section and just remove the intreexxx feature gate from the vendor specific feature flags since technically they do not belong to this KEP
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I discussed with @jsafrane and we also think keeping it as a feature gate is simpler. Eventually once the feature gate is GA and intree code is removed, then the feature gates will also be removed.
With a config option, we would have to deal with deprecating the config API and following the deprecation timeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To me, feature gates have a clear path towards GA and then they're removed. With a config (new field in component configs like KubeletConfiguration?), this API will be more permanent and harder to remove when everything is migrated to CSI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another challenge with the config is we would need to create new config APIs for many kubernetes components: kubelet, kube-controller-manager, kube-scheduler and a cluster operator would have to update the config in each of these components instead of being able to reuse existing tooling to apply feature gates.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I did some digging and it seems that these InTreePlugin{Provider}Unregister
flags were added in the 1.21 cycle (https://github.com/kubernetes/website/blob/ed9728ca8cbf5f6933bfd3ad66eb22a3582937fb/content/en/docs/reference/command-line-tools-reference/feature-gates.md) but weren't documented in the KEP. So it probably doesn't make sense to revisit the design at this point, but I do think they need to be properly documented and discussed here.
The flag should be added to all applicable PRR questions, not just the first one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flag should be added to all applicable PRR questions, not just the first one.
This flag is only relevant with the first question. It is kind of standalone feature flag but happens to play a role in the overall CSI Migration feature. But technically it is not CSI migration specific feature gate and can be managed separately. So here I create a table to describe what happens with all the different scenario and hope it makes more sense.
302deb5
to
a16ef4a
Compare
Gentle ping. @ehashman Can you help to take another look? Thanks a lot! |
- `InTreePlugin{vendor}Unregister` is a standalone feature gate that can be enabled and disabled | ||
even out of CSI Migration scope. The name speaks for itself, when enabled, the component will not | ||
register the specific in-tree storage plugin to the supported list. If the cluster operator only enables this flag, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if a component is started with InTreePlugin{vendor}Unregister
enabled and later disabled while running? This is not currently covered in the questionnaire.
If you expect this to work, that should be documented. If not, and the component needs to be fully restarted for the behaviour to change, that should be documented.
This should be covered under the reenable feature after rollback/feature enablement tests questions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the advice! I added some more description in the section you mentioned. Please let me know if that is sufficient.
@@ -229,7 +254,7 @@ Specifically, for each in-tree plugin corresponding CSI drivers, it will have | |||
### Rollout, Upgrade and Rollback Planning | |||
|
|||
* **How can a rollout fail? Can it impact already running workloads?** |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This question should cover failures relating to InTreePlugin{vendor}Unregister
. What if it's enabled when it shouldn't be? What about enabling or disabling at runtime?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks for the advice! I added some more description in the section you mentioned. Please let me know if that is sufficient.
a16ef4a
to
db5afc6
Compare
db5afc6
to
3ba198d
Compare
Hi @ehashman, your comments are addressed. Can you please take another look? As discussed on the Slack, we want to request for exception for these CSI Migration KEPs to be included in v1.23. Thanks! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
for PRR
/lgtm |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ehashman, Jiawei0227, xing-yang The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/sig storage
/cc @msau42