-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Guidance for skipping Alpha for feature gated changes #7828
Guidance for skipping Alpha for feature gated changes #7828
Conversation
- changes that are explicitly considered as bug fixes are recommended to go directly to `Beta` | ||
and should be enabled by-default from the very beginning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many bug fixes today are merged without a feature gate at all.
Is the intent here that "bug fixes that have a sufficient level of risk that being able to turn the fix off via a feature gate is justified" should be handled this way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are using Deprecated state to roll out some bug fixes that may have some risk of breaking , see kubernetes/kubernetes#119789
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Deprecated" is really for changing functionality which could be perceived as "removal". It's, not a very clear principle.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it actually works in all cases. Let me try to clarify that [yes, I think that bug fixes with sufficient level of risk should actually go with FGs]
@@ -9,6 +9,17 @@ Feature gates are intended to cover the development life cycle of a feature - th | |||
Features generally progress through `Alpha` -> `Beta` -> `GA`. Sometimes we end up deciding that a feature is not going to be supported and we end up marking them as `Deprecated`. | |||
|
|||
The majority of features will go through all three stages, but occasionally there are features which may skip stages. | |||
While some exceptions may happen, approvers should use the following guidance: | |||
- features that involve API changes must progress through all `Alpha`, `Beta`, `GA` stages | |||
- larger or more complex changes should progress through all `Alpha`, `Beta`, `GA` stages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- larger or more complex changes should progress through all `Alpha`, `Beta`, `GA` stages | |
- features that are unproven at achieving their goals, have significant complexity, risk of defects/problematic edge cases, or performance/scalability implications should progress through all `Alpha`, `Beta`, `GA` stages |
(trying to expand the word "larger" into criteria somehow.. further improvements welcome)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to the clarification.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this too - thanks!
While some exceptions may happen, approvers should use the following guidance: | ||
- features that involve API changes must progress through all `Alpha`, `Beta`, `GA` stages | ||
- larger or more complex changes should progress through all `Alpha`, `Beta`, `GA` stages | ||
- smaller changes that carry non-trivial risk (e.g. due to changing user-facing behavior) might skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- smaller changes that carry non-trivial risk (e.g. due to changing user-facing behavior) might skip | |
- features achieve their goals with minimal complexity and performance/scalability implications, but that still carry non-trivial risk (e.g. due to changing user-facing behavior, or risk of defects/problematic edge cases) might skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to the clarification.
We should also indicate that changes that carry a risk of making previously working functionality no longer work in certain edges should always start in an off-by-default state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"features which..."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
- larger or more complex changes should progress through all `Alpha`, `Beta`, `GA` stages | ||
- smaller changes that carry non-trivial risk (e.g. due to changing user-facing behavior) might skip | ||
`Alpha` and start directly in `Beta` (provided the appropriate `Beta` quality is achieved) | ||
but should be off by-default until proven in some production environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but should be off by-default until proven in some production environment | |
but should be off by-default until proven in representative production environment that utilized the feature with the scale or variety of use to prove it's working |
- smaller changes that carry non-trivial risk (e.g. due to changing user-facing behavior) might skip | ||
`Alpha` and start directly in `Beta` (provided the appropriate `Beta` quality is achieved) | ||
but should be off by-default until proven in some production environment | ||
- smaller changes with low enough risk might skip `Alpha` and start directly in `Beta` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
those are basically the bug fixes? Is the KEP only needed to introduce the Feature Gate instead of providing config options?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If so, we can be open about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these are "bugfixes which might need a backout"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe just say it like this:
- smaller changes with low enough risk might skip `Alpha` and start directly in `Beta` | |
- smaller changes with low enough risk that still may need to be disabled using the feature gate without introducing a new long term configuration option, might skip `Alpha` and start directly in `Beta` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just want it to be very explicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This overlaps with my PR on this file, but it's OK. I'll rebase. :)
I am a visual person:
Trivial Low High API changes
bugfixes complexity complexity |
| changes with changes with |
| no API no API |
| | | |
Risk ---O-----------O---------------O------------------O-------->
| | | |
No gate Start beta Start alpha Start alpha,
required and enabled or beta, but disabled
disabled
While some exceptions may happen, approvers should use the following guidance: | ||
- features that involve API changes must progress through all `Alpha`, `Beta`, `GA` stages | ||
- larger or more complex changes should progress through all `Alpha`, `Beta`, `GA` stages | ||
- smaller changes that carry non-trivial risk (e.g. due to changing user-facing behavior) might skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"features which..."
- smaller changes that carry non-trivial risk (e.g. due to changing user-facing behavior) might skip | ||
`Alpha` and start directly in `Beta` (provided the appropriate `Beta` quality is achieved) | ||
but should be off by-default until proven in some production environment | ||
- smaller changes with low enough risk might skip `Alpha` and start directly in `Beta` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, these are "bugfixes which might need a backout"
- changes that are explicitly considered as bug fixes are recommended to go directly to `Beta` | ||
and should be enabled by-default from the very beginning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Deprecated" is really for changing functionality which could be perceived as "removal". It's, not a very clear principle.
Do we need to say explicitly that the trivial bugfix is different from low complexity changed by the fact that we provide the temporary opt out config option. Maybe better name "trivial" -> "no opt-out". |
d1b7e5e
to
bef3087
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PTAL
@@ -9,6 +9,17 @@ Feature gates are intended to cover the development life cycle of a feature - th | |||
Features generally progress through `Alpha` -> `Beta` -> `GA`. Sometimes we end up deciding that a feature is not going to be supported and we end up marking them as `Deprecated`. | |||
|
|||
The majority of features will go through all three stages, but occasionally there are features which may skip stages. | |||
While some exceptions may happen, approvers should use the following guidance: | |||
- features that involve API changes must progress through all `Alpha`, `Beta`, `GA` stages | |||
- larger or more complex changes should progress through all `Alpha`, `Beta`, `GA` stages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like this too - thanks!
While some exceptions may happen, approvers should use the following guidance: | ||
- features that involve API changes must progress through all `Alpha`, `Beta`, `GA` stages | ||
- larger or more complex changes should progress through all `Alpha`, `Beta`, `GA` stages | ||
- smaller changes that carry non-trivial risk (e.g. due to changing user-facing behavior) might skip |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
- changes that are explicitly considered as bug fixes are recommended to go directly to `Beta` | ||
and should be enabled by-default from the very beginning |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it actually works in all cases. Let me try to clarify that [yes, I think that bug fixes with sufficient level of risk should actually go with FGs]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
@@ -9,6 +9,28 @@ Feature gates are intended to cover the development life cycle of a feature - th | |||
Features generally progress through `Alpha` -> `Beta` -> `GA`. Sometimes we end up deciding that a feature is not going to be supported and we end up marking them as `Deprecated`. | |||
|
|||
The majority of features will go through all three stages, but occasionally there are features which may skip stages. | |||
While some exceptions may happen, approvers should use the following guidance: | |||
- features that involve API changes must progress through all `Alpha`, `Beta`, `GA` stages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I posted on slack about this but can we define (or point me to a place) what is meant by an API change?
We have staging api changes which are obvious to me. But do we consider cli flags and configs for kubelet to be a API change?
So any new flag/api change to KubeletConfig would require alpha -> beta -> GA?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general when we use this term I think it refers to PRs that would trigger API review. So, this is about the K8s APIs, not the broader user surface. So, basically, anything that makes you read https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we consider cli flags and configs for kubelet to be a API change
Yes. https://github.com/kubernetes/community/blob/master/sig-architecture/api-review-process.md#what-parts-of-a-pr-are-api-changes is a crisp list of "things that are APIs"
APIs are the way things outside the project integrate with our binaries / servers, and changes to that surface area can disrupt or break integrators. It's important that we be very confident in things that increase that surface area, so having at least a single release where we are still able to adjust that surface area is important.
So any new flag/api change to KubeletConfig would require alpha -> beta -> GA?
Kubelet config overall is still beta (:-/) so a field inside it can't really be more mature than beta, but generally, yes, new things start as alpha for a release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the other important point is the removal of flags, removing a flag in kubelet per example will make that the new version will fail to start with "unknown flag" (I can not remember the exactly) so removal of flags need to consider this problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this is about the K8s APIs, not the broader user surface. So, basically, anything that makes you read https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api_changes.md
we should starting thinking about moving these or some of this conventions to the CRI API too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added a suggestion with Jordan's link
/lgtm |
/hold LGTM but I am holding for @deads2k. David go ahead and remove the hold if you are good with this. /approve |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: johnbelamaric, wojtek-t The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@@ -9,6 +9,28 @@ Feature gates are intended to cover the development life cycle of a feature - th | |||
Features generally progress through `Alpha` -> `Beta` -> `GA`. Sometimes we end up deciding that a feature is not going to be supported and we end up marking them as `Deprecated`. | |||
|
|||
The majority of features will go through all three stages, but occasionally there are features which may skip stages. | |||
While some exceptions may happen, approvers should use the following guidance: | |||
- features that involve API changes must progress through all `Alpha`, `Beta`, `GA` stages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- features that involve API changes must progress through all `Alpha`, `Beta`, `GA` stages | |
- features that involve [API changes](https://github.com/kubernetes/community/blob/master/sig-architecture/api-review-process.md#what-parts-of-a-pr-are-api-changes) must progress through all `Alpha`, `Beta`, `GA` stages |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
bef3087
to
d060ec0
Compare
/lgtm |
Thanks for the chance for a re-look. /lgtm |
Follow up from SIG Architecture meeting on Apr 18th 2024
/assign @deads2k @johnbelamaric @thockin @liggitt @jpbetz