-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
KEP-267: kubelet server certificate bootstrap and rotation #4848
Conversation
This feature has been beta enabled by default since 1.12, there is no much concern about the stability of the feature, however, it requires an external component to be used. Kubernetes must provide "battery included" for all its built-in features. Graduate the existing feature to GA by providing a controller in the cloud-controller-manager that users can opt-in to auto approve the CSR from the nodes. Change-Id: I7500f4cb6582fdff423e430518d370bcd08f144a Signed-off-by: Antonio Ojea <[email protected]>
#### GA | ||
|
||
- Real world usage | ||
- Opt-in built-in Node CSR approver on cloud-controller-manager so users does not have to depend on external components to use this feature |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the past discussion was saying that it is not needed. See #267 (comment)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I read it as is the part missing #267 (comment)
##### Prerequisite testing updates | ||
|
||
|
||
##### Unit tests |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hardest part of testing when I looked at GA-ing this was testing of the edge cases. How we retry, how we start kubelet that failed to get certs, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is a feature that has been running in production for more than 20 releases, there is no functionality added in this KEP, this is just adding the missing requirement per sig-auth, a builtin component that can provide the same functionality
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How we retry, how we start kubelet that failed to get certs, etc.
what do you mean by start? how we start kubelet that failed to get certs
|
||
Any cluster running e2e tests with the feature enabled will be exercising the feature. | ||
|
||
A job using the built-in CSR approver will be added exercising all the Conformance e2e tests. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need to run all tests? Once kubelet bootstrapped, there is nothing new when we run conformance. Would it be best to concentrate on covering edge cases like various failures handling and certs rotation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
after 20 releases what regressions or errors we plan to uncover?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I mean, the test is to run existing jobs with the builtin cloudprovider, everything should pass
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can setup a time to rotate certificates very low, so the job guarantee that the certificates are rotated during the execution of the tests
|
||
- Real world usage | ||
|
||
#### GA |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
corresponding metrics needs to be reviewed and GA-ed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a new requirement?
We have most of the metrics in alpha
kubernetes$ grep -r StabilityLevel pkg/ | cut -d\: -f3 | sort | uniq -c
11 compbasemetrics.ALPHA,
2 metrics.ALPHA,
179 metrics.ALPHA,
1 metrics.BETA,
1 metrics.INTERNAL,
1 metrics.STABLE,
14 metrics.STABLE,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I asked Han about this the other day because I also wanted to know and he said that metrics can stay in alpha even when features graduate. There's no process that ensures that metrics mature either.
per triage: |
/reopen was not my intention to close it |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: aojea The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/close I will not have time for this soon, so better to avoid noise Sorry about it |
One-line PR description: Graduate to GA KEP-267
Issue link: Kubelet Server TLS Certificate Rotation #267
Other comments:
This feature has been beta enabled by default since 1.12, there is no much concern about the stability of the feature, however, it requires an external component to be used. Kubernetes must provide "battery included" for all its built-in features. Graduate the existing feature to GA by providing a controller in the cloud-controller-manager that users can opt-in to auto approve the CSR from the nodes.
Note for reviewers: Please take into consideration this is a personal effort to remove technical debt by addressing the technical gaps identified by the sig-auth leaders, please be constructive to get this to the finish line and avoid derailing on the main topic that is move this feature to GA (that is beta since 1.12)
/sig auth
/assign @liggitt @deads2k