-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use TokenRequest API instead of calico-nodes service account token for CNI kubeconfig. #5910
Use TokenRequest API instead of calico-nodes service account token for CNI kubeconfig. #5910
Conversation
/sem-approve |
Do you know when the TokenRequest API was added to k8s? I'm wondering if this will break compatibility with older releases. |
I do not know the exact release, but it looks like even v1.10 had the relevant logic: How far back would be interesting for you? |
Please note that I have not adapted the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments - let me know what you think about them. Feel free to riff on my suggestions.
The main thing is that we need to be paying attention to the returned expiry time, not just the requested expiry time, and it would be nice to package this up to share some code.
cni-plugin/pkg/install/install.go
Outdated
@@ -496,3 +504,46 @@ func setSuidBit(file string) error { | |||
|
|||
return nil | |||
} | |||
|
|||
func createCNIToken(kubecfg *rest.Config) string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like we're duplicating very similar logic here and within the node directory.
Probably would be better to do this in once place - we can use a single constant for the validity duration, as well as a single function for refreshing the token.
I'd recommend creating a new structure to handle this logic, as there is a little bit of complexity that we need to deal with. Specifically, the returned expiry time may not be the time that we requested, so we need to cache that and update our ticker based on the value that's returned (not just the value we requested).
ExpirationSeconds is the requested duration of validity of the request. The token issuer may return a token with a different validity duration so a client needs to check the 'expiration' field in a response.
From here: https://kubernetes.io/docs/reference/kubernetes-api/authentication-resources/token-request-v1/
e.g., something like this:
type TokenRefresher struct {
cfg *rest.Config
clientset *kubernetes.ClientSet
...
}
type TokenUpdate struct {
Token string
Error error
}
func (t *TokenRefresher) TokenChan() chan TokenUpdate {}
func (t* TokenRefresher) UpdateToken() TokenUpdate {}
The code in this file (install.go) can just call TokenUpdate() to get a new token.
The code in node that updates the token can do something like this:
tokenChan := t.TokenChan()
for {
select {
case <- tokenChan:
// Update CNI config with new token.
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another thing to think about.. would be whether or not the install.go
code should bother generating a token at all - we need to have the node code running, and updating tokens anyway, and it needs to do it ASAP because otherwise it won't know what the returned expiry time is on token generated by the CNI plugin installer.
I'll need to think about this one a bit more - LMK what you think.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the actual validity period is a good point. I will adapt the code accordingly. When should we try to replace the token? In the initial change I just statically assumed 1/3 of the (expected) validity period so that we still have enough time for another attempt if the token request fails.
Concerning the code duplication, I completely agree, but did not want to touch/change too many things at once. Writing the kubeconfig is anyway completely duplicated in slightly different form. It would for sure be easier to have the code only in calico-node
, but I understand it originated in the cni installation and only moved to calico-node
when token got restricted validity.
In case you want to continue to have the kubeconfig writing logic in two places, where should I put the token update code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When should we try to replace the token? In the initial change I just statically assumed 1/3 of the (expected) validity period so that we still have enough time for another attempt if the token request fails.
I think 1/3 of the period is a fine enough way to do it - I might even do a token refresh every hour for the base case, and then switch to doing 1/3 of the period if the returned expiry is less than an hour long.
That actually makes me realize that these tokens are valid cluster-wide, not within the context of a particular node, so putting the logic to refresh the token within calico/node
might not be right - it means we might get 5000 nodes all trying to refresh their token at the same time.
Think we might need a controller in kube-controllers which refreshes the token, and then watchers within calico/node that spot when the token has changed and update the config for that node.
Writing the kubeconfig is anyway completely duplicated in slightly different form
Yep, longer term I'd like to consolidate this as well. It's remnant of cni-plugin and node being separate repositories in the past. I'd like to see us consolidate all of this into one place or the other eventually.
In case you want to continue to have the kubeconfig writing logic in two places, where should I put the token update code?
If the above is true, I think we probably want the token refreshing code to live in kube-controllers, and then all calico/node and the CNI plugin code needs to do is query the generated token via a Get()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main argument against having the token code in kube-controllers
is that kube-controllers
is running in the pod network as far as I know. Therefore, it needs a running CNI to setup its network sandbox. Hence, it cannot be the precondition for CNI. Not sure how to follow up here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caseydavenport I changed the code as suggested to have just one location (in calico-node
) and reuse it from within the cni installer. Let me know you feedback.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Also included the check of the validity of the returned token.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the main argument against having the token code in kube-controllers is that kube-controllers is running in the pod network as far as I know.
Yeah, good call. This would be a problem. I'll think on this some more.
One option would be to start running kube-controllers as host networked, but I'd really prefer not to do that.
14ff563
to
74f749b
Compare
The clusterrole change is now included. I am not sure if I got all occurrences as the clusterrole seems to be duplicated several times. |
74f749b
to
9079de3
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great, thanks again for working on this.
Had a few more comments / thoughts / questions, so let me know what you think.
e375136
to
c9ecda7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor comment but this is looking pretty good! Let's see how the tests do.
We should probably have some unit tests for this logic as well. We run some tests with a real API server so we could probably just have the test code in this package create a token refresher with a short expiry time and make sure it gives us a few tokens as a quick and dirty functional verification. (as part of this suite:
Lines 278 to 284 in faa1d3c
## Run the ginkgo FVs | |
fv: run-k8s-apiserver | |
$(DOCKER_RUN) \ | |
-v $(CERTS_PATH):/home/user/certs \ | |
-e KUBECONFIG=/go/src/github.com/projectcalico/calico/hack/test/certs/kubeconfig \ | |
-e ETCD_ENDPOINTS=http://$(LOCAL_IP_ENV):2379 \ | |
$(CALICO_BUILD) ginkgo -cover -r -skipPackage vendor pkg/lifecycle/startup pkg/allocateip $(GINKGO_ARGS) |
node/pkg/cni/token_watch.go
Outdated
|
||
clientset, _ := InClusterClientSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should handle this error rather than ignore it (static checks will fail for this anyway)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not really sure if the handling I added is what you had in mind.
The TokenRefresher
cannot work without the clientset. Hence, I used a fatal error in that case.
Please let me know what you think.
/sem-approve |
c9ecda7
to
46a5026
Compare
@caseydavenport I took a look at the failed cni-plugin tests in semaphoreci before looking into further tests. It looks like the tests fail because of the missing namespace file ( |
46a5026
to
90b5cd8
Compare
I added simple tests for the code using the approach suggested. |
90b5cd8
to
af199c8
Compare
Now there is also a small test checking the creation of multiple tokens in a series. |
/sem-approve |
@caseydavenport Thanks for the hint. I excluded the new test from the unit tests as suggested. At least locally, that part now succeeds. |
/sem-approve |
@ScheererJ 🎉 it's merged! Thanks for the great work on this. |
@caseydavenport Is there a plan to backport the fix to v3.23?. |
@skmatti mulling on that at the moment. We definitely want to let it soak a bit more before release, but it's also a substantial change with implications for permissions, etc, which normally would be a no-no for a patch release, so it's a hard sell as a backport. |
…n-release-v3.23 [release-v3.23] Auto pick #5910: Use TokenRequest API instead of calico-nodes service account
Hi @caseydavenport , Sorry for commenting on a merged PR. I just wanted to add my 2 cts for backporting the fix to 3.23. The expiring tokens were introduced by default in Kubernetes v1.21 with default expiration time of You can test this by deploying calico on a 1.21 cluster, enable audit logs in the API server, waiting >1 hour and create a pod. You should see a message like this in the audit: {
"kind": "Event",
"apiVersion": "audit.k8s.io/v1",
"level": "Metadata",
"auditID": "4b3aa577-8407-4383-adff-c10f5954fd01",
"stage": "ResponseComplete",
"requestURI": "/api/v1/namespaces/default/pods/bad-pod/status",
"verb": "patch",
"user": {
"username": "system:serviceaccount:kube-system:calico-node",
"uid": "049fd9b3-f537-4fd3-b5ba-59350407e999",
"groups": [
"system:serviceaccounts",
"system:serviceaccounts:kube-system",
"system:authenticated"
],
"extra": {
"authentication.kubernetes.io/pod-name": [
"calico-node-c7jgk"
],
"authentication.kubernetes.io/pod-uid": [
"adef62fc-d593-4687-bf78-47ccf44f8785"
]
}
},
"sourceIPs": [
"10.7.255.126"
],
"userAgent": "calico/v0.0.0 (linux/amd64) kubernetes/$Format",
"objectRef": {
"resource": "pods",
"namespace": "default",
"name": "bad-pod",
"apiVersion": "v1",
"subresource": "status"
},
"responseStatus": {
"metadata": {},
"code": 200
},
"requestReceivedTimestamp": "2022-06-21T14:09:07.779004Z",
"stageTimestamp": "2022-06-21T14:09:07.786404Z",
"annotations": {
"authentication.k8s.io/stale-token": "subject: system:serviceaccount:kube-system:calico-node, seconds after warning threshold: 12006",
"authorization.k8s.io/decision": "allow",
"authorization.k8s.io/reason": "RBAC: allowed by ClusterRoleBinding \"calico-node\" of ClusterRole \"calico-node\" to ServiceAccount \"calico-node/kube-system\""
}
} From the milestone seems to me that 3.24 is not going to be released soon, but a path release of 3.23 including the proper management of expiring tokens could be a quick win. Of course, I'm assuming a lot and not being familiar with your internal processes maybe this doesn't make sense. Just sharing an end-user POV. Thanks a lot! |
Hi @ralgozino, this pull request was backported to v3.23 with #6218, but there is no release, yet. Best regards, |
As you can see from #6218 - this fix is already backported to the v3.23 branch. As soon as its soaked in our testing for a bit, we will release it in a new v3.23 patch.
This is a good point, thank you for raising. |
Thank you both for the quick answer and sorry that I missed #6218 in my research 🙌 |
Thanks for the pokes everyone - we are aiming to cut a v3.23 with this included soon, so stay tuned! |
I am trying to upgrade kOps to use v3.23.2 and getting the following error for Canal:
Looking at the code, seems that the service account name is hardcoded to Any thoughts @caseydavenport @lwr20 @ScheererJ ? |
@hakman You are correct that the service account name is hardcoded. I was not aware of the It would be fairly straight forward to adapt the code to use an environment variable if set instead of the default (
If that would be fine I could open a corresponding pull request for it. @caseydavenport / @lwr20 What do you think? |
@caseydavenport this is the bug we spotted in our end-to-end tests yesterday. I'll leave you to assist @ScheererJ to fix. |
Cool. Thanks for the quick response! |
Yep, this was my bad for not spotting this in the code review! I think an adaptation to allow specification via an env var would be appropriate. It can be populated via the k8s downward API quite easily:
|
I'm working on a fix for this ^ EDIT: PR here: #6302 |
Description
Use TokenRequest API instead of calico-nodes service account token for CNI kubeconfig.
With projected service account tokens in kubernetes, the service account tokens of pods
are bound to the lifetime of the corresponding pod. Therefore, it may lead to problems
if an external process re-uses the token of a pod.
The CNI binaries used the token of calico-node. However, in case calico-node got stopped
the corresponding token lost its validity and hence could no longer be used for CNI
operations. Usually, this automatically resolves over time, but there are some edge cases
where this is not possible, e.g. if calico-node is autoscaled in terms of resources and
the new resource requests would require preemption/eviction of an existing pod the CNI
operation to delete the network sandbox will fail due to the no longer valid token
(as calico-node was stopped beforehand).
This change switches over to using the TokenRequest API instead, i.e. creating new
tokens with limited validity. It would have been good to bind the token to an object,
e.g. to the corresponding node, but as of now only secret and pod are supported types
for binding tokens. Hence, the tokens are only limited in time and not bound to any
other kubernetes object.
Related issues/PRs
fixes #4857
fixes #5712
Todos
Release Note
Reminder for the reviewer
Make sure that this PR has the correct labels and milestone set.
Every PR needs one
docs-*
label.docs-pr-required
: This change requires a change to the documentation that has not been completed yet.docs-completed
: This change has all necessary documentation completed.docs-not-required
: This change has no user-facing impact and requires no docs.Every PR needs one
release-note-*
label.release-note-required
: This PR has user-facing changes. Most PRs should have this label.release-note-not-required
: This PR has no user-facing changes.Other optional labels:
cherry-pick-candidate
: This PR should be cherry-picked to an earlier release. For bug fixes only.needs-operator-pr
: This PR is related to install and requires a corresponding change to the operator.