Bootstrap worker nodes using kops-controller #9653

johngmyers · 2020-07-31T04:11:03Z

The idea is for nodeup to generate private keys, send the public keys to kops-controller, and have kops-controller issue the necessary certificates.

~~This is far enough along to prove the concept. Posting to get early review (especially from @justinsb) while I plumb in the remaining pieces.~~

On AWS for Kubernetes 1.19 and later, worker nodes bootstrap by generating a private key and authenticating with a cloud-provider-specific mechanism to kops-controller. Kops-controller then verifies the authentication and issues a node-specific kubelet certificate.

This new bootstrap mechanism replaces both the mechanisms of reading a cluster-wide kubelet cert from the state store and node-authorizer. There is no option to use either of these previous mechanisms on Kubernetes 1.19+ in AWS.

Additionally, enables Kubernetes Node authorization for Kubernetes 1.19 and later, on all cloud providers.

Cribs from #8580. The authentication mechanism for AWS is based on that used by Vault. The code is structured to allow other cloud providers to provide their own authentication mechanisms.

Followup PRs will switch other worker node certs to use this mechanism.

Closes #1231

johngmyers · 2020-07-31T04:12:21Z

/approve cancel

justinsb · 2020-07-31T12:14:41Z

cmd/kops-controller/main.go

+		var err error
+		switch opt.Server.Provider {
+		case kops.CloudProviderAWS:
+			verifier, err = awsup.NewAWSVerifier()


We do have a node identifier in kops-controller already. We do also have the node verifier in node-authorizer.

Not a blocker, just that we can probably look at that code.

The verifier needs to authenticate the request data as having come from an instance that was created using the spec of one of the cluster's instancegroups. Bonus points for authenticating the particular instance group or even the particular instance.

node-authorizer makes two checks, neither of which are sufficient, even in combination. First, it verifies an instance identity document provided by the instance. This document is long-lived and replayable: it is not tied to any data in the rest of the request. The only thing tying the data in the request to the instance is a check that the source IP address of the incoming request matches that of the instance, which is not a particularly strong mechanism.

justinsb · 2020-07-31T12:17:13Z

cmd/kops-controller/main.go

+		srv, err := server.NewServer(&opt, verifier)
+		if err != nil {
+			setupLog.Error(err, "unable to create server")
+			os.Exit(1)


I've been finding recently that I've been rewriting the kubebuilder main to call into a func run() error function immediately, just to avoid having to log & exit every time!

The structured logging, such as in the handling of addNodeController, makes it tricky to use that pattern.

cmd/kops-controller/pkg/server/server.go

nodeup/pkg/model/bootstrap_client.go

upup/pkg/fi/cloudup/awsup/aws_authenticator.go

justinsb · 2020-07-31T12:38:41Z

upup/pkg/fi/cloudup/awsup/aws_verifier.go

+	}
+	token = strings.TrimPrefix(token, AWSAuthenticationTokenPrefix)
+
+	// We rely on the client and server using the same version of the same STS library.


Should we instead send the body of the STS request?

Vault does that, but they have to support arbitrary clients. I took the ability to require the client to use the same version of the AWS library to reduce the risk of allowing attackers to use kops-controller as a pass-through proxy to STS.

Ah - interesting point. Does this introduce a challenge for upgrading the AWS SDK though? If there's master vs node skew then I am worried that either old-version nodes or new nodes won't be able to join?

I didn't think supporting version skew was a requirement. Old nodes would only be an issue if they were created before the ASG's spec was updated yet didn't call kops-controller until after the first master was updated. The old node would be cleaned out by rolling update (though the cluster validation failure would impede rolling update from starting). New nodes would fail only until the first master is updated.

I do have an idea of making rolling update ignore non-ready nodes that are in nodes-role instance groups other than the one being updated.

So I think the scenario we be most mindful of is the master being ahead of the nodes. i.e. we update the AWS SDK and somehow the body changes, we roll that out for the control-plane, but the nodes aren't yet updated. As you point out, old node configurations would fail to join in this scenario. Kubernetes itself tries very hard to ensure that the masters can be a minor version ahead of the nodes (indefinitely); we don't have to support that but we should be aware of this.

Do we have an example of what's in the body? How likely is it to change?

At the time we run kops update cluster --yes (or terraform apply) the node ASGs will be updated. All new instances created after that point will be on the new version of the library. Similarly, all nodes that have joined by the time the first master is updated will continue to be in the cluster. The only issue would be a worker instance that is created before the kops update cluster --yes but takes longer to join the cluster than the time it takes for the first control plane node to get updated and apply the new kops-controller manifest.

The body usually has QWN0aW9uPUdldENhbGxlcklkZW50aXR5JlZlcnNpb249MjAxMS0wNi0xNQ== which is the base64 encoding of Action=GetCallerIdentity&Version=2011-06-15. It doesn't seem likely to change often.

The other thing we're relying on being the same from client to server is the request URL.

OK, that sounds reasonable. We could create a unit test to check that the body is this value; if the unit test ever failed we could deal with it (e.g constructing all allowed forms and seeing if any of them match the sha), but as you point out it seems unlikely.

justinsb · 2020-07-31T12:40:29Z

upup/pkg/fi/cloudup/awsup/aws_verifier.go

+		return "", fmt.Errorf("incorrect content-length")
+	}
+
+	// TODO - implement retry?


I suspect our client needs to retry anyway, so there might not be much reason for us to retry here.

Nodeup will retry, it's just a question of how long transient errors delay node startup.

upup/pkg/fi/nodeup/nodetasks/bootstrap_client.go

justinsb · 2020-07-31T12:45:36Z

upup/pkg/fi/nodeup/nodetasks/bootstrap_client.go

+	if resp.StatusCode != http.StatusOK {
+		detail := ""
+		if resp.Body != nil {
+			scanner := bufio.NewScanner(resp.Body)


Any reason not to just ioutil.ReadAll ?

We only want the first line.

OK; not a problem. Unless there's something that we actively don't want to log, I'd personally just log the whole thing - little cost, potential benefit for debugging. But your call.

justinsb · 2020-07-31T12:50:13Z

Really great stuff. Mostly minor feedback.

Topics that may be more interesting:

Can we / should we reuse any of the existing validation code
GRPC vs REST
Should we make the request more extensible, with a flexible "evidence" section that we basically throw lots of fields into. This is the approach taken by node-authorizer, which supports some interesting verification policies. But generally it might just be easier anyway!

johngmyers · 2020-08-03T03:59:31Z

aws_verifier might want to verify the received arn's partition matches that of its own identity.
aws_verifier assumes the arn it receives is for an assumed-role. It does in my (limited) testing, but in case there's some region or situation where it isn't we might want to have an escape hatch to the ten-year-kubelet-cert mechanism.
I haven't tested non-nil values of Spec.IAM.Profile. I doubt we have coverage of that feature. What is the use case for that feature? Where should we document that anything sharing such such a profile would be able to register as a node?
I think we should remove support for node-authorizer in k8s 1.19+ for the non-AWS cloud providers as well.

johngmyers · 2020-08-03T15:20:18Z

I wonder if now there's a circular dependency for most CNIs now, with CNIs depending on a worker node joining and worker node joining depending on the CNI.
/retest

olemarkus · 2020-08-03T15:37:59Z

I don't think so. If you install a CNI manually, you wait for the masters to be available, but NOT READY, and then apply the CNI spec. The worker nodes join once the masters are ready. Just looking quickly at this, it looks like dns-controller is not updating the DNS and dns-controller uses host network.

johngmyers · 2020-08-04T04:47:34Z

/retest

johngmyers · 2020-08-04T06:36:24Z

/retest

cmd/kops-controller/pkg/server/server.go

nodeup/pkg/model/kops_controller.go

justinsb · 2020-08-08T20:36:22Z

pkg/apis/nodeup/bootstrap.go

+	// APIVersion defines the versioned schema of this representation of a request.
+	APIVersion string `json:"apiVersion"`
+	// Certs are the requested certificates and their respective public keys.
+	Certs map[string]string `json:"certs"`


You might want to make this Certs map[string]*CertRequest or Certs []*CertRequest just to give you more extensibility

It depends on whether we need to support bootstrapping nodes created with a different version, which I discuss below.

Did you post this discussion? I'm not seeing it.

It's in the block of comments on aws_verifier.go line 129.

justinsb · 2020-08-08T20:37:02Z

pkg/apis/nodeup/bootstrap.go

+// BootstrapRespose is a response to a BootstrapRequest.
+type BootstrapResponse struct {
+	// Certs are the issued certificates.
+	Certs map[string]string


Ditto here - map[string]*Cert doesn't cost much now, might avoid version contortions later

justinsb · 2020-08-08T20:41:10Z

upup/pkg/fi/nodeup/nodetasks/bootstrap_client.go

+	if resp.StatusCode != http.StatusOK {
+		detail := ""
+		if resp.Body != nil {
+			scanner := bufio.NewScanner(resp.Body)


OK; not a problem. Unless there's something that we actively don't want to log, I'd personally just log the whole thing - little cost, potential benefit for debugging. But your call.

upup/pkg/fi/cloudup/awsup/aws_authenticator.go

justinsb · 2020-08-08T20:47:00Z

upup/pkg/fi/cloudup/awsup/aws_verifier.go

+	}
+	token = strings.TrimPrefix(token, AWSAuthenticationTokenPrefix)
+
+	// We rely on the client and server using the same version of the same STS library.


Ah - interesting point. Does this introduce a challenge for upgrading the AWS SDK though? If there's master vs node skew then I am worried that either old-version nodes or new nodes won't be able to join?

upup/pkg/fi/authenticate.go

justinsb · 2020-08-08T20:54:44Z

upup/pkg/fi/nodeup/nodetasks/bootstrap_client.go

+	}
+	httpReq.Header.Set("Content-Type", "application/json")
+
+	token, err := b.Authenticator.CreateToken(reqBytes)


Thanks for explaining the signature challenge. I do understand, but ... can we implement this on GCE / OpenStack / Bare-Metal? I know it's not as simple, but could we do something like this:

type SignedBootstrapRequest struct { Request *BootstrapRequest `json:"request"` AWSSecureTokenService *AWSSecureTokenService `json:"awsSecureTokenService"` // GCESignature *GCESignature } type AWSSecureTokenService struct { Headers map[string]string `json:"headers"` }

It feels more extensible (for other clouds, possible for AWS also)

I believe this interface is implementable by other clouds, assuming those clouds have some way to authenticate an instance. The interface I defined seems to me to have been reduced to the essentials: the provider-specific client implementation gets the serialized body and produces an authenticator. The common code transfers the authenticator in the Authorization: header. The provider-specific server implementation gets the serialized body and the authenticator and gets to say what the node name is or reject the request.

I'm presuming that multi-provider clusters are well out of scope.

Agree that multi-provider clusters should be out of scope. Even if we have a central management cluster, I think we would want node bring-up to be done relatively locally. I could imagine forwarding the request to another service, but I imagine that would essentially be wrapping the current request anyway.

I can see the authorization header as isomorphic to what I described - we're encoding the signature data (with the body) into json, base64 encoding it, and then putting that in the authorization header. I'm really not sure if that is simpler, but it's likely tied up in the skew problem ... I'll ponder!

Putting the token in the header simplifies the task of getting the signature to include the request, since you can then just take a secure hash of the entire body. Putting it into the Authorization: header means anything in the middle is likely to treat it as confidential.

I'm OK to proceed on this basis; if we do find that on another cloud it's much trickier to put it into the header (e.g. GCE's TPM flow) then we might have to put it into the body for that cloud - is that reasonable?

I think that's reasonable. If that were to happen, we'd probably do something like change the interfaces to return a modified body.

justinsb · 2020-08-08T20:58:56Z

So I like this PR a lot! My only reservation is around making the requests more structured, to allow for evolution and support for other clouds.

re deprecating node-authorizer, let's get this going first, then check with users!

justinsb · 2020-08-15T18:58:15Z

/retest

One prow networking failure before we even touch our code, the other test looks like it actually passed

justinsb · 2020-08-16T18:36:52Z

This LGTM ... I think we agreed to merge and aim for 1.19 (instead of 1.20), so...

/approve
/lgtm

k8s-ci-robot · 2020-08-16T18:37:06Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: justinsb

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [justinsb]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

justinsb · 2020-08-16T18:56:08Z

/test pull-kops-e2e-cni-cilium

fejta-bot · 2020-08-16T21:38:13Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

fejta-bot · 2020-08-17T00:26:14Z

/retest
This bot automatically retries jobs that failed/flaked on approved PRs (send feedback to fejta).

Review the full test history for this PR.

Silence the bot with an /lgtm cancel or /hold comment for consistent failures.

k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Jul 31, 2020

k8s-ci-robot requested review from gjtempleton and mikesplain July 31, 2020 04:12

k8s-ci-robot added area/addons area/api approved Indicates a PR has been approved by an approver from all required OWNERS files. area/kops-controller area/nodeup area/provider/aws Issues or PRs related to aws provider labels Jul 31, 2020

k8s-ci-robot removed the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jul 31, 2020

johngmyers force-pushed the kops-controller-server branch 2 times, most recently from c278b40 to d37e4ae Compare July 31, 2020 04:32

justinsb reviewed Jul 31, 2020

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jul 31, 2020

johngmyers force-pushed the kops-controller-server branch from d37e4ae to 5bd91e4 Compare August 1, 2020 05:24

k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Aug 1, 2020

johngmyers mentioned this pull request Aug 8, 2020

Certificate issuance cleanup and hardening #9110

Closed

justinsb reviewed Aug 8, 2020

View reviewed changes

Authenticate from nodeup to kops-controller

cfa262a

johngmyers force-pushed the kops-controller-server branch from aef5c86 to 744edaf Compare August 15, 2020 16:50

johngmyers added 9 commits August 15, 2020 10:30

Add server code to kops-controller

9cfa169

Allow cert/key file tasks to specify owner

321035f

Implement signing of kubelet cert in kops-controller

bec273e

Get kubelet certificate from kops-controller

c5871df

Don't issue kubelet cert on masters before k8s 1.19

fb381c4

Don't install RBAC addon when using kops-controller bootstrap

8ec55f2

Address review comments

9e99f76

Verify the ARN's partition

5dcc951

Send the STS queries to the local region

1a253dc

johngmyers force-pushed the kops-controller-server branch from 744edaf to 1a253dc Compare August 15, 2020 17:30

k8s-ci-robot assigned justinsb Aug 16, 2020

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Aug 16, 2020

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Aug 16, 2020

k8s-ci-robot merged commit 713c3b2 into kubernetes:master Aug 17, 2020

k8s-ci-robot added this to the v1.19 milestone Aug 17, 2020

johngmyers deleted the kops-controller-server branch August 17, 2020 02:09

hakman mentioned this pull request Aug 28, 2020

Move from debian-hyperkube-base to debian-base for node-authorizer #9828

Merged

zetaab mentioned this pull request Sep 16, 2020

not possible to rolling update cluster #9953

Closed

alanbover mentioned this pull request Feb 3, 2021

"does not contain acceptable node role" error when starting a k8s 1.19 cluster specifying custom iam roles #10719

Closed

nicktrav mentioned this pull request Apr 29, 2021

feature: allow arbitrarily named AWS IAM roles to be used by nodes joining a cluster #11356

Closed

rifelpet mentioned this pull request Apr 10, 2024

Migrate to aws-sdk-go-v2 #16424

Closed

15 tasks

rifelpet mentioned this pull request Apr 21, 2024

Migrate AWS Verifier to aws-sdk-go-v2 #16483

Merged

Bootstrap worker nodes using kops-controller #9653

Bootstrap worker nodes using kops-controller #9653

Conversation

johngmyers commented Jul 31, 2020 • edited Loading

johngmyers commented Jul 31, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johngmyers Aug 12, 2020 • edited Loading

Choose a reason for hiding this comment

johngmyers Aug 12, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinsb commented Jul 31, 2020

johngmyers commented Aug 3, 2020

johngmyers commented Aug 3, 2020

olemarkus commented Aug 3, 2020

johngmyers commented Aug 4, 2020

johngmyers commented Aug 4, 2020

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

johngmyers Aug 8, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

justinsb commented Aug 8, 2020

justinsb commented Aug 15, 2020

justinsb commented Aug 16, 2020

k8s-ci-robot commented Aug 16, 2020

justinsb commented Aug 16, 2020

fejta-bot commented Aug 16, 2020

fejta-bot commented Aug 17, 2020

johngmyers commented Jul 31, 2020 •

edited

Loading

johngmyers Aug 12, 2020 •

edited

Loading

johngmyers Aug 12, 2020 •

edited

Loading

johngmyers Aug 8, 2020 •

edited

Loading