Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rpc,security: simplify and enhance the client cert validation #96207

Merged
merged 3 commits into from
Feb 1, 2023

Conversation

knz
Copy link
Contributor

@knz knz commented Jan 30, 2023

Fixes #96174.
Prerequisite for #96153.
Epic: CRDB-14537

TLDR: this change enhances various aspects of TLS client cert
validation. Between other things, it makes the error clearer in case
of authentication failure.

Example, before:

ERROR: requested user demo is not authorized for tenant {2}

Example, after:

ERROR: certificate authentication failed for user "demo"
DETAIL: This is tenant system; cert is valid for "root" on all tenants.

@knz knz requested review from a team as code owners January 30, 2023 18:03
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Left some nitpics and questions but nothing blocking.

@@ -187,10 +187,28 @@ func UserAuthCertHook(
if ValidateUserScope(certUserScope, systemIdentity.Normalized(), tenantID) {
return nil
}
return errors.Errorf("requested user %s is not authorized for tenant %d", systemIdentity, tenantID)
return errors.WithDetailf(errors.Errorf("certificate authentication failed for user %q", systemIdentity),
"You are connecting to tenant %v. The client cert is valid for %s.", tenantID, FormatUserScopes(certUserScope))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think cert is a fine abbreviation in code and perhaps for options. But for the error message we may want to s/cert/certificate/ here and in the other error message this PR updates.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave it to security reviewers to decide whether we want to leak the tenantID of the service that the client connected to here. My sense is that it the benefit from the improved debuggability is probably worth it.

Comment on lines +201 to +207
fmt.Fprintf(&buf, "%s%q on ", comma, scope.Username)
if scope.Global {
buf.WriteString("all tenants")
} else {
fmt.Fprintf(&buf, "tenant %v", scope.TenantID)
}
comma = ", "
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[style] If we implemented String() on security.CertificateUserScope, then this could be

		fmt.Fprintf(&buf, "%s%s", comma, scope.String())
		comma = ", "

which I suppose allocate a bit more, so I leave it to your judgement.

knz added 2 commits January 31, 2023 23:31
TLDR: this commit enhances various aspects of TLS client cert
validation. Between other things, it makes the error clearer in case
of authentication failure.

Example, before:
```
ERROR: requested user demo is not authorized for tenant {2}
```

Example, after:
```
ERROR: certificate authentication failed for user "demo"
DETAIL: The client certificate is valid for "root" on all tenants.
```

----

Changes to client cert authentication:

The error message was improved.

Changes to tenant authentication in the RPC subsystem:

Prior to this change, there was a function `getActiveNodeOrUserScope`
whose purpose as not obvious at first sight -- it was merely checking
that the client cert contained the `root` or `node` identity.

This commit reduces the complexity of this function to make it more
maintainable.

Additionally, upon an authn error the text of the error was a bit
complex to read, and did not include details about the client cert.
This commit enhances the error and makes it show the user scopes
included in the client cert.

Finally, the function contained an exception to prevent the `node`
principal from running RPCs on a non-system tenant. This was too
restrictive; it prevents us (in a later commit) from reusing the
node cert for tenant-tenant connections when using shared-process
multitenancy. So this commit removes that exception.

Release note: None
@knz knz force-pushed the 20230130-cert-improvement branch from 111432e to fb78a5d Compare January 31, 2023 22:47
Copy link
Contributor Author

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @catj-cockroach, @jeffswenson, @rafiss, and @stevendanna)


pkg/security/auth.go line 191 at r2 (raw file):

Previously, stevendanna (Steven Danna) wrote…

I'll leave it to security reviewers to decide whether we want to leak the tenantID of the service that the client connected to here. My sense is that it the benefit from the improved debuggability is probably worth it.

Andrei made the case separately that indeed we should not leak that information. I'm removing it.


pkg/security/auth.go line 207 at r2 (raw file):

Previously, stevendanna (Steven Danna) wrote…

[style] If we implemented String() on security.CertificateUserScope, then this could be

		fmt.Fprintf(&buf, "%s%s", comma, scope.String())
		comma = ", "

which I suppose allocate a bit more, so I leave it to your judgement.

Let's first see (later) how far we want to involve into this cert user scope story. Maybe we can simplify it.

@knz
Copy link
Contributor Author

knz commented Jan 31, 2023

bors r=stevendanna

@knz
Copy link
Contributor Author

knz commented Jan 31, 2023

i left a buglet in here

bors r-

@craig
Copy link
Contributor

craig bot commented Jan 31, 2023

Canceled.

@knz knz force-pushed the 20230130-cert-improvement branch from fb78a5d to 2609923 Compare January 31, 2023 23:41
@knz
Copy link
Contributor Author

knz commented Feb 1, 2023

bors r=stevendanna

@craig
Copy link
Contributor

craig bot commented Feb 1, 2023

Build failed (retrying...):

@craig
Copy link
Contributor

craig bot commented Feb 1, 2023

Build failed (retrying...):

@knz
Copy link
Contributor Author

knz commented Feb 1, 2023

hm maybe another bug here. will investigate first.

bors r-

@craig
Copy link
Contributor

craig bot commented Feb 1, 2023

Canceled.

This patch more clearly separate authentication from authorization,
and makes it clear when the flow involves a tenant ID.

Release note: None
@knz knz force-pushed the 20230130-cert-improvement branch from 2609923 to cca7a9b Compare February 1, 2023 09:21
@knz
Copy link
Contributor Author

knz commented Feb 1, 2023

bors r=stevendanna single on

@craig
Copy link
Contributor

craig bot commented Feb 1, 2023

Build succeeded:

@craig craig bot merged commit 8b35f22 into cockroachdb:master Feb 1, 2023
@knz knz deleted the 20230130-cert-improvement branch February 1, 2023 11:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potentially misleading error message: "requested user blah_blah is not authorized for tenant {1}"
3 participants