-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server,rpc,security: use a separate sql-node.crt
for SQL pods
#71190
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code that we'd expect to remove once we move to multiple CAs? If so, do we need to add to the CLI, since CC generates its own certificates rather than using the CLI? Also, if so, are there any changes that would be difficult to back-out later on?
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @catj-cockroach and @dhartunian)
The CLI change exist for the benefit of manual and automated testing. This is also true of the other cli commands that generate MT certs. The code as implemented here uses the tenant CA to sign (and verify) the new sql-node. This was an arbitrary choice on my part, and I can revert that to use the "main" CA found by a SQL server in the file ca.crt (and we'd assume that CA is specific to SQL pods, not shared with the host.cluster). I can also implement a fallback behavior, where we try one CA cert first, and if that doesn't exist, try another one. @catj-cockroach please instruct further. |
@chrisseto @catj-cockroach instead of the approach I've taken here (new sql-node cert), I could also update this PR to use the tenant cert directly (the one used to connect to KV pods) to connect SQL pods to each other. What do you think? |
I think I like the separation of concerns by using a sql-node cert rather than reusing the tenant -> kv cert. It feels a bit more intuitive to me. I'll defer to @catj-cockroach as the TLS expert, however. |
I think the separate sql-node certificate is better from a separation of concerns point of view ... but if we plan to throw this out anyways, I'd lean towards whatever results in the minimum code change that's easiest to later remove. |
I'm sorry I should have updated here as well! Personally I'm fine with using the same certificate for both. It's fewer keys to rotate in the event of a server compromise and all the certificate is doing is attesting that it is a SQL server to the KV layer and the rest of the SQL layer. So whatever is easier for you @knz! :) |
Release note (security update): Multitenant SQL servers now use a separate certificate with `CN=sql-node` and filename `sql-node.crt`. Using the `node.crt` generated for the system tenant is not possible any more.
RFAL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 33 of 37 files at r1, 29 of 29 files at r2, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian and @knz)
pkg/security/certificate_loader.go, line 470 at r2 (raw file):
// values of certain fields. // This should only be called on the NodePem CertInfo when there is no specific // client certificate for the 'node' user.
Could we update this comment to:
// This should only be called on the NodePem CertInfo when there is no specific
// client certificate for the 'node' or 'sql-node' user.
pkg/security/certificate_manager.go, line 130 at r2 (raw file):
uiCACert *CertInfo // optional: certificate to verify UI certficates nodeCert *CertInfo // certificate for nodes (always server cert, sometimes client cert) tenantNodeCert *CertInfo // certificate for SQL tenant servers (always server cert, sometimes client cert)
I'm not certain I understand this comment. Does this mean "sometimes the CertInfo value is referring to a client certificate in addition to when it's referring to a server certificate"?
pkg/security/certs.go, line 297 at r2 (raw file):
overwrite bool, hosts []string, forTenant bool,
Why not expose baseNodeUser
here instead of using a boolean?
Edit: oh, I see, we use this to determine the label and the filesystem paths! I'd argue we should have a createNodePair
and createTenantNodePair
and move the duplicated code that makes sense to into a new function. We could easily wrap everything from the start of the function to where we're setting the labels and determining filenames from the looks of things.
pkg/security/username.go, line 75 at r2 (raw file):
// IsNodeUser is true iff the username designates the node user. func (s SQLUsername) IsNodeUser() bool { return s.u == NodeUser }
Should we have an equivalent version of this function for SQLNodeUser?
we're going to deprioritize this approach in favor of #71248. |
71248: rpc,security: use the tenant client cert for pod-pod communication r=catj-cockroach a=knz Fixes #71106 Alternative design to #71190 Epic: SEC-665 As of this patch, we have the following file usage: - KV nodes on host cluster: - ui.crt (optional): - used as server cert for HTTP - ui-ca.crt (optional): - used in unit tests to verify the server's identity for HTTP conns - node.crt: - used as client cert for node-to-node comms - used as server cert for node-to-node comms - used as server cert for SQL clients - used as server cert for incoming conns from SQL tenant servers - used as server cert for HTTP, if ui.crt doesn't exist - tenant-client-ca.crt (optional): - used to verify client certs for SQL tenant servers - client-ca.crt (optional); - used to verify client certs for SQL clients - used to verify client certs for SQL tenant servers, if tenant-client-ca.crt doesn't exist - ca.crt: - used to verify other node client certs for node-to-node comms - used in unit tests to verify the server's identity for SQL and RPC conns - used to verify client certs for SQL clients, if client-ca.crt doesn't exist - used to verify client certs for SQL tenant servers, if neither tenant-client.ca.crt nor client-ca.crt exist - SQL servers: - ui.crt (optional): - used as server cert for HTTP - ui-ca.crt (optional): - used in unit tests to verify the server's identity for HTTP conns - client-tenant.NN.crt: - used as client cert for node-to-node comms (SQL server to SQL server) - used as server cert for node-to-node comms (SQL server to SQL server) - used as client cert for conns to KV nodes - used as server cert for SQL clients - used as server cert for HTTP, if ui.crt doesn't exist - tenant-client-ca.crt (optional): - used to verify client certs for SQL tenant servers - client-ca.crt (optional); - used to verify client certs for SQL clients - used to verify client certs for SQL tenant servers, if tenant-client-ca.crt doesn't exist - ca.crt: - used to verify other SQL server certs for node-to-node comms, if tenant-client-ca.crt doens't exist - used to verify client certs for SQL clients, if client-ca.crt doesn't exist - used to verify client certs for SQL tenant servers, if neither tenant-client.ca.crt nor client-ca.crt exist - used in unit tests to verify the server's identity for SQL and RPC conns Release note (security update): Multitenant SQL servers now reuse the tenant client certificate (`client-tenant.NN.crt`) for SQL-to-SQL communication. Existing deployments must regenerate the certificates with dual purpose (client and server authentication). 71330: Use include_cached to speed up build time, adding comment tags. r=knz a=ianjevans This PR has minor changes to the Markdown output of the release notes script. The docs team now uses the `include_cached` plugin for Jekyll for common included files to speed up build times. And I wrapped the comment for the docs team in Liquid comment tags. release notes: none Co-authored-by: Raphael 'kena' Poss <[email protected]> Co-authored-by: ianjevans <[email protected]>
Fixes #71106.
Release note (security update): Multitenant SQL servers now use a
separate certificate with
CN=sql-node
and filenamesql-node.crt
.Using the
node.crt
generated for the system tenant is not possibleany more.