Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

security: add CA and client certs for tenant usage #49864

Merged
merged 1 commit into from
Jun 10, 2020

Conversation

tbg
Copy link
Member

@tbg tbg commented Jun 4, 2020

This commit introduces methods that create a CA certificate for use with
the auth broker envisioned in #49105 and, from that CA, client
certificates for use in #47898. They are not exposed through the cli
(yet), though the certificate loader already supports them.

Touches #49105.
Touches #47898.

Release note: None

@tbg tbg added the A-multitenancy Related to multi-tenancy label Jun 4, 2020
@tbg tbg requested review from bdarnell, knz and aaron-crl June 4, 2020 09:33
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Contributor

@knz knz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl and @bdarnell)

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl and @tbg)


pkg/security/certs.go, line 505 at r1 (raw file):

}

// WriteTenantClientPair writes a TenantClientPair into certsDir.

Where (on which kind of server) do tenant certs get written to disk? In my understanding of the design these certs are all ephemeral and can be kept in memory.

@tbg tbg requested a review from bdarnell June 5, 2020 17:47
Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl and @bdarnell)


pkg/security/certs.go, line 505 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Where (on which kind of server) do tenant certs get written to disk? In my understanding of the design these certs are all ephemeral and can be kept in memory.

Your understanding is correct. In production a client pair either comes from the auth broker (either directly or injected at pod startup, likely through a file then?). I could move this into _test.go to keep the public API clean for now, though I anticipated using this for a first plumbing pass of the security work that switches to these certificates (without using the auth broker yet) as well.

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 6 files at r1.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl and @tbg)


pkg/security/certificate_loader.go, line 90 at r1 (raw file):

	ClientPem
	// TenantClientPem describes a SQL tenant client certificate.
	TenantClientPem

How does TenantClientPem relate to NodePem (the "node client" cert)? Wouldn't a SQL tenant pod always use a tenant cert instead of a node cert? I'm tempted to say we shouldn't have separate variables for TenantClientPem and should just load the tenant cert as the node client cert.

At first I wanted to make this argument for TenantCAPem as well, but this gets trickier - KV servers need to be able to accept connections from both the system tenant and other tenants, and they must be able to distinguish between them (since the tenant CA is online in the auth broker, it's important that it not be able to mint omnipotent node certs). So maybe if we keep that we might as well keep TenantClientPem too.


pkg/security/certificate_manager.go, line 198 at r1 (raw file):

func CACertFilename() string { return "ca" + certExtension }

// TenantCACertPath returns the expected file path for the Tenant CA

FYI I have a branch I'm working on that unexports all the "Path" methods from CertificateManager.


pkg/security/certificate_manager.go, line 269 at r1 (raw file):

// TenantClientKeyFilename returns the expected file name for the user's key.
func TenantClientKeyFilename(tenantIdentifier string) string {
	return "tenant." + tenantIdentifier + keyExtension

This makes me nervous because it implies we might have keys for multiple clients written to disk in the same place. As I said in certs.go, I'd just avoid doing anything with naming conventions for these certs.


pkg/security/certs.go, line 443 at r1 (raw file):

}

// TenantClientPair are client certs for use with multi-tenancy.

Why a specific type instead of KeyPair? Why hard-code RSA for the private key ?(instead of crypto.PrivateKey. For that matter, we should get @aaron-crl 's opinion on whether we should be using RSA, ECDSA, or something else here. This is a purely internal part of the system so we don't have to worry as much about rest-of-world compatibility) What format are those bytes in? Why isn't this just a tls.Certificate (which despite the name can carry a private key)?

(for a lot of my comments the answer could be "because that's what this file already does elsewhere". It's accumulated a lot of tech debt but we don't have to fix it all right now)


pkg/security/certs.go, line 505 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

Your understanding is correct. In production a client pair either comes from the auth broker (either directly or injected at pod startup, likely through a file then?). I could move this into _test.go to keep the public API clean for now, though I anticipated using this for a first plumbing pass of the security work that switches to these certificates (without using the auth broker yet) as well.

OK. For some additional context, I'm working on some cleanups here to make the certificate manager less file-centric (anticipating future usage of HSM/KMS), and to split it up so we don't have all the different use cases sharing one cert manager object with a bunch of possibly-nil fields.

So I don't think following the current cert patterns is a requirement or even necessarily a good idea. If you'll just be loading certs from disk in testing scenarios, don't worry about enforcing conventions and require the path to the files to be passed on the command line.


pkg/security/x509.go, line 198 at r1 (raw file):

	lifetime time.Duration,
	user string,
	hosts ...string,

Why would client certs have hosts? Are you trying to make them non-transferable and restricted to a single source pod? Does that work? It kind of makes sense but I can't find any reference to this being done. (and in particular verifying clients with DNS lookups is problematic; it might only make sense for it to use IP addresses)

Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @bdarnell, and @tbg)


pkg/security/certificate_loader.go, line 90 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

How does TenantClientPem relate to NodePem (the "node client" cert)? Wouldn't a SQL tenant pod always use a tenant cert instead of a node cert? I'm tempted to say we shouldn't have separate variables for TenantClientPem and should just load the tenant cert as the node client cert.

At first I wanted to make this argument for TenantCAPem as well, but this gets trickier - KV servers need to be able to accept connections from both the system tenant and other tenants, and they must be able to distinguish between them (since the tenant CA is online in the auth broker, it's important that it not be able to mint omnipotent node certs). So maybe if we keep that we might as well keep TenantClientPem too.

Yes, tenants have no use for node certs. They use their tenant client cert to talk to KV and they also have some additional cert they use for their Postgres-TLS which doesn't factor in here (yet, though we'll need one). Effectively, we'll set up an rpc.Context for them that uses the tenant client cert instead of the node client cert but otherwise looks mostly the same.

For practical reasons, I would prefer to just "do how this package does" at the time of writing and (i.e. this PR mostly as is) - if you're amenable to it - rely on you to revisit when you're moving away from the file-centric view (which implies that you'll be figuring out the certs-dir complications across the codebase that would otherwise be front-loaded on me).
I'm mostly trying to move through quickly to get these certs into the right places in the SQL tenant cli using the standard --certs-dir flag to build out the tenant server endpoint in the KV layer, and as of this PR I'm there (mod the auth broker, i.e. for testing with filesystem certs, and possibly multiple different tenant's processes at once). If I do anything differently, I have to rework the flags, how the RPC context loads the certs, etc, I think there's just going to be a lot of fallout that I'm not the right person for handling; I think this is better addressed in the work that you've set to do, if I understand it correctly.

I also think there might be a good case for loading tenant cli certs from disk in prod: we will need to inject an initial tenant cert at boot time; naively the filesystem is the right way to do it (or is an env var better?).

Let me know which changes you want me to make before merging.


pkg/security/certificate_manager.go, line 269 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

This makes me nervous because it implies we might have keys for multiple clients written to disk in the same place. As I said in certs.go, I'd just avoid doing anything with naming conventions for these certs.

👍 but see my other comment.


pkg/security/certs.go, line 505 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

OK. For some additional context, I'm working on some cleanups here to make the certificate manager less file-centric (anticipating future usage of HSM/KMS), and to split it up so we don't have all the different use cases sharing one cert manager object with a bunch of possibly-nil fields.

So I don't think following the current cert patterns is a requirement or even necessarily a good idea. If you'll just be loading certs from disk in testing scenarios, don't worry about enforcing conventions and require the path to the files to be passed on the command line.

I agree in principal, but see my other comment.


pkg/security/x509.go, line 198 at r1 (raw file):

Previously, bdarnell (Ben Darnell) wrote…

Why would client certs have hosts? Are you trying to make them non-transferable and restricted to a single source pod? Does that work? It kind of makes sense but I can't find any reference to this being done. (and in particular verifying clients with DNS lookups is problematic; it might only make sense for it to use IP addresses)

I just saw that it seemed possible, and so it seemed desirable to try to do it. If you consider this exotic/not worth it I can take it out (otherwise I'd just see whether it works once I have the auth broker). It does seem nice to not have the "token" be transferable to another host. OTOH maybe this is over the top and just causes headaches with pod migrations etc if that is a thing.

Copy link
Contributor

@bdarnell bdarnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No changes required at this time except removing the host option when generating client certs.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @bdarnell, and @tbg)


pkg/security/x509.go, line 198 at r1 (raw file):

Previously, tbg (Tobias Grieger) wrote…

I just saw that it seemed possible, and so it seemed desirable to try to do it. If you consider this exotic/not worth it I can take it out (otherwise I'd just see whether it works once I have the auth broker). It does seem nice to not have the "token" be transferable to another host. OTOH maybe this is over the top and just causes headaches with pod migrations etc if that is a thing.

I'd leave it out - in our standard setup, we already have client certs with IPs and hostnames set in SANs but they're not used for client cert verification (this is because our node certs are hybrid client/server). We wouldn't want to add enforcement that would suddenly change the behavior of existing certs.

This commit introduces methods that create a CA certificate for use with
the auth broker envisioned in cockroachdb#49105 and, from that CA, client
certificates for use in cockroachdb#47898. They are not exposed through the cli
(yet), though the certificate loader already supports them.

Touches cockroachdb#49105.
Touches cockroachdb#47898.

Release note: None
@tbg tbg requested review from bdarnell and knz June 10, 2020 07:40
Copy link
Member Author

@tbg tbg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR! I removed the host option.

bors r=bdarnell

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @aaron-crl, @bdarnell, @knz, and @tbg)

@craig
Copy link
Contributor

craig bot commented Jun 10, 2020

Build succeeded

@craig craig bot merged commit 7cf408b into cockroachdb:master Jun 10, 2020
@tbg tbg deleted the mt-authtoken branch June 10, 2020 15:58
tbg added a commit to tbg/cockroach that referenced this pull request Jun 11, 2020
This is an addendum to cockroachdb#49864. I had added a tenant CA and tenant client
certs but did not add tenant server CA and certs. However, this is
necessary:a KV server run in a multi-tenancy environment will use

- tenant server certs (issued by tenant server CA) to host TLS-enabled
  endpoint (for use by SQL tenant processes)
- tenant client CA (to validate incoming connections)
- node server certs (for intra-KV traffic).

The expectation is that the tenant server certs/CA will be using
external PKI (i.e. publicly trusted certs), but I learned quickly
while prototyping cockroachdb#47898 that it will well be worth it to generate
these certs internally as well.

Release note: None
tbg added a commit to tbg/cockroach that referenced this pull request Jun 12, 2020
This is an addendum to cockroachdb#49864. I had added a tenant CA and tenant client
certs but did not add tenant server CA and certs. However, this is
necessary:a KV server run in a multi-tenancy environment will use

- tenant server certs (issued by tenant server CA) to host TLS-enabled
  endpoint (for use by SQL tenant processes)
- tenant client CA (to validate incoming connections)
- node server certs (for intra-KV traffic).

The expectation is that the tenant server certs/CA will be using
external PKI (i.e. publicly trusted certs), but I learned quickly
while prototyping cockroachdb#47898 that it will well be worth it to generate
these certs internally as well.

Release note: None
tbg added a commit to tbg/cockroach that referenced this pull request Jun 19, 2020
This is an addendum to cockroachdb#49864. I had added a tenant CA and tenant client
certs but did not add tenant server CA and certs. However, this is
necessary:a KV server run in a multi-tenancy environment will use

- tenant server certs (issued by tenant server CA) to host TLS-enabled
  endpoint (for use by SQL tenant processes)
- tenant client CA (to validate incoming connections)
- node server certs (for intra-KV traffic).

The expectation is that the tenant server certs/CA will be using
external PKI (i.e. publicly trusted certs), but I learned quickly
while prototyping cockroachdb#47898 that it will well be worth it to generate
these certs internally as well.

Release note: None
tbg added a commit to tbg/cockroach that referenced this pull request Jun 24, 2020
This is an addendum to cockroachdb#49864. I had added a tenant CA and tenant client
certs but did not add tenant server CA and certs. However, this is
necessary:a KV server run in a multi-tenancy environment will use

- tenant server certs (issued by tenant server CA) to host TLS-enabled
  endpoint (for use by SQL tenant processes)
- tenant client CA (to validate incoming connections)
- node server certs (for intra-KV traffic).

The expectation is that the tenant server certs/CA will be using
external PKI (i.e. publicly trusted certs), but I learned quickly
while prototyping cockroachdb#47898 that it will well be worth it to generate
these certs internally as well.

Release note: None
tbg added a commit to tbg/cockroach that referenced this pull request Jun 24, 2020
This is an addendum to cockroachdb#49864. I had added a tenant CA and tenant client
certs but did not add tenant server CA and certs. However, this is
necessary:a KV server run in a multi-tenancy environment will use

- tenant server certs (issued by tenant server CA) to host TLS-enabled
  endpoint (for use by SQL tenant processes)
- tenant client CA (to validate incoming connections)
- node server certs (for intra-KV traffic).

The expectation is that the tenant server certs/CA will be using
external PKI (i.e. publicly trusted certs), but I learned quickly
while prototyping cockroachdb#47898 that it will well be worth it to generate
these certs internally as well.

Release note: None
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-multitenancy Related to multi-tenancy
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants