-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add configurable leaf cert TTL to Connect CA #4400
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!
Couple of minor nits in the PR.
I also have a nagging feeling that there was at least one or two other bits of code that we had TODOs to fix once this was configurable.
Off the top of my head, the one I can remember is the purging timeout for rotated certs - we picked 7 days on the basis that it's longer than 3 day cert lifetime but now we should also be setting that dynamically I presume (in case people set certificate lifetime to a month say...)
Also we should probably enforce a sane range on the time, less than say 1 hour is likely to cause a lot more churn in the control plane and a lot more crypto activity burning CPU time on clients and servers. It will also break with the vault code you have here I think (noted inline) so maybe we should limit to that for now and consider allowing even shorter durations only if people actually ask for it?
Similarly setting certificate validity of 10 years kinda breaks all our assumptions about being able to rotate out old roots etc, So I'd suggest maybe 1 year as an extremely conservative maximum? We could raise that if there are legit use-cases but I think a lot of the security properties we target are compromised at this point anyway.
agent/connect/ca/provider_vault.go
Outdated
@@ -227,6 +227,7 @@ func (v *VaultProvider) Sign(csr *x509.CertificateRequest) (string, error) { | |||
// Use the leaf cert role to sign a new cert for this CSR. | |||
response, err := v.client.Logical().Write(v.config.IntermediatePKIPath+"sign/"+VaultCALeafCertRole, map[string]interface{}{ | |||
"csr": pemBuf.String(), | |||
"ttl": fmt.Sprintf("%.0fh", v.config.LeafCertTTL.Hours()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will round by truncation. If the user configured less than 1 hour won't this make TTL be 0h
which is not going to work well :)
Moot point if we enforce config can't be less than 1 somewhere but even so a comment would be good.
<p>There are also a number of common configuration options supported by all providers:</p> | ||
|
||
* <a name="ca_leaf_cert_ttl"></a><a href="#ca_leaf_cert_ttl">`leaf_cert_ttl`</a> The lease duration of | ||
a leaf certificate issued for a service, after which a new certificate will be requested by the proxy. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest we make this more precise. This is the hard limit on certificate validity, we actually request a new one with time to spare. We should probably also point out the trade off here that this is an upper bound on how long a control plane outage the cluster can handle before network connections will start being rejected which is why it's 3 days as a default to survive a weekend outage without intervention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Super cool.
As a side note - we have a couple of other uses of 72h
baked into Connect code but I don't think they are logically related - one is Cache TTL for certs and intentions. I think though that is fine - even if the user set a cert expiry of 1 year there is no reason to keep a cert around on an agent for more than a few days if the service no longer needs it - a new one can always be fetched.
You may want to grep 72 and double check nothing else that is logically related to the cert life time remains hard coded though if you didn't already?
agent/connect/ca/provider_vault.go
Outdated
@@ -172,7 +172,7 @@ func (v *VaultProvider) GenerateIntermediate() (string, error) { | |||
"allow_any_name": true, | |||
"allowed_uri_sans": "spiffe://*", | |||
"key_type": "any", | |||
"max_ttl": "72h", | |||
"max_ttl": fmt.Sprintf("%.0fm", v.config.LeafCertTTL.Minutes()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
won't v.config.LeafCertTTL.String()
give you a valid duration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good catch! I was overcomplicating this for some reason
This adds a configurable leaf cert TTL via a new block of "common" CA config options.