-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add automatic tidy of expired issuers #17823
Conversation
42725f6
to
a56bc78
Compare
a56bc78
to
e329b46
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Getting somewhere!
fields["tidy_expired_issuers"] = &framework.FieldSchema{ | ||
Type: framework.TypeBool, | ||
Description: `Set to true to automatically remove expired issuers | ||
past the issuer_safety_buffer. No keys will be removed as part of this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that we need to add something to remove unused keys eventually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I'm punting on that particular problem. Issuers make the mount slow (chain building, CRL building, ...), keys are always fetched by reference and don't make the mount slow. Obviously they'll be some nominal storage costs, but operators should be able to sort this out themselves eventually, if they care about the storage costs and want to remove old keys.
But even then, the number of stored leaves should exceed the stored issuer keys by several orders of magnitude.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the logging provide enough information for an operator to delete the key after an auto tidy if they want to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see the log message below:
It contains the key ID from the issuer, allowing them to make a call from the log messages as to whether to remove that key or not (and the system will prevent that if its in use by other still-present issuers).
In the general case of tidying keys, we don't necessarily presently store when/how a key was created, so we don't know if there's an outstanding, say, ICA issuance going on that we don't know about, or if it was from a deleted issuer.
We could later remove now-unused keys from issuers if we wanted to, but it makes recovery much harder -- you can't presently re-import deleted keys if you didn't have an external backup, and we wouldn't want to log them (unlike issuers) for obvious reasons...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I don't want to automatically remove keys, but I don't want operators to get in a case where they'd like to remove an old key but are confused which issuer it once applied to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would you recommend something else? Another migration to associate issuers on keys, and/or a tidy step to associate a since-deleted issuer's identifier(s) back to the key?
Two way association is only beneficial in this case; in general nothing in the PKI code (presently) goes from key->issuer, except the key deletion code.
The problem is a number of issuers won't necessarily have meaningful names, so logging the full certificate is perhaps the best approach here -- unless we do want to retain the issuer's certificate in the system and associate removed issuers' certificates with the key entry. Then they could read the key and see the list of since-deleted issuers which used that key... It wouldn't impact perf anymore but still wouldn't be ideal to interact with (as they'd likely have to parse the certificate to understand which one it was).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'd just be happy if the key_id was in the output of the issuer tidy, so they could grep it out and delete it by hand.
e329b46
to
8da9ad4
Compare
To aid PKI users like Consul, which periodically rotate intermediates, and provided a little more consistency with older versions of Vault which would silently (and dangerously!) replace the configured CA on root/intermediate generation, we introduce an automatic tidy of expired issuers. This includes a longer safety buffer (1 year) and logging of the relevant issuer information prior to deletion (certificate contents, key ID, and issuer ID/name) to allow admins to recover this value if desired, or perform further cleanup of keys. From my PoV, removal of the issuer is thus a relatively safe operation compared to keys (which I do not feel comfortable removing) as they can always be re-imported if desired. Additionally, this is an opt-in tidy operation, not enabled by default. Lastly, most major performance penalties comes with lots of issuers within the mount, not as much large numbers of keys (as only new issuer creation/import operations are affected, unlike LIST /issuers which is a public, unauthenticated endpoint). Signed-off-by: Alexander Scheel <[email protected]>
Signed-off-by: Alexander Scheel <[email protected]>
Signed-off-by: Alexander Scheel <[email protected]>
Signed-off-by: Alexander Scheel <[email protected]>
8da9ad4
to
41943d9
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some small feedback
fields["tidy_expired_issuers"] = &framework.FieldSchema{ | ||
Type: framework.TypeBool, | ||
Description: `Set to true to automatically remove expired issuers | ||
past the issuer_safety_buffer. No keys will be removed as part of this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will the logging provide enough information for an operator to delete the key after an auto tidy if they want to?
@@ -463,6 +470,15 @@ Defaults to 72 hours.`, | |||
Default: int(defaultTidyConfig.SafetyBuffer / time.Second), // TypeDurationSecond currently requires defaults to be int | |||
} | |||
|
|||
fields["issuer_safety_buffer"] = &framework.FieldSchema{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about issuer_tidy_grace_period
? This doesn't as directly indicate safety from what, and that buffer is time related.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mirrors naming of the present field, safety_buffer
above -- do you recommend breaking that pattern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, no, didn't notice the existing field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like something named _duration
ideally, but I think it'd require a soft-migration for the other field (and use say expired_issuer_retention_duration
and expired_leaf_retention_duration+safety_buffer
) -- this'd let the CLI parse it correctly as a duration-typed field.
We have time to revisit doing that before 1.13 GAs as this won't be backported.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could just introduce the two new fields and switch the docs. Add a note about the previous field still existing and being deprecated. But that can wait for another PR.
Signed-off-by: Alexander Scheel <[email protected]>
Signed-off-by: Alexander Scheel <[email protected]>
To aid PKI users like Consul, which periodically rotate intermediates, and provided a little more consistency with older versions of Vault which would silently (and dangerously!) replace the configured CA on root/intermediate generation, we introduce an automatic tidy of expired issuers.
This includes a longer safety buffer (1 year) and logging of the relevant issuer information prior to deletion (certificate contents, key ID, and issuer ID/name) to allow admins to recover this value if desired, or perform further cleanup of keys.
From my PoV, removal of the issuer is thus a relatively safe operation compared to keys (which I do not feel comfortable removing) as they can always be re-imported if desired. Additionally, this is an opt-in tidy operation, not enabled by default. Lastly, most major performance penalties comes with lots of issuers within the mount, not as much large numbers of keys (as only new issuer creation/import operations are affected, unlike LIST /issuers which is a public, unauthenticated endpoint).
Signed-off-by: Alexander Scheel <[email protected]>
This obviously needs:
But I'm curious to get people's initial opinions on this change.