-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to cancel PKI tidy operations, pause between tidying certs #16958
Conversation
When tidy operations take a long time to execute (and especially when executing them automatically), having the ability to cancel them becomes useful to reduce strain on Vault clusters (and let them be rescheduled at a later time). To this end, we add the /tidy-cancel write endpoint. Signed-off-by: Alexander Scheel <[email protected]>
Signed-off-by: Alexander Scheel <[email protected]>
By setting pause_duration, operators can have a little control over the resource utilization of a tidy operation. While the list of certificates remain in memory throughout the entire operation, a pause is added between processing certificates and the revocation lock is released. This allows other operations to occur during this gap and potentially allows the tidy operation to consume less resources per unit of time (due to the sleep -- though obviously consumes the same resources over the time of the operation). Signed-off-by: Alexander Scheel <[email protected]>
Signed-off-by: Alexander Scheel <[email protected]>
Signed-off-by: Alexander Scheel <[email protected]>
Signed-off-by: Alexander Scheel <[email protected]>
f243557
to
f9ba5d7
Compare
Signed-off-by: Alexander Scheel <[email protected]>
@@ -164,6 +194,11 @@ func (b *backend) startTidyOperation(req *logical.Request, config *tidyConfig) { | |||
} | |||
} | |||
|
|||
// Check for cancel before continuing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we also schedule if case a cancel comes in after the doTidy func but before we release tidyCASGuard? -> defer atomic.StoreUint32(b.tidyCancelCAS, 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If anything, I think we just need a atomic.CompareAndSwapUint32(b.tidyCancelCAS, 1, 0)
at the top of this function before getting the tidyCASGuard
. otherwise, I don't think it really matters, IMO?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure that would work as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(I think that's correct because it ensures that if a cancel somehow came in after we finished, we'd want any new ones to start fresh).
Signed-off-by: Alexander Scheel <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
This adds two features to the PKI tidy operation that become useful with auto-tidy: