-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keyring: safely handle missing keys and restore GC #15092
Conversation
When replication of a single key fails, the replication loop broke early and therefore keys that fell later in the sorting order would never get replicated. Log the error and continue. Refactor the replication loop so that each key replicated in a function call that returns an error, to make the workflow more clear and reduce nesting.
We no longer block leadership on initializing the keyring, so there's a race condition in the keyring tests where we can test for the existence of the root key before the keyring has been initialize. Change this to an "eventually" test.
nomad/state/schema.go
Outdated
Indexer: &memdb.UUIDFieldIndex{ | ||
Field: "SigningKeyID", | ||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note: An alternative to storing this field in the Allocation
struct would be to extract it from the signed identity claim in an indexer. I got that to work but it brought all the JWT parsing down here into the state store and we'd be parsing JWTs every time we restore a snapshot, which seemed terrible. The key ID isn't a lot of data (just a UUID string), and having it on the Allocation
will be convenient for later use anyways, I'm sure.
nomad/core_sched.go
Outdated
// don't GC a key that's deprecated, that's encrypted a variable or | ||
// signed a live workload identity | ||
inUse := false | ||
if !keyMeta.Deprecated() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it will reach the delete case if keyMeta.Deprecated()
returns true, which doesn't seem to jibe with the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dropping from our sidebar discussion on Slack:
I think I've figured out why it exists, and we don't need it anymore:
In the original design, we didn't have a way of determining if the key was in use by a WI except by guessing (incorrectly, as it turned out) based on the indexes. So we marked it as deprecated b/c the GC that actually removed it could happen much later and having a status that let the user know we've done a rekey on it was maybe helpful for diagnostics.
In the new design, we can just not touch the key at all in the rekey pass. It'll already be marked inactive because the key was rotated. The next periodic GC will delete the key if it's not in use. Much simpler.
I've done that and I think this looks pretty good?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remove CoreScheduler.getOldestAllocationIndex
? It doesn't appear to be used anymore. FWIW I have seen clusters with running allocs whose last modify time/index was over a year ago (which is absolutely fine and not necessarily a bug/problem), but it means basing logic purely off of that may mean the index very very rarely changes.
|
||
// IsRootKeyMetaInUse determines whether a key has been used to sign a workload | ||
// identity for a live allocation or encrypt any variables | ||
func (s *StateStore) IsRootKeyMetaInUse(keyID string) (bool, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This made me a little nervous at first glance because it could trend toward quadratic over time if there's at-least-one alloc for every key that's ever existed...
...however at the default key rotation rate of 30 days even in the maximally pathological case of 1-alloc-for-every-key-ever, that's a whopping 120 iterations over all allocations after a decade.
If we ever become concerned with the performance here we can always add ref counting to keys and update it when allocs are created or updated (or even just wait to decrement when the allocs are gc'd if we want to be really lazy). I'm guessing we'll never need to do that.
Making a compound index that only includes non-terminal allocs would be an easier optimization if not quite as effective.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there are a lot of cases where I want a cheap Count()
method in memdb. 😀 I can do that compound index though: it reduces the memory usage for the index a bit for clusters with lots of short-lived allocs, and reduces the number of iterations we have to do here at a tiny cost on txn.Insert
(which is already doing expensive reflection stuff over the fields anyways). Seems worth it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done! (Struck out the note about memory b/c I'd forgotten how conditional indexes work)
8d35a22
to
bd3c556
Compare
Co-authored-by: Charlie Voiselle <[email protected]>
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Fixes #14981 #15088
When replication of a single key fails, the replication loop breaks early and therefore keys that fall later in the sorting order will never get replicated. This is particularly a problem for clusters impacted by the bug that caused #14981 and that were later upgraded; the keys that were never replicated can now never be replicated, and so we need to handle them safely.
Included in the replication fix:
But these fixes aren't enough to fix #14981 because they'll end up seeing an error once a second complaining about the missing key, so we also need to fix keyring GC so the keys can be removed from the state store. Now we'll store the key ID used to sign a workload identity in the Allocation, and we'll index the Allocation table on that so we can track whether any live Allocation was signed with a particular key ID.
(Best reviewed commit-by-commit.)