Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

keyring: safely handle missing keys and restore GC #15092

Merged
merged 11 commits into from
Nov 1, 2022
Merged

Conversation

tgross
Copy link
Member

@tgross tgross commented Oct 31, 2022

Fixes #14981 #15088

When replication of a single key fails, the replication loop breaks early and therefore keys that fall later in the sorting order will never get replicated. This is particularly a problem for clusters impacted by the bug that caused #14981 and that were later upgraded; the keys that were never replicated can now never be replicated, and so we need to handle them safely.

Included in the replication fix:

  • Refactor the replication loop so that each key replicated in a function call that returns an error, to make the workflow more clear and reduce nesting. Log the error and continue.
  • Improve stability of keyring replication tests. We no longer block leadership on initializing the keyring, so there's a race condition in the keyring tests where we can test for the existence of the root key before the keyring has been initialize. Change this to an "eventually" test.

But these fixes aren't enough to fix #14981 because they'll end up seeing an error once a second complaining about the missing key, so we also need to fix keyring GC so the keys can be removed from the state store. Now we'll store the key ID used to sign a workload identity in the Allocation, and we'll index the Allocation table on that so we can track whether any live Allocation was signed with a particular key ID.

(Best reviewed commit-by-commit.)

This reverts commit b583f78.
But keeps the changelog entry from #15034.
When replication of a single key fails, the replication loop broke early and
therefore keys that fell later in the sorting order would never get
replicated. Log the error and continue.

Refactor the replication loop so that each key replicated in a function call
that returns an error, to make the workflow more clear and reduce nesting.
We no longer block leadership on initializing the keyring, so there's a race
condition in the keyring tests where we can test for the existence of the root
key before the keyring has been initialize. Change this to an "eventually" test.
Comment on lines 696 to 698
Indexer: &memdb.UUIDFieldIndex{
Field: "SigningKeyID",
},
Copy link
Member Author

@tgross tgross Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: An alternative to storing this field in the Allocation struct would be to extract it from the signed identity claim in an indexer. I got that to work but it brought all the JWT parsing down here into the state store and we'd be parsing JWTs every time we restore a snapshot, which seemed terrible. The key ID isn't a lot of data (just a UUID string), and having it on the Allocation will be convenient for later use anyways, I'm sure.

nomad/core_sched.go Outdated Show resolved Hide resolved
Comment on lines 938 to 941
// don't GC a key that's deprecated, that's encrypted a variable or
// signed a live workload identity
inUse := false
if !keyMeta.Deprecated() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like it will reach the delete case if keyMeta.Deprecated() returns true, which doesn't seem to jibe with the comment.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dropping from our sidebar discussion on Slack:

I think I've figured out why it exists, and we don't need it anymore:

In the original design, we didn't have a way of determining if the key was in use by a WI except by guessing (incorrectly, as it turned out) based on the indexes. So we marked it as deprecated b/c the GC that actually removed it could happen much later and having a status that let the user know we've done a rekey on it was maybe helpful for diagnostics.

In the new design, we can just not touch the key at all in the rekey pass. It'll already be marked inactive because the key was rotated. The next periodic GC will delete the key if it's not in use. Much simpler.

I've done that and I think this looks pretty good?

Copy link
Member

@schmichael schmichael left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove CoreScheduler.getOldestAllocationIndex? It doesn't appear to be used anymore. FWIW I have seen clusters with running allocs whose last modify time/index was over a year ago (which is absolutely fine and not necessarily a bug/problem), but it means basing logic purely off of that may mean the index very very rarely changes.

nomad/encrypter.go Show resolved Hide resolved
nomad/encrypter.go Outdated Show resolved Hide resolved

// IsRootKeyMetaInUse determines whether a key has been used to sign a workload
// identity for a live allocation or encrypt any variables
func (s *StateStore) IsRootKeyMetaInUse(keyID string) (bool, error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This made me a little nervous at first glance because it could trend toward quadratic over time if there's at-least-one alloc for every key that's ever existed...

...however at the default key rotation rate of 30 days even in the maximally pathological case of 1-alloc-for-every-key-ever, that's a whopping 120 iterations over all allocations after a decade.

If we ever become concerned with the performance here we can always add ref counting to keys and update it when allocs are created or updated (or even just wait to decrement when the allocs are gc'd if we want to be really lazy). I'm guessing we'll never need to do that.

Making a compound index that only includes non-terminal allocs would be an easier optimization if not quite as effective.

Copy link
Member Author

@tgross tgross Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, there are a lot of cases where I want a cheap Count() method in memdb. 😀 I can do that compound index though: it reduces the memory usage for the index a bit for clusters with lots of short-lived allocs, and reduces the number of iterations we have to do here at a tiny cost on txn.Insert (which is already doing expensive reflection stuff over the fields anyways). Seems worth it.

Copy link
Member Author

@tgross tgross Nov 1, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! (Struck out the note about memory b/c I'd forgotten how conditional indexes work)

@tgross tgross added the backport/1.4.x backport to 1.4.x release line label Nov 1, 2022
@tgross tgross requested a review from angrycub November 1, 2022 17:26
nomad/encrypter.go Outdated Show resolved Hide resolved
Co-authored-by: Charlie Voiselle <[email protected]>
@github-actions
Copy link

github-actions bot commented Mar 2, 2023

I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions.
If you have found a problem that seems related to this change, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 2, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
backport/1.4.x backport to 1.4.x release line theme/keyring type/bug
Projects
None yet
3 participants