-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
keyring: update handle to state inside replication loop #15227
Conversation
nomad/encrypter.go
Outdated
@@ -457,6 +456,7 @@ START: | |||
goto ERR_WAIT // rate limit exceeded | |||
} | |||
|
|||
store := krr.srv.fsm.State() | |||
ws := store.NewWatchSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm going to admit I feel like I'm supposed to be able to have this store.NewWatchSet
at the top of the loop and then I can select
on it with the rest of the context to replace the state handle... but I can't figure out which is the right thing to be polling on there and this diff fixes the test case I've got.
(And we should probably not create a watchset here at all if we're not using it to poll for changes?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this can restructured a bit, but I could be wrong as I'm new to this area of code:
(*state.StateStore).AbandonCh()
will get closed when a restore finishes, so I think just toss it in the select above and then you only need to do store = krr.srv.fsm.State()
in that case block.
The WatchSet
can be dropped and nil
apssed tWatchSets
are internal to iradix indexes while Abandoning is a StateStore concept. I think this is an accurate mental model:
Raft
|
v
FSM <readers>
| ^
v |
StateStore <- Abandon
|
v
MemDB
|
v
iradix indexes <- Watches
If replicators were waiting for their local root key meta to be updated via the FSM then I think the watchset could be added to the select {}
to be notified when it was updated. Since the root key meta is replicated by RPC outside of the FSM, I don't think a watchset is useful here.
But like I said: new code to me, so let's maybe zoom if this all sounds way off.
a9134dd
to
7a41d2f
Compare
nomad/encrypter.go
Outdated
@@ -457,6 +456,7 @@ START: | |||
goto ERR_WAIT // rate limit exceeded | |||
} | |||
|
|||
store := krr.srv.fsm.State() | |||
ws := store.NewWatchSet() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I think this can restructured a bit, but I could be wrong as I'm new to this area of code:
(*state.StateStore).AbandonCh()
will get closed when a restore finishes, so I think just toss it in the select above and then you only need to do store = krr.srv.fsm.State()
in that case block.
The WatchSet
can be dropped and nil
apssed tWatchSets
are internal to iradix indexes while Abandoning is a StateStore concept. I think this is an accurate mental model:
Raft
|
v
FSM <readers>
| ^
v |
StateStore <- Abandon
|
v
MemDB
|
v
iradix indexes <- Watches
If replicators were waiting for their local root key meta to be updated via the FSM then I think the watchset could be added to the select {}
to be notified when it was updated. Since the root key meta is replicated by RPC outside of the FSM, I don't think a watchset is useful here.
But like I said: new code to me, so let's maybe zoom if this all sounds way off.
That's what I was looking for! Thank you!
The But I think we could do this similar to how
That would let us lean on the efficiency of the watchset in the common case, while still handling new nodes and cases where the cluster needs to self-repair. |
7a41d2f
to
9a1c450
Compare
When keyring replication starts, we take a handle to the state store. But whenever a snapshot is restored, this handle is invalidated and no longer points to a state store that is receiving new keys. This leaks a bunch of memory too! In addition to operator-initiated restores, when fresh servers are added to existing clusters with large-enough state, the keyring replication can get started quickly enough that it's running before the snapshot from the existing clusters have been restored. Fix this by updating the handle to the state store whenever the store's abandon channel is closed. Refactor the query for key metadata to use blocking queries for efficiency.
9a1c450
to
e7703d1
Compare
@schmichael I've updated this to |
|
…ing ERR_WAIT block
01dc833
to
82fce0a
Compare
@schmichael I've backed out the blocking query change and just left refreshing the This tight loop was destabilizing the whole test suite, as it was causing enough load that leadership was failing. In the interest of shipping a critical fix I'm using the rate-limited query and I'll see about revisiting blocking queries in later work -- I have a sneaking suspicion there's may be a bug in our blocking query implementation that's hard to hit in RPCs. You've got a request-changes on this so I'll need your 👍 here to ship. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! goto -= 3
🎉
I'm going to lock this pull request because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active contributions. |
Fixes #14981
When keyring replication starts, we take a handle to the state store. But whenever a snapshot is restored, this handle is invalidated and no longer points to a state store that is receiving new keys. This leaks a bunch of memory too!
In addition to operator-initiated restores, when fresh servers are added to existing clusters with large-enough state, the keyring replication can get started quickly enough that it's running before the snapshot from the existing clusters have been restored.
Fix this by updating the handle to the state store when we query.