-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testcluster: fix dial check on restart #111199
testcluster: fix dial check on restart #111199
Conversation
Previously server restart was incorectly dialing loopback instead of other nodes to reset circuit breakers. This commit fixes it to dial from existing nodes to restarted one. Epic: none Fixes: cockroachdb#111163 Release note: None
Is this relevant for backports? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch, thanks.
@@ -1787,6 +1787,7 @@ func (tc *TestCluster) RestartServerWithInspect( | |||
// node. This is useful to avoid flakes: the newly restarted node is now on a | |||
// different port, and a cycle of gossip is necessary to make all other nodes | |||
// aware. | |||
id := s.StorageLayer().NodeID() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not just s.NodeID()
? Is there some kind of mandate to type the "layer" explicitly? CC @knz to help with the answer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We want to be explicit about what we are trying to use StorageLayer (i.e. underlying raw KV) or tenant layer. I think it makes little difference here as we manipulate servers themselves, but maybe having intent expressed explicitly is better. Then we can deprecate those methods on the level above where server could have no storage and no nodeid?
There's no bug in 23.1, it was using correct servers for source and destination. |
bors r+ |
Build succeeded: |
Previously server restart was incorectly dialing loopback instead of other nodes to reset circuit breakers.
This commit fixes it to dial from existing nodes to restarted one.
Epic: none
Fixes: #111163
Release note: None