-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachprod: only use cluster with positive lifetime #101434
base: master
Are you sure you want to change the base?
roachprod: only use cluster with positive lifetime #101434
Conversation
Previously it was possible to use locally cached cluster which is already expired and was GCd. That could lead to operations affecting VMs that belong to different clusters that is now reusing old cluster IPs. This PR adds an expiration check when reading cluster info from local file cache. Epic: none Release note: None
|
After testing existing fix to check cluster name on target node I got this:
Which just states SSH error because I think IP is not not used for other cluster. We should also improve error messages that roachprod outputs. |
Reopening as general improvement request. |
Checking the lifetime is a possibility, but IMO a more long term solution would be to surface authentication errors more explicitly (and don't retry them), instead of the generic There are some thoughts in #100875. |
Previously it was possible to use locally cached cluster which is already expired and was GCd. That could lead to operations affecting VMs that belong to different clusters that is now reusing old cluster IPs.
This PR adds an expiration check when reading cluster info from local file cache.
Epic: none
Release note: None