-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachprod: guard calls to SetupSSH #96794
Conversation
560cb60
to
74b5270
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @herkolategan and @srosenberg)
pkg/roachprod/vm/gce/gcloud.go
line 654 at r1 (raw file):
// ConfigSSH is part of the vm.Provider interface func (p *Provider) ConfigSSH(zones []string) error { configSSHMu.Lock()
Would this make more sense at the roachprod level? Only having locking for GCP will still result in potential write contention to the config file from other providers.
Conceptually it makes sense to have only one process modifying ssh config.
Also if there is ever a case where we have multiple roachtests running from the same machine (I don't think we can), this would not work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @smg260 and @srosenberg)
pkg/roachprod/vm/gce/gcloud.go
line 654 at r1 (raw file):
Previously, smg260 (Miral Gadani) wrote…
Would this make more sense at the roachprod level? Only having locking for GCP will still result in potential write contention to the config file from other providers.
Conceptually it makes sense to have only one process modifying ssh config.
Also if there is ever a case where we have multiple roachtests running from the same machine (I don't think we can), this would not work.
I was dubious at doing it on aroachprod
level as each provider might have different ways of providing access. AWS seems to send a key rather than modifying the shared
local resource. But then again there might be little harm in a blanket lock one level up? It's true if you run multiple roachprod
instances the issue would persist, I'll look into a system-wide lock (maybe a file based lock).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @smg260 and @srosenberg)
pkg/roachprod/vm/gce/gcloud.go
line 654 at r1 (raw file):
Previously, herkolategan (Herko Lategan) wrote…
I was dubious at doing it on a
roachprod
level as each provider might have different ways of providing access. AWS seems to send a key rather than modifying theshared
local resource. But then again there might be little harm in a blanket lock one level up? It's true if you run multipleroachprod
instances the issue would persist, I'll look into a system-wide lock (maybe a file based lock).
Right so Sync
which ultimately calls SetupSSH
already does a file lock
for the sync part. I'll re-use that logic and add one for the call to SetupSSH
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @herkolategan and @srosenberg)
pkg/roachprod/vm/gce/gcloud.go
line 654 at r1 (raw file):
Previously, herkolategan (Herko Lategan) wrote…
Right so
Sync
which ultimately callsSetupSSH
already does afile lock
for the sync part. I'll re-use that logic and add one for the call toSetupSSH
.
I think the issue this PR addresses only arises because roachtest can attempt to create multiple clusters simultaneously (via multiple roachprod calls).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @smg260 and @srosenberg)
pkg/roachprod/vm/gce/gcloud.go
line 654 at r1 (raw file):
Previously, smg260 (Miral Gadani) wrote…
I think the issue this PR addresses only arises because roachtest can attempt to create multiple clusters simultaneously (via multiple roachprod calls).
True, but I agree with your statement around multiple processes still being at risk. In fact CleanSSH
was already covered by the system-wide or (user user-wide rather) lock-file. Same just had to be done for SetupSSH.
This change ensures that calls to `SetupSSH` do not run concurrently across processes or threads. Overlapping calls are not safe and can lead to invalid SSH configurations. The scenario takes place when multiple clusters are created simultaneously from the same or multiple processes. Resolves: cockroachdb#90092 Release note: None
74b5270
to
87510fd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @smg260 and @srosenberg)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 3 files at r2, all commit messages.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @srosenberg)
bors r=renatolabs,smg260 |
Build failed: |
bors retry |
Build succeeded: |
This change ensures that calls to
SetupSSH
do not run concurrently acrossprocesses or threads. Overlapping calls are not safe and can lead to invalid SSH
configurations. The scenario takes place when multiple clusters are created
simultaneously from the same or multiple processes.
Resolves: #90092
Release note: None