-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
roachprod: force wait=true for stop --signal=9
#78268
Conversation
Fixes cockroachdb#77334. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for PR! Minor comments, otherwise it's good to go.
Reviewed all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @otan and @tbg)
pkg/roachprod/roachprod.go, line 680 at r1 (raw file):
// If Wait is set, roachprod waits until the PID disappears (i.e. the // process has terminated). Wait bool // forced to true when Sig == 9
Might be worthwhile to also change roachprod's DefaultStopOpts.Wait
default to true
.
pkg/roachprod/install/cluster_synced.go, line 228 at r1 (raw file):
func (c *SyncedCluster) Stop(ctx context.Context, l *logger.Logger, sig int, wait bool) error { if sig == 9 { // `kill -9` without wait is never what a caller wants. See #77334.
While I can't think of a specific use-case which would require wait=false
, it would improve debugging experience to log in the case that the wait
value is overridden below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TFTR! No changes, but would like you to take a look at my responses before I merge.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @otan and @srosenberg)
pkg/roachprod/roachprod.go, line 680 at r1 (raw file):
Previously, srosenberg (Stan Rosenberg) wrote…
Might be worthwhile to also change roachprod's
DefaultStopOpts.Wait
default totrue
.
I intentionally didn't do that, or things like this will rot:
stopOpts := option.DefaultStopOpts()
stopOpts.RoachprodOpts.Sig = 2
DefaultStopOpts()
has 50 usages. I would like to minimize potential disruption during the stability period (and would like to backport this without sweating). Generally I still like the plan to move from Wait
-> to NoWait
but I think that should be done separately from fixing the missing wait in kill -9
that exists pervasively across the test suite and is ever desirable.
pkg/roachprod/install/cluster_synced.go, line 228 at r1 (raw file):
Previously, srosenberg (Stan Rosenberg) wrote…
While I can't think of a specific use-case which would require
wait=false
, it would improve debugging experience to log in the case that thewait
value is overridden below.
It won't be useful to log this. Tests call c.Stop
all the time and since I am averse to changing the default for the Wait
flag at this moment (see other comment) it would just get spammed all over the place (into already spammy logs :-) )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r1.
Reviewable status: complete! 1 of 0 LGTMs obtained (waiting on @otan and @tbg)
pkg/roachprod/roachprod.go, line 680 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
I intentionally didn't do that, or things like this will rot:
stopOpts := option.DefaultStopOpts() stopOpts.RoachprodOpts.Sig = 2
DefaultStopOpts()
has 50 usages. I would like to minimize potential disruption during the stability period (and would like to backport this without sweating). Generally I still like the plan to move fromWait
-> toNoWait
but I think that should be done separately from fixing the missing wait inkill -9
that exists pervasively across the test suite and is ever desirable.
Sounds good; would you mind leaving a TODO for a follow-up refactoring.
pkg/roachprod/install/cluster_synced.go, line 228 at r1 (raw file):
Previously, tbg (Tobias Grieger) wrote…
It won't be useful to log this. Tests call
c.Stop
all the time and since I am averse to changing the default for theWait
flag at this moment (see other comment) it would just get spammed all over the place (into already spammy logs :-) )
Got it; logs are way too spammy as is :(
bors r=srosenberg Not going to put a TODO in the code but instead leaving #77334 open. |
Build failed (retrying...): |
Build succeeded: |
Touches #77334.
Release note: None