-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix hang for --discover-log-location flag #592
Conversation
This was missed in #384.
@@ -225,6 +225,8 @@ func run(ctx context.Context, wg *sync.WaitGroup, globalCollectionOpts state.Col | |||
|
|||
if globalCollectionOpts.DiscoverLogLocation { | |||
selfhosted.DiscoverLogLocation(ctx, servers, globalCollectionOpts, logger) | |||
testRunSuccess = make(chan bool, 1) | |||
testRunSuccess <- true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if this is a good fix (this isn't really a "testRun"), but it works, and I don't have any better ideas. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is happening because the main function selects from this channel, and selecting from a nil channel will block forever. It seems Go's recommended solution is to add a timeout to that select, but we wouldn't want to confuse users with a timeout error when the function is returning normally.
So since select statements apparently can't check if the channel is nil first, this is the only way for the main function to handle it:
DoneOrSignal:
for {
if testRunSuccess != nil {
select {
case success := <-testRunSuccess:
if reloadRun {
if success {
Reload(logger)
} else {
logger.PrintError("Error: Reload requested, but ignoring since configuration errors are present")
exitCode = 1
}
} else if !success {
exitCode = 1
}
break DoneOrSignal
case s := <-sigs:
if s == syscall.SIGINT || s == syscall.SIGTERM {
logger.PrintError("Interrupt")
break DoneOrSignal
}
}
} else {
select {
case s := <-sigs:
if s == syscall.SIGINT || s == syscall.SIGTERM {
logger.PrintError("Interrupt")
break DoneOrSignal
}
}
}
}
I don't think either option is easy to understand, so your approach is probably better since it results in fewer lines of code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I think the underlying bug here is that we expect anything that exits quickly, instead of continuing to run, to be a test run and depend on reading something from the testRun
channel (or a signal). I think a proper fix involves rethinking that, but the run
method is kind of daunting, and I'd rather fix this first and refactor that some other time.
@@ -225,6 +225,8 @@ func run(ctx context.Context, wg *sync.WaitGroup, globalCollectionOpts state.Col | |||
|
|||
if globalCollectionOpts.DiscoverLogLocation { | |||
selfhosted.DiscoverLogLocation(ctx, servers, globalCollectionOpts, logger) | |||
testRunSuccess = make(chan bool, 1) | |||
testRunSuccess <- true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like this is happening because the main function selects from this channel, and selecting from a nil channel will block forever. It seems Go's recommended solution is to add a timeout to that select, but we wouldn't want to confuse users with a timeout error when the function is returning normally.
So since select statements apparently can't check if the channel is nil first, this is the only way for the main function to handle it:
DoneOrSignal:
for {
if testRunSuccess != nil {
select {
case success := <-testRunSuccess:
if reloadRun {
if success {
Reload(logger)
} else {
logger.PrintError("Error: Reload requested, but ignoring since configuration errors are present")
exitCode = 1
}
} else if !success {
exitCode = 1
}
break DoneOrSignal
case s := <-sigs:
if s == syscall.SIGINT || s == syscall.SIGTERM {
logger.PrintError("Interrupt")
break DoneOrSignal
}
}
} else {
select {
case s := <-sigs:
if s == syscall.SIGINT || s == syscall.SIGTERM {
logger.PrintError("Interrupt")
break DoneOrSignal
}
}
}
}
I don't think either option is easy to understand, so your approach is probably better since it results in fewer lines of code.
This was missed in #384.