-
-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-39835] - Be extra defensive about Errors and Exceptions #133
Conversation
[JENKINS-39835] Be even more defensive then against leaving connections dangling.
This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation. |
🐝 |
} finally { | ||
// incase this was an OOMErr and logging caused another OOMErr | ||
recvKey.cancel(); | ||
onRecvClosed(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose this code should rethrow errors afterwards
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not doing it before the change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was not catching Error
s before the change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if it does that you could kill the thread / leave the system in a bad state.
🐝 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do not feel suppressing Error
s is a good idea here. After closing the connection, we should try to rethrow it. 🐛
@jtnord Do you think we need a similar fix in stable-2.x? |
@oleg-nenashev if you throw the error will likely kill the thread and you may not end up in the state you think you want to end up in. I'm not comfortable re-throwing Errors, so if you would like that then I will leave it up to yourself, cloudbees oss-team or someone else to provide. |
This does not contain the JNLP4v protocol (and hence this code) does it? |
W.r.t. Killing the thread. Thinking about this more... You would only ever kill a worker thread, which should be ok as they are just assigned to this operation and the threadpool will replace if they die |
Yes this is 100% needed - and I feel this is a regression in remoting and the new LTS will be picking this up so we are likely to see reports of this more. Basically ops read is no longer registered and only adds it later on (when all data has been read), leaving hanging network. Has been observed in the wild on the pre OSS version of the code. |
I still do not like the handling of some fatal errors, but I agree to merge it taking the potential impact into account. |
BTW I also think that killing the worker thread is fine |
@oleg-nenashev if the worker thread dies handling a read then whatever is connected is dead as far as OP_READ. as OP_READ is removed when we start the thread and only removed when we have read all of the data available in the callback (ie the thread that dies). See line 147 in this block Did you see my proposal on the Jenkins-dev list for handling uncaught exceptions :-) |
Show me the link |
LogRecord record = new LogRecord(Level.SEVERE, "[{0}] Uncaught {1}"); | ||
record.setThrown(t); | ||
record.setParameters(new Object[]{stack().name(), t.getClass().getSimpleName()}); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The whole block could be written more simply as
LOGGER.log(Level.SEVERE, "Uncaught error in " + stack().name(), t);
without any real loss.
record.setParameters(new Object[]{stack().name(), t.getClass().getSimpleName()}); | ||
} | ||
} finally { | ||
// incase this was an OOMErr and logging caused another OOMErr |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Easier and probably safer to set a boolean
flag at the top of the method indicating whether it completed normally; wrap everything in a try
-finally
block that closes the channel if not.
OK, let's go forward with this PR as is. |
I am ok with the code as it is - logging the errors is fine. It seems reasonable safe to have it backported into .2. |
OK, going forward with the merge. Will be in this weekly if @kohsuke is available |
JENKINS-39835 Be even more defensive then against leaving connections dangling.
NB: no testing performed of this fix - it is a direct port from the internal code that was submitted to make the JNLP v4 protocol.
@reviewbybees esp @stephenc @oleg-nenashev