-
-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-39543] Improve the caller/callee correlation diagnostics #119
Conversation
I was trying to diagnose a remoting call that bridges two VMs that is hanging. This can be made easier if thread dump allows us to correlate the caller/callee. This change adds the request ID to thread name so that thread dump will show this clearly Also fixes one incorrect use of toString() in diagnosis, which should be getName() --- see all the other uses of thread names.
LGTM, 🐝 |
@@ -233,6 +233,8 @@ public boolean isDone() { | |||
|
|||
public RSP get() throws InterruptedException, ExecutionException { | |||
synchronized(Request.this) { | |||
String oldThreadName = Thread.currentThread().getName(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe oldThreadName
is a bit confuse when you are asking for currentThread()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the same variable name used everywhere. I think it's old (thread name)
not (old thread) name
🐝 |
This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation. |
Post-:bee: |
I will add the post-factum JIRA issue since this stabilization branch is a subject for careful reviews |
The change introduces one serious bugfix (JENKINS-39596) and a bunch of various diagnostics improvements. Bugfixes: * [JENKINS-39596](https://issues.jenkins-ci.org/browse/JENKINS-39596) - Jenkins URL in `hudson.remoting.Engine` was always `null` since `3.0`. It was causing connection failures of Jenkins JNLP agents when using Java Web Start. ([PR jenkinsci#131](jenkinsci/remoting#131)) * [JENKINS-39617](https://issues.jenkins-ci.org/browse/JENKINS-39617) - `hudson.remoting.Engine` was failing to establish connection if one of the URLs parameter in parameters was malformed. ([PR jenkinsci#131](jenkinsci/remoting#131)) Improvements: * [JENKINS-39150](https://issues.jenkins-ci.org/browse/JENKINS-39150) - Add logic for dumping diagnostics across all the channels. ([PR jenkinsci#122](jenkinsci/remoting#122), [PR jenkinsci#125](jenkinsci/remoting#125)) * [JENKINS-39543](https://issues.jenkins-ci.org/browse/JENKINS-39543) - Improve the caller/callee correlation diagnostics in thread dumps. ([PR jenkinsci#119](jenkinsci/remoting#119)) * [JENKINS-39290](https://issues.jenkins-ci.org/browse/JENKINS-39290) - Add the `org.jenkinsci.remoting.nio.NioChannelHub.disabled` flag for disabling NIO (mostly for debugging purposes). ([PR jenkinsci#123](jenkinsci/remoting#123)) * [JENKINS-38692](https://issues.jenkins-ci.org/browse/JENKINS-38692) - Add extra logging to help diagnosing `IOHub` Thread spikes. ([PR #116](jenkinsci/remoting#116)) * [JENKINS-39289](https://issues.jenkins-ci.org/browse/JENKINS-39289) - When a proxy fails, report what caused the channel to go down. ([PR jenkinsci#128](jenkinsci/remoting#128))
The change introduces one serious bugfix (JENKINS-39596) and a bunch of various diagnostics improvements. Bugfixes: * [JENKINS-39596](https://issues.jenkins-ci.org/browse/JENKINS-39596) - Jenkins URL in `hudson.remoting.Engine` was always `null` since `3.0`. It was causing connection failures of Jenkins JNLP agents when using Java Web Start. ([PR #131](jenkinsci/remoting#131)) * [JENKINS-39617](https://issues.jenkins-ci.org/browse/JENKINS-39617) - `hudson.remoting.Engine` was failing to establish connection if one of the URLs parameter in parameters was malformed. ([PR #131](jenkinsci/remoting#131)) Improvements: * [JENKINS-39150](https://issues.jenkins-ci.org/browse/JENKINS-39150) - Add logic for dumping diagnostics across all the channels. ([PR #122](jenkinsci/remoting#122), [PR #125](jenkinsci/remoting#125)) * [JENKINS-39543](https://issues.jenkins-ci.org/browse/JENKINS-39543) - Improve the caller/callee correlation diagnostics in thread dumps. ([PR #119](jenkinsci/remoting#119)) * [JENKINS-39290](https://issues.jenkins-ci.org/browse/JENKINS-39290) - Add the `org.jenkinsci.remoting.nio.NioChannelHub.disabled` flag for disabling NIO (mostly for debugging purposes). ([PR #123](jenkinsci/remoting#123)) * [JENKINS-38692](https://issues.jenkins-ci.org/browse/JENKINS-38692) - Add extra logging to help diagnosing `IOHub` Thread spikes. ([PR #116](jenkinsci/remoting#116)) * [JENKINS-39289](https://issues.jenkins-ci.org/browse/JENKINS-39289) - When a proxy fails, report what caused the channel to go down. ([PR #128](jenkinsci/remoting#128))
I was trying to diagnose a remoting call that bridges two VMs that is hanging. This can be made easier if thread dump allows us to correlate the caller/callee.
This change adds the request ID to thread name so that thread dump will show this clearly
Also fixes one incorrect use of
toString()
in diagnosis, which should begetName()
--- see all the other uses of thread names.@reviewbybees