-
-
Notifications
You must be signed in to change notification settings - Fork 272
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JENKINS-39150] expose diagnostics across all the channels #120
Conversation
To be used by support-core, we need to be able to enumerate all active channels. We do this via WeakHashMap so that references get automatically garbage collected. Unclosed channel will remain in memory forever, which also helps us find those leaks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐛 for potential concurrency issues
* When a transport is functioning correctly, {@link #commandsSent} of one side | ||
* and {@link #commandsReceived} of the other side should closely match. | ||
*/ | ||
private int commandsReceived; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be AtomicLong 🐜
@@ -494,6 +519,8 @@ protected Channel(ChannelBuilder settings, CommandTransport transport) throws IO | |||
|
|||
transport.setup(this, new CommandReceiver() { | |||
public void handle(Command cmd) { | |||
commandsReceived++; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐛 since handling may potentially happen in different threads
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, commands are received in serial order, so it is not possible for this to happen in different threads that do not have memory barrier established.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps you meant reader reading incorrect value when write to normal non-volatile long variable can go in two 32bit writes (JLS). I've added volatile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. It's what I've meant. Thanks for addressing it
@@ -1149,6 +1178,20 @@ public void dumpPerformanceCounters(PrintWriter w) throws IOException { | |||
} | |||
|
|||
/** | |||
* Print the diagnostic information. | |||
*/ | |||
public void dumpDiagnostics(PrintWriter w) throws IOException { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐛 Unsynchronized access to pendingCalls
container
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hashtable is synchronized, but deprecated. Conflicts with #109
@@ -163,4 +165,14 @@ public T call() throws RuntimeException { | |||
return t; | |||
} | |||
} | |||
|
|||
public void testDiagnostics() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice2have: reference the issue
This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation. |
* Timestamp of the last {@link Command} object sent/received, in | ||
* {@link System#currentTimeMillis()} format. | ||
*/ | ||
private long lastCommandSent, lastCommandReceived; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
NIT: not that important because private, I guess, but maybe suffix with At
like createdAt
to distinguish units here? as for example commandsSent
is the number of commands received, when lastCommandSent
is actually a date/timestamp.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, isn't lastCommandReceived
redundant with lastHeard
?
private volatile long lastHeard; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right! I removed the duplication
... and see how they can be interpreted to aid diagnostics
JLS (http://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.7) states that a write to non-volatile long variable can go in 32bit batch, so without it, read could retrieve a completely bogus value. There's no risk of writer contention here because that is serialized by the context in which it gets invoked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🐛
* Used for diagnostics. | ||
*/ | ||
public static void dumpDiagnosticsForAll(PrintWriter w) throws IOException { | ||
for (Ref ref : ACTIVE_CHANNELS.values()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to use ACTIVE_CHANNELS.values().toArray(new Ref[0])
so that we do not need the lock. You risk CME iterating, e.g. see the javadoc for Collections.synchronizedMap
which explicitly states:
It is imperative that the user manually synchronize on the returned map when iterating over any of its collection views
The toArray
will give you the shortest lock time
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makes sense
All review comments addressed |
🐝 |
Would be great to address the last comment from @stephenc and to add |
@kohsuke By the way, do you need backporting to stable-2.x? Diagnosability improvements IMHO qualify if they are pretty small and get soaked enough. |
Crap, I meant to target this to |
Taken over by #122. See you there. |
To be used by support-core, we need to be able to enumerate all active channels. We do this via WeakHashMap so that references get automatically garbage collected.
Unclosed channel will remain in memory forever, which also helps us find those leaks.
@reviewbybees