-
Notifications
You must be signed in to change notification settings - Fork 70
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add read timeout for how long we query beans #168
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for having a look into this! Left a couple of comments
@@ -53,6 +54,10 @@ public RemoteConnection(LinkedHashMap<String, Object> connectionParams) | |||
} | |||
|
|||
} | |||
|
|||
//Set an RMI timeout so we don't get stuck waiting for a bean to report a value | |||
System.setProperty("sun.rmi.transport.tcp.responseTimeout", DEFAULT_RMI_RESPONSE_TIMEOUT); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what happens when we reach the timeout? Do we catch and handle the exception that's raised correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An error should be shown during the refresh bean list period where we scope out all available beans. If we hit the timeout there, an exception will be thrown (with a read timeout) and we should wait until the next collection period to try again.
@@ -20,6 +20,7 @@ | |||
private String jmx_url; | |||
private static final String TRUST_STORE_PATH_KEY = "trust_store_path"; | |||
private static final String TRUST_STORE_PASSWORD_KEY = "trust_store_password"; | |||
private static final String DEFAULT_RMI_RESPONSE_TIMEOUT = "5000"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on the docs this is expressed in milliseconds, are you sure 5 seconds is enough in usual cases? (I know some JVMs expose a lot of beans, and on these environments I could imagine that the call to get all the beans takes more than 5 seconds).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bumped this up to 15 by default and am using the user defined "refresh_beans" parameter to address this and make it configurable. Though it may make more sense to actually have this set to the min_collection_interval. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I bumped this again to actually be the default of the refresh_bean timeout as I think that makes the most sense here.
rmi_timeout = (String) connectionParams.get("refresh_beans"); | ||
if (rmi_timeout == null) { | ||
rmi_timeout = DEFAULT_RMI_RESPONSE_TIMEOUT; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's some indentation weirdness here
17404c0
to
25b7061
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 👍
Adds in the
sun.rmi.transport.tcp.responseTimeout
system property before starting the RMI connection.Customer experienced a blocking bean that caused JMXFetch to hang indefinitely (order of magnitude hours) This should help reduce cases like that and allow JMXFetch to continue grabbing the remaining metrics.