-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Jenkins 31084 graceful shutdown #115
base: master
Are you sure you want to change the base?
Jenkins 31084 graceful shutdown #115
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks great so far! I like the use of a ShutdownHook
, and the backend APIs are pretty much exactly what I expected.
Have you given any thought as to how this functionality might be activated or deactivated? The Java API states:
Shutdown hooks should also finish their work quickly. When a program invokes exit the expectation is that the virtual machine will promptly shut down and exit. When the virtual machine is terminated due to user logoff or system shutdown the underlying operating system may only allow a fixed amount of time in which to shut down and exit. It is therefore inadvisable to attempt any user interaction or to perform a long-running computation in a shutdown hook.
For this reason as well as to preserve backwards compatibility of the existing command-line API, I think this new functionality should probably be opt-in rather than opt-out by default. What do you think?
|
||
|
||
public static class ShutdownHook extends Thread { | ||
private SwarmClient client; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
[nit] These can be private final
.
try { | ||
client.takeSlaveOfflineAndWait(target, "JVM is shutting down"); | ||
} catch (Exception e) { | ||
logger.log(Level.SEVERE, "Unable to perform graceful slave shutdown: ", e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would happen if you started a shutdown operation (e.g., by sending a SIGINT
with Ctrl-C or a SIGTERM
with kill(1)
) and then sent another signal (e.g., another SIGINT
by pressing Ctrl-C again) while we were in the middle of takeSlaveOfflineAndWait()
? I think the expected behavior is that we would stop immediately without continuing to wait in takeSlaveOfflineAndWait()
. I think that your current code probably already handles this in that an InterruptedException
would be thrown (and you are catching Exception
here). It might be worth validating this with a manual test. Even if the code works as expected, might it be worth printing an additional message in this case (e.g., "terminating with extreme prejudice")?
|
||
node.toComputer().setTemporarilyOffline(true, new OfflineCause.UserCause(User.current(), reason)); | ||
|
||
normalResponse(req, rsp, "<response/>"); //TODO: not sure what to send here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure either offhand. Assuming that the default response status is javax.servlet.http.HttpServletResponse.SC_OK
, can we rely on the response status alone and simply avoid calling rsp.getCompressedWriter()
in the first place for such responses? I can look into this further if you're having trouble.
} | ||
|
||
public void doGetSlaveReadyForShutdown(StaplerRequest req, StaplerResponse rsp, @QueryParameter String name) throws IOException { | ||
Jenkins jenkins = Jenkins.getInstance(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding checking for secrets/permissions: I can't see any disadvantages to checking, and I can't see any advantages to not checking. In contrast, if we don't check, we increase the attack surface for potential security issues. Based on the above reasoning, I think I'd prefer to check secrets/permissions.
I have mixed feeling about the design. Then there is the opt-in. I see two main approaches:
Number 2 is not really opt-in since it will require instrumentation changes. It's also not trivial to do using only java's shutdown hooks (without explicitly registering signal hooks), but we might as well require Which leaves these options:
I'm in favor of choice 3 because I am not confident about the suitability of shutdown hooks for this use case. HTTP listener is probably too heavy and opens some security vulnerabilities, but a command file (a la nagios.cmd) would work, eg |
Another half-baked idea:
This might not work with unique client names. Alternatively:
Again unique client names makes finding the right one to work with difficult if multiple instances are present. |
Work in progress. Seems to be working so far.
Question: should
doGetSlaveReadyForShutdown
also check secrets/permissions?