-
Notifications
You must be signed in to change notification settings - Fork 196
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Flush overall Pipeline log stream when shutting down Jenkins #669
Conversation
@@ -1694,6 +1694,7 @@ public void pause(final boolean v) throws IOException { | |||
LOGGER.log(Level.WARNING, null, t); | |||
} | |||
}); | |||
cpsExec.getOwner().getListener().getLogger().close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can also just flush
(I tested both options), but I'm not really sure that it matters either way.
cpsExec.getOwner().getListener().getLogger().close(); | |
cpsExec.getOwner().getListener().getLogger().flush(); |
If |
@@ -1694,6 +1694,7 @@ public void pause(final boolean v) throws IOException { | |||
LOGGER.log(Level.WARNING, null, t); | |||
} | |||
}); | |||
cpsExec.getOwner().getListener().getLogger().close(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is getOwner()
guaranteed to be non-null? getListener()
? In principle SpotBugs should know but sometimes it seems to miss these things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think CpsFlowExecution.owner
is guaranteed to be non-null. It's set here and here (called here).
getListener
may return WorkflowRun.NULL_LISTENER
, which is harmless to close, or throw an exception, but this code already handles exceptions, and I made this be the final operation in the try
block so that other more important cleanup still runs even if this throws.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If JenkinsSessionRule does not work, try RealJenkinsRule.
Good call, RealJenkinsRule
is indeed able to reproduce the issue. I guess JVM reuse across the restart is why this can't be reproduced using JenkinsSessionRule
or something? The static FileLogStorage.openStorages
field actually ends up returning the same object across the restart? EDIT: Yeah that is exactly the problem, with JenkinsSessionRule
we reuse the FileLogStorage
object and so we cannot lose any data from before the restart.
r.assertLogContains("Resuming build at ", b); | ||
r.assertLogContains("Pausing (Preparing for shutdown)", b); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first assertion here passes with or without the fix, since that message is printed after the restart. However, the second assertion is printed right before Jenkins shuts down, and it is missing from the log if you revert the fix.
Build failure is because |
…own failure on Windows
…ushLogsOnShutdown failure on Windows
I am merging this now because I would like to get a release out sooner than later, but I'll try to keep an eye on Dependabot PRs and remove the test assumption once we pick up jenkinsci/jenkins-test-harness#559. |
You can simply override |
Right now, when Jenkins shuts down gracefully, we attempt to pause Pipeline execution and save all related state information. However, we do not flush the listener for the overall log, which means that for
LogStorage
implementations likeFileLogStorage
that write the log and log metadata to separate files, and write metadata synchronously while buffering log writes, the metadata and log file may be out of sync after a Jenkins restart.In particular, when performing a safe restart, we write "Pausing (Preparing for shutdown)" to the build log and pause the Pipeline. For
FileLogStorage
, if the last log line was written by a step, then we will immediately write the current offset in the log to the metadata file and then flush the metadata file. We will also write the pause message to the log file, but that write will be buffered, and may be lost. On restart, subsequent writes will begin at the log offset that we expected to start with "Pausing (Preparing for shutdown)". ForFileLogStorage
, this is fine, because the first lines after the restart are guaranteed to come from the overall log, but for more complex implementations ofLogStorage
that store additional metadata (e.g. line lengths), this desynchronization can cause problems.TODO:
JenkinsSessionRule
+doQuietDown
). I can only reproduce the issue while restarting an actual Jenkins instance using/safeRestart
.RealJenkinsRule
, so I added a new test