Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-42556] Make PlaceholderTask.getFullDisplayName ignore thread authentication #34

Merged

Conversation

jglick
Copy link
Member

@jglick jglick commented Mar 9, 2017

JENKINS-42556

@reviewbybees

return Run.fromExternalizableId(runId);
SecurityContext orig = ACL.impersonate(ACL.SYSTEM);
try {
return Run.fromExternalizableId(runId);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not to worry about showing the task to other users in the executor widget; this display code already handled that, since run() will normally return something without an access check (while the node block is up and running)—this code path is only used during ExecutorPickle rehydration when resuming a build.

import org.jvnet.hudson.test.RestartableJenkinsRule;

public class ExecutorPickleTest {

@ClassRule public static BuildWatcher buildWatcher = new BuildWatcher();
@Rule public RestartableJenkinsRule r = new RestartableJenkinsRule();
//@Rule public LoggerRule logging = new LoggerRule().record(Queue.class, Level.FINE);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Useful with jenkinsci/jenkins#2791, and triggers further exceptions fixed in jenkinsci/jenkins#2792.

r.j.jenkins.setAuthorizationStrategy(new MockAuthorizationStrategy().
grant(Jenkins.ADMINISTER).everywhere().to("admin").
grant(Jenkins.READ, Item.DISCOVER).everywhere().toEveryone());
r.j.jenkins.save(); // TODO pending https://github.com/jenkinsci/jenkins/pull/2790
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Override public void evaluate() throws Throwable {
SemaphoreStep.success("wait/1", null);
WorkflowJob p = r.j.jenkins.getItemByFullName("p", WorkflowJob.class);
assertFalse(p.getACL().hasPermission(Jenkins.ANONYMOUS, Item.READ));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

verifying that save worked

WorkflowJob p = r.j.createProject(WorkflowJob.class, "p");
p.setDefinition(new CpsFlowDefinition("node('remote') {semaphore 'wait'}", true));
SemaphoreStep.waitForStart("wait/1", p.scheduleBuild2(0).waitForStart());
remote.toComputer().setTemporarilyOffline(true, new OfflineCause.UserCause(User.get("admin"), "hold"));
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Otherwise ExecutorPickle could resume too quickly without ever even trying to print a message about its progress.

WorkflowJob p = r.j.jenkins.getItemByFullName("p", WorkflowJob.class);
assertFalse(p.getACL().hasPermission(Jenkins.ANONYMOUS, Item.READ));
WorkflowRun b = p.getBuildByNumber(1);
r.j.waitForMessage(Messages.ExecutorPickle_waiting_to_resume(Messages.ExecutorStepExecution_PlaceholderTask_displayName(b.getFullDisplayName())), b);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ghost
Copy link

ghost commented Mar 9, 2017

This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation.

Copy link
Member

@svanoort svanoort left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐝 Looks okay to me -- don't 100% understand the test but the code itself is straightforward.

@jglick
Copy link
Member Author

jglick commented Mar 10, 2017

don't 100% understand the test

FTR, this is the build output before the patch:

Started
[Pipeline] node
Running on remote in …/workspace/p
[Pipeline] {
[Pipeline] semaphore
Resuming build at Thu Mar 09 19:30:34 EST 2017 after Jenkins restart
[Pipeline] End of Pipeline
java.io.IOException: Failed to load build state
	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:586)
	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$3.onSuccess(CpsFlowExecution.java:584)
	at org.jenkinsci.plugins.workflow.cps.CpsFlowExecution$4$1.run(CpsFlowExecution.java:627)
	at org.jenkinsci.plugins.workflow.cps.CpsVmExecutorService$1.run(CpsVmExecutorService.java:35)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at hudson.remoting.SingleLaneExecutorService$1.run(SingleLaneExecutorService.java:112)
	at jenkins.util.ContextResettingExecutorService$1.run(ContextResettingExecutorService.java:28)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
	at java.lang.Thread.run(Thread.java:745)
Caused by: org.acegisecurity.AccessDeniedException: Please login to access job p
	at jenkins.model.Jenkins.getItem(Jenkins.java:2398)
	at jenkins.model.Jenkins.getItem(Jenkins.java:306)
	at jenkins.model.Jenkins.getItemByFullName(Jenkins.java:2504)
	at hudson.model.Run.fromExternalizableId(Run.java:2282)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.runForDisplay(ExecutorStepExecution.java:384)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getDisplayName(ExecutorStepExecution.java:397)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepExecution$PlaceholderTask.getFullDisplayName(ExecutorStepExecution.java:406)
	at org.jenkinsci.plugins.workflow.support.pickles.ExecutorPickle$1.printWaitingMessage(ExecutorPickle.java:116)
	at org.jenkinsci.plugins.workflow.support.pickles.TryRepeatedly$1.run(TryRepeatedly.java:95)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
	at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
	... 3 more
Finished: FAILURE

and after:

Started
[Pipeline] node
Running on remote in …/workspace/p
[Pipeline] {
[Pipeline] semaphore
Resuming build at Thu Mar 09 19:31:30 EST 2017 after Jenkins restart
Waiting to resume part of p #1: ???
Ready to run at Thu Mar 09 19:31:32 EST 2017
[Pipeline] }
[Pipeline] // node
[Pipeline] End of Pipeline
Finished: SUCCESS

@jglick jglick merged commit 30c1394 into jenkinsci:master Mar 10, 2017
@jglick jglick deleted the TryRepeatedly-anonDiscover-JENKINS-42556 branch March 10, 2017 00:34
jglick added a commit to jglick/workflow-durable-task-step-plugin that referenced this pull request Nov 16, 2021
jglick added a commit that referenced this pull request Oct 17, 2022
…tion of `node` step, not kill whole build (#180)

* Sketch of non-`Pickle`-based resumption of `ExecutorStepExecution`

* Amending test from #34

* Fixing `Callback`

* Exploring where to detect a failure eligible for retry

* Making `normalNodeDisappearance` pass under somewhat altered circumstances

* First successful `node` block retry

* `FilePathDynamicContext` must take precedence over `ExecutorStepDynamicContext.FilePathTranslator`

* `ExecutorStepTest.unloadableExecutorPickle` as currently conceived no longer makes sense

* Fixing `ExecutorStepTest.nodeDisconnectMissingContextVariableException` by copying some diagnostic code from `FilePathDynamicContext`

* Fixing `ExecutorStepTest.contextualizeFreshFilePathAfterAgentReconnection`

* Marking `ExecutorStepRetryEligibility` beta

* Javadoc imports

* SpotBugs: does not make sense to allow `ExecutorStepDynamicContext.node` to be null

* Reasonable behavior of `ExecutorPickleTest.canceledQueueItem`

* User-visible logging about 5m timeout now that `ExecutorPickle.printWaitingMessage` is unused

* `ExecutorPickleTest` → `ExecutorStepDynamicContextTest`

* Move `TIMEOUT_WAITING_FOR_NODE_MILLIS` from `ExecutorPickle` to `ExecutorStepExecution`

* Some user-visible logging when `ExecutorStepRetryEligibility` is _not_ used

* Do not block on `FutureImpl`.
After jenkinsci/workflow-step-api-plugin#73 this caused `SecretsMasker` to block on a `KubernetesComputer`
which in turn blocked provisioning of a `TaskListener`, causing `RestartPipelineTest.terminatedPodAfterRestart` to fail to find a message:

```
FINE	o.j.p.w.s.d.DurableTaskStep$Execution#getWorkspace: rediscovering that terminated-pod-after-restart-1-8fb2m-j206k-8x125 has been removed and timeout has expired
FINE	o.j.p.w.s.d.DurableTaskStep$Execution#_listener: JENKINS-34021: could not get TaskListener in CpsStepContext[9:sh]:Owner[terminated Pod After Restart/1:terminated Pod After Restart #1]
java.util.concurrent.TimeoutException
	at hudson.remoting.AsyncFutureImpl.get(AsyncFutureImpl.java:102)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepDynamicContext$Translator.get(ExecutorStepDynamicContext.java:115)
Caused: java.io.IOException
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepDynamicContext$Translator.get(ExecutorStepDynamicContext.java:118)
	at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:97)
	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:139)
	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet$1Delegate.doGet(ContextVariableSet.java:98)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:75)
	at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:85)
	at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:73)
	at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:95)
	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:139)
	at org.jenkinsci.plugins.workflow.cps.CpsThread.getContextVariable(CpsThread.java:135)
	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:297)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:75)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.getListener(DefaultStepContext.java:127)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:87)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution._listener(DurableTaskStep.java:421)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.listener(DurableTaskStep.java:412)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.getWorkspace(DurableTaskStep.java:363)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.check(DurableTaskStep.java:570)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.run(DurableTaskStep.java:553)
	at …
FINE	o.j.p.w.s.d.DurableTaskStep$Execution$NewlineSafeTaskListener#getLogger: creating filtered stream
FINE	o.j.p.w.s.d.DurableTaskStep$Execution#_listener: terminated-pod-after-restart-1-8fb2m-j206k-8x125 has been removed for 15 sec, assuming it is not coming back
```

* A `sh` step could fail to find a workspace at the moment it starts.
java.lang.NullPointerException
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:329)
	at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:319)
	at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:193)
	at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
	at …

* Also handling `ClosedChannelException` as commonly thrown by `SimpleBuildStep`s and the like: jenkinsci/kubernetes-plugin#1083 (comment)

* Also retry after `MissingContextVariableException` on `FilePath`

* Remove `super.onResume` left over in #46; see jenkinsci/workflow-step-api-plugin#66

* Documenting need for JENKINS-30383

* jenkinsci/workflow-step-api-plugin#77 allows `retryNodeBlockSynchAcrossRestarts` to be fixed

* jenkinsci/workflow-step-api-plugin#77 released

* Expanding `MissingContextVariableException` type list after noticing that `isUnix` requests `Launcher` not `FilePath`

* Never log a message with only `PlaceholderTask.cookie`; also need `.runId` for context

* Suppressing uninteresting log message

* Maybe need to call `StepContext.saveState`?

* Deleting comment; for now it seems clearest to keep `ExecutorStepDynamicContext` as a distinct type

* Resolve longstanding tech debt from #104 & #117 about `BodyExecution`.
When we do not use pickles, there is no need for `transient`.
Moving the reference to `ExecutorStepExecution` from `PlaceholderTask` avoids a memory leak.
This is only possible now that `Callback` holds the `ExecutorStepExecution`.
And we can now properly handle `AsynchronousExecution.interrupt`.

* Sketch of updating `WorkspaceStepExecution`: #180 (comment)

* Allowing `FilePathDynamicContext` to block waiting for an agent to reconnect.
Not great since the CPS VM thread will only grant 5s.
Better would be to avoid unnecessary use of `StepContext.get(FilePath.class)`
(which implicitly requires a live `Channel`);
perhaps `FilePathRepresentation` should become an API.
Amends b65ff1c.
#180 (comment)

* Adding a comment about 8b08398

* 8b08398 seems to allow a968adf to be finished.

* Refining 8b08398 to avoid tail call

* `org.jenkinsci.plugins.workflow.support.concurrent.Futures` deprecated

* SpotBugs reminds me that `WorkspaceStepExecution.Callback` needs an SVUID; taking one from trunk

* Adding `NodeTranslator`

* Beginning rewrite to `AgentErrorCondition`

* Extracted `RetryExecutorStepTest` from `ExecutorStepTest`

* Rely on blocking `ExecutorStepExecution.onResume`

* Removing bogus `waitForMessage` (and unnecessary `interrupt`) from `unrestorableAgent`. Sometimes a failing `waitForMessage` seems to trigger https://maven.apache.org/surefire/maven-surefire-plugin/faq.html#corruptedstream though I cannot track down why.

* `PlaceholderTask` may not hold a reference to `ExecutorStepExecution`

* Equivalent of #184 for `ExecutorStepDynamicContext`

* Assert jenkinsci/workflow-api-plugin@b1778a9

* jenkinsci/workflow-cps-plugin#534 & jenkinsci/workflow-job-plugin#260 released

* Windows tests jenkinsci/workflow-basic-steps-plugin#203 (comment)

* Avoiding `powershell` jenkinsci/workflow-basic-steps-plugin#203 (comment)

* Better usage of Hamcrest; got a flake https://github.com/jenkinsci/workflow-durable-task-step-plugin/pull/180/checks?check_run_id=6779178701 with `QueueTaskCancelled` instead of `RemovedNodeCause`

* In fact the flake was in a different test (`canceledQueueItem`)

* Forgot to make `canceledQueueItem` not use `sh`, though it should not really matter since this step is expected to fail anyway

* Test flake: https://github.com/jenkinsci/workflow-durable-task-step-plugin/pull/180/checks?check_run_id=7240329028

* Use `AgentOfflineException` from `ExecutorStepDynamicContext` as well

* Can now run all of `AgentErrorConditionTest`

* Responding to `CancellationException` in `ExecutorStepDynamicContext` with `QueueTaskCancelled`, analogous to handling in `ExecutorPickle`

* Permit `canceledQueueItem` to pass on `RemovedNodeCause`

* Normalizing indentation

* jenkinsci/workflow-step-api-plugin#124 deprecates an overload which 2e95e3a only handles at one call site; TBD if others should be switched to `actualInterruption`

* Refined `ExecutorStepDynamicContext` behavior in case of `quietingDown` #180 (comment) or `terminating` #180 (comment)

* Add an explicit `serialVersionUID` to `ExecutorStepExecution.PlaceholderTask.Callback` #180 (comment)

* Optimizing `withExecution` to only consider steps in the current build #180 (comment)

* Pick up jenkinsci/workflow-api-plugin#256 #180 (comment)

* jenkinsci/workflow-api-plugin#256 released
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants