Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[JENKINS-26156] Do not call BodyInvoker.withDisplayName(null) #1

Merged

Conversation

jglick
Copy link
Member

@jglick jglick commented Apr 8, 2016

See jenkinsci/workflow-cps-plugin#3 for discussion.

@reviewbybees

…a no-op, and will soon be a FindBugs warning.
@ghost
Copy link

ghost commented Apr 8, 2016

This pull request originates from a CloudBees employee. At CloudBees, we require that all pull requests be reviewed by other CloudBees employees before we seek to have the change accepted. If you want to learn more about our process please see this explanation.

@jenkinsadmin
Copy link
Member

Thank you for this pull request! Please check this document for how the Jenkins project handles pull requests.

@amuniz
Copy link
Member

amuniz commented Apr 12, 2016

🐝

@jglick jglick merged commit c919d02 into jenkinsci:master Apr 12, 2016
@jglick jglick deleted the JENKINS-26156-BodyInvoker.withDisplayName branch April 12, 2016 19:06
svanoort pushed a commit that referenced this pull request Feb 16, 2018
dwnusbaum pushed a commit that referenced this pull request Dec 14, 2018
Update dependencies and make minor adjustments for tests to pass
jglick added a commit to jglick/workflow-durable-task-step-plugin that referenced this pull request Dec 8, 2021
After jenkinsci/workflow-step-api-plugin#73 this caused `SecretsMasker` to block on a `KubernetesComputer`
which in turn blocked provisioning of a `TaskListener`, causing `RestartPipelineTest.terminatedPodAfterRestart` to fail to find a message:

```
FINE	o.j.p.w.s.d.DurableTaskStep$Execution#getWorkspace: rediscovering that terminated-pod-after-restart-1-8fb2m-j206k-8x125 has been removed and timeout has expired
FINE	o.j.p.w.s.d.DurableTaskStep$Execution#_listener: JENKINS-34021: could not get TaskListener in CpsStepContext[9:sh]:Owner[terminated Pod After Restart/1:terminated Pod After Restart jenkinsci#1]
java.util.concurrent.TimeoutException
	at hudson.remoting.AsyncFutureImpl.get(AsyncFutureImpl.java:102)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepDynamicContext$Translator.get(ExecutorStepDynamicContext.java:115)
Caused: java.io.IOException
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepDynamicContext$Translator.get(ExecutorStepDynamicContext.java:118)
	at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:97)
	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:139)
	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet$1Delegate.doGet(ContextVariableSet.java:98)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:75)
	at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:85)
	at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:73)
	at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:95)
	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:139)
	at org.jenkinsci.plugins.workflow.cps.CpsThread.getContextVariable(CpsThread.java:135)
	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:297)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:75)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.getListener(DefaultStepContext.java:127)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:87)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution._listener(DurableTaskStep.java:421)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.listener(DurableTaskStep.java:412)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.getWorkspace(DurableTaskStep.java:363)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.check(DurableTaskStep.java:570)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.run(DurableTaskStep.java:553)
	at …
FINE	o.j.p.w.s.d.DurableTaskStep$Execution$NewlineSafeTaskListener#getLogger: creating filtered stream
FINE	o.j.p.w.s.d.DurableTaskStep$Execution#_listener: terminated-pod-after-restart-1-8fb2m-j206k-8x125 has been removed for 15 sec, assuming it is not coming back
```
jtnord added a commit to jtnord/workflow-durable-task-step-plugin that referenced this pull request May 11, 2022
calling setLabels on an agent will not persist the node.

in older versions of Jenkins the tests would be less flaky as adding any
Node would cause all labels to be re-evaluated, so when creating a few
agents and adding labels in a loop the last one created would at least
deterministically ensure that all previous agents labels where correct.

However since 2.332 (jenkinsci/jenkins#5882)
only labels part of a node added or removed would be updated, and when
creating the agents they where created without labels, which where added
later.

This caused tests to be flaky depending on when the periodic
`trimLabels` was called (or at least on other timing related things)

THis was discovered by enabling a loggerrule for hudson.model.queue and
observing that the builds would timeout as not all the agents would have
the expected nodes.

e.g.

```
  12.023 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[ jenkinsci#1] rejected part of demo jenkinsci#1: ?Jenkins? doesn?t have label ?foo?
  12.023 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave0 #0] is a potential candidate for task part of demo jenkinsci#1
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave2 #0] rejected part of demo jenkinsci#1: ?slave2? doesn?t have label ?foo?
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave1 #0] rejected part of demo jenkinsci#1: ?slave1? doesn?t have label ?foo?
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[ #0] rejected part of demo jenkinsci#1: ?Jenkins? doesn?t have label ?foo?
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave3 #0] rejected part of demo jenkinsci#1: ?slave3? doesn?t have label ?foo?
```

from `reuseNodesWithSameLabelsInParallelStages`

Additionally creating agents and waiting for them to come oneline is
slow.  A pipeline will start and then wait for the node to be available,
so we can do other things whilst the agent is connecting.

For the case where we need a number of agents connected before we start
to run the pipeline, we now create iall the agents before waiting for them to connect.
jtnord added a commit to jtnord/workflow-durable-task-step-plugin that referenced this pull request May 11, 2022
calling setLabels on an agent will not persist the node.

in older versions of Jenkins the tests would be less flaky as adding any
Node would cause all labels to be re-evaluated, so when creating a few
agents and adding labels in a loop the last one created would at least
deterministically ensure that all previous agents labels where correct.

However since 2.332 (jenkinsci/jenkins#5882)
only labels part of a node added or removed would be updated, and when
creating the agents they where created without labels, which where added
later.

This caused tests to be flaky depending on when the periodic
`trimLabels` was called (or at least on other timing related things)

THis was discovered by enabling a loggerrule for hudson.model.queue and
observing that the builds would timeout as not all the agents would have
the expected nodes.

e.g.

```
  12.023 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[ jenkinsci#1] rejected part of demo jenkinsci#1: ?Jenkins? doesn?t have label ?foo?
  12.023 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave0 #0] is a potential candidate for task part of demo jenkinsci#1
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave2 #0] rejected part of demo jenkinsci#1: ?slave2? doesn?t have label ?foo?
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave1 #0] rejected part of demo jenkinsci#1: ?slave1? doesn?t have label ?foo?
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[ #0] rejected part of demo jenkinsci#1: ?Jenkins? doesn?t have label ?foo?
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave3 #0] rejected part of demo jenkinsci#1: ?slave3? doesn?t have label ?foo?
```

from `reuseNodesWithSameLabelsInParallelStages`

Additionally creating agents and waiting for them to come oneline is
slow.  A pipeline will start and then wait for the node to be available,
so we can do other things whilst the agent is connecting.

For the case where we need a number of agents connected before we start
to run the pipeline, we now create iall the agents before waiting for them to connect.
jglick added a commit that referenced this pull request May 11, 2022
* deflake and speedup ExecutorStepTest tests

calling setLabels on an agent will not persist the node.

in older versions of Jenkins the tests would be less flaky as adding any
Node would cause all labels to be re-evaluated, so when creating a few
agents and adding labels in a loop the last one created would at least
deterministically ensure that all previous agents labels where correct.

However since 2.332 (jenkinsci/jenkins#5882)
only labels part of a node added or removed would be updated, and when
creating the agents they where created without labels, which where added
later.

This caused tests to be flaky depending on when the periodic
`trimLabels` was called (or at least on other timing related things)

THis was discovered by enabling a loggerrule for hudson.model.queue and
observing that the builds would timeout as not all the agents would have
the expected nodes.

e.g.

```
  12.023 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[ #1] rejected part of demo #1: ?Jenkins? doesn?t have label ?foo?
  12.023 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave0 #0] is a potential candidate for task part of demo #1
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave2 #0] rejected part of demo #1: ?slave2? doesn?t have label ?foo?
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave1 #0] rejected part of demo #1: ?slave1? doesn?t have label ?foo?
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[ #0] rejected part of demo #1: ?Jenkins? doesn?t have label ?foo?
  12.024 [id=141]       FINEST  hudson.model.Queue#maintain: JobOffer[slave3 #0] rejected part of demo #1: ?slave3? doesn?t have label ?foo?
```

from `reuseNodesWithSameLabelsInParallelStages`

Additionally creating agents and waiting for them to come oneline is
slow.  A pipeline will start and then wait for the node to be available,
so we can do other things whilst the agent is connecting.

For the case where we need a number of agents connected before we start
to run the pipeline, we now create iall the agents before waiting for them to connect.

* Update src/test/java/org/jenkinsci/plugins/workflow/support/steps/ExecutorStepTest.java

Co-authored-by: Jesse Glick <[email protected]>

Co-authored-by: Jesse Glick <[email protected]>
jglick added a commit that referenced this pull request Oct 17, 2022
…tion of `node` step, not kill whole build (#180)

* Sketch of non-`Pickle`-based resumption of `ExecutorStepExecution`

* Amending test from #34

* Fixing `Callback`

* Exploring where to detect a failure eligible for retry

* Making `normalNodeDisappearance` pass under somewhat altered circumstances

* First successful `node` block retry

* `FilePathDynamicContext` must take precedence over `ExecutorStepDynamicContext.FilePathTranslator`

* `ExecutorStepTest.unloadableExecutorPickle` as currently conceived no longer makes sense

* Fixing `ExecutorStepTest.nodeDisconnectMissingContextVariableException` by copying some diagnostic code from `FilePathDynamicContext`

* Fixing `ExecutorStepTest.contextualizeFreshFilePathAfterAgentReconnection`

* Marking `ExecutorStepRetryEligibility` beta

* Javadoc imports

* SpotBugs: does not make sense to allow `ExecutorStepDynamicContext.node` to be null

* Reasonable behavior of `ExecutorPickleTest.canceledQueueItem`

* User-visible logging about 5m timeout now that `ExecutorPickle.printWaitingMessage` is unused

* `ExecutorPickleTest` → `ExecutorStepDynamicContextTest`

* Move `TIMEOUT_WAITING_FOR_NODE_MILLIS` from `ExecutorPickle` to `ExecutorStepExecution`

* Some user-visible logging when `ExecutorStepRetryEligibility` is _not_ used

* Do not block on `FutureImpl`.
After jenkinsci/workflow-step-api-plugin#73 this caused `SecretsMasker` to block on a `KubernetesComputer`
which in turn blocked provisioning of a `TaskListener`, causing `RestartPipelineTest.terminatedPodAfterRestart` to fail to find a message:

```
FINE	o.j.p.w.s.d.DurableTaskStep$Execution#getWorkspace: rediscovering that terminated-pod-after-restart-1-8fb2m-j206k-8x125 has been removed and timeout has expired
FINE	o.j.p.w.s.d.DurableTaskStep$Execution#_listener: JENKINS-34021: could not get TaskListener in CpsStepContext[9:sh]:Owner[terminated Pod After Restart/1:terminated Pod After Restart #1]
java.util.concurrent.TimeoutException
	at hudson.remoting.AsyncFutureImpl.get(AsyncFutureImpl.java:102)
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepDynamicContext$Translator.get(ExecutorStepDynamicContext.java:115)
Caused: java.io.IOException
	at org.jenkinsci.plugins.workflow.support.steps.ExecutorStepDynamicContext$Translator.get(ExecutorStepDynamicContext.java:118)
	at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:97)
	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:139)
	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet$1Delegate.doGet(ContextVariableSet.java:98)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:75)
	at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:85)
	at org.csanchez.jenkins.plugins.kubernetes.pipeline.SecretsMasker$Factory.get(SecretsMasker.java:73)
	at org.jenkinsci.plugins.workflow.steps.DynamicContext$Typed.get(DynamicContext.java:95)
	at org.jenkinsci.plugins.workflow.cps.ContextVariableSet.get(ContextVariableSet.java:139)
	at org.jenkinsci.plugins.workflow.cps.CpsThread.getContextVariable(CpsThread.java:135)
	at org.jenkinsci.plugins.workflow.cps.CpsStepContext.doGet(CpsStepContext.java:297)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:75)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.getListener(DefaultStepContext.java:127)
	at org.jenkinsci.plugins.workflow.support.DefaultStepContext.get(DefaultStepContext.java:87)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution._listener(DurableTaskStep.java:421)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.listener(DurableTaskStep.java:412)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.getWorkspace(DurableTaskStep.java:363)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.check(DurableTaskStep.java:570)
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.run(DurableTaskStep.java:553)
	at …
FINE	o.j.p.w.s.d.DurableTaskStep$Execution$NewlineSafeTaskListener#getLogger: creating filtered stream
FINE	o.j.p.w.s.d.DurableTaskStep$Execution#_listener: terminated-pod-after-restart-1-8fb2m-j206k-8x125 has been removed for 15 sec, assuming it is not coming back
```

* A `sh` step could fail to find a workspace at the moment it starts.
java.lang.NullPointerException
	at org.jenkinsci.plugins.workflow.steps.durable_task.DurableTaskStep$Execution.start(DurableTaskStep.java:329)
	at org.jenkinsci.plugins.workflow.cps.DSL.invokeStep(DSL.java:319)
	at org.jenkinsci.plugins.workflow.cps.DSL.invokeMethod(DSL.java:193)
	at org.jenkinsci.plugins.workflow.cps.CpsScript.invokeMethod(CpsScript.java:122)
	at …

* Also handling `ClosedChannelException` as commonly thrown by `SimpleBuildStep`s and the like: jenkinsci/kubernetes-plugin#1083 (comment)

* Also retry after `MissingContextVariableException` on `FilePath`

* Remove `super.onResume` left over in #46; see jenkinsci/workflow-step-api-plugin#66

* Documenting need for JENKINS-30383

* jenkinsci/workflow-step-api-plugin#77 allows `retryNodeBlockSynchAcrossRestarts` to be fixed

* jenkinsci/workflow-step-api-plugin#77 released

* Expanding `MissingContextVariableException` type list after noticing that `isUnix` requests `Launcher` not `FilePath`

* Never log a message with only `PlaceholderTask.cookie`; also need `.runId` for context

* Suppressing uninteresting log message

* Maybe need to call `StepContext.saveState`?

* Deleting comment; for now it seems clearest to keep `ExecutorStepDynamicContext` as a distinct type

* Resolve longstanding tech debt from #104 & #117 about `BodyExecution`.
When we do not use pickles, there is no need for `transient`.
Moving the reference to `ExecutorStepExecution` from `PlaceholderTask` avoids a memory leak.
This is only possible now that `Callback` holds the `ExecutorStepExecution`.
And we can now properly handle `AsynchronousExecution.interrupt`.

* Sketch of updating `WorkspaceStepExecution`: #180 (comment)

* Allowing `FilePathDynamicContext` to block waiting for an agent to reconnect.
Not great since the CPS VM thread will only grant 5s.
Better would be to avoid unnecessary use of `StepContext.get(FilePath.class)`
(which implicitly requires a live `Channel`);
perhaps `FilePathRepresentation` should become an API.
Amends b65ff1c.
#180 (comment)

* Adding a comment about 8b08398

* 8b08398 seems to allow a968adf to be finished.

* Refining 8b08398 to avoid tail call

* `org.jenkinsci.plugins.workflow.support.concurrent.Futures` deprecated

* SpotBugs reminds me that `WorkspaceStepExecution.Callback` needs an SVUID; taking one from trunk

* Adding `NodeTranslator`

* Beginning rewrite to `AgentErrorCondition`

* Extracted `RetryExecutorStepTest` from `ExecutorStepTest`

* Rely on blocking `ExecutorStepExecution.onResume`

* Removing bogus `waitForMessage` (and unnecessary `interrupt`) from `unrestorableAgent`. Sometimes a failing `waitForMessage` seems to trigger https://maven.apache.org/surefire/maven-surefire-plugin/faq.html#corruptedstream though I cannot track down why.

* `PlaceholderTask` may not hold a reference to `ExecutorStepExecution`

* Equivalent of #184 for `ExecutorStepDynamicContext`

* Assert jenkinsci/workflow-api-plugin@b1778a9

* jenkinsci/workflow-cps-plugin#534 & jenkinsci/workflow-job-plugin#260 released

* Windows tests jenkinsci/workflow-basic-steps-plugin#203 (comment)

* Avoiding `powershell` jenkinsci/workflow-basic-steps-plugin#203 (comment)

* Better usage of Hamcrest; got a flake https://github.com/jenkinsci/workflow-durable-task-step-plugin/pull/180/checks?check_run_id=6779178701 with `QueueTaskCancelled` instead of `RemovedNodeCause`

* In fact the flake was in a different test (`canceledQueueItem`)

* Forgot to make `canceledQueueItem` not use `sh`, though it should not really matter since this step is expected to fail anyway

* Test flake: https://github.com/jenkinsci/workflow-durable-task-step-plugin/pull/180/checks?check_run_id=7240329028

* Use `AgentOfflineException` from `ExecutorStepDynamicContext` as well

* Can now run all of `AgentErrorConditionTest`

* Responding to `CancellationException` in `ExecutorStepDynamicContext` with `QueueTaskCancelled`, analogous to handling in `ExecutorPickle`

* Permit `canceledQueueItem` to pass on `RemovedNodeCause`

* Normalizing indentation

* jenkinsci/workflow-step-api-plugin#124 deprecates an overload which 2e95e3a only handles at one call site; TBD if others should be switched to `actualInterruption`

* Refined `ExecutorStepDynamicContext` behavior in case of `quietingDown` #180 (comment) or `terminating` #180 (comment)

* Add an explicit `serialVersionUID` to `ExecutorStepExecution.PlaceholderTask.Callback` #180 (comment)

* Optimizing `withExecution` to only consider steps in the current build #180 (comment)

* Pick up jenkinsci/workflow-api-plugin#256 #180 (comment)

* jenkinsci/workflow-api-plugin#256 released
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants