Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a Timeouts optional field to pipelinerun #3843

Merged
merged 3 commits into from
Jun 7, 2021

Conversation

souleb
Copy link
Contributor

@souleb souleb commented Mar 18, 2021

This PR is an implementation of TEP #326.

Fixes issue #2989.

Changes

Timeouts is a dict of timeout fields:

timeouts:
  pipeline: "0h0m60s"
  tasks: "0h0m30s"
  finally: "0h0m20s"

All three subfields are optional

Validates timeouts if:
- timeouts.pipeline >= timeouts.tasks + timeouts.finally
- all fields much be strictly positive

It is still possible to use timeout, but it is invalid to have both timeout and timeouts. The timeout behavior is unchanged.

⚠️ tkn does not process this new field

Have been tested with:

apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: hello-world-pipeline-run-with-timeout
spec:
  timeout: "0h0m60s"
  pipelineSpec:
    tasks:
    - name: dagtask
      timeout: "0h0m30s"
      taskSpec:
        steps:
          - name: hello
            image: ubuntu
            script: |
              echo "Hello World!"
              sleep 10
    finally:
    - name: finallytask
      params:
        - name: echoStatus
          value: "$(tasks.dagtask.status)"
      taskSpec:
        params:
          - name: echoStatus
        steps: 
          - name: verify-status
            image: ubuntu
            script: |
              if [ $(params.echoStatus) == "Succeeded" ]
              then
                echo " Hello World echoed successfully"
              fi

---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: hello-world-pipeline-run-with-taskstimeout
spec:
  timeouts: 
    tasks: "0h0m30s"
  pipelineSpec:
    tasks:
    - name: dagtask
      taskSpec:
        steps:
          - name: hello
            image: ubuntu
            script: |
              echo "Hello World!"
              sleep 30
    finally:
    - name: finallytask
      params:
        - name: echoStatus
          value: "$(tasks.dagtask.status)"
      taskSpec:
        params:
          - name: echoStatus
        steps:
          - name: verify-status
            image: ubuntu
            script: |
              if [ $(params.echoStatus) == "Succeeded" ]
              then
                echo " Hello World echoed successfully"
              fi


---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: hello-world-pipeline-run-with-timeout-and-taskstimeout
spec:
  timeouts: 
    pipeline: "0h0m40s"
    tasks: "0h0m30s"
  pipelineSpec:
    tasks:
    - name: dagtask
      taskSpec:
        steps:
          - name: hello
            image: ubuntu
            script: |
              echo "Hello World!"
              sleep 30
    finally:
    - name: finallytask
      params:
        - name: echoStatus
          value: "$(tasks.dagtask.status)"
      taskSpec:
        params:
          - name: echoStatus
        steps:
          - name: verify-status
            image: ubuntu
            script: |
              if [ $(params.echoStatus) == "Succeeded" ]
              then
                echo " Hello World echoed successfully"
              fi

---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: pipeline-run-with-timeout-and-taskstimeout-finally-timeout
spec:
  timeouts: 
    pipeline: "0h0m60s"
    tasks: "0h0m30s"
    finally: "0h0m4s"
  pipelineSpec:
    tasks:
    - name: dagtask
      taskSpec:
        steps:
          - name: hello
            image: ubuntu
            script: |
              echo "Hello World!"
              sleep 30
    finally:
    - name: finallytask
      params:
        - name: echoStatus
          value: "$(tasks.dagtask.status)"
      taskSpec:
        params:
          - name: echoStatus
        steps:
          - name: verify-status
            image: ubuntu
            script: |
              if [ $(params.echoStatus) == "Succeeded" ]
              then
                echo " Hello World echoed successfully"
              fi


---
apiVersion: tekton.dev/v1beta1
kind: PipelineRun
metadata:
  name: hello-world-pipeline-run-with-timeout-and-finallytimeout
spec:
  timeouts: 
    pipeline: "0h0m40s"
    finally: "0h0m30s"
  pipelineSpec:
    tasks:
    - name: dagtask
      taskSpec:
        steps:
          - name: hello
            image: ubuntu
            script: |
              echo "Hello World!"
              sleep 30
    finally:
    - name: finallytask
      params:
        - name: echoStatus
          value: "$(tasks.dagtask.status)"
      taskSpec:
        params:
          - name: echoStatus
        steps:
          - name: verify-status
            image: ubuntu
            script: |
              if [ $(params.echoStatus) == "Succeeded" ]
              then
                echo " Hello World echoed successfully"
              fi

/kind feature

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

  • Includes tests (if functionality changed/added)
  • Includes docs (if user facing)
  • Commit messages follow commit message best practices
  • Release notes block has been filled in or deleted (only if no user facing changes)

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

Reviewer Notes

If API changes are included, additive changes must be approved by at least two OWNERS and backwards incompatible changes must be approved by more than 50% of the OWNERS, and they must first be added in a backwards compatible way.

Release Notes

  • API changes
    • Added field Timeouts to PipelineRun spec. It is a dict with the following sub-fields
      • pipeline, to control the pipeline failure timeout
      • tasks, to control the pipeline tasks failure timeout
      • finally, , to control the pipeline finally tasks failure timeout
  • Changes in behavior
    • When supplied, a timeouts field combination permits deciding which part of the pipeline runtime is allocated to tasks and finally tasks.

@tekton-robot tekton-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Mar 18, 2021
@tekton-robot tekton-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 18, 2021
@tekton-robot
Copy link
Collaborator

Hi @souleb. Thanks for your PR.

I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@souleb
Copy link
Contributor Author

souleb commented Mar 18, 2021

cc @jerop @pritidesai

@tekton-robot tekton-robot added the kind/feature Categorizes issue or PR as related to a new feature. label Mar 18, 2021
@pritidesai
Copy link
Member

/ok-to-test

@tekton-robot tekton-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 18, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 85.6% 85.8% 0.2
pkg/apis/pipeline/v1beta1/pipelinerun_validation.go 100.0% 97.6% -2.4
pkg/reconciler/pipelinerun/pipelinerun.go 83.8% 84.3% 0.5
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.1% 92.2% 0.0

Copy link

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this @souleb ! Looking good so far - I did a partial review and will grab some more time to look again tomorrow.

pkg/apis/pipeline/v1beta1/pipelinerun_types.go Outdated Show resolved Hide resolved
pkg/apis/pipeline/v1beta1/pipelinerun_validation.go Outdated Show resolved Hide resolved
pkg/apis/pipeline/v1beta1/pipelinerun_validation.go Outdated Show resolved Hide resolved
@@ -219,6 +219,11 @@ func (t *ResolvedPipelineRunTask) parentTasksSkip(facts *PipelineRunFacts) bool
return false
}

// IsFinalTask returns true if a task is a finally task
func (t *ResolvedPipelineRunTask) IsFinalTask(facts *PipelineRunFacts) bool {
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make facts.isFinalTask public instead of wrapping it in a new function on the RPRT type. wdyt?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I'm assuming this isn't prevented for some other reason reason I'm missing?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It just seemed to me more natural to put it that way because I was expecting to find the function closed to IsFinallySkipped. I like the idea of collocating those receiver function as they seem related. But i'm not against changing the scope of facts.isFinalTask.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, there is no specific reason preventing it from going public i.e facts.isFinalTask can be made public.

Copy link

@ghost ghost Mar 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of collocating those receiver function as they seem related.

I see this a bit differently - IsFinallySkipped is answering a question about the whole pipeline graph which requires the graph be fully "resolved", so makes sense on the ResolvedPipelineRunTasks type. IsFinalTask answers a question about a "fact" of an individual task of the pipeline, so makes sense to me to live on the PipelineRunFacts type. (Edit to add: now that I read this again I'm actually not sure there's much distinction here after all :D)

Given that these two types (RPRT and PipelineRunFacts) are used so closely together in our code I don't feel super strongly either way. I just got a bit confused when I first read it because the two funcs are named so similarly and use the same objects 😅

pkg/reconciler/pipelinerun/pipelinerun_test.go Outdated Show resolved Hide resolved
@souleb souleb force-pushed the finally-exec-post-timeout branch from b7f3bf9 to c4a61fe Compare March 18, 2021 21:25
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 85.6% 85.8% 0.2
pkg/apis/pipeline/v1beta1/pipelinerun_validation.go 100.0% 97.6% -2.4
pkg/reconciler/pipelinerun/pipelinerun.go 83.8% 84.3% 0.5
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.1% 92.2% 0.0

@jerop jerop self-assigned this Mar 21, 2021
@ghost
Copy link

ghost commented Mar 22, 2021

I'm going to add a hold on this just so to ensure we get the TEP merged before this goes in.

/hold

Edit to add: also to wait until the feature gate work is in place.

@tekton-robot tekton-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Mar 22, 2021
@ghost
Copy link

ghost commented Mar 22, 2021

@souleb it would also be great to add an example YAML as part of this PR which shows the new field's usage too. examples/v1beta1/pipelineruns/no-ci/pipeline-timeout.yaml is another similar example which serves that purpose.

@ghost
Copy link

ghost commented Mar 22, 2021

One more thing that would be useful (sorry for the message spam!) would be to include some integration tests as well. These are quite a bit of work but it's good to have a fully-constructed pipeline run that tests timeout behaviour as part of our e2e test suite. See test/timeout_test.go for an example that you could base your own test on.

@souleb souleb force-pushed the finally-exec-post-timeout branch from c4a61fe to 9d5768a Compare March 22, 2021 13:29
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 85.6% 85.8% 0.2
pkg/apis/pipeline/v1beta1/pipelinerun_validation.go 100.0% 97.6% -2.4
pkg/reconciler/pipelinerun/pipelinerun.go 83.8% 84.3% 0.5
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.1% 92.2% 0.0

@souleb
Copy link
Contributor Author

souleb commented Mar 22, 2021

One more thing that would be useful (sorry for the message spam!) would be to include some integration tests as well. These are quite a bit of work but it's good to have a fully-constructed pipeline run that tests timeout behaviour as part of our e2e test suite. See test/timeout_test.go for an example that you could base your own test on.

Sure, will do that. Just I wanted to wait for the TEP to be approved beforehand.

@souleb souleb force-pushed the finally-exec-post-timeout branch from 9d5768a to f903552 Compare April 7, 2021 21:40
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 85.6% 85.8% 0.2
pkg/apis/pipeline/v1beta1/pipelinerun_validation.go 100.0% 97.6% -2.4
pkg/reconciler/pipelinerun/pipelinerun.go 83.8% 84.3% 0.5
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.1% 92.2% 0.0

@souleb souleb changed the title Add a TasksTimeout optional field to pipelinerun Add a Timeouts optional field to pipelinerun Apr 22, 2021
@tekton-robot tekton-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Apr 22, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
pkg/apis/pipeline/v1beta1/pipelinerun_defaults.go 100.0% 92.9% -7.1
pkg/apis/pipeline/v1beta1/pipelinerun_types.go 75.8% 73.5% -2.2
pkg/apis/pipeline/v1beta1/pipelinerun_validation.go 100.0% 98.2% -1.8
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.1% 92.2% 0.0

@souleb souleb force-pushed the finally-exec-post-timeout branch from a334074 to 11be900 Compare April 23, 2021 00:25
@souleb souleb force-pushed the finally-exec-post-timeout branch from 5e894ee to 4fc27b2 Compare May 27, 2021 08:10
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 85.6% 86.5% 0.9
pkg/apis/pipeline/v1beta1/pipelinerun_defaults.go 100.0% 92.9% -7.1
pkg/apis/pipeline/v1beta1/pipelinerun_types.go 75.8% 73.5% -2.2
pkg/apis/pipeline/v1beta1/pipelinerun_validation.go 100.0% 98.2% -1.8
pkg/reconciler/pipelinerun/pipelinerun.go 83.8% 82.9% -0.9
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 92.2% 92.3% 0.0

@souleb
Copy link
Contributor Author

souleb commented May 27, 2021

@pritidesai will you have some time to review this PR?

@pritidesai
Copy link
Member

@pritidesai will you have some time to review this PR?

Thank you @souleb for all the changes and your patience 🙇‍♀️ Yes I will spend some time today to review this PR and continue early next week after long weekend. Thanks a bunch again for all your efforts 🙏

@vdemeester
Copy link
Member

/retest

Copy link
Member

@vdemeester vdemeester left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 1, 2021
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbwsg, vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@jerop
Copy link
Member

jerop commented Jun 1, 2021

/test pull-tekton-pipeline-integration-tests

@tekton-robot tekton-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jun 2, 2021
@pritidesai
Copy link
Member

@souleb please rebase this PR, its ready to merge. Thanks again for all the hard work. 🙏

souleb added 3 commits June 4, 2021 23:03
TaskTimeout is used to timeout all dag tasks, finally tasks excluded

Validates a TasksTimeout if:
	- TasksTimeout > 0
	- Timeout is specified and TasksTimeout <= Timeout
 	- Timeout not specified and TasksTimeout <= default Timeout

Add a builder function for taskTimeout

Defines 2 functions to get taskrun timeout
One for dag tasks and one specifically for finally tasks
i.e.

timeouts:
  pipeline: "0h0m60s"
  tasks: "0h0m30s"
  finally: "0h0m20s"

All three subfields are optional
@souleb souleb force-pushed the finally-exec-post-timeout branch from 4fc27b2 to a965afd Compare June 4, 2021 21:16
@tekton-robot tekton-robot removed lgtm Indicates that a PR is ready to be merged. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Jun 4, 2021
@tekton-robot
Copy link
Collaborator

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File Old Coverage New Coverage Delta
internal/builder/v1beta1/pipeline.go 85.6% 86.5% 0.9
pkg/apis/pipeline/v1beta1/pipelinerun_defaults.go 100.0% 92.9% -7.1
pkg/apis/pipeline/v1beta1/pipelinerun_types.go 75.8% 73.5% -2.2
pkg/apis/pipeline/v1beta1/pipelinerun_validation.go 100.0% 98.3% -1.7
pkg/reconciler/pipelinerun/pipelinerun.go 83.8% 82.9% -0.9
pkg/reconciler/pipelinerun/resources/pipelinerunresolution.go 91.9% 91.9% 0.0

@pritidesai
Copy link
Member

/retest

@souleb
Copy link
Contributor Author

souleb commented Jun 7, 2021

@vdemeester can reset the lgtm? The bot unset it following my rebase.

@vdemeester
Copy link
Member

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Jun 7, 2021
@vdemeester
Copy link
Member

/retest

@vdemeester
Copy link
Member

/retest

🤔

@tekton-robot tekton-robot merged commit b6cb0a0 into tektoncd:main Jun 7, 2021
abayer added a commit to abayer/community that referenced this pull request Dec 14, 2021
tektoncd/pipeline#3843 was merged back in June.

Signed-off-by: Andrew Bayer <[email protected]>
tekton-robot pushed a commit to tektoncd/community that referenced this pull request Dec 14, 2021
tektoncd/pipeline#3843 was merged back in June.

Signed-off-by: Andrew Bayer <[email protected]>
lbernick pushed a commit to lbernick/community that referenced this pull request Dec 15, 2021
tektoncd/pipeline#3843 was merged back in June.

Signed-off-by: Andrew Bayer <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants