Add constant backoff for TaskRun retry #4881

yannickhilber · 2022-05-17T08:48:15Z

A Taskrun not started due to compute resources unavailability have a exponential retry backoff.

When a Taskrun could not start due to compute resources unavailability for it's pods, it will restart after a few seconds. If it occurs again, the backoff will be increased. After several retries failure, it will take minutes before the TaskRun been started again.

Between two retries, compute resource may be available for the TaskRun Pods and we want them to be started at that time not when backoff is elapsed.

Changes

The PR implement the change proposed by @imjasonh in issue comment #4847

A description of my tests related to the code change in issue comment #4847

/kind bug

Submitter Checklist

As the author of this PR, please check off the items in this checklist:

Docs included if any changes are user facing
Tests included if any functionality added or changed
Follows the commit message standard
Meets the Tekton contributor standards (including
functionality, content, code)
Release notes block below has been filled in
(if there are no user facing changes, use release note "NONE")

Release Notes

NONE

linux-foundation-easycla · 2022-05-17T08:48:17Z

The committers listed above are authorized under a signed CLA.

✅ login: yhil / name: yhil (886408d)

tekton-robot · 2022-05-17T08:48:30Z

Hi @yhil. Thanks for your PR.

I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Set constant retry backoff when TaskRun Pods not started due to compute resources unavailaibility Issue tektoncd#4847

yannickhilber · 2022-05-17T09:11:04Z

/cc @vdemeester @pritidesai

imjasonh · 2022-05-17T12:54:22Z

/ok-to-test

imjasonh

/lgtm

tekton-robot · 2022-05-17T12:57:49Z

The following is the coverage report on the affected files.
Say /test pull-tekton-pipeline-go-coverage to re-run this coverage report

File	Old Coverage	New Coverage	Delta
pkg/reconciler/taskrun/taskrun.go	80.1%	80.2%	0.1

tekton-robot · 2022-05-18T07:37:42Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: vdemeester

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

~~OWNERS~~ [vdemeester]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

tekton-robot added release-note-none Denotes a PR that doesnt merit a release note. kind/bug Categorizes issue or PR as related to a bug. labels May 17, 2022

tekton-robot added the size/XS Denotes a PR that changes 0-9 lines, ignoring generated files. label May 17, 2022

tekton-robot requested review from afrittoli and dlorenc May 17, 2022 08:48

tekton-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label May 17, 2022

Add constant backoff for TaskRun retry

886408d

Set constant retry backoff when TaskRun Pods not started due to compute resources unavailaibility Issue tektoncd#4847

yannickhilber force-pushed the pod_scheduling_retry_backoff branch from 6caa6c0 to 886408d Compare May 17, 2022 08:53

tekton-robot requested review from pritidesai and vdemeester May 17, 2022 09:11

tekton-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels May 17, 2022

imjasonh reviewed May 17, 2022

View reviewed changes

tekton-robot assigned imjasonh May 17, 2022

tekton-robot added the lgtm Indicates that a PR is ready to be merged. label May 17, 2022

vdemeester approved these changes May 18, 2022

View reviewed changes

tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 18, 2022

tekton-robot merged commit a146455 into tektoncd:main May 18, 2022

yannickhilber mentioned this pull request May 18, 2022

Tekton controller queue processing time #4847

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add constant backoff for TaskRun retry #4881

Add constant backoff for TaskRun retry #4881

yannickhilber commented May 17, 2022

linux-foundation-easycla bot commented May 17, 2022 •

edited

Loading

tekton-robot commented May 17, 2022

yannickhilber commented May 17, 2022

imjasonh commented May 17, 2022

imjasonh left a comment

tekton-robot commented May 17, 2022

tekton-robot commented May 18, 2022

Add constant backoff for TaskRun retry #4881

Add constant backoff for TaskRun retry #4881

Conversation

yannickhilber commented May 17, 2022

Changes

Submitter Checklist

Release Notes

linux-foundation-easycla bot commented May 17, 2022 • edited Loading

tekton-robot commented May 17, 2022

yannickhilber commented May 17, 2022

imjasonh commented May 17, 2022

imjasonh left a comment

Choose a reason for hiding this comment

tekton-robot commented May 17, 2022

tekton-robot commented May 18, 2022

linux-foundation-easycla bot commented May 17, 2022 •

edited

Loading