-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TaskRun retries: Improve separation of concerns between PipelineRun and TaskRun reconciler #5248
Comments
/assign XinruZhang |
@XinruZhang I was just talking about this with @jerop: An equally valid way to support TaskRun retries and have separation of concerns is to add |
Thanks @lbernick for bringing up this issue! Indeed the behavior here is a little bit weird 😅. I'm more than happy to discuss it more :) I totally agree this is really about which controller should be responsible for taking care of the I agree either options would make sense here because It decouples the two reconciler on this functionality. I'm a little bit leaning towards the latter one -- adding |
sg! @afrittoli @abayer @pritidesai just want to make sure you don't have any concerns about adding |
I'm in favor of this approach. It does kinda feel like it deserves a TEP, but it's borderline to me. |
Agreeing with @abayer, I think it deserves a TEP 👼🏼 |
Thanks for everyone's input! I'll write a TEP for the new retries field XD. |
Thanks @lbernick!
It feels awkward because there is no concept of
+1
+1
I find it very useful to have pod name based on the length of the retries. The retry count signifies how many attempts were executed for a particular
How?
|
Pod naming for taskruns isn't part of our api-- we don't make any guarantees about pod naming not changing and I don't think we should. Are other projects getting the pod associated with a taskrun by making an api call for a pod named
I was thinking the pipelinerun controller would keep track of the number of retries of a taskrun, and keep track of each taskrun created for an attempt. The taskrun controller wouldn't need to use retries status for anything. However, your comment is making me realize that what I had in mind probably doesn't work, because I'm not sure the pipelinerun controller can create a taskrun with a status already set.
I don't think the taskrun reconciler should handle retries or know anything about retries, and I don't think we should move this logic to the taskrun reconciler.
I would imagine with the idea laid out here, the multiple taskruns would be referenced in the pipelinerun status, so you wouldn't have to query all the taskruns |
Fixed by #5844 |
Today, retries in Pipeline Tasks are implemented via the following process:
taskRun.status.retriesStatus
, and mark it as running. (happens here)This pattern is a bit awkward because it means both reconcilers are partially responsible for implementing retries, and only the reconciler for a given CRD should be the one updating that CRD's status.
One way around this would be to create a new TaskRun for each retry instead. This was discussed as a potential implementation strategy when retries were created (in issue 221, initial PR 510, and final PR 658), but it's not clear why the final decision was made.
If we'd like to go this route, here's what I think we should do:
RetriesStatus
based on the status of the old TaskRun.The text was updated successfully, but these errors were encountered: