-
Notifications
You must be signed in to change notification settings - Fork 221
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Rework RUCSS retry strategy #6222
Comments
Scope a solutionTo solve this issue, we are planning to overhaul the retry system, and it could be interesting to consider adopting a Strategy design pattern approach to achieve this. Here's the proposed plan:
While Mathieu says:
We can see here different strategies that will have to be implemented.
I think it would be nice to go with the exponential approach. For this, we can create a new column in the database called Development steps:
Effort estimation:M Is a refactor needed in that part of the codebase?Yes, as explain in the solution |
@Miraeld could you provide an estimation of the effort for the exponential waiting time? |
I would like to complete the grooming: to switch between different strategies we gonna use a Otherwise that seems good to me. |
@CrochetFeve0251 , I would say a S for me, but an XS for you and probably others :) |
Error code refinement discussed here: https://wp-media.slack.com/archives/CUT7FLHF1/p1698749275154029 |
I updated the ticket based on the Slack discussion, and added Acceptance Criteria as well to better frame the behavior. |
Another quick update to simplify the 404 and 422 behavior (fail immediately). |
Co-authored-by: Vasilis Manthos <[email protected]> Co-authored-by: COQUARD Cyrille <[email protected]>
Context
The current retry strategy is as follows in
check_job_status
, supposed to be triggered every minute:This has several issues:
Expected behavior
Rework the behavior in case the SaaS does not return a job result so that:
In case of timeout during the request from the plugin to the server, we follow the same strategy as with a 400. A dedicated error code and error message should be handled on the plugin side (maybe 504 Server Timeout ?)
Suggested solutions
For point 1, we should leverage the details provided here.
For point 2, the ideal approach would be an exponential backoff: Try after 1 minute, then 2 minutes, then 5, then 10, then 10 for instance. Another quick&dirty approach could be to increase the number of retries. Both should be groomed to decide.
EDIT
After discussions, we will go with the exponential strategy implementation and distinct error codes.
Acceptance Criteria
NOTE: There is a known issue on the SaaS side for the Unauthorized '401') case. It will be considered like a 400 by the plugin. This use-case is to be discarded for now when testing the plugin.
The text was updated successfully, but these errors were encountered: