-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Set max consecutive futile launches #102
Conversation
Codecov ReportPatch coverage:
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. Additional details and impacted files@@ Coverage Diff @@
## main #102 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 22 22
Lines 1261 1285 +24
=========================================
+ Hits 1261 1285 +24
☔ View full report in Codecov by Sentry. |
Re. your comment on #79, whether a worker is 'backlogged' shouldn't affect the actual completed count - if a worker does a task this will increment regardless. I think you'd want to set it higher than 1 to allow for random errors though. |
By the way: @shikokuchuo, it's starting to look like shikokuchuo/mirai@3f15ead really fixed #76 and ropensci/targets#1101! I almost always get lots of host segfaults and dangling dispatchers when I develop and test |
So then I can trust that backlogged workers will have at least 1 completed task? I thought if completed < assigned then it might be the case that completed = 0 when it should be 1 (or at least 1 task should have actually completed). |
Yes, it is hard to remember how things were... but now it's all cumulative stats. So if 'backlogged' as you've termed it - assigned will be 1 larger than complete. At most 1. All that means is there is a task waiting to be completed at the socket. If a daemon dials in and completes that task complete will increment. Conversely, if it wasn't backlogged, assigned = complete to start with, then it gets sent a task, it completes it, both increase. So whether it is 'backlogged' shouldn't affect this feature. I just wanted to be sure there wasn't something getting mixed up here. |
Prework
Related GitHub issues and pull requests
Summary
This PR implements a new
launch_max
argument to the controllers and launchers which sets the maximum number of allowed launch attempts in a row which do not complete any tasks. @shikokuchuo suggested this as a robust uniform solution to wlandau/crew.cluster#19 and shikokuchuo/mirai#20. Indeed it prevents a nightmare of indefinitely racking up costs on a malfunctioning platform, but given #79,launch_max
can almost never be zero. So at best, this PR is only an approximate solution. I hope there is a more precise / less wasteful way to solve this long-term. Maybe it will have to involve understanding why some workers become backlogged in the first place and to prevent that outcome entirely.FYI @multimeric