You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When handling a graph of tasks in DaskVine, it first submits all available tasks in the topmost level of the graph (because they don't depend on the output files produced by any other tasks), and then begins to call wait where worker connection and task dispatching happen.
However, if the graph is wide enough, thousands of tasks may be ready for submission, then the manager will be busy with submitting tasks instead of dispatching at the initialization stage. If we could delay some task submissions and instead do some worker connection and task dispatching, it might improve the concurrency at the beginning
For example, in the following run, at the first ~10 min, no workers were connected and no tasks were dispatched, which is potentially harmful to the overall execution time.
The text was updated successfully, but these errors were encountered:
JinZhou5042
changed the title
vine: long initialization time of task submission
vine: long initialization time for large task graphs in DaskVine
Oct 14, 2024
vine_hungry is the intended solution to this problem! It gives the caller a signal as to when "enough" tasks have been submitted and the manager should get to work, hence this pattern:
When handling a graph of tasks in DaskVine, it first submits all available tasks in the topmost level of the graph (because they don't depend on the output files produced by any other tasks), and then begins to call
wait
where worker connection and task dispatching happen.However, if the graph is wide enough, thousands of tasks may be ready for submission, then the manager will be busy with submitting tasks instead of dispatching at the initialization stage. If we could delay some task submissions and instead do some worker connection and task dispatching, it might improve the concurrency at the beginning
For example, in the following run, at the first ~10 min, no workers were connected and no tasks were dispatched, which is potentially harmful to the overall execution time.
The text was updated successfully, but these errors were encountered: