You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In job-intensive systems where many background jobs need to be processed in parallel, there can be tens or hundreds of worker nodes in a cluster.
Currently, each worker starts a JobProcessor for ALL of the InProgress jobs. This means all of the nodes will start processing many jobs, which is not efficient. Typically, each single job needs to be run on a few of the cluster nodes, and the jobs can be distributed on different nodes.
Nebula needs to be able to specify the maximum number of nodes processing each single job.
For example, you can deploy a 100-node worker cluster, with 200 running jobs. Current version of Nebula will start 20000 runners across the cluster. But we need to specify, for example, 5 maximum runners per job (set separately on each individual job), so that a total of 1000 runners will be started and distributed on the cluster nodes (each node running an average of 10 runners).
A nice feature can be intelligent distribution of the runners, so when new jobs are started, it will be assigned to the nodes with less traffic/runners.
The text was updated successfully, but these errors were encountered:
In job-intensive systems where many background jobs need to be processed in parallel, there can be tens or hundreds of worker nodes in a cluster.
Currently, each worker starts a
JobProcessor
for ALL of theInProgress
jobs. This means all of the nodes will start processing many jobs, which is not efficient. Typically, each single job needs to be run on a few of the cluster nodes, and the jobs can be distributed on different nodes.Nebula needs to be able to specify the maximum number of nodes processing each single job.
For example, you can deploy a 100-node worker cluster, with 200 running jobs. Current version of Nebula will start 20000 runners across the cluster. But we need to specify, for example, 5 maximum runners per job (set separately on each individual job), so that a total of 1000 runners will be started and distributed on the cluster nodes (each node running an average of 10 runners).
A nice feature can be intelligent distribution of the runners, so when new jobs are started, it will be assigned to the nodes with less traffic/runners.
The text was updated successfully, but these errors were encountered: