Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inefficient scheduling for par_iter #1204

Open
morgante opened this issue Oct 21, 2024 · 3 comments
Open

Inefficient scheduling for par_iter #1204

morgante opened this issue Oct 21, 2024 · 3 comments

Comments

@morgante
Copy link

morgante commented Oct 21, 2024

I have an iterator where the work to be done for each task varies widely—most items finish very quickly, but a few take orders of magnitude longer.

It appears from my quick experiment that Rayon is being somewhat inefficient here since spawn performs much better than the ParallelIterator. It looks like Rayon is still queuing up fast tasks behind the slow task on the same thread pool, or otherwise running fast tasks after the slow task.

The optimal strategy here, which spawn seems to support, is to continue chewing through tasks on other threads while 1 thread stays stuck on the slow task.

Is there a way to do this with par_iter which has much nicer iteration semantics, or do I need to manage spawning myself?

@cuviper
Copy link
Member

cuviper commented Oct 21, 2024

It's a tough balancing act, because usually I hear that the adaptive scheduler was too aggressive. :)

If your input par_iter() is an IndexedParallelIterator, then you can force fine granularity by adding .with_max_len(1). That will get you roughly the same processing strategy as a bunch of individual spawns, but still based on join recursion. That has its own work-stealing gotchas with latency though -- see #1054.

@morgante
Copy link
Author

Thanks for the pointer, I could probably get indexed iteration working but I'm actually not sure I understand what wins I'm getting there over just spawning each thread separately.

My ideal would actually be something like this:

  • I use the iterator/into_some_iter() with for_each.
  • The iterator is consumed for tasks into different threads.
  • When no thread is available for a task, we don't continue pulling from the iterator until there's actually a thread available for the next task.

Is there a way to do this with rayon or will I need to build my own scheduler / switch to tokio?

@cuviper
Copy link
Member

cuviper commented Oct 22, 2024

but I'm actually not sure I understand what wins I'm getting there over just spawning each thread separately.

The benefit is that you would still be using the shared thread pool, even if your number of items is much larger than the number of threads. They'll just be "scheduled" individually if you add .with_max_len(1).

What is your source that you're calling par_iter() on? Many types do support the indexed API already.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants