-
Notifications
You must be signed in to change notification settings - Fork 501
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Does rayon have guarantee that .par_bridge().map().collect() will not store too many "Item"s in mem? #1068
Comments
"Too many" is not specific enough to make guarantees.
I'm not sure that we want to commit guarantees about that implementation though. We've changed it in a major way already, and might want to change it again if we figure out a new heuristic for performance. |
Thanks for answer! It still would be great to get at least some guarantee. |
We can (and should) at least guarantee that we don't consume the entire iterator at once -- this means that it's safe to use with unbounded iterators like The other extreme is to say that we don't buffer items at all, which is actually the current state of things. That's a useful property because it enables patterns like channel send/recv that might otherwise deadlock, if existing items don't get processed before trying to recv more. Once promised though, I hope we would not regret it... I'm less certain about the value of trying to put any particular numbers in-between. |
Thinking about the
👍 |
There are a lot of ways to implement buffering on the pre-rayon iterator side. Buffering on the rayon side would be trying to avoid some of the So, okay, let's commit to not buffering. Anyone want to write that up in a PR? |
I will adopt my pull request #1071 in my code base, so I personally don't need any guarantees from par_bridge anymore. Still the guarantees can be useful for others |
Hi. I wrote iterator, which reads data from a file, splits the data to chunks and returns them one-by-one. Then I apply
.par_bridge().map(...).collect::<Vec<_>>()
to that iterator. But my file does not necessary fits into memory. So my question is: does this use case provide guarantee that rayon will never store in memory too many chunks? (I. e. too many items of original sequential iterator.) (Values produced bymap
are small [in fact they are hashes of chunks in my application], so it okay for my application to store them all in memory.)If such guarantee exists, then, please, document it
The text was updated successfully, but these errors were encountered: