-
Notifications
You must be signed in to change notification settings - Fork 507
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
yield_local can cause stack overflow #1064
Comments
I don't agree with your fix that yielding should never recurse. It's in the nature of work-stealing that rayon jobs may be nested on each other, and I think /// Wait for all evaluations to finish and return smallest reduction
/// Or `None` if the queue is empty.
#[cfg(feature = "parallel")]
pub fn get_best_candidate(self) -> Option<Candidate> {
let (eval_send, eval_recv) = self.eval_channel;
// Disconnect the sender, breaking the loop in the thread
drop(eval_send);
// Yield to ensure evaluations are finished - this can prevent deadlocks when run within an existing thread pool
while let Some(rayon::Yield::Executed) = rayon::yield_local() {}
eval_recv.into_iter().min_by_key(Candidate::cmp_key)
} That's draining all local work before returning, when they only need to wait for their own evaluations. It might be cleaner if they looped on the channel |
I looks like they're already fixing something like that in shssoichiro/oxipng#527 |
Maybe |
When a job calls yield_local, Rayon loads another job onto the stack on top of it. If lots of jobs are calling it, this can cause a stack overflow.
To fix this, Rayon should give each thread a flag that tracks whether yield_local is already on the stack. If it is and is called again, yield_local should return immediately. (There may need to be a third value in the Yield enum for this case.) Alternately, it could check available stack space.
Example stack trace:
stack_trace.txt
The text was updated successfully, but these errors were encountered: