-
Notifications
You must be signed in to change notification settings - Fork 125
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Parallel mode hangs when invoked from Rayon global thread pool #517
Comments
I've tried wrapping the |
Is this on master or v8? Could you try some older versions perhaps? |
Master. |
Maybe this is relevant? rayon-rs/rayon#592 Out of interest, are you using the new raw API? |
That bug is unlikely to be relevant, since the PNG-writing task that invokes Oxipng calls into_inner() on its Mutex before running. I'm not using the raw API but will look into it. |
As I'm not exactly an expert in Rayon, I'm afraid there's not much else I can do to help, sorry. Perhaps @kornelski may have some ideas? |
Very simple to reproduce, using this code as an example. My CPU has 12 threads. This works perfectly well with 11, but as soon as I uncomment 12, none of them finish, there is no CPU activity and the program just hangs. On an 8 thread CPU, it works with 7, but hangs with 8, etc. use rayon::prelude::*;
use oxipng::{optimize_from_memory, Options, Deflaters};
fn main() {
let buffer = include_bytes!("sample.png").as_slice();
let options = Options {
deflate: Deflaters::Libdeflater { compression: 12 },
..Default::default()
};
vec![
1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11,
// 12,
].par_iter().for_each(|x| {
println!("Optimizing {x}");
optimize_from_memory(&buffer, &options).unwrap();
println!("Done {x}");
});
} |
@wait-what Thanks for minimal test case, that's quite helpful! |
Using Using a while loop like that does seem like it might not be the most optimal solution, but it does seem plenty fast. |
I think the yield loop should be safe, because it's not busy-waiting, it's stealing work to run on the current thread. I'm not sure if it solves the problem 100%, because I don't know if it's guaranteed that rayon will find a job to execute at the moment of entering the When I see a couple of solutions to this:
|
Thanks for the analysis, @kornelski!
Are you sure this is possible though? I would have thought the job would be guaranteed to be queued before |
@Pr0methean Have you had a chance to try out the branch above? |
@andrews05 It works; thanks! |
Great, I've opened a PR for it. |
@Pr0methean Was just thinking about the stack overflow you mentioned, are you able to get a stack trace for that? |
Yes. |
Opened rayon-rs/rayon#1064. |
Oh I see, so yield_local is picking up a task from your own program (the next image to operate on), rather than an evaluation task, is that right? |
That's right. I've tried having the image task itself call yield_local() before starting, but this is slower and gives lower CPU utilization than without the Oxipng parallelization. I've also tried wrapping the Oxipng call in install() with a second thread pool, but then I get deadlock again. |
I guess kornelski's solution would have worked better then. Whoops 😅 |
Update: it looks like this only affects jobs running in a FIFO scope, but it's unfortunate that I have to choose between my two methods of increasing utilization (parallelizing oxipng vs starting the longest jobs first). |
I believe I have a simple solution. Can you try this out? andrews05@1e46122 |
It may take me a day or two to test this, because I'm also testing zopfli-rs/zopfli#22 and (in preparation for making it a PR) Pr0methean/zopfli@b2d0142. |
My app needs to generate and optimize multiple images in parallel, so it uses the Rayon global thread pool. However, the time these images take varies unpredictably, so utilization at the end would be higher if I could also use Oxipng's parallel mode. Unfortunately, Oxipng deadlocks when I invoke oxipng from within a Rayon task in the global thread pool. Here's the app where I'm experiencing this: https://app.circleci.com/pipelines/github/Pr0methean/OcHd-RustBuild?branch=oxipng-parallel
The text was updated successfully, but these errors were encountered: