-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Big enhancement: Parallel Tasks #6
Conversation
Cloning would result in copying every single image file, which took up a lot of memory, very quickly
More performance enhancements
This let's us pass the queue between different threads, which is useful for parallel image decoding and soon, parallel processing
It seems that using |
This lets me reference it multiple times for different decoding images, fixing the display issue
This would cause the program to process every image sequentailly, as the next thread would wait for the lock while the image processed or saved. We fix that by dropping the locks before beginning processing
This makes the conversions seemingly quite a bit faster, since we convert and write to memory then write to disk all at once. I need to test a bit more though.
Parallel decoding seems to work really well and brings a lot of improvement. |
Seems to be a bit faster and more reliable than my previous method of thread spawning
…found small but nice optimization
This should let me dynamically free memory when I no longer need a task, thus freeing the allocated image data
Using parallel iterators seems to bring pretty sizable improvements to performance, quite a bit more than I expected. Single threaded
Parallel
|
From my testing it seems that using par_iter() is a much better implementation of multi-threading than using a threadpool. Processing large batches of images using the thread pool causes the program to get killed by, on my system, earlyoom, which does not happen when running with parallel iterators. However, threadpools can be a bit faster, since we can specify the number of threads, we do not need them though
From a bit more testing it seems that our decode and process speeds see a significant increase in speed thanks to our parallelizing, however sadly so, writing images to disk is the slowest stage and ends up taking the most time from what I have tested. This makes sense, because in that case we would be limited by our disc speeds. |
However it seems that the thread pool |
The problem with the thread pool is that it would cause massive CPU utilization and if one were to process large amounts of images the process would easily hang the system, like on my own, having the process be forcibly killed by earlyoom because of CPU thread usage and massive memory usage. |
Through my debugging I have noticed that rayon's parallel iterators let threads sleep for very short periods of time, although I need to research that a bit more. |
Increases speed a little bit since threads don't always have to wait for mutex guards
This has been something that I have been wanting to work on for quite some time now and now I aim to make that a reality.