-
Notifications
You must be signed in to change notification settings - Fork 892
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid blocking on CloseHandle #1850
Conversation
This is a draft at this point - depends on tar accepting a change I have in preparations |
This permits optimisation of unpacking on Windows - which is perhaps overly special cased to include in tar itself? Doing this for rustup which already doesn't use the Archive wrapper :/. With this patch, I reduced rustup's rust-docs install from 21/22s to 11s on my bench machine - details at rust-lang/rustup#1850 I haven't filled in all the possibilities for the Unpacked enum, but happy to do so if you'd like that done - it seemed like it wouldn't be useful at this stage as there are no live objects to return.
This may be enough to permit closing #1540 |
For the purpose of seeing the CI complete, could you update your Cargo.toml to use a git repo instead of a path for the patch to tar-rs ? |
(writing here to spam fewer people than in #1540) @rbtcollins your PRs are awesome, thanks! Do you think it's worth implementing the streaming unpack I mentioned in #1540 (comment)? |
Thank you :).
I think its worth doing the experiment. It may be a bit awkward as the downloaded is in futures and tar-rs is sync code, but perhaps just slamming them together as a learning test would answer the question about impact, and we can make it nice if it has a big benefit. From what you've said though, you have a slow package server // slow link to the package server, and you want to do the unpacking while the download progresses, because the download is so slow that you can mask all the local IO overheads during the download period. I think that will definitely work for you; I don't think it will be a negative for anyone, though it will have an impact on folk that depend on the partial download recovery/retry mechanism today unless care is taken to preserve that. We don't have any stats on how many folk have download times that are roughly the same or greater than the unpacking time (and thus would benefit as you do with making the unpacking concurrent). |
Even if they're the same speed, downloading and extracting files in parallel could almost halve the installation time.
That impedance mismatch can be solved by using the https://docs.rs/futures/0.1.27/futures/stream/struct.Wait.html, which adapts a |
Maybe! Lets find out.
I don't think the in-process retry patch has merged yet; so you can ignore that. However partial downloads are saved to a .partial file, and that can be resumed if the download fails by running rustup again. I think that is a useful capability, so a mergable version of what you want to achieve would want to still stream the archive to disk. In terms of joining the two bits together; a few possilbities:
Back on the design though, one more consideration is validation - right now we make some assumptions about the safety of the archive we unpack based on validating it's hash - and in future GPG signature; mirrors of the packages may make trusting partly downloaded content more complex. |
This permits optimisation of unpacking on Windows - which is perhaps overly special cased to include in tar itself? Doing this for rustup which already doesn't use the Archive wrapper :/. With this patch, I reduced rustup's rust-docs install from 21/22s to 11s on my bench machine - details at rust-lang/rustup#1850 I haven't filled in all the possibilities for the Unpacked enum, but happy to do so if you'd like that done - it seemed like it wouldn't be useful at this stage as there are no live objects to return.
This permits optimisation of unpacking on Windows - which is perhaps overly special cased to include in tar itself? Doing this for rustup which already doesn't use the Archive wrapper :/. With this patch, I reduced rustup's rust-docs install from 21/22s to 11s on my bench machine - details at rust-lang/rustup#1850 I haven't filled in all the possibilities for the Unpacked enum, but happy to do so if you'd like that done - it seemed like it wouldn't be useful at this stage as there are no live objects to return.
This permits optimisation of unpacking on Windows - which is perhaps overly special cased to include in tar itself? Doing this for rustup which already doesn't use the Archive wrapper :/. With this patch, I reduced rustup's rust-docs install from 21/22s to 11s on my bench machine - details at rust-lang/rustup#1850 I haven't filled in all the possibilities for the Unpacked enum, but happy to do so if you'd like that done - it seemed like it wouldn't be useful at this stage as there are no live objects to return.
This permits optimisation of unpacking on Windows - which is perhaps overly special cased to include in tar itself? Doing this for rustup which already doesn't use the Archive wrapper :/. With this patch, I reduced rustup's rust-docs install from 21/22s to 11s on my bench machine - details at rust-lang/rustup#1850 I haven't filled in all the possibilities for the Unpacked enum, but happy to do so if you'd like that done - it seemed like it wouldn't be useful at this stage as there are no live objects to return.
eecc128
to
8accbf1
Compare
I'm working on sane reporting of the time after the tar is fully read
|
Nearly there:
(thats with 4 threads, defender on). 4 threads, defender off gets this output:
64 threads, defender off:
64 threads, defender on:
-> so running with threads==core-count is much better. |
Files are in bytes, but other things we may track are not.
On Windows closing a file involves CloseHandle, which can be quite slow (6+ms) for various reasons, including circumstances outside our control such as virus scanners, content indexing, device driver cache write-back synchronisation and so forth. Rather than having a super long list of all the things users need to do to optimise the performance of CloseHandle, use a small threadpool to avoid blocking package extraction when closing file handles. This does run a risk of resource exhaustion, but as we only have 20k files in the largest package today that should not be a problem in practice. https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-January/078404.html provided inspiration for this. My benchmark system went from 21/22s to 11s with this change with both 4 or 8 threads.
Final version, looks like this on a constrained machine (e.g. a surface):
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a couple of typos, but honestly I don't want to hold things up for those and we can fix them later if we re-touch the code. This looks excellent. Thank you so much for working through all of this with Alex.
Thank you; please let me know on discord any followup typo fixes etc you'd like me to do. |
On Windows closing a file involves CloseHandle, which can be quite
slow (6+ms) for various reasons, including circumstances outside our
control such as virus scanners, content indexing, device driver cache
write-back synchronisation and so forth. Rather than having a super
long list of all the things users need to do to optimise the performance
of CloseHandle, use a small threadpool to avoid blocking package
extraction when closing file handles.
This does run a risk of resource exhaustion, but as we only have 20k
files in the largest package today that should not be a problem in
practice.
https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-January/078404.html
provided inspiration for this.
My benchmark system went from 21/22s to 11s with this change with both
4 or 8 threads (for rust-docs installation).
My surface running rustup release is 44s for rust-docs, and 22s with this branch.