-
-
Notifications
You must be signed in to change notification settings - Fork 30.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-118761: Speedup pathlib import by deferring shutil #123520
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On main in a non-debug build (but not a PGO one), I get no improvements at all.
The command to run is ./python -m pyperf timeit 'import pathlib' -l 1 -w 0 -n 1 -p 250
. This will ensure that you don't have warmups and don't have cached modules.
- main:
Mean +- std dev: 1.37 ms +- 0.05 ms
- PR:
Mean +- std dev: 1.36 ms +- 0.04 ms
I'm not sure this is worth the change but maybe I'm incorrectly benchmarking the time.
Some idea: shutil
imports some additional libs such as fnmatch
that are then needed by glob
. So when you do from glob import _StringGlobber
after import shutil
, you'll end up importing fnmatch
. So you save the fnmatch import in the glob module by importing shutil. Now I don't know if this specific import is the reason why I don't see an improvement.
Misc/NEWS.d/next/Library/2024-08-31-11-12-49.gh-issue-118761.Ai_Ma1.rst
Outdated
Show resolved
Hide resolved
I think this probably shouldn't have a NEWS entry, because the |
1.6ms seems suspiciously fast, are you sure the module is not being cached? I don't think fnmatch is the issue here, I think in the icicle graph I posted above it is already counted as part of glob. Most of the import time of shutil comes from importing compression modules, lzma etc. Measured with |
Ideally use PGO+LTO. On macOS with PGO+LTO, running You can see the big Before: After: And with hyperfine, this branch is about 1.5 ms faster than ❯ hyperfine --warmup 32 \
--prepare "git checkout speedup-pathlib" './python.exe -c "import pathlib"' \
--prepare "git checkout main" './python.exe -c "import pathlib"'
Benchmark 1: ./python.exe -c "import pathlib"
Time (mean ± σ): 14.8 ms ± 0.6 ms [User: 11.9 ms, System: 2.4 ms]
Range (min … max): 14.1 ms … 18.2 ms 96 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs.
Benchmark 2: ./python.exe -c "import pathlib"
Time (mean ± σ): 16.4 ms ± 0.8 ms [User: 13.1 ms, System: 2.7 ms]
Range (min … max): 15.6 ms … 23.2 ms 91 runs
Warning: Statistical outliers were detected. Consider re-running this benchmark on a quiet system without any interferences from other programs.
Summary
./python.exe -c "import pathlib" ran
1.11 ± 0.07 times faster than ./python.exe -c "import pathlib" |
Co-authored-by: Hugo van Kemenade <[email protected]>
Thank you all for the reviews, I believe I addressed all the comments. |
With a caching it's in the realm of nanoseconds on my laptop. But my specs are a bit... biaised sometimes:
EDIT: It's a bit weird that pyperf reports timings in the realm of 1.6 ms but hyperfine does not:
I'm approving the PR though it does not really change on my system much. But since it may affect other users, I think it's worth the change. |
Thanks for this! |
hyperfine also measures the overhead of python startup. Another possible explanation why you don't see a difference, do you have the compression modules available? You can try |
I do have them available :( it's an interesting observation though but it's my problem I think. |
~15% of pathlib import time (1.6ms on my machine) is spent on importing
shutil
. Since this module is only used in one place it makes sense to defer its import. Incidentally, this also makes the code more explicit imho, it took me a while to understand how the current code works.