Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-109653: Speedup import of threading module #114509

Merged
merged 3 commits into from
Jan 31, 2024

Conversation

danielhollas
Copy link
Contributor

@danielhollas danielhollas commented Jan 23, 2024

Delayed import of functools speeds up the import threading by ~50% (2ms -> 1ms) in my testing.

Since the functools module is only used in the internal _register_atexit function that is called by concurrent.futures, this seems like a worthwhile win for users of threading module who do not use asyncio.

Part of #109653

CC @AlexWaygood

Delayed import of functools leads to 50% speedup
of import time.
@bedevere-app
Copy link

bedevere-app bot commented Jan 23, 2024

Most changes to Python require a NEWS entry. Add one using the blurb_it web app or the blurb command-line tool.

If this change has little impact on Python users, wait for a maintainer to apply the skip news label instead.

@danielhollas
Copy link
Contributor Author

danielhollas commented Jan 24, 2024

To be precise, compiling python with ./configure --enable-optimizations and measuring with python -Ximporttime -c "import threading", I am getting 2.44ms on main and 1.17ms on this PR.

@Eclips4 Eclips4 added the performance Performance or resource usage label Jan 24, 2024
@ajoino
Copy link

ajoino commented Jan 24, 2024

Just a thought, is it even necessary to use functools.partial here? Could this not be replaced with lambda: f(*args, **kwargs), that would avoid importing functools at all. Is there something I'm missing here?

@AlexWaygood
Copy link
Member

Just a thought, is it even necessary to use functools.partial here? Could this not be replaced with lambda: f(*args, **kwargs), that would avoid importing functools at all. Is there something I'm missing here?

This was my thought as well on first seeing the patch. functools.partial can be faster than a lambda function, but here I doubt it makes a significant difference. (If we wanted to check whether using a lambda here slowed things down, we'd need to do a benchmark using concurrent.futures, since the concurrent.futures module is the only public API that makes use of this private API. It might be possible to write such a benchmark, but it also might be difficult -- not sure.)

@danielhollas
Copy link
Contributor Author

functools.partial can be faster than a lambda function, but here I doubt it makes a significant difference. (If we wanted to check whether using a lambda here slowed things down, we'd need to do a benchmark using concurrent.futures, since the concurrent.futures module is the only public API that makes use of this private API. It might be possible to write such a benchmark, but it also might be difficult -- not sure.)

Looking at the code, the threading.register_atexit() is only ever called during concurrent.futures import, so I would assume any performance difference here would be marginal?

@AlexWaygood
Copy link
Member

Looking at the code, the threading.register_atexit() is only ever called during concurrent.futures import, so I would assume any performance difference here would be marginal?

Oh, great point 😄

In that case, let's just go with a lambda here -- it seems simpler :)

Lib/threading.py Outdated Show resolved Hide resolved
Copy link
Member

@AlexWaygood AlexWaygood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks! I'd love to check with a core dev more familiar with subinterpreters before merging, though (since this feature was specifically added to help with subinterpreter support).

@ericsnowcurrently, there's no reason why switching to a lambda rather than functools.partial could be problematic for subinterpreter support, is there?

@danielhollas
Copy link
Contributor Author

@AlexWaygood thanks!

@ericsnowcurrently, there's no reason why switching to a lambda rather than functools.partial could be problematic for subinterpreter support, is there?

Just a note, if this was a problem, we could still get away with it by simply not doing either: the function is (at least currently) being called without any extra *args or **args arguments so we could make _register_atexit less general and simply pass the callback function directly to _threading_atexits list.

@danielhollas danielhollas changed the title gh-109653: Speedup import of threading module gh-109653: Speedup import of threading module Jan 24, 2024
@AlexWaygood
Copy link
Member

I can't see a way in which this would cause problems — I'll go ahead and merge, since it's been a few days :)

Thanks @danielhollas!

@AlexWaygood AlexWaygood merged commit 5e390a0 into python:main Jan 31, 2024
31 checks passed
@danielhollas danielhollas deleted the import-threading-speedup branch January 31, 2024 10:59
@ericsnowcurrently
Copy link
Member

@ericsnowcurrently, there's no reason why switching to a lambda rather than functools.partial could be problematic for subinterpreter support, is there?

I'm not aware of any such reason.

aisk pushed a commit to aisk/cpython that referenced this pull request Feb 11, 2024
Avoiding an import of functools leads to 50% speedup of import time.

Co-authored-by: Alex Waygood <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance or resource usage
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants