Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] FEA Make it possible to pass pre-inspected modules to threadpool_limit #95

Merged
merged 20 commits into from
Sep 24, 2021

Conversation

jeremiedbb
Copy link
Collaborator

@jeremiedbb jeremiedbb commented Sep 16, 2021

Fixes #94

In addition, makes some stuff public and get rid of the "module" naming, as suggested by @ogrisel

  • _ThreadpoolInfo -> ThreadpoolController
  • _Module -> LibController
  • _<Lib>Module -> <Lib>Controller

The possibility to pass an existing controller (which has already loaded the libraries) to threadpool_limits makes it much faster (like 100x faster):

In [1]: from threadpoolctl import ThreadpoolController, threadpool_limits

In [2]: controller = ThreadpoolController()

In [3]: %%timeit
   ...: with threadpool_limits(1):
   ...:     pass
   ...: 
190 µs ± 744 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)

In [4]: %%timeit
   ...: with threadpool_limits(1, controller=controller):
   ...:     pass
   ...: 
1.43 µs ± 9.68 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)

Although it's already quite short, it's not negligible in some situations (I already experienced that in scikit-learn).

TODO

  • add tests
  • finish doc
  • change log

@jeremiedbb
Copy link
Collaborator Author

@ogrisel, finally I chose to allow threadpool_limits to accept a controller at init. It felt weird to have a method of ThreadpoolController to just return threadpool_limits(controller=self). The context manager holds a controller, not the opposite.
I'm still open to discussion and it's easy to add.

@ogrisel
Copy link
Contributor

ogrisel commented Sep 16, 2021

From a user point of view, I would find more natural to have:

from threadpoolctl import ThreadpoolController

controller = ThreadpoolController()
with controller.limited_context(1):
    # do stuff here

We can keep the top-level threapoolctl.threadpool_limits public context manager factory in that would spawn its own internal controller each time for backward compat and convenience.

Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me ctl is a shortcut for "control" rather than "controller". And here the "old" modules are controllers, so I would explicitly name them as such.

While we are at it, it the longer _controller local variable names make it a hell to reformat to PEP8 manually, feel free to blackify everything while we are at it :)

tests/test_threadpoolctl.py Outdated Show resolved Hide resolved
threadpoolctl.py Outdated Show resolved Hide resolved
@jeremiedbb
Copy link
Collaborator Author

jeremiedbb commented Sep 17, 2021

I extracted the blackify part in another PR to distinguish between the 2 big source of diff. Review will be easier starting by #96

@ogrisel
Copy link
Contributor

ogrisel commented Sep 23, 2021

We can ignore the lint issues because they will be fixed in the black PR.

@jeremiedbb this PR is still WIP, do you still have changes to implement in mind, or is it ready for review?

Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this PR. I like the design and naming.

In addition to the changelog, I think the new extended object oriented public API should be presented in a code snippet in the README.md of the project.

tests/test_threadpoolctl.py Outdated Show resolved Hide resolved
tests/test_threadpoolctl.py Outdated Show resolved Hide resolved
threadpoolctl.py Outdated Show resolved Hide resolved
threadpoolctl.py Outdated Show resolved Hide resolved
tests/test_threadpoolctl.py Outdated Show resolved Hide resolved
@ogrisel
Copy link
Contributor

ogrisel commented Sep 23, 2021

For the record, I am fine with renaming .todicts() to .infos() if you wish @jeremiedbb.

@jeremiedbb jeremiedbb changed the title [WIP] Make it possible to pass pre-inspected modules to threadpool_limit [MRG] FEA Make it possible to pass pre-inspected modules to threadpool_limit Sep 23, 2021
Copy link
Contributor

@ogrisel ogrisel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the conflicts have been solved and the suggestions below have been handled.

README.md Show resolved Hide resolved
README.md Show resolved Hide resolved
@ogrisel ogrisel merged commit 0ef9667 into joblib:master Sep 24, 2021
@ogrisel
Copy link
Contributor

ogrisel commented Sep 24, 2021

Merged! Thank you very much @jeremiedbb . Could you please try to leverage this in a PR in scikit-learn using a dummy backward compat wrapper in sklearn.utils.fixes to check that it works as expected? If so we can release threadpoolctl 3.0.0.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Make it possible to pass pre-inspected modules to threadpool_limit
2 participants