Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Delay scikit-learn import until first use #1061

Merged
merged 7 commits into from
Mar 3, 2023

Conversation

wshanks
Copy link
Collaborator

@wshanks wshanks commented Feb 28, 2023

Summary

This commit pushes the import of scikit-learn lower into the code, so that it does not occur until first usage.

Details and comments

Previously, scikit-learn was in a try block which allowed for code that did not use scikit-learn to work fine when it was not installed. However, sometimes scikit-learn can be installed but have errors (like in #1050) which were not caught by the try block. Further delaying the import can help in this case. Additionally, scikit-learn is a little bit of a slow import, so not importing it when it is not needed gives a little bit of efficiency (maybe; mostly it imports scipy modules but those might get imported any way by other analysis code).

@itoko
Copy link
Contributor

itoko commented Mar 1, 2023

Maybe we could use lazy import functions defined in Terra (qiskit.utils.optionals) https://qiskit.org/documentation/apidoc/utils.html#module-qiskit.utils.optionals
It appears HAS_SKLEARN is already defined.

@wshanks
Copy link
Collaborator Author

wshanks commented Mar 1, 2023

I like that idea. I will try that tomorrow.

@wshanks
Copy link
Collaborator Author

wshanks commented Mar 1, 2023

This should be okay to review now. A couple of the files have big diffs in GitHub because the line endings got changed from Windows to Unix. I can revert that if that helps.

I used terra's LazyImporter but not its HAS_SKLEARN. LazyImporter tries to check for specific elements in the imported module and we use different ones from terra. It might be overkill to check for specific objects any way. I used the LazyImoprter in the tests as well, though they test for sklearn differently -- to decide whether to skip the tests or not. I made them consistent about skipping sklearn tests.

@wshanks
Copy link
Collaborator Author

wshanks commented Mar 2, 2023

I added a test that the scikit-learn check gets used when scikit-learn can not be imported. I used a subprocess to do this which means that the test takes 1.4 seconds which is not my favorite, but we have a lot of tests that are slower than that at the moment any way. If we wanted to test other optional imports we could add them into that one subprocess.

@wshanks wshanks added Changelog: None Do not include in changelog Changelog: Bugfix Include in the "Fixed" section of the changelog and removed Changelog: None Do not include in changelog labels Mar 2, 2023
@wshanks
Copy link
Collaborator Author

wshanks commented Mar 2, 2023

I decided to add a release note since this helps mitigate an issue that users reported (#1050).

I moved this text out of the original description above since it doesn't need to go in the commit message:

I would not say this PR fully addresses the problem in #1050. For that, we need to ensure that the either the IBM Quantum Lab environment is updated to have a working verison of scikit-learn (we want that any way to support the experiments that require scikit-learn) or to have the newer version of qiskit-experiments with this PR so that all the non-scikit-learn experiments work. One would hope both packages can be updated.



class SkLDA(BaseDiscriminator):
"""A wrapper for the SKlearn linear discriminant analysis."""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These classes should probably have a note in the docstring that they require sklearn to run.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added notes to the classes that say they require scikit-learn. If you want to get very pedantic, the classes here only require scikit-learn for the from_config methods. The __init__ methods could work with a custom class if it satisfied the Analysis interface of scikit-learn.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's true. In the future it would be helpful to have a how-to that explains how to work with discriminators and implement a custom class.

test/data_processing/test_discriminator.py Outdated Show resolved Hide resolved
@@ -86,3 +89,35 @@ def __init__(self, physical_qubits):
with self.assertWarns(DeprecationWarning):
instance = OldExperiment(qubit=0)
self.assertEqual(instance._physical_qubits, (0,))

def test_warn_sklearn(self):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, so I'm not sure this test does what we want because I ran it with the current main branch and it only threw the "scikit-learn import guard did not run on scikit-learn usage" error. I would have expected qiskit_experiments to not import at all, which is the error that was reported. When I made a fake dummy sklearn that did nothing except throw an exception, that successfully caused the import of qiskit_experiments to fail, so maybe we have to mock the module instead of setting sys.modules["sklearn"] to None.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, with the imports moving down into first use and the guard raising an exception on first use, I didn't think it was as important to test against importing scikit-learn raising a surprising error, but I think it is true that raising something different than ImportError would be better just to make sure no code gets added that does try: import sklearn; except ImportError since we prefer to keep that at first use rather than triggered by import qiskit_experiments (an alternative test would be to do what qiskit-terra does and just check the import tree after import qiskit_experiments).

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a new commit where I override builtins.__import__ to raise a different exception from ImportError. This gives a test that fails on main but passes in this branch.

Copy link
Collaborator

@coruscating coruscating left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

@coruscating coruscating added this pull request to the merge queue Mar 3, 2023
Merged via the queue into qiskit-community:main with commit 0a9da6a Mar 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Changelog: Bugfix Include in the "Fixed" section of the changelog
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants