Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TST/TYP: make pandas/tests private #46651

Closed
twoertwein opened this issue Apr 6, 2022 · 14 comments
Closed

TST/TYP: make pandas/tests private #46651

twoertwein opened this issue Apr 6, 2022 · 14 comments
Labels
Deprecate Functionality to remove in pandas Typing type annotations, mypy/pyright type checking

Comments

@twoertwein
Copy link
Member

One long-term goal of pandas is to be a py.typed library #28142 and then also to test whether the entire public API is typed #39813.

When running pyright --ignoreexternal --verifytypes pandas pandas 1.4.1 achieves a type completeness score of around 10%. If pandas/tests were marked as private, it would jump to around 40%! I assume the tests shouldn't be part of the public API.

It seems that the only way to make pandas/tests private is to rename it to pandas/_tests. This is a simple change, but I want to open this issue first, as it might be a slightly controversial change.

@jreback
Copy link
Contributor

jreback commented Apr 6, 2022

well in theory i agree it should be private but it has been public so long it's de-facto public

so would need to deprecate it first (we did this a long time ago for many of the modules themselves ) but for example pandas.core itself should be private but not sure we could change this (sure we could deprecate and eventually change it)

so while a good idea we couldn't actually change these till the next major version

@twoertwein
Copy link
Member Author

Is there a best way to start this deprecation process?

The following works but it will also spam the CI:

pandas/tests/__init__.py:

from warnings import warn

warn(
    "pandas.tests is considered to be private and will be renamed to pandas._tests in the future.",
    FutureWarning,  # or DeprecationWarning (but DeprecationWarnings will not be printed by default)
)

@mroeschke mroeschke added Deprecate Functionality to remove in pandas Typing type annotations, mypy/pyright type checking labels Jul 6, 2022
@jbrockmendel
Copy link
Member

i think the foo/tests/ pattern is really common, not wild about changing it

@simonjayhawkins
Copy link
Member

#30741 somewhat related, if they are private and we deprecate pd.test() as done in #47738, instead of making them private we can perhaps remove them from the distribution.

however agree with @jbrockmendel as I think we sometimes get issues raised when we have a new release that pd.test() is failing as they are used by distributions to test the new release.

@twoertwein
Copy link
Member Author

#30741 somewhat related, if they are private and we deprecate pd.test() as done in #47738, instead of making them private we can perhaps remove them from the distribution.

It wasn't my intention to deprecate pd.test() in #47738. When running python -W default -c "import pandas as pd; pd.test()" the pytest output contains the deprecation warning but it is not printed outside of pytest.

@simonjayhawkins
Copy link
Member

It wasn't my intention to deprecate pd.test() in #47738. When running python -W default -c "import pandas as pd; pd.test()" the pytest output contains the deprecation warning but it is not printed outside of pytest.

not ideal. could be confusing.

does changing pandas/tests/__init__.py not fix the typing issue?

from pandas.tests import extension

__all__ = ["extension"]

@twoertwein
Copy link
Member Author

I found a workaround: trigger the warning and catch it before calling pytest in pd.test().

Defining __all__ would not emit a warning.

@simonjayhawkins
Copy link
Member

simonjayhawkins commented Jul 16, 2022

Defining __all__ would not emit a warning.

yeah. I'm not sure we want to change the directory name. pandas.tests.extension.base is effectively public and we should probably keep it that way.

Also, if we did change the directory name, https://pandas.pydata.org/docs/dev/development/contributing_codebase.html#running-the-test-suite would need updating.

Do other projects have this issue?

@twoertwein
Copy link
Member Author

It is only an "issue" for py.typed libraries that want to have the entire API (which would also include pandas/tests) to be fully typed.

Alternatives could be to 1) slowly type pandas/tests, 2) mark other modules inside of pandas.tests as private (except pandas.tests.extension.base), 3) forget about it for now (far away from py.typed).

@simonjayhawkins
Copy link
Member

  1. mark other modules inside of pandas.tests as private

we can do this (for typing) by doing the change suggested in #46651 (comment) and also adding similar for pandas/tests/extension/__init__.py?

from https://github.com/microsoft/pyright/blob/main/docs/typed-libraries.md#library-interface

A module can expose an all symbol at the module level that provides a list of names that are considered part of the interface. The all symbol indicates which symbols are included in a wildcard import. All symbols included in the all list are considered public even if the other rules above would otherwise indicate that they were private. For example, this allows symbols whose names begin with an underscore to be included in the interface.

@twoertwein
Copy link
Member Author

That doesn't seem to work, pyright doesn't report an error when doing this change.

# pyright: strict

from pandas.tests import test_aggregation
import pandas.tests.test_aggregation

I think __all__ only affects the __init__.py file itself.

A quite invasive change would be to add __all__ = [] to most of the test files.

@jorisvandenbossche
Copy link
Member

Having a package/tests/ directory is a well established pattern, and as @jbrockmendel I am not very keen on changing that (I find it a bit strange that pyright doesn't provide a way to deal with this specific case, since it is so widespread usage)

@twoertwein
Copy link
Member Author

twoertwein commented Jul 18, 2022

I think the best is to keep pandas/tests as-is. In the future when --verifytypes becomes feasible, we can simply (on the CI) delete the installed tests folder and then run pyright against it.

edit: I don't think pyright will have an option to ignore tests (unless there is a PEP for it/agreement amount type checkers) microsoft/pyright#3770

@simonjayhawkins
Copy link
Member

we can simply (on the CI) delete the installed tests folder and then run pyright against it.

another alternative, if we package the tests separately #30741, then they would not need to be installed to run the static type checks?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas Typing type annotations, mypy/pyright type checking
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants