TYP/CI: Demonstrate possible integration of VirtusLab's type tests #45561

twoertwein · 2022-01-23T00:13:36Z

Demonstrate the integration of positive type tests from https://github.com/VirtusLab/pandas-stubs for the public API (tests import
some private modules?).

This PR should NOT be merged and this PR probably violates licenses/copyrights. It is probably best if a member of the VirtusLab re-creates this PR. This PR is intended only to demonstrate how the snippets from VirtusLab could be integrated.

The typing errors would need to be ignored (and if pyright needs further ignores, # pyright: <rulename> = false at the top of a file). Pyright checks could also be skipped for now. There might be better ways to exclude/include files with pyright.

xref #45252

pep8speaks · 2022-01-23T00:13:40Z

Hello @twoertwein! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-01-23 18:06:54 UTC

simonjayhawkins · 2022-01-23T13:35:33Z

Thanks @twoertwein it may be worth getting the typing check to green for the POC. (pep less important)

what's the proposal for dealing with "reportUnusedVariable". would we just disable that check on the test files?

twoertwein · 2022-01-23T14:16:51Z

pandas/tests/typing/valid/test_frame.py

+    df[["col1", "col2"]] = [[1, 2], [3, 4]]
+    df[s] = [5, 6]
+    df[a] = [[1, 2], [3, 4]]
+    df[select_df] = [1, 2, 3]


This does not work: ValueError: cannot assign mismatch length to masked array

Yup. Investigating this as a possible regression from pandas version 1.3.2.

Type tests finding non-typing bugs :) If this worked in previous versions, feel free to open an issue!

That's not first time that this happened. See #40440 originally spotted by @joannasendorek.

I've opened up an issue: #45593. I'll circumvent this for now in my PR which is pending.

twoertwein · 2022-01-23T15:26:29Z

what's the proposal for dealing with "reportUnusedVariable". would we just disable that check on the test files?

I now limit pyright to the same config as we use everywhere else, but I enable its general type error checking (the most important option).

edit:
There are still a few cases where mypy and pyright are not in sync: if a function has no return annotation, pyright tries to infer one. In case of read_csv: pyright infers TextFileReader | Unknown (Unkown since, it doesn't know what parser.read() returns), correct would be TextFileReader | DataFrame. If mypy had TextFileReader | DataFrame as a return annotation, it would also create a few more errors.

twoertwein · 2022-01-23T19:01:29Z

Thanks @twoertwein it may be worth getting the typing check to green for the POC. (pep less important)

@simonjayhawkins Type checks pass now (except one issue existing on main) and the CI is otherwise mostly green (except the one test mentioned above).

twoertwein · 2022-01-23T19:03:56Z

pandas/tests/typing/valid/test_pandas.py


    rdf4: pd.DataFrame = pd.concat(list(map(lambda x: s2, ["some_value", 3])), axis=1)


 def test_types_json_normalize() -> None:
    data1: List[Dict[str, Any]] = [
        {"id": 1, "name": {"first": "Coleen", "last": "Volk"}},
-        {"name": {"given": "Mose", "family": "Regner"}},
+        {"name": {"given": "More", "family": "Regner"}},


to make codespell happy

zkrolikowski-vl · 2022-01-24T12:23:17Z

@twoertwein So as more stubs are added we'll keep removing the type: ignore comments?

Let me look at the the invalid test case, and then I'll open a PR of my own based on this one - based on your preference. I agree that it's probably realistically more straight-forward in terms of the copyright/license.

twoertwein · 2022-01-24T13:36:20Z

@twoertwein So as more stubs are added we'll keep removing the type: ignore comments?

I think that is the idea. Ideally, we want to achieve that with inline annotations (and pyi for cython/c). Knowing where we need ignores also helps us to prioritize annotating code.

Let me look at the the invalid test case, and then I'll open a PR of my own based on this one - based on your preference. I agree that it's probably realistically more straight-forward in terms of the copyright/license.

Great! You could also create a PR where multiple people from the VirtusLab are involved, in case attribution is important. You are welcome to either copy&past my follow-up commit or apply it as a patch, which ever you prefer.

Dr-Irv · 2022-01-24T15:03:12Z

@twoertwein Thanks for doing this.

It seems that this would use the code from Virtus Labs to check our existing type stubs within the pandas code. Then when we brought over the pyi files from the Microsoft stubs, it would do the same. Am I correct about that?

twoertwein · 2022-01-24T15:26:28Z

It seems that this would use the code from Virtus Labs to check our existing type stubs within the pandas code.

Yes, this simply copies the test snippets from Virtus Labs (but not their pyi files). This PR tries to address only #45252 and not #45253.

Then when we brought over the pyi files from the Microsoft stubs, it would do the same. Am I correct about that?

I'm sorry, I don't understand what you mean. Merging some of the Microsoft pyi files for cython files (e.g., offesets.pyi #45331 and interval.pyi #44922) might help reduce the number of ignores.

Dr-Irv · 2022-01-24T15:43:37Z

Then when we brought over the pyi files from the Microsoft stubs, it would do the same. Am I correct about that?

I'm sorry, I don't understand what you mean. Merging some of the Microsoft pyi files for cython files (e.g., offesets.pyi #45331 and interval.pyi #44922) might help reduce the number of ignores.

I was referring to #45253 .

Following up on the discussion we had at the January developer meeting, @simonjayhawkins brought up that our current typing tests are meant to test the types of our internals, while if we create .pyi files throughout the code, which benefit external users, it could hurt some of that internal testing. So I had the following idea, which I don't know whether it is possible. Assuming we have a bunch of .pyi files throughout the pandas code, which is how we distribute the code, what if we ran mypy and pyright without them (maybe just keeping the ones corresponding to cython code) to check the internals, and then we run the type checking again, using the tests here (typing/valid and typing/invalid) on the full code with all pyi files present. Then we have a testing scheme for the internals and for the external pyi files.

twoertwein · 2022-01-24T16:04:00Z

If we have two sets of type annotations (more complete public pyi files and the in-progress inline annotations), it would be great to test both!

If pyi files take precedence over inline (I'm not sure about that), your idea should be easily possible (run mypy&pyright twice, but for the second time delete some of the pyi files).

Dr-Irv · 2022-01-24T16:22:58Z

If we have two sets of type annotations (more complete public pyi files and the in-progress inline annotations), it would be great to test both!

Yes, I agree with that.

If pyi files take precedence over inline (I'm not sure about that), your idea should be easily possible (run mypy&pyright twice, but for the second time delete some of the pyi files).

From what I understand the pyi files do take precedence, and I was thinking about the deletion idea as a way to satisfy both goals of type checking - internal for what we put in the code and external for what is in the pyi files.

twoertwein · 2022-01-24T17:22:07Z

One option could be to add only as many external pyi files to satisfy the public tests. I think type checkers don't mind a mix of pyi and inline (a.pyi but b.py has inline). Over time, we could then try to completely remove the external pyi files.

twoertwein · 2022-01-24T22:51:47Z

Closing this PR in favor of #45594.

twoertwein mentioned this pull request Jan 23, 2022

ENH: Add set of tests to type check the public APIs #45252

Open

copy pandas-stubs/tests/snippets + black&isort

d7217ad

twoertwein commented Jan 23, 2022

View reviewed changes

configure mypy/pyright/pytest, run pyupgrade, mypy ignores, skip tests

6ee8857

twoertwein commented Jan 23, 2022

View reviewed changes

twoertwein closed this Jan 24, 2022

zkrolikowski-vl mentioned this pull request Feb 18, 2022

ENH: Add typing tests according to the issue 45252 #45594

Closed

4 tasks

twoertwein deleted the type_test branch March 9, 2022 02:56

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TYP/CI: Demonstrate possible integration of VirtusLab's type tests #45561

TYP/CI: Demonstrate possible integration of VirtusLab's type tests #45561

twoertwein commented Jan 23, 2022

pep8speaks commented Jan 23, 2022 •

edited

Loading

simonjayhawkins commented Jan 23, 2022

twoertwein Jan 23, 2022

zkrolikowski-vl Jan 24, 2022 •

edited

Loading

twoertwein Jan 24, 2022

zkrolikowski-vl Jan 24, 2022

twoertwein commented Jan 23, 2022 •

edited

Loading

twoertwein commented Jan 23, 2022

twoertwein Jan 23, 2022

zkrolikowski-vl commented Jan 24, 2022

twoertwein commented Jan 24, 2022

Dr-Irv commented Jan 24, 2022

twoertwein commented Jan 24, 2022

Dr-Irv commented Jan 24, 2022

twoertwein commented Jan 24, 2022

Dr-Irv commented Jan 24, 2022

twoertwein commented Jan 24, 2022

twoertwein commented Jan 24, 2022

TYP/CI: Demonstrate possible integration of VirtusLab's type tests #45561

TYP/CI: Demonstrate possible integration of VirtusLab's type tests #45561

Conversation

twoertwein commented Jan 23, 2022

pep8speaks commented Jan 23, 2022 • edited Loading

Comment last updated at 2022-01-23 18:06:54 UTC

simonjayhawkins commented Jan 23, 2022

twoertwein Jan 23, 2022

Choose a reason for hiding this comment

zkrolikowski-vl Jan 24, 2022 • edited Loading

Choose a reason for hiding this comment

twoertwein Jan 24, 2022

Choose a reason for hiding this comment

zkrolikowski-vl Jan 24, 2022

Choose a reason for hiding this comment

twoertwein commented Jan 23, 2022 • edited Loading

twoertwein commented Jan 23, 2022

twoertwein Jan 23, 2022

Choose a reason for hiding this comment

zkrolikowski-vl commented Jan 24, 2022

twoertwein commented Jan 24, 2022

Dr-Irv commented Jan 24, 2022

twoertwein commented Jan 24, 2022

Dr-Irv commented Jan 24, 2022

twoertwein commented Jan 24, 2022

Dr-Irv commented Jan 24, 2022

twoertwein commented Jan 24, 2022

twoertwein commented Jan 24, 2022

pep8speaks commented Jan 23, 2022 •

edited

Loading

zkrolikowski-vl Jan 24, 2022 •

edited

Loading

twoertwein commented Jan 23, 2022 •

edited

Loading