-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Add set of tests to type check the public APIs #45252
Comments
see #40202 for a less verbose but more primitive approach. Using pytest is perhaps more robust. How easy is is to parametrize testing of types? I assume that parametrization cannot be used for static checks. The method in #40202 has several issues. If anyone knows of other approaches to testing types, would much appreciate posting suggestions here. |
from #40202 (comment)
repeating here so that this option is not overlooked if the solution gets too complex. |
One potentially easier to implement option is to have something like mypy_primer but for pandas. Typeshed uses mypy_primer to automatically comment on each PR whether it would change type annotations as inferred by mypy. "Pandas_primer" would help to 1) prevent involuntary changes to public type annotations and to 2) verify that intended changes actually have an effect. |
In pandas-stubs, we've decided to go with the simplest method, which doesn't require any extra setup or testing constructs. Here are some of the other tools we've looked at (not including dynamic type-checkers):
Coming back to our solution, there's a way to test negative scenarios in our approach but it's far from elegant:
This will cause whereas using it incorrectly
will result in an error
The issue that remains is that our whole approach is static and I don't see it being easily integrated with parametric list of use-cases. mypy_primer seems like an another interesting approach. Potentially difficult to integrated into pandas CI but might be worth the extra effort. Overall it seems like either pytest-mypy-testing/ @simonjayhawkins 's (with some work) or our (pandas-stubs) solution might be a semi-immediate fit for pandas use-case. @simonjayhawkins if I may ask: Why was your work discontinued? Was it just inconvenient at the time? |
Here is why I like the approach taken by the
From my experience in submitting a bunch of PR's to the Microsoft |
The approach we took in pandas-stubs was the most pragmatical for our usage and capacity we have to maintain the project. We can assume that any stubs will not be perfect and when people will first start using them, they will get some errors (I applied our stubs for one of the thousand lines project and made 10 fixes to pandas-stubs based on that only). So whenever this happens, we would first re-create the false negative case provided, modify the stubs and make sure the exact usage is corrected. From this moment, this particular usage is guarded so people using the project shouldn't notice regression and we can add only small improvements at once. |
abandoned due to #40202 (comment). In pandas we tend not merge stuff if there are any objections. I assumed from that position and the notes from the meeting...
that the pandas-stubs would be integrated into out pytest tests, but I guess I got that wrong from not attending the meeting. @Dr-Irv I assume from #45252 (comment) that your objections to previous attempt no longer apply? |
What we decided is that the testing approach from
I'm not sure which objections you are referring to.... What I said there is that if we could annotate all of our existing tests (which means adding Since then, I've seen the |
specifically from #40202 (comment)
IIUC the panda-stubs approach, if not using pytest, is not so different from #40202. i.e. we need to enumerate all the possibilities and have a separate set of tests. also from #40202 (comment)
The pandas-stubs testing approach, if not integrated into the current testing, also does not satisfy this. |
I was more referring to using the
I don't think we have to enumerate all the possibilities. We can start with a set of tests that represent "common" usage, and if the stubs are not covering some particular use case (by being too wide or too narrow), we can just add to those stubs.
That's correct. But it's a first step forward. I would like it if we could put typing into our testing code, and just run |
+1 At the end of the day it's not so different than the previous attempt. We would at least have some testing and the tests in place (esp for the overloads). changing those tests at a later date from say
to
would not be difficult and would improve the robustness of the checks. |
If we were to go that route, since the notation used with |
sure.
potential future possibility, not suggesting that we would do it, or that the discussion blocks the addition of static tests. If the (i'll call it) assignment method works sufficiently well then may not need the additional complexity. In #45252 (comment), I said..
We are using reveal_type so that we can check the return types explicitly. However, the problem is that, in the script we are using
so IIRC there are cases which pass that shouldn't, since we are not checking equality to ensure an explicit check is made. The "assignment method" is checking type compatibility only, so I imagine there could potentially be more cases that are not checked explicitly. For instance, if we have If we find that this less explicit checking is not an issue, then there would be no reason to make the testing more complex. |
You could catch it this way: var: typeA | typeB = func()
varA: typeA = func() # type: ignore
varB: typeB = func() # type: ignore If we use the
I think this is just an issue of how complete we want the tests to be. If we try to make them fully complete, we'll never get done. So I say we start with a set of tests that include common usage of pandas, and expand those tests as issues get identified. |
exactly. The very reason #40202 was opened as an alternative to typing the tests in the first place! |
from #40202 (comment)
so could maybe skip the
step mentioned in the meeting discussion and get these in sooner by copying across what they already have and iterate on that? |
I'm pretty sure this was a suggestion from @jreback . Currently, there are 6 files there, and the question is whether it's worth the effort to split them up. |
I think a file (x3) for each pandas public class (Series, DataFrame, Timestamp, Interval etc) and then a file for the top level functions (maybe split that one) and then somewhere for some internals (unit) testing, maybe a file or subdirectory. |
Can you explain the "(x3)" in the above? Are you saying there would be files each for |
The x3 comes from pass/fail/reveal. Forget that for now to simplify the discussion. So lets say one file per class, with 1 test function per method (or maybe 2 for pass/fail?) There are some classes with so many methods that I think splitting the methods maybe noisy? |
basically, what pandas-stubs already has lgtm. It's close enough to the suggestion above and we can evolve the structure/style once copied across. |
@zkrolikowski-vl I prefer the structure for the tests that you already have so IMO VirtusLab/pandas-stubs#127 is not necessary. I think that keeping the structure more aligned to the public API is preferable (maybe easier for casual contributors) to mirroring either the source code or the pytest testing structure, but I defer to others on this since this was discussed in the meeting. |
@simonjayhawkins I can see that both you and @Dr-Irv are agreeing to leave the structure as is. It's definitely easier to read right now. My only concern is that as more and more tests will be added the files would either become very long or would have to be split with semi-subjective names. I'll close VirtusLab/pandas-stubs#127 for now. Do we have an idea on how to integrate the tests with the CI? Looks like |
My suggestion is as follows. Put your tests in There is also a section at the bottom for |
will pytest execute these? The reason I ask is I don't think we want that? The static tests in https://github.com/VirtusLab/pandas-stubs/tree/master/tests/snippets have functions named |
@Dr-Irv Will do. @simonjayhawkins Yes. AFAIK pytest will pick them up as well. That's the way we've been testing if the snippets execute correctly. |
We have very few options of pyright enabled at the moment (most notably |
So the only issue here is that we will probably add tests for typing to test if a type is too wide, and put a >>> s = pd.Series([1,2,3])
>>> s.corr(pd.Series([4,5,6]), method="foobar")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "C:\Anaconda3\envs\b38np122try2\lib\site-packages\pandas\core\series.py", line 2512, in corr
raise ValueError(
ValueError: method must be either 'pearson', 'spearman', 'kendall', or a callable, 'foobar' was supplied So when we type s.corr(pd.Series([4,5,6]), method="foobar") # type: ignore so that the type checker was happy, but then Maybe that means we should have a |
I've ran into an challenge related to how pyright is configured in pandas: https://github.com/pandas-dev/pandas/blob/main/pyproject.toml#L150-L185 mainly because of those lines:
I've checked and the exclusion list is stronger than the inclusion list so it's not possible to make an exception to the exception. Other than that I would have to enable most of the Due to this I'm thinking of having a separate configuration just for the type tests themselves. I'm not sure it that's even possible but I'll continue on this topic. |
Instead of exclude
I think it should be sufficient to use Testing with mypy is already enough for a first version. You don't need to worry about pyright for now - unless you want to :) edit: |
@zkrolikowski-vl I crated the PR #45561 to demonstrate how the test snippets from VirtusLab could be integrated. |
@twoertwein Opened my own PR based on yours, with the |
@zkrolikowski-vl When you say "with the |
@Dr-Irv In a sense yes. I've purposefully avoided the situation in which the regression happens. The test case itself still works. |
@Dr-Irv I think you mentioned that the MS stub started using a not-yet supported argument of |
@twoertwein Thanks. It appears that |
cool. this will overcome the deficiencies of the "assignment" method that I commented on in #45252 (comment) and now align the testing with the numpy method (that was done in #40202)? |
@Dr-Irv can/should we move this issue (and maybe others) to https://github.com/pandas-dev/pandas-stubs or best to keep that to stub specific issues since I think we could also use the test framework to test the pandas code too. see my comment in mailing list thread. |
I guess there are two ways to do this.
or both. if there is testing against the pandas codebase in pandas-stubs, 1. is also more likely to succeed if automated. |
I believe that to be the case. |
I could see it going either way. From the pandas development standpoint, we would want to make sure that changes to pandas in a PR don't cause the tests in There are other issues here that are purely about public type stubs that could possibly be moved over (and maybe closed). |
I really like the idea of testing I don't think this is a particularly good example, but it might demonstrate the potential issue: Instead of checking |
I think we are not talking about using the
I don't think that you can have some stubs in a |
If the stubs explicitly import them (and re-export them as public functions in |
Just to be clear, my comment #45252 (comment) regards the test framework only. In the discussion about the challenges of making pandas stubs public (mailing list, dev meeting and elsewhere), my support for having a separate stubs package was on the basis that we would have a test framework that could help keep the standalone stubs package and the pandas codebase more consistent. Other than the test framework, we should probably keep the standalone pandas-stubs and inline annotations in pandas decoupled. This will allow us to use all the latest typing features in pandas-stubs as they become available. It will also allow us to continue to add annotations to the pandas code without further constraints (it's quite a challenge already relative to the ease of creating stubs) I suspect that we may need to comment out failing tests (on the pandas side) with a TODO in some cases or just allow the test job to fail if we copy over the tests, but I hope that we wouldn't need to change code on the pandas side or stubs on the pandas-stubs side to share the tests. |
I think we are all in agreement on keeping the public stubs separate.
I'm not sure that the tests in |
I think that is exactly the reason to use the same tests. IMO we should make the annotations as consistent as possible and part of the original argument on why we did not have separate stubs before now.
I still disagree with this approach, but am happy to await user feedback once the stubs are public on whether this is a significant issue.
I do expect that we would need to have some allowed failures at first, but fixing these would make the pandas-stubs and in-line type annotations more consistent. |
I'm thinking we can close this, as we are now testing the type checking of the published |
Could also leave this issue open if we hope to have consistency checks between pandas and pandas-stubs. |
From discussion on pandas typing meeting on January 7, 2022 (notes at https://docs.google.com/document/d/1tGbTiYORHiSPgVMXawiweGJlBw5dOkVJLY-licoBmBU/)
Follow up to discussion here: #28142 (comment)
Goal is to have a set of tests that make sure that users of the pandas API writing correct code work with whatever type stubs we publish.
Example of an approach that is suggested was developed within the pandas-stubs project based on various code snippets which are then tested with
mypy
: https://github.com/VirtusLab/pandas-stubs/tree/master/tests/snippets@zkrolikowski-vl and @joannasendorek from that project have offered to help (nudge, nudge)
Ideally we would add this testing framework to the CI before starting to bring over the stubs.
The text was updated successfully, but these errors were encountered: