-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix CMR-related type hints #510
Conversation
d8d99e1
to
fae6b73
Compare
pre-commit.ci autofix |
pre-commit.ci run |
There seems to be some sort of discrepancy between pre-commit ci's ruff hook and locally running ruff because the pre-commit failure was due to import formatting that doesn't cause a problem locally. Specifically, pre-commit ci wanted to remove a blank line on line 5 of No idea why that's the case, but I'm concerned that this might cause a persistent build issue because now that I've pulled the autofix that I triggered above, and run
That's failing locally on the autofix that pre-commit ci applied because the autofix removed a blank line. If I were to "fix" that locally and re-push, then pre-commit ci would again fail again because my local "fix" would insert the blank line. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, the changes in this file do not materially change the tests. This is simply a refactoring of the code to follow the convention used for integrating vcrpy with Unittest: https://vcrpy.readthedocs.io/en/latest/usage.html#unittest-integration
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, this ends up renaming the cassettes, but the contents of the cassettes have not changed, although 1 cassette was added that didn't previously exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this PR cleans up and improves the typing used in earthaccess. The only observation/question I have is if from now on we'll have to write Python interfaces for new classes? (.pyi) Would we need some extra instructions on the "how to contribute" section? Python and typing doesn't feel intuitive for newcomers.
earthaccess/typing_.py
Outdated
from typing import Dict, List, Mapping, Sequence, Tuple | ||
else: | ||
from builtins import dict as Dict, list as List, tuple as Tuple | ||
from collections.abc import Mapping, Sequence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kind of remember an issue with collections.abc
in the past, I guess this abstraction fixes it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I pulled all imports from typing
, typing_extensions
, and collections.abc
into this module so that throughout the earthaccess code, we simply import from .typing_
(or earthaccess.typing_
) to avoid having to deal with all of the changes that have been made across Python versions.
There are various "collection" types prior to 3.9 that were exposed from the typing
module, but as of 3.9, the recommendation is to import them from collections.abc
. There's also the transition from using List
, Dict
, and Tuple
to list
, dict
, and tuple
, respectively, along with various other changes that are annoying to keep up with.
Therefore, this typing_
module centralizes all such logic in one place, so throughout the rest of the code base, we can ignore it all and import all types from typing_
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the rest of this module, but I'm not sure how I feel about smoothing over the transition to built-in types like this:
from builtins import dict as Dict, list as List, tuple as Tuple
I think I prefer having the codebase use the style of the oldest version of Python we support, i.e. from typing import List
at the moment, then letting Ruff handle autofixing that (using the UP
ruleset IIRC) when we drop support for older versions. With this module helping us transition, Ruff won't be able to apply those fixes, and we'll need to remember to come back and clean up parts of this module when we drop support for 3.8 and no longer need the transitional code.
Anyway, I don't care about this deeply enough to request a change here, this is totally fine :) Just a feeling.
earthaccess/search.py
Outdated
|
||
def point(self, lon: str, lat: str) -> Type[GranuleQuery]: | ||
@override | ||
def point(self, lon: FloatLike, lat: FloatLike) -> Union[Self, Never]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
with FloatLike are "1.2" and 1.2 valid values? I forgot if python_cmr
will parse the string.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, both are valid values, as python_cmr
simply passes the values to the float
function/constructor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
More specifically, the FloatLike
type alias (Union[str, SupportsFloat]
) indicates that we can pass either an str
value (which could cause an exception if it's a string that cannot be parsed as a float), or any instance of a type that implements the __float__
method, which the float
function/constructor defers to.
Generally speaking, this means that aside from str
(which does not implement __float__
, thus the need for the Union
above), we'll take an int
or a float
as a value.
stubs/cmr/__init__.pyi
Outdated
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we going to have to include interfaces from now on?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what you mean. Do you mean do we have to define interfaces within earthaccess
? Or do you mean do we have to include .pyi
files? Either way, the answer is no. However, that does remind me that I forgot to add an empty py.typed
file to this PR. By adding such a file, we're indicating that our code includes its own type hints, and there is no need for a separate type stub package.
earthaccess/search.py
Outdated
""" | ||
if not isinstance(doi, str): | ||
raise TypeError("doi must be of type str") | ||
|
||
self.params["doi"] = doi | ||
return self | ||
|
||
def instrument(self, instrument: str) -> Type[CollectionQuery]: | ||
def instrument(self, instrument: str) -> Union[Self, Never]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't seen Never
is that the type for not used, None?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Never
(or NoReturn
, prior to Python 3.11) indicates that the function/method may throw an exception.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if I understand Something | Never
annotation. My understanding is the reason to annotate def foo() -> Never:
is to enable mypy to throw an error on f = foo()
, since foo
can't return. But if a function sometimes returns Something
and sometimes returns Never
, what additional checking does that enable?
My understanding is as a "bottom type", Something | Never
is identical to Something
!
I've read an argument that it serves to help document for humans that a possible outcome is an exception, but IMO that's a shortcoming of the type system that shouldn't be documented using the type system, since if you remove the possibility of an exception from the function and forget to also update the annotation, the annotation will be "wrong" (no longer serving its intended purpose) and the typechecker won't know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to clarify, a return type of Something | Never
does not indicate that it is possible for a function to return Never
. It indicates that the function possibly never returns at all (typically indicative of the possibility of raising an exception instead of returning a value).
However, I agree that Python's type system has shortcomings, so using Never
really only serves to help the reader, but in thinking more about this, and seeing that an IDE may not even show Never
in the function's return type when it pops up info when hovering your mouse pointer over a function call (at least VS Code doesn't show it, perhaps because it is a bottom type), then it would seem that this perhaps is not even helpful to the reader (unless directly reading the source code where the hint is written).
Further, given that a return type of Something | Never
can be very confusing, since in all other cases, a union type indicates the possible types of return values, including Never
doesn't actually makes sense since you can never return Never
.
Therefore, I'm going to drop the use of Never
, but I'll keep the Raises
sections that I added to the docstrings, since that indeed does show up in the little mouse-over popups in IDEs, and is helpful to include in generated docs.
if you remove the possibility of an exception from the function and forget to also update the annotation, the annotation will be "wrong" (no longer serving its intended purpose) and the typechecker won't know.
As an aside, this isn't a particularly strong argument for removing Never
because this issue is not unique to Never
. There are currently plenty of places in the current codebase where mypy doesn't complain, but should, due to incorrect return type hints. In fact, this PR corrected many of them that mypy wasn't complaining about because of the very loose configuration of mypy, so this issue is also a matter of making sure we're also configuring the type checker effectively.
I added a bit of strictness with this PR, but not too much. Only enough to help weed out some of the incorrect hints scoped within the context of this PR. I plan to incrementally increase strictness so that we don't have to touch too many parts of the code at once.
If we don't improve our type hints, then we may be doing more harm than good since incorrect type hints are perhaps worse than no type hints, as with documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Therefore, I'm going to drop the use of
Never
, but I'll keep theRaises
sections that I added to the docstrings, since that indeed does show up in the little mouse-over popups in IDEs, and is helpful to include in generated docs.
That sounds great to me, thanks so much for looking in to IDE behavior around these annotations. These details of the developer experience aren't always paid attention to, and I love to see when they are! ❤️
I added a bit of strictness with this PR, but not too much. Only enough to help weed out some of the incorrect hints scoped within the context of this PR. I plan to incrementally increase strictness so that we don't have to touch too many parts of the code at once.
If we don't improve our type hints, then we may be doing more harm than good since incorrect type hints are perhaps worse than no type hints, as with documentation.
💯 * 💯 Couldn't agree more with the analysis and the approach. We need this and I really appreciate you're spearheading the work!
I want to say I'll review it this weekend, but I have some important paperwork due very soon that I've been postponing for far too long... Let's see how that goes :) |
I appreciate your critical eyes! 👀 |
There were a number of type hints in `search.py` and `api.py` related to CMR queries that were incorrect. These were fixed. In addition, there were a number of other static type errors that were masked because of ignored `cmr` imports. Added type stubs for `python_cmr` library to unmask and address these additional type errors. Limited static type changes as much as possible to only functions and methods dealing with CMR queries and results to keep this PR manageable. Fixes nsidc#508
for more information, see https://pre-commit.ci
- wrap docstrings at 88 characters per ruff configuration - remove `Never` from return types where method raises only for invalid input types or values - add missing `Never` return types where method raises for server-side errors
Use of the session during CMR paged queries was accidentally removed with the introduction of Search-After functionality.
b574b90
to
4a6586f
Compare
pre-commit.ci autofix |
for more information, see https://pre-commit.ci
👀 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this looks great! One thing I'd like to see is some documentation about the purpose of and process for maintaining the stubs/
directory. It's not immediately obvious to me that these are type stubs for python_cmr
, because the import for that library is simply cmr
, and my brain doesn't make that jump intuitively or immediately. It would also be good to document the long-term intention to upstream these instead of keeping them around!
Perhaps this could go in CONTRIBUTING.md
. As I'm thinking that, I also think we've outgrown this single-file contributing doc, but that's a different issue (#521)
earthaccess/typing_.py
Outdated
from typing import Dict, List, Mapping, Sequence, Tuple | ||
else: | ||
from builtins import dict as Dict, list as List, tuple as Tuple | ||
from collections.abc import Mapping, Sequence |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the rest of this module, but I'm not sure how I feel about smoothing over the transition to built-in types like this:
from builtins import dict as Dict, list as List, tuple as Tuple
I think I prefer having the codebase use the style of the oldest version of Python we support, i.e. from typing import List
at the moment, then letting Ruff handle autofixing that (using the UP
ruleset IIRC) when we drop support for older versions. With this module helping us transition, Ruff won't be able to apply those fixes, and we'll need to remember to come back and clean up parts of this module when we drop support for 3.8 and no longer need the transitional code.
Anyway, I don't care about this deeply enough to request a change here, this is totally fine :) Just a feeling.
Should we merge this? @mfisher87 @chuckwondo |
I'm a bit torn on how we should handle this. I started adding a section to It's starting to feel like we're including information that is documented elsewhere about dealing with type stubs in Python in general, and that we would just be repeating such general documentation (and perhaps not as well). It feels akin to including a section on "how to program in Python" because we use Python in this repo. A contributor should look elsewhere for such information. If a contributor has difficulty resolving type check failures, they should reach out by asking a question in an issue, discussion, or PR, and one of us more familiar with type checking can assist. This should wash out during PR reviews, just like any other aspect of Python programming in general. Regarding "promotion" of the included What say you? |
Not just yet. We're close, but there are a couple more questions open, which I posted in response to a couple of questions from @mfisher87 |
I don't think we should duplicate information (as much as possible), more like a sentence or two that's relevant to our project's more curious contributors. Even if it's simply "We added type stubs for the untype python-cmr library, which we intend to eventually upstream. You shouldn't need to worry about this, but check here and here to learn more." |
@mfisher87, I believe I've addressed everything now. The most recent tweaks:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks so much Chuck!! 😍
|
||
We have included type stubs for the untyped `python-cmr` library, which we | ||
intend to eventually upstream. Since `python-cmr` exposes the `cmr` package, | ||
the stubs appear under `stubs/cmr`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
💯 to both new paragraphs, thanks!
Thanks @mfisher87! I'm willing to take on the responsibility of having merge privileges, if you'd like to grant them, so I can squash-merge my changes. |
Invitation sent! |
There were a number of type hints in `search.py` and `api.py` related to CMR queries that were incorrect. These were fixed. In addition, there were a number of other static type errors that were masked because of ignored `cmr` imports. Added type stubs for `python_cmr` library to unmask and address these additional type errors. In addition: - Aligned vcrpy usage with VCRTestCase as per https://vcrpy.readthedocs.io/en/latest/usage.html#unittest-integration - Restored use of session for CMR paged queries, which was accidentally removed with the introduction of Search-After functionality. - Wrapped a number of docstrings at 88 characters per ruff configuration Fixes nsidc#508
There were a number of type hints in
search.py
andapi.py
related to CMR queries that were incorrect. These were fixed. In addition, there were a number of other static type errors that were masked because of ignoredcmr
imports. Added type stubs forpython_cmr
library to unmask and address these additional type errors.Limited static type changes as much as possible to only functions and methods dealing with CMR queries and results to keep this PR manageable.
Fixes #508
📚 Documentation preview 📚: https://earthaccess--510.org.readthedocs.build/en/510/