Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect type hints for CMR queries #508

Closed
chuckwondo opened this issue Apr 2, 2024 · 7 comments · Fixed by #510
Closed

Incorrect type hints for CMR queries #508

chuckwondo opened this issue Apr 2, 2024 · 7 comments · Fixed by #510

Comments

@chuckwondo
Copy link
Collaborator

The code related to CMR queries within earthaccess/search.py includes a number of incorrect type hints. Further, type-checking via mypy earthaccess does not catch these incorrect type hints.

For example, on this line, the return type is annotated as Type[CollectionQuery], but it should be CollectionQuery (or even better, typing.Self because the method returns self).

This error is caught by pylance/pyright, but not by mypy. The reason mypy doesn't catch it (and a number of other similar type errors) is because it uses classes from python-cmr, but python-cmr does not itself include type hints. Thus mypy assumes everything is type Any (although this seems a bit odd since it should at least type a class as itself).

We can easily get mypy earthaccess to generate some type errors by simply doing the following:

touch "$(python -c 'import site; print(site.getsitepackages()[0])')"/cmr/py.typed

After running the command above, running mypy earthaccess should generate numerous type errors that were previously masked. You should then remove the file by running the following:

rm "$(python -c 'import site; print(site.getsitepackages()[0])')"/cmr/py.typed`)

This indicates that we need to address 2 things:

  1. Add type hints to python-cmr
  2. Correct erroneous type hints in earthaccess

To address type hints for python-cmr, there are at least the following options:

  1. Open a PR that adds py.typed and also adds the missing type hints throughout the code base. This might encounter some hurdles for a reasonably speedy approval and release since the repo is owned by NASA.
  2. Submit type hints for python-cmr to typeshed. This would avoid any hurdles/slowdown we might encounter with PRs to python-cmr, so might be a faster route than the preceding option.
  3. Generate type stubs (stubgen is a command that comes with mypy for doing this) for python-cmr and add them to earthaccess. This is our fastest route because it does not rely on any external factors, nor does it require waiting for a release to be published (necessary for the preceding options).

The ideal solution is item 1, but we may want to reach the ideal state by first going with option 3 (fastest route), then option 2, and finally option 1.

Further, by taking option 3 first, not only is that the fastest route, but it also allows us to simultaneously correct the erroneous type hints within earthaccess, which we would need to do to get linting to pass. Further still, it would allow us to iron out any kinks in the new python-cmr type hints before later promoting them to options 2 and 1.

@chuckwondo
Copy link
Collaborator Author

@betolink, @MattF-NSIDC, @jhkennedy, any thoughts, suggestions, or concerns about this? I'd like to work on this because I discovered this issue while I was looking at the code in preparation for tackling #120, which I'd like to work on after addressing these type hint issues.

@jhkennedy
Copy link
Collaborator

@chuckwondo I don't personally use mypy much, but I do really like typehints, especially when working in PyCharm or VSCode. So I can't speak much to the value of testing with mypy, or mypy vs pylance/pyright.

I'd agree that (1) would be ideal, and depending on how nasa/python_cmr#30 goes, would be worth the time to contribute upstream. That said, it'll really depends on if the python-cmr maintainers value type hints and will want to maintain them going forwrad, which isn't always the case.

As for Earthaccess, I think we should do (3) now, and then cut over to (1) once released. I'm not sure there's much value in (2) if (1) and (3) happen.

@chuckwondo
Copy link
Collaborator Author

@jhkennedy, thanks for the input. I mentioned (2) in case (1) is not an option or is met with too much resistance/difficulty.

@jhkennedy
Copy link
Collaborator

@chuckwondo what's the benefit of doing (2) over (3)?

@mfisher87
Copy link
Collaborator

(3) first sounds great to me; I agree if we can't do (1) then it would be best for the community if we pursue (2) as a long-term solution so that folks other than us can benefit from the annotation work.

@chuckwondo
Copy link
Collaborator Author

@chuckwondo what's the benefit of doing (2) over (3)?

Doing (2) over (3) allows other users of python-cmr (not only earthaccess) to also benefit from the type hints (as @mfisher87 mentioned), but doing (3) first gets things accomplished for earthaccess purposes sooner, and also allows us to iron out possible kinks before providing the hints to a wider audience via (2) or (1).

chuckwondo added a commit to chuckwondo/earthaccess that referenced this issue Apr 8, 2024
There were a number of type hints in `search.py` and `api.py` related to
CMR queries that were incorrect. These were fixed. In addition, there
were a number of other static type errors that were masked because of
ignored `cmr` imports. Added type stubs for `python_cmr` library to
unmask and address these additional type errors.

Limited static type changes as much as possible to only functions and
methods dealing with CMR queries and results to keep this PR manageable.

Fixes nsidc#508
chuckwondo added a commit to chuckwondo/earthaccess that referenced this issue Apr 9, 2024
There were a number of type hints in `search.py` and `api.py` related to
CMR queries that were incorrect. These were fixed. In addition, there
were a number of other static type errors that were masked because of
ignored `cmr` imports. Added type stubs for `python_cmr` library to
unmask and address these additional type errors.

Limited static type changes as much as possible to only functions and
methods dealing with CMR queries and results to keep this PR manageable.

Fixes nsidc#508
chuckwondo added a commit to chuckwondo/earthaccess that referenced this issue Apr 15, 2024
There were a number of type hints in `search.py` and `api.py` related to
CMR queries that were incorrect. These were fixed. In addition, there
were a number of other static type errors that were masked because of
ignored `cmr` imports. Added type stubs for `python_cmr` library to
unmask and address these additional type errors.

Limited static type changes as much as possible to only functions and
methods dealing with CMR queries and results to keep this PR manageable.

Fixes nsidc#508
chuckwondo added a commit that referenced this issue Apr 16, 2024
There were a number of type hints in `search.py` and `api.py` related to
CMR queries that were incorrect. These were fixed. In addition, there
were a number of other static type errors that were masked because of
ignored `cmr` imports. Added type stubs for `python_cmr` library to
unmask and address these additional type errors.

In addition:

- Aligned vcrpy usage with VCRTestCase as per
  https://vcrpy.readthedocs.io/en/latest/usage.html#unittest-integration
- Restored use of session for CMR paged queries, which was accidentally
  removed with the introduction of Search-After functionality.
- Wrapped a number of docstrings at 88 characters per ruff configuration

Fixes #508
@github-project-automation github-project-automation bot moved this from 🆕 New to ✅ Done in earthaccess project Apr 16, 2024
@mfisher87
Copy link
Collaborator

Amazing work! 🤩

doug-newman-nasa pushed a commit to doug-newman-nasa/earthaccess that referenced this issue Apr 19, 2024
There were a number of type hints in `search.py` and `api.py` related to
CMR queries that were incorrect. These were fixed. In addition, there
were a number of other static type errors that were masked because of
ignored `cmr` imports. Added type stubs for `python_cmr` library to
unmask and address these additional type errors.

In addition:

- Aligned vcrpy usage with VCRTestCase as per
  https://vcrpy.readthedocs.io/en/latest/usage.html#unittest-integration
- Restored use of session for CMR paged queries, which was accidentally
  removed with the introduction of Search-After functionality.
- Wrapped a number of docstrings at 88 characters per ruff configuration

Fixes nsidc#508
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

3 participants