-
-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Type annotations, part 2 #8706
Type annotations, part 2 #8706
Conversation
Note: I'm aware that py.typed is not present in distributed. That's deliberate; I wanted to test dask/dask and dask/distributed together to see if anything pops up before I add it. |
@crusaderky Are you blocked on this and is there any way I could help? Should I start a third PR with the files not modified in this one? |
@phobson this PR changes CI, but leaves several files in the code with mypy errors. Please fork from this and iterate on such files until everything is green. |
[PEP 544](https://www.python.org/dev/peps/pep-0544/) introduces the `Protocol` class to the `typing` module in Python 3.8 (the soon be the minimum supported version, dask/community#213). Writing new Dask collections for [dask-awkward](https://github.com/ContinuumIO/dask-awkward/) has had me thinking about working on a `DaskCollection` protocol. I imagine the benefits to be: - usage with static type checkers - other activity in this area at - #8295 - #8706 - #8854 - Python supporting IDEs take advantage of typing - self-documenting; some improvements to [the custom collections page](https://docs.dask.org/en/latest/custom-collections.html) of the docs. The protocol docs can be autogenerated and added to that page. - purely opt-in feature The `typing.runtime_checkable` decorator allows use of `isinstance(x, DaskCollection)` in any code base that uses Dask collections; for example: ```python >>> from dask.typing import DaskCollection >>> import dask.array as da >>> x = da.zeros((10, 3)) >>> isinstance(x, DaskCollection) True ``` (though this is an order of magnitude slower than `dask.base.is_dask_collection` which only checks for `x.__dask_graph__() is not None`; static typing checking & built-in interface documentation are the core benefits IMO) Something else that came up in the brief discussion on a call last week was having `{Scheduler,Worker,Nanny}Plugin` protocols in `distributed`; and perhaps those are better places to start introducing protocols to Dask since on the user side typically more folks would write plugins than new collections.
[PEP 544](https://www.python.org/dev/peps/pep-0544/) introduces the `Protocol` class to the `typing` module in Python 3.8 (the soon be the minimum supported version, dask/community#213). Writing new Dask collections for [dask-awkward](https://github.com/ContinuumIO/dask-awkward/) has had me thinking about working on a `DaskCollection` protocol. I imagine the benefits to be: - usage with static type checkers - other activity in this area at - dask#8295 - dask#8706 - dask#8854 - Python supporting IDEs take advantage of typing - self-documenting; some improvements to [the custom collections page](https://docs.dask.org/en/latest/custom-collections.html) of the docs. The protocol docs can be autogenerated and added to that page. - purely opt-in feature The `typing.runtime_checkable` decorator allows use of `isinstance(x, DaskCollection)` in any code base that uses Dask collections; for example: ```python >>> from dask.typing import DaskCollection >>> import dask.array as da >>> x = da.zeros((10, 3)) >>> isinstance(x, DaskCollection) True ``` (though this is an order of magnitude slower than `dask.base.is_dask_collection` which only checks for `x.__dask_graph__() is not None`; static typing checking & built-in interface documentation are the core benefits IMO) Something else that came up in the brief discussion on a call last week was having `{Scheduler,Worker,Nanny}Plugin` protocols in `distributed`; and perhaps those are better places to start introducing protocols to Dask since on the user side typically more folks would write plugins than new collections.
Continues from #8295.
Enable mypy in CI and let mypy use dask annotations when it runs on third party packages that import dask.
TODO
several files still need tweaking