Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Type stubs for single-file top-level modules #1333

Open
not-my-profile opened this issue Jan 9, 2023 · 29 comments
Open

Type stubs for single-file top-level modules #1333

not-my-profile opened this issue Jan 9, 2023 · 29 comments

Comments

@not-my-profile
Copy link

not-my-profile commented Jan 9, 2023

PEP 561 currently states the following:

Package maintainers who wish to support type checking of their code MUST add a marker file named py.typed to their package supporting typing.

I consider this requirement to be problematic for Python libraries that are written in another programing language and distributed as compiled .so files. PEP 561 currently does not provide a way to mark .so files residing directly in site-packages/ to be typed, resulting in typed shared libraries needing to introduce an intermediary __init__.py file such as the following:

from ._native import *

__doc__ = _native.__doc__
if hasattr(_native, "__all__"):
    __all__ = _native.__all__

While this works for static type checkers I think this is obviously suboptimal because it has several undesired side-effects. Let's take the following as an example:

site-packages
├── my_project
│   ├── __init__.py
│   ├── _native.cpython-36m-x86_64-linux-gnu.so
│   └── py.typed

The unintended side-effects are:

  1. You can import my_project._native.
  2. _native shows up in the documentation generated by pydoc. E.g. under PACKAGE CONTENTS for the documentation of my_project and invoking e.g. help(my_project.foobar) will tell you that foobar resides in the module my_project._native.
  3. my_project.__file__ now is the __init__.py file instead of the .so file, potentially misleading developers into thinking the package is implemented in Python

So I really think PEP 561 should be amended to provide some way of marking single-file packages as "typed" without having to resort to hacks such as defining an intermediary __init__.py since that introduces a bunch of undesired side-effects that have the potential to confuse API users.

What do you think about this?

@JelleZijlstra
Copy link
Member

I agree this is suboptimal, and I'd support lifting the restriction if we can come up with a good way to do it. Maybe @ethanhs has some insights into why we didn't provide a way to type single-module packages at the time.

The first obvious solution perhaps would be to put something in the package's dist-info directory, e.g. a new key in the METADATA file. But the problem with that would be that type checkers can't reliably go from the name of the installed module to the dist-info directory, because the names may not match.

@not-my-profile
Copy link
Author

not-my-profile commented Jan 9, 2023

A very simple solution could be to create a .typed marker in the same directory by appending .typed to the filename of the shared library so for example site-packages/myproject.cpython-36m-x86_64-linux-gnu.so could be marked as typed by creating the file site-packages/myproject.cpython-36m-x86_64-linux-gnu.so.typed.

I am unfamiliar with the process of amending a PEP, are there other places where I should announce this discussion? PEP 561 has been marked as "Final", does this mean that introducing such an update would require a new PEP?

@JelleZijlstra
Copy link
Member

This would technically require a new PEP, yes. However, it can be short.

@hauntsaninja
Copy link
Collaborator

I think just {module_name}.typed would work and be easier for type checkers to search for

@not-my-profile not-my-profile changed the title Marking a single-file package as typed as per PEP 561 Marking a single-file top-level module as typed Jan 9, 2023
@not-my-profile
Copy link
Author

not-my-profile commented Jan 9, 2023

Good point, Shantanu! Oh ... I just realized something ... static type checkers could just as well check for {module_name}.pyi within site-packages/ ... no need for a separate marker file at all.

Ok thanks Jelle, I'm working on a PEP draft for this right now.

@hauntsaninja
Copy link
Collaborator

I think .pyi as a marker file would only work for single module extension modules, not for single module pure Python files with inline types.

@not-my-profile
Copy link
Author

not-my-profile commented Jan 9, 2023

Right I don't really consider having to change foo.py to foo/__init__.py to mark it as typed with inline types to be problematic because:

  • documentation generators know how to deal with __init__.py files since they are so prevalent (e.g. pydoc doesn't list __init__ under PACKAGE CONTENTS and help displays in module foo instead of in module foo.__init__).
  • foo.__file__ directly points to the source code

So the drawbacks I described that exist for shared libraries pretty much don't apply to pure Python modules.

I don't think we should introduce yet another type of .typed marker file since that is bound to result in confusion. What are the differences between foo.typed and py.typed? You can put partial\n in py.typed but not in foo.typed (since the stubs of single-file modules cannot be partial) so we would have two different types of marker files with the same extension, which is just confusing. Even worse there could be a package named py (in fact there is one on PyPI), so site-packages/py.typed would have different semantics than site-packages/foo/py.typed ... which again is confusing. I guess we could deal with that by introducing a .typed-module extension but this again isn't as clean since py.typed already exists and is named py.typed instead of py.typed-package.

I think we should rather go with the intuitive solution of putting a .pyi file in site-packages.

@not-my-profile not-my-profile changed the title Marking a single-file top-level module as typed Type stubs for single-file top-level modules Jan 9, 2023
@erictraut
Copy link
Collaborator

I think we should rather go with the intuitive solution of putting a .pyi file in site-packages.

I agree this is the most straightforward approach for single-file binary (compiled) packages. Pyright already supports a ".pyi" file in site-packages if it's present. It looks like mypy doesn't handle this currently, but I'm guessing it would be a simple change.

@not-my-profile
Copy link
Author

I wrote a draft PEP "Type stubs for single-file top-level modules". Feedback is very much welcome :)

(This is my first attempt at writing a PEP.)

@JelleZijlstra
Copy link
Member

@AlexWaygood
Copy link
Member

I'm supportive of the idea and I'm happy to co-sponsor :)

However, I wouldn't want to be the sole sponsor — I haven't written or sponsored a PEP before, so I'm not 100% sure of the exact process. I also feel like I'm pretty close to full capacity on my open-source commitments at the moment.

@AlexWaygood
Copy link
Member

  • I remember seeing some grumbling about needing py.typed even for single packages, but can't find it now.

I've seen this in a few places; I'll also see if I can dig up some references.

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Jan 9, 2023

Thanks for putting the effort into writing this up (and starting this discussion in the first place)!

Given that PEP 561 95% solves this problem, I feel if we want to make changes to standards here, we shouldn't solve only half of the remaining 5% of the problem.

I'm not sympathetic to the claim that pure Python developers don't feel the drawbacks you mention, mainly because I think 2/3 of those drawbacks are very weak: "you can import my_project._native" (so what? consenting adults), "potentially misleading developers into thinking the package is implemented in Python" (such developers would also believe that numpy is in pure python)...

...I think the biggest reason to do this is just "it's annoying to complicate project layout because of a dumb shortcoming of type checkers" and "why have package when you can have module, simple is better than complex". This applies equally to single pure Python modules with inline types and single extension modules.

@AlexWaygood
Copy link
Member

^I agree with everything @hauntsaninja just said; I also think it would be a real shame to not find a way to solve this for pure-Python file packages

@JelleZijlstra
Copy link
Member

The suggested approach of putting a .pyi file in site-packages next to the implementation file would also work for pure-Python packages, right?

@hauntsaninja
Copy link
Collaborator

Yes, but not for inline types, which I strongly encourage as the most maintainable way to add types to pure Python

@JelleZijlstra
Copy link
Member

Ah right, good point.

Possible hacky solution: Use a .pyi file, but put some special marker code in the .pyi file that indicates "look inline".

@AlexWaygood
Copy link
Member

AlexWaygood commented Jan 9, 2023

Use a .pyi file, but put some special marker code in the .pyi file that indicates "look inline".

__py_typed__ = True?

So, if type checkers see a .pyi file with just that, and nothing more, they know to look in the equivalent .py file for inline types?

@not-my-profile
Copy link
Author

not-my-profile commented Jan 9, 2023

Thanks everybody :)

I remember seeing some grumbling about needing py.typed even for single packages, but can't find it now.

There is #1297.

The PEP should clarify where in the resolution order defined in PEP 561 (https://peps.python.org/pep-0561/#type-checker-module-resolution-order) the stub should go. (Presumably under # 4.)

Yes I agree with that.

What happens if there is both a foo.pyi and a foo/ directory with a py.typed in it?

If both foo.pyi and foo/__init__.pyi + foo/py.typed exist I think type checkers should do the following:

  • use foo/__init__.pyi if foo/__init__.py exists (because Python's import statement also prefers packages over single-file modules if both exist in the same directory)
  • otherwise use foo.pyi

@hauntsaninja I agree with 2/3 drawbacks being weak. I think the main drawback is that tooling such as documentation generators generally don't have special support for recognizing such re-exporting __init__.py files.

I am alright with also solving the problem for pure-Python modules with inline types. My initial idea just now was to create a .pyi file next to the .py file as a symbolic link to the .py file, however seeing that symbolic links aren't supported in wheels that unfortunately does not appear to be an option.

I think I'd rather introduce a new file extension for new marker files (e.g. .typed-module) rather than overloading the meaning of existing file extensions such as .typed or .pyi. So an empty {module}.typed-module file would imply that {module}.pyi should be used if it exists and if it does not if the module is implemented as {module}.py then to look for inline types. This would allow for the following combinations:

  • {module}.*.so + {module}.pyi + {module}.typed-module (arguably the marker file is redundant in this case)
  • {module}.py + {module}.pyi + {module}.typed-module (arguably the marker file is redundant in this case)
  • {module}.py + {module}.typed-module (for inline types)

What do you think? Perhaps the .typed-module marker file should be optional if there exists a .pyi file? So you would actually only need to use it for inline types. And type checkers would only have to check for it if the .pyi doesn't exist.

@erictraut
Copy link
Collaborator

There is already a way to handle inlined types for a single-file module: convert it to a multi-file package and add a "py.typed" marker. This isn't too onerous. Let's not add hacky and inconsistent solutions (like symbolic links or files with specific name extensions). Resist the urge to overreact to one or two people grumbling about needing to do a few extra steps (one time) to make this work.

There is currently no way to package type information for a single-file compiled module. The side-by-side ".pyi" file is an elegant solution to this currently-unsolved problem. Let's focus on solving this problem, not creating more complexity and inconsistencies to solve a problem that already has a solution.

@not-my-profile
Copy link
Author

not-my-profile commented Jan 10, 2023

I think a key difference between pure modules and extension modules that has not been mentioned here yet is that the __init__ module of a package always has to be a pure module. So while a pure top-level module can simply be moved to {module}/__init__.py the same does not work for extension modules (because CPython does not pick up {module}/__init__.cpython-36m-x86_64-linux-gnu.so). So the easy solution of simply moving {module}.py to {module}/__init__.py file and creating a marke file does not work for extension modules, requiring the creation of an unintuitive and disadvantageous intermediary/re-exporting __init__.py file.

I have strongly revised my PEP draft to better explain the reasoning (as well as answering the questions raised by @JelleZijlstra).

Sidenote: I have now also specified that these top-level .pyi files should be recognized in the 4th step of the module resolution order as per PEP 561. While looking at that order I think I have spotted an oversight in PEP 561, for which I have just opened #1334 in order to keep this discussion on topic.

@hauntsaninja There are three scenarios:

  1. a top-level extension module (which always needs a .pyi file)
  2. a top-level Pure module with a type stub file (which always needs a .pyi file)
  3. a top-level Pure module with inline types

Since the first two cases always need a .pyi file, I think just supporting recognizing a .pyi file in the same directory is very much what you would expect to work, so I think we really should support this, especially because the first case requires such an unintuitive/disadvantageous workaround in the form of a re-exporting __init__.py file.

Addressing the third case would require a solution that is not obvious (defining magic variables like __py_typed__ = True in .pyi files or introducing a brand-new file extension like .typed-module is all very much arbitrary), so I'd have to agree with @erictraut that the inconvenience caused by having to turn a .py module into a package is not great enough to warrant the introduction of such an arbitrary solution (and thus additional complexity). And by complexity I don't mean complexity in the implementation of type checkers (checking for a .typed-module file would be quite easy) but rather complexity in the conceptual model of "how to distribute types for Python packages" that Python developers have to deal with when distributing packages or debugging the type stub resolution order.

@cdce8p
Copy link
Contributor

cdce8p commented Jan 10, 2023

There is already a way to handle inlined types for a single-file module: convert it to a multi-file package and add a "py.typed" marker. This isn't too onerous.

True, however I've seen project owners rejecting adding the type hint marker because it would have required them to move to a package structure. It would be a much easier sell, if there would be a solution which didn't require a structure change.

I do agree that adding a separate file with a new suffix would be too complicated. What about adding __py_typed__ = True to the python module itself? That would be simple enough and type checkers would only need to check for it for single file modules.

@erictraut
Copy link
Collaborator

I've seen project owners rejecting adding the type hint marker because it would have required them to move to a package structure

Really? That surprises me. Can you provide examples?

I guess I'm not very sympathetic to this argument. This is a really low bar. It involves a one-time change that requires just a few minutes of work. If a library maintainer is unwilling to do this, then they're just looking for excuses not to support typing. I don't think that inventing redundant mechanisms is the right solution to this problem.

@cdce8p
Copy link
Contributor

cdce8p commented Jan 10, 2023

I've seen project owners rejecting adding the type hint marker because it would have required them to move to a package structure

Really? That surprises me. Can you provide examples?

Unfortunately not. It was some time ago and IIRC the owner wasn't too convinced about the usefulness of typing.

This is a really low bar. It involves a one-time change that requires just a few minutes of work. If a library maintainer is unwilling to do this, then they're just looking for excuses not to support typing.

The refactoring might be simple but packaging is another story. Sometimes it's easier to leave it alone if it works.
Adding a simple line to the Python file wouldn't require any other changes.

--
Side note: For time to time I also come across projects which do have a py.typed file in their repo but don't include it with their sdist / wheel. Packaging is hard sometimes, especially if you don't do it frequently 🤷🏻‍♂️

@ethanhs
Copy link
Contributor

ethanhs commented Jan 12, 2023

Hi! Sorry for not responding sooner, been a bit busy recently.

Let me start by giving context into why PEP 561 makes the tradeoffs it does, and what I was discussing with type checker authors at the time.

  1. I think the largest reason was at the time getting installed package metadata was still newly added in the standard library (importlib.metadata was only added in 3.8). We didn't really want to make people depend on third party packages to get the metadata about the typing status. I think there was also some concern that non-Python type checkers would have a harder time reading package metadata because they would have to build that infrastructure themselves.
  2. There was significant interest, from users and type checker authors at the time to have per-package (in the folder of Python code sense) metadata, and allowing py.typed to exist in that folder seemed the simplest way to accomplish that. This enables gradual adoption of typing.
  3. py.typed did not require integrations from all packaging tools. It was a significantly smaller amount of implementation work to reach people.

That being said, I do think PEP 561 is a bit lacking.

I've seen project owners rejecting adding the type hint marker because it would have required them to move to a package structure

I've actually also seen this, but I cannot recall where. I do still think the bar is pretty low, but I also think the UX for marking a package as typed could be better.

One of the original alternate designs for PEP 561 was to include typing support status in the distribution metadata. This would most likely exist as a list of files in the distribution that support typing or something like that. This solution is particularly appealing now that 3.7 is almost end of life (June of this year), and so soon all versions of Python will support importlib.metadata. There are however downsides, such as third party type checkers that don't want to call out into Python needing to implement this logic themselves. In addition, it is rather orthogonal to py.typed, so I worry it could be confusing (maybe if the UI is just typed=True to the user though, that isn't as much of a concern).

I agree though that if we don't want to shift to something like the above metadata-based system, keeping the status quo and suggesting maintainers change the layout of the package is an acceptable solution.

@Kentzo
Copy link

Kentzo commented Jan 12, 2023

Possible hacky solution: Use a .pyi file, but put some special marker code in the .pyi file that indicates "look inline".

I'm very likely missing something well known here, but could someone tell me why a type checker needs a marker to consider looking inline rather than just looking inline unconditionally?

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Jan 12, 2023

@Kentzo it's a good question, particularly in 2023. The main reason is that if a package doesn't have types or only has partial types, it's useful to warn a user about that so they don't falsely think they have typing coverage. This also gives the user the opportunity to install a stubs package themselves / the type checker to easily detect this situation and hint to do so.

Historical reasons are that annotations weren't always reserved for typing use and type checkers sometimes struggled with unusual code.

If I were writing a type checker in 2023, I'd probably always try to analyse the code because more information is usually better and you can still type check the shape of things, but I'd use the absence of py.typed to surface an error to the user.

9y2070m added a commit to 9y2070m/deprecation that referenced this issue Apr 11, 2023
9y2070m added a commit to 9y2070m/deprecation that referenced this issue Apr 11, 2023
9y2070m added a commit to 9y2070m/deprecation that referenced this issue Apr 11, 2023
@brettcannon
Copy link
Member

brettcannon commented Aug 8, 2023

FYI I ran into this recently due to the lack of support for top-level .pyi files. In my use case the project is a single .py file by design for ease of copying/integration with non-Python code and is purposefully kept small so it can be passed entirely as a string on the command-line (e.g., Rust code can embed the entire file as a string constant and execute a subprocess with the string constant passed in via argv). I would be happy to provide a .pyi file for the single function the code exposes, but that currently doesn't work with mypy due to the current standards.

markjoshwel added a commit to markjoshwel/surplus that referenced this issue Sep 2, 2023
i dont like it, but that's how it'll have to be
python/typing#1333

s+: also, add __str__ support for query types (#18)
aebrahim added a commit to aebrahim/once that referenced this issue Sep 6, 2023
This required moving away from using a single module, a known limitation
being currently visited in python/typing#1333.
martijnthe added a commit to martijnthe/pybreaker that referenced this issue Dec 22, 2023
 ### Summary

Mypy only considers the type annotations of packages that contain the
`py.typed` marker file, even if they are fully type-annotated (like
`pybreaker`).

This PR adds said file.

I had to move `pybreaker` into a module instead of a single .py file, because
there is no support for `py.typed` files for single .py file packages yet (see
python/typing#1333).

 ### Test Plan

- Ran `python setup.py test` to ensure the tests still pass.
- Ran `python -m build` to build the package.
- Observe the output logs to see that `py.typed` was included.
- Install the tarball into a project where I have mypy set up. Run mypy and
  observe mypy now no longer fails with `... becomes "Any" due to an unfollowed
import [error/no-any-unimported]` for imports from `pybreaker`.
danielfm pushed a commit to danielfm/pybreaker that referenced this issue Dec 22, 2023
### Summary

Mypy only considers the type annotations of packages that contain the
`py.typed` marker file, even if they are fully type-annotated (like
`pybreaker`).

This PR adds said file.

I had to move `pybreaker` into a module instead of a single .py file, because
there is no support for `py.typed` files for single .py file packages yet (see
python/typing#1333).

 ### Test Plan

- Ran `python setup.py test` to ensure the tests still pass.
- Ran `python -m build` to build the package.
- Observe the output logs to see that `py.typed` was included.
- Install the tarball into a project where I have mypy set up. Run mypy and
  observe mypy now no longer fails with `... becomes "Any" due to an unfollowed
import [error/no-any-unimported]` for imports from `pybreaker`.
Alphix added a commit to Alphix/python-ldap that referenced this issue Jan 30, 2024
This is necessary in order for type checking due to:
python/typing#1333

Given that _ldap is an internal module, this change is hopefully ok.
chrysn added a commit to chrysn/cbor-diag-py that referenced this issue May 5, 2024
@yaxxie-piramidal
Copy link

I've seen project owners rejecting adding the type hint marker because it would have required them to move to a package structure

Really? That surprises me. Can you provide examples?

I guess I'm not very sympathetic to this argument. This is a really low bar. It involves a one-time change that requires just a few minutes of work. If a library maintainer is unwilling to do this, then they're just looking for excuses not to support typing. I don't think that inventing redundant mechanisms is the right solution to this problem.

Unless I'm wrong you can have (single file) modules or (multi file) packages.

Again, unless I am wrong, modules are not deprecated.

As long as modules are a valid way to produce libraries, I think it is fair to demand a way to indicate they're typed without having to change them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

10 participants