Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

stubtest: disable --ignore-missing-stubs for stdlib? #6401

Closed
Akuli opened this issue Nov 27, 2021 · 17 comments · Fixed by #6491
Closed

stubtest: disable --ignore-missing-stubs for stdlib? #6401

Akuli opened this issue Nov 27, 2021 · 17 comments · Fixed by #6491
Labels
project: policy Organization of the typeshed project

Comments

@Akuli
Copy link
Collaborator

Akuli commented Nov 27, 2021

@AlexWaygood has recently fixed lots of things in the standard library by running stubtest_stdlib without the --ignore-missing-stubs option. Maybe our CI should do that too, so that we notice what's missing? This doesn't apply to third-party stubs very well, because then our CI would be broken every time a new feature is added to any library, but the standard library is relatively stable.

@Akuli Akuli added the project: policy Organization of the typeshed project label Nov 27, 2021
@JelleZijlstra
Copy link
Member

Yes, let's do it.

@AlexWaygood
Copy link
Member

AlexWaygood commented Nov 27, 2021

Here's my analysis of which modules are missing the most classes/functions/methods/constants, according to stubtest:

Counter({
    'distutils': 165, 'xml': 69, 'tkinter': 51, 'email': 45, 'multiprocessing': 30, '__main__': 27, 'asyncio': 17,
    'os': 17, 'importlib': 15, 'cgi': 11, 'profile': 11, '_thread': 10, 'pkgutil': 10, 'codecs': 8, 'sre_constants': 8,
    'logging': 7, 'ssl': 7, '_collections_abc': 5, 'wsgiref': 5, '_markupbase': 4, '_winapi': 4, 'ctypes': 4, 'types': 4,
    'asynchat': 3, 'pipes': 3, 'pstats': 3, 'pydoc': 3, 'socket': 3, 'threading': 3, 'turtle': 3, '_csv': 2, '_threading_local': 2,
    'ast': 2, 'asyncore': 2, 'builtins': 2, 'configparser': 2, '  setup': 2, '': 2, 'inspect': 2, 'lib2to3': 2, 'msvcrt': 2, 'pickle': 2,
    'pyexpat': 2, 'rlcompleter': 2, 'signal': 2, 'sunau': 2, 'symtable': 2, 'unittest': 2, 'wave': 2, 'zipimport': 2, '_ast': 1,
    '_codecs': 1, '_imp': 1, '_json': 1, '_msi': 1, '_weakref': 1, 'bdb': 1, 'bz2': 1, 'csv': 1, 'ftplib': 1, 'hmac': 1, 'mimetypes': 1, 
    'modulefinder': 1, 'platform': 1, 'random': 1, 're': 1, 'sched': 1, 'socketserver': 1, 'subprocess': 1, 'token': 1,
    'tokenize': 1, 'tracemalloc': 1, 'typing': 1, 'unicodedata': 1, 'weakref': 1, 'winreg': 1
})

As you can see, it should be pretty manageable to disable --ignore-missing-stubs for the stdlib, as long as we keep it enabled for a few select modules: distutils/xml/tkinter/email/multiprocessing etc. are just going to create too many hits for now. I think it would be good to chip away at the other modules a bit first, as well, so we don't have to add loads of stuff to the allowlist.

A summary list of all classes that stubtest reports is missing is attached
here. The full output that stubtest gave me is here.

@AlexWaygood
Copy link
Member

AlexWaygood commented Nov 27, 2021

It might also be good if we could suppress stubtest for any names starting with a single underscore, as those are often private/ uncommented/implementation details

@AlexWaygood
Copy link
Member

Here's my analysis of which modules are missing the most classes/functions/methods/constants, according to stubtest

Here's the equivalent analysis of all missing objects, excluding objects that have names starting with a single leading underscore:

Counter({
    'distutils': 165, 'xml': 69, 'tkinter': 51, 'email': 45, 'multiprocessing': 30, '__main__': 27, 'os': 17, 'importlib': 15,
    'asyncio': 13, 'cgi': 11, 'profile': 11, 'pkgutil': 10, 'codecs': 8, 'sre_constants': 8, '_thread': 7, 'logging': 7, 'ssl': 7,
    '_collections_abc': 5, 'wsgiref': 5, '_markupbase': 4, '_winapi': 4, 'ctypes': 4, 'types': 4, 'asynchat': 3, 'pipes': 3,
    'pstats': 3, 'pydoc': 3, 'socket': 3, 'threading': 3, 'turtle': 3, '_csv': 2, 'ast': 2, 'asyncore': 2, 'builtins': 2, 'configparser': 2,
    '  setup': 2, 'inspect': 2, 'lib2to3': 2, 'msvcrt': 2, 'pickle': 2, 'pyexpat': 2, 'rlcompleter': 2, 'signal': 2, 'sunau': 2,
    'symtable': 2, 'unittest': 2, 'wave': 2, 'zipimport': 2, '_ast': 1, '_codecs': 1, '_imp': 1, '_json': 1, '_msi': 1, '_weakref': 1,
    'bdb': 1, 'bz2': 1, 'csv': 1, 'ftplib': 1, 'hmac': 1, 'mimetypes': 1, 'modulefinder': 1, 'platform': 1, 'random': 1, 're': 1,
    'sched': 1, 'socketserver': 1, 'subprocess': 1, 'token': 1, 'tokenize': 1, 'tracemalloc': 1, 'unicodedata': 1, 'weakref': 1
    'winreg': 1
})

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Nov 27, 2021

We have auto-updating of stdlib allowlists, which we didn't have when I first added stubtest, so doing this is more convenient for contributors than it would have been.

I will say I'm still a little concerned about the "false negatives in new additions" problem. If stubtest has ever caught an issue for you when adding something to a stdlib, it likely no longer will.

Also maybe useful to know if we decide to go ahead here is the stubtest --generate-allowlist flag.

@AlexWaygood stubtest has some heuristics for deciding what to check, and leading single underscore is among them. I think I've fixed some issues in mypy's public name detection since 0.910, but I'd be curious where you think heuristics are failing.
Relevant stubtest code is:
https://github.com/python/mypy/blob/902df03fa12ca9b39ad3ee11235c6558792782f4/mypy/stubtest.py#L203
https://github.com/python/mypy/blob/902df03fa12ca9b39ad3ee11235c6558792782f4/mypy/stubtest.py#L246

@Akuli
Copy link
Collaborator Author

Akuli commented Nov 27, 2021

I will say I'm still a little concerned about the "false negatives in new additions" problem.

If I understood correctly, this isn't a problem if we add stubgen-generated stubs very soon after disabling --ignore-missing-stub, and delete all of the added allowlist entries manually.

@AlexWaygood
Copy link
Member

AlexWaygood commented Nov 27, 2021

I think I've fixed some issues in mypy's public name detection since 0.910, but I'd be curious where you think heuristics are failing.

I've just been using the version of stubtest in typeshed's repo, so it may be that the issue's already been fixed! I'll try and double-check. But here are some of the names starting with a leading underscore that stubtest was flagging for me:

  • asyncio._enter_task
  • asyncio._leave_task
  • asyncio._register_task
  • asyncio._unregister_task
  • asyncio.tasks._enter_task
  • asyncio.tasks._leave_task
  • asyncio.tasks._register_task
  • asyncio.tasks._unregister_task
  • _thread._ExceptHookArgs.n_fields
  • _thread._ExceptHookArgs.n_sequence_fields
  • _thread._ExceptHookArgs.n_unnamed_fields

@Akuli
Copy link
Collaborator Author

Akuli commented Nov 27, 2021

Here are all of them (missing-stubs is #6403):

akuli@akuli-desktop:~/typeshed$ git diff master missing-stubs | grep '\._[A-Za-z]' | sort | uniq
+asyncio._enter_task
+asyncio.futures._TracebackLogger.loop
+asyncio.futures._TracebackLogger.source_traceback
+asyncio._leave_task
+asyncio._register_task
+asyncio.tasks._enter_task
+asyncio.tasks._leave_task
+asyncio.tasks._register_task
+asyncio.tasks._unregister_task
+asyncio._unregister_task
+_thread._ExceptHookArgs.n_fields
+_thread._ExceptHookArgs.n_sequence_fields
+_thread._ExceptHookArgs.n_unnamed_fields
+_threading_local._localimpl.localargs
+_threading_local._localimpl.locallock
+typing._SpecialForm.__call__

Edit: changed command to remove duplicates from output

@hauntsaninja
Copy link
Collaborator

hauntsaninja commented Nov 28, 2021

For the asyncio ones, stubtest checks them because "_enter_task" in asyncio.__all__.
For _thread, stubtest checks it because we do already have _ExceptHookArgs in the stub (this is the reason for asyncio.futures._TracebackLogger as well).

@AlexWaygood
Copy link
Member

AlexWaygood commented Nov 28, 2021

There are some interesting false-positives in pydoc. stubtest flags Helper.symbol, Helper.symbols_ and Helper.topic as being "missing" from the stub. But they're not proper attributes of the Helper class at all: they're just throwaway variables used in a for loop, the purpose of which is to populate the Helper.symbols dict. Source code here.

I imagine there's not an easy fix here, since it's just a fact that Python's for loops don't have their own scope, and it's a bit difficult to get around that. But, it seems like an interesting case nonetheless :)

@hauntsaninja
Copy link
Collaborator

Yeah, there are a couple cases like that... the module I remember is this (although stubtest actually ignores this because it's annoying to import), but I think it happens in some more serious modules too. The fix would have to be in CPython: a) del the variables from the namespace, b) prefix with underscore, c) use __all__ (looks like all three of these options are things that already happen with modules that use top level loops in the stdlib)

@hauntsaninja
Copy link
Collaborator

Getting slightly off-topic, but I also noticed that stubtest has some false negatives when finding primitive objects missing in stubs that are missing __all__

After fixing this, we find several more module level constants. We also find a few more loops:
https://github.com/python/cpython/blob/f87ea0350286837e9e96de03f8bfa215176c2928/Lib/inspect.py#L57 :-)

Here's the diff, python/mypy#11634, I recommend using it to find missing stubs cc @AlexWaygood

@AlexWaygood
Copy link
Member

Do we care about the flags stubtest is raising for collections.abc classes? stubtest is complaining because in the stubs they don't have __class_getitem__ methods, whereas at runtime they do. The difficulty is that the aliases for these classes in the typing module don't have __class_getitem__ at runtime. At the moment, the classes in collections.abc and the aliases for these classes in typing are the same in typeshed. So, might be a pain to fix. (Related: #6257)

@Akuli
Copy link
Collaborator Author

Akuli commented Nov 29, 2021

In any case, we can add the classes to allowlists for now and revisit them later if needed.

@JelleZijlstra
Copy link
Member

I'd avoid changing too much related to __class_getitem__ unless it solves a concrete problem. These classes are core to the type system and type checkers likely need to handle them specially anyway.

@AlexWaygood
Copy link
Member

FWIW I've basically added all the objects from the list that I feel comfortable adding (until I get my hands on @hauntsaninja's longer list). The remaining missing objects are either (1) false positives; (2) platform-specific; (3) deprecated; or (4) in modules I don't know enough about to be confident writing a PR for.

@LEDlastchance

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
project: policy Organization of the typeshed project
Projects
None yet
5 participants