-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Runtime implementation of TypedDict extension #2552
Conversation
def TypedDict(typename, fields): | ||
"""TypedDict creates a dictionary type that expects all of its | ||
import sys | ||
from typing import _type_check # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am using a non-public API here, but probably it is better than just copy the whole function here.
|
||
@skipUnless(PY36, 'Python 3.6 required') | ||
def test_class_syntax_usage(self): | ||
self.assertEqual(LabelPoint2D.__annotations__, {'x': int, 'y': int, 'label': str}) # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added # noqa
because flake8 complains about undefined name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good. I left a few comments.
The metaclass magic is impressive.
""" | ||
def new_dict(*args, **kwargs): | ||
return dict(*args, **kwargs) | ||
def __new__(cls, _typename, fields=None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest _fields
as the parameter name (rather than fields
) so that it doesn't clash with an attempt to use a dictionary item called "fields". This fulfills the desire that TypedDict should accept any item names that do not begin with an underscore.
Class = TypedDict('Class', name=str, fields=List[str], methods=List[str])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
|
||
def test_basics_iterable_syntax(self): | ||
# Check that two iterables allowed | ||
Emp = TypedDict('Emp', [('name', str), ('id', int)]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was under the impression that we weren't going to support the list of tuples syntax, based on python/typing#28 (comment) and python/typing#28 (comment) .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is rather by accident. The implementation just takes an arbitrary iterable as an argument, e.g., a dictionary TypedDict('Employee', {'name': str, 'id': int})
. List of tuples etc. also work. Basically everything that could be given to a normal dict
works. Of course we could prohibit everything apart from a real dict
, but I think that would be redundant. Anyway, I just removed this test, since I agree that it is unnecessary.
Emp = TypedDict('Emp', {'name': str, 'id': int}) | ||
self.assertIsSubclass(Emp, dict) | ||
jim = Emp(name='Jim', id=1) | ||
self.assertIsInstance(jim, Emp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't this check fail? I thought we were going to prevent isinstance() checks with TypedDicts since they would true for any dict type. Indeed in test_typeddict_errors() below that seems to be the case.
Actually in rereading your implementation it looks like the runtime type of Emp(...) is actually an Emp rather than a plain dict, as the TypedDict doc comment says. I would advocate that even if Emp itself has __attributes__
and other nice introspection properties that instances still be plain dict objects. This is similar to how NewType works.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I remember I tried this, but it didn't give any visible speed-up. Current implementation is already slower than plain dict
only by 10%, so that it is difficult to improve this. Anyway, probably you are right that it is better to make instances just dict
s for consistency with other types. I did this in a new commit (this unfortunately required a bit more "magic" to be compatible with both Python 2 and 3, note some other magic already present is only because of different class machinery in Python 2 and 3).
|
||
|
||
def _check_fails(cls, other): | ||
if sys._getframe(1).f_globals['__name__'] not in ['abc', 'functools']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) Is there a reason that these particular modules are special-cased so that they do allow isinstance checks?
(2) Perhaps the call to sys._getframe() should be guarded when using a non-CPython implementation that doesn't support sys._getframe()? I see this is done in _TypedDictMeta.__new__
below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(1) Yes, this is because there are several functions in these two modules that do "reckless" scan with isinstance
and issubclass
over .__subclasses__
.
(2) Good catch. Fixed.
@davidfstr Thank you for a thorough review! I implemented your comments in a new commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few more small comments and this PR looks good to me.
@@ -0,0 +1 @@ | |||
# This page intentioanlly left blank. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: intentioanlly -> intentionally
Emp = TypedDict('Emp', {'name': str, 'id': int}) | ||
self.assertIsSubclass(Emp, dict) | ||
jim = Emp(name='Jim', id=1) | ||
self.assertIsInstance(jim, dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Suggest strengthen this check to:
self.assertTrue(type(jim) is dict)
Ditto for other occurrences below.
@davidfstr Thank you for another review! I implemented your comments (and added few more tests) in a new commit. |
Looks ready for squash-and-merge to me. Any comments @gvanrossum or others with merge power? |
I'll try to give it a quick look over and merge. I'm running behind with
stuff, as usual. :-(
|
@@ -0,0 +1 @@ | |||
# This page intentionally left blank. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The addition of this file seems wrong. "extensions" is not supposed to be a Python package, it's just a folder containing the mypy_extensions module. That's why there's a setup.py here. All other code uses "from mypy_extensions import TypedDict" which implies that "mypy_extensions" is the top-level name, not "extensions". (The latter would be a terrible top-level package name since it is entirely non-specific.)
I realize that you're probably doing this so that your test code can import it, but really, the test code should somehow have this directory on its sys.path
to make "from mypy_extensions import TypedDict" work instead.
And before you add "what harm can it do", there are circumstances where mypy searches up from a file until it reaches a directory without an __init__.py
(or .pyi
) file, and that algorithm would be affected by this addition. So please find a way to do without it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Situation was actually opposite here. I added this file because otherwise mypy was complaining. But now I see that I misunderstood the idea. I have removed this file.
@@ -7,16 +7,76 @@ | |||
|
|||
# NOTE: This module must support Python 2.7 in addition to Python 3.x | |||
|
|||
import sys | |||
import types | |||
from typing import _type_check # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you sure you want to depend on this internal API? It means that if in the future we change some version of typing.py it will break with this version of mypy. This may be acceptable given our plans here, but it's still a red flag...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand this is a bad idea to use the private API, this is why I added this comment #2552 (comment) But the function is really helpful, it takes care about forward references, callables, invalid types, etc. And it is not easy to reimplement this function here, since the implementation depends on other private typing
API like _ForwardRef
and _TypingBase
. I think it is OK to keep this. What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, it's fine to keep, but please add a comment to the code for posterity. (I also worry that before we know it this will become a de-facto public API -- I've seen others ask for the same functionality, e.g. in the issue asking to restore get_type_hint() for Python 2.)
y: int | ||
label: str | ||
|
||
The latter syntax is only supported in Python 3.6+ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe add explicitly (as a reminder) that first two syntax forms works for Python 2(.7) and 3?
_fields = kwargs | ||
elif kwargs: | ||
raise TypeError("Either list of fields or keywords" | ||
" can be provided to TypedDict, not both") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a list but a dict, right? Anyway, I'd rewrite this error in the active tense, and without mentioning "fields" (which is questionable terminology here): "TypedDict takes either a dict or keyword arguments, but not both".
new_dict.__name__ = typename | ||
new_dict.__supertype__ = dict | ||
return new_dict | ||
TypedDict = _TypedDictMeta('TypedDict', _TypedDict.__bases__, dict(_TypedDict.__dict__)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm guessing this awkward way of defining the class is to get around the different metaclass syntaxes for Python 2 and 3. But it makes my head explode trying to understand what's happening. I think that one of the problems is actually that TypedDict
is itself a metaclass. (We see this when we realize that instances of TypedDict are really classes whose instances are dicts.) So maybe _TypedDict
needs to be combined into TypedDictMeta
? What tipped me off was that _TypedDict.__new__
ends up not calling object()
or super().__new__()
but cls.__class__()
, which is really a fancy way of spelling TypedDictMeta()
.
except ImportError: | ||
import collections as collections_abc # type: ignore # PY32 and earlier | ||
from unittest import TestCase, main, skipUnless | ||
from extensions.mypy_extensions import TypedDict |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line should somehow be from mypy_extensions import TypedDict
, see my comment on __init__.py
.
TypedDict('Hi', [('x', int)], y=int) | ||
|
||
@skipUnless(PY36, 'Python 3.6 required') | ||
def test_class_syntax_usage(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add "py36" or something similar to the test case name?
@gvanrossum Thank you for a review! I implemented your comments in new commits. Also I have fixed an unguarded node look-up in |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, getting closer!
@@ -7,16 +7,76 @@ | |||
|
|||
# NOTE: This module must support Python 2.7 in addition to Python 3.x | |||
|
|||
import sys | |||
import types | |||
from typing import _type_check # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, it's fine to keep, but please add a comment to the code for posterity. (I also worry that before we know it this will become a de-facto public API -- I've seen others ask for the same functionality, e.g. in the issue asking to restore get_type_hint() for Python 2.)
_fields = kwargs | ||
elif kwargs: | ||
raise TypeError("TypedDict takes either a dict or " | ||
"keyword arguments, but not both") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you break after the comma? (I worry about people grepping for the error message.)
anns.update(base.__dict__.get('__annotations__', {})) | ||
tp_dict.__annotations__ = anns | ||
if name == 'TypedDict': | ||
tp_dict.__new__ = types.MethodType(_typeddict_new, tp_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAIK the signature of MethodType
is undocumented. Why not just remove the unused inst
arg from _typeddict_new
and _dict_new
and plunk the bare function into tp_dict.__new__
? It seems to work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to work.
This approach does not work in Python 2. If __new__
returns an object of "wrong" class, then it looks like isinstance()
is called on it and the latter is prohibited. Anyway, I have found a much simpler way: just insert __new__
into class namespace before super().__new__
i.e. type.__new__
is called. The latter will do whatever is necessary for both Python 2 and 3.
|
||
def _check_fails(cls, other): | ||
try: | ||
if sys._getframe(1).f_globals['__name__'] not in ['abc', 'functools']: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm... If I comment out this line things still work, suggesting that it's not important that it returns False when invoked from those modules. I suppose this is from when this code was part of typing.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
False
will be returned also if sys._getframe()
fails in non-CPython implementations. In general I think it is better if __subclasscheck__()
and __instancecheck__()
will return False
rather than None
.
if name == 'TypedDict': | ||
tp_dict.__new__ = types.MethodType(_typeddict_new, tp_dict) | ||
else: | ||
tp_dict.__new__ = types.MethodType(_dict_new, tp_dict) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I finally remember why this is so complicated -- we must support both calling TypedDict and subclassing it. I like the new refactoring better, but maybe it's worth adding a comment explaining the situation? (The docstring is more for external use.)
@gvanrossum Thank you for another review! I implemented requested changes and added comments in code and one small test. |
Whee! Hope @davidfstr can now start implementing some of the features implemented here. (We should probably kill some |
I think we could split the work here. I could take care about class based syntax in Python 3.6, while @davidfstr could take care about keyword based syntax #2492 |
Yep sounds good to me. I was actually planning to work on #2492 next. |
Here is the runtime implementation of
TypedDict
initially proposed in python/typing#322, @davidfstr please take a look. You told you are interested only in keyword syntax, but here I add all three forms. I think it will be easy to add support for other forms in mypy (as it happened forNamedTuple
), I could do this later in a separate PR. The implementation is tested on both Python 2 and 3 but I didn't find how to add Python2 runtime tests in mypy.I copy the description from original
typing
PR:Here is a simple (but quite flexible) implementation that supports three forms (iterable, keywords, class for 3.6+). In all forms type info for runtime introspection is accessible via
__annotations__
, multiple inheritance is supported. Some examples:Everything is implemented to be pickleable and fast. I have measured and
Point2D(x=1, y=2)
is slower thatdict(x=1, y=2)
by only 3% to 11% depending on Python version.