Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Runtime implementation of TypedDict extension #2552

Merged
merged 7 commits into from
Dec 14, 2016

Conversation

ilevkivskyi
Copy link
Member

@ilevkivskyi ilevkivskyi commented Dec 10, 2016

Here is the runtime implementation of TypedDict initially proposed in python/typing#322, @davidfstr please take a look. You told you are interested only in keyword syntax, but here I add all three forms. I think it will be easy to add support for other forms in mypy (as it happened for NamedTuple), I could do this later in a separate PR. The implementation is tested on both Python 2 and 3 but I didn't find how to add Python2 runtime tests in mypy.

I copy the description from original typing PR:


Here is a simple (but quite flexible) implementation that supports three forms (iterable, keywords, class for 3.6+). In all forms type info for runtime introspection is accessible via __annotations__, multiple inheritance is supported. Some examples:

class Point1D(TypedDict):
    x: float
class Point2D(Point1D):
    y: float

Grid2D = TypedDict('Grid2D', [('x', int), ('y', int)])
Grid3D = TypedDict('Grid3D', {'x': int, 'y': int, 'z': int})
Info = TypedDict('Info', color=str, size=int)

class CoolPoint2D(Point2D, Info):
    pass

assert Point2D(x=1, y=2) == dict(x=1, y=2)
assert CoolPoint2D.__bases__ == (dict,)
assert CoolPoint2D.__annotations__ == {'x': float, 'y': float, 'size': int, 'color': str}

Everything is implemented to be pickleable and fast. I have measured and Point2D(x=1, y=2) is slower that dict(x=1, y=2) by only 3% to 11% depending on Python version.

def TypedDict(typename, fields):
"""TypedDict creates a dictionary type that expects all of its
import sys
from typing import _type_check # type: ignore
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am using a non-public API here, but probably it is better than just copy the whole function here.


@skipUnless(PY36, 'Python 3.6 required')
def test_class_syntax_usage(self):
self.assertEqual(LabelPoint2D.__annotations__, {'x': int, 'y': int, 'label': str}) # noqa
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added # noqa because flake8 complains about undefined name.

Copy link
Contributor

@davidfstr davidfstr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking pretty good. I left a few comments.

The metaclass magic is impressive.

"""
def new_dict(*args, **kwargs):
return dict(*args, **kwargs)
def __new__(cls, _typename, fields=None, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd suggest _fields as the parameter name (rather than fields) so that it doesn't clash with an attempt to use a dictionary item called "fields". This fulfills the desire that TypedDict should accept any item names that do not begin with an underscore.

Class = TypedDict('Class', name=str, fields=List[str], methods=List[str])

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.


def test_basics_iterable_syntax(self):
# Check that two iterables allowed
Emp = TypedDict('Emp', [('name', str), ('id', int)])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was under the impression that we weren't going to support the list of tuples syntax, based on python/typing#28 (comment) and python/typing#28 (comment) .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is rather by accident. The implementation just takes an arbitrary iterable as an argument, e.g., a dictionary TypedDict('Employee', {'name': str, 'id': int}). List of tuples etc. also work. Basically everything that could be given to a normal dict works. Of course we could prohibit everything apart from a real dict, but I think that would be redundant. Anyway, I just removed this test, since I agree that it is unnecessary.

Emp = TypedDict('Emp', {'name': str, 'id': int})
self.assertIsSubclass(Emp, dict)
jim = Emp(name='Jim', id=1)
self.assertIsInstance(jim, Emp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this check fail? I thought we were going to prevent isinstance() checks with TypedDicts since they would true for any dict type. Indeed in test_typeddict_errors() below that seems to be the case.

Actually in rereading your implementation it looks like the runtime type of Emp(...) is actually an Emp rather than a plain dict, as the TypedDict doc comment says. I would advocate that even if Emp itself has __attributes__ and other nice introspection properties that instances still be plain dict objects. This is similar to how NewType works.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I remember I tried this, but it didn't give any visible speed-up. Current implementation is already slower than plain dict only by 10%, so that it is difficult to improve this. Anyway, probably you are right that it is better to make instances just dicts for consistency with other types. I did this in a new commit (this unfortunately required a bit more "magic" to be compatible with both Python 2 and 3, note some other magic already present is only because of different class machinery in Python 2 and 3).



def _check_fails(cls, other):
if sys._getframe(1).f_globals['__name__'] not in ['abc', 'functools']:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) Is there a reason that these particular modules are special-cased so that they do allow isinstance checks?

(2) Perhaps the call to sys._getframe() should be guarded when using a non-CPython implementation that doesn't support sys._getframe()? I see this is done in _TypedDictMeta.__new__ below.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(1) Yes, this is because there are several functions in these two modules that do "reckless" scan with isinstance and issubclass over .__subclasses__.
(2) Good catch. Fixed.

@ilevkivskyi
Copy link
Member Author

@davidfstr Thank you for a thorough review! I implemented your comments in a new commit.

Copy link
Contributor

@davidfstr davidfstr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more small comments and this PR looks good to me.

@@ -0,0 +1 @@
# This page intentioanlly left blank.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: intentioanlly -> intentionally

Emp = TypedDict('Emp', {'name': str, 'id': int})
self.assertIsSubclass(Emp, dict)
jim = Emp(name='Jim', id=1)
self.assertIsInstance(jim, dict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggest strengthen this check to:

self.assertTrue(type(jim) is dict)

Ditto for other occurrences below.

@ilevkivskyi
Copy link
Member Author

@davidfstr Thank you for another review! I implemented your comments (and added few more tests) in a new commit.

@davidfstr
Copy link
Contributor

Looks ready for squash-and-merge to me.

Any comments @gvanrossum or others with merge power?

@gvanrossum
Copy link
Member

gvanrossum commented Dec 12, 2016 via email

@@ -0,0 +1 @@
# This page intentionally left blank.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The addition of this file seems wrong. "extensions" is not supposed to be a Python package, it's just a folder containing the mypy_extensions module. That's why there's a setup.py here. All other code uses "from mypy_extensions import TypedDict" which implies that "mypy_extensions" is the top-level name, not "extensions". (The latter would be a terrible top-level package name since it is entirely non-specific.)

I realize that you're probably doing this so that your test code can import it, but really, the test code should somehow have this directory on its sys.path to make "from mypy_extensions import TypedDict" work instead.

And before you add "what harm can it do", there are circumstances where mypy searches up from a file until it reaches a directory without an __init__.py (or .pyi) file, and that algorithm would be affected by this addition. So please find a way to do without it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Situation was actually opposite here. I added this file because otherwise mypy was complaining. But now I see that I misunderstood the idea. I have removed this file.

@@ -7,16 +7,76 @@

# NOTE: This module must support Python 2.7 in addition to Python 3.x

import sys
import types
from typing import _type_check # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure you want to depend on this internal API? It means that if in the future we change some version of typing.py it will break with this version of mypy. This may be acceptable given our plans here, but it's still a red flag...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand this is a bad idea to use the private API, this is why I added this comment #2552 (comment) But the function is really helpful, it takes care about forward references, callables, invalid types, etc. And it is not easy to reimplement this function here, since the implementation depends on other private typing API like _ForwardRef and _TypingBase. I think it is OK to keep this. What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it's fine to keep, but please add a comment to the code for posterity. (I also worry that before we know it this will become a de-facto public API -- I've seen others ask for the same functionality, e.g. in the issue asking to restore get_type_hint() for Python 2.)

y: int
label: str

The latter syntax is only supported in Python 3.6+
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add explicitly (as a reminder) that first two syntax forms works for Python 2(.7) and 3?

_fields = kwargs
elif kwargs:
raise TypeError("Either list of fields or keywords"
" can be provided to TypedDict, not both")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a list but a dict, right? Anyway, I'd rewrite this error in the active tense, and without mentioning "fields" (which is questionable terminology here): "TypedDict takes either a dict or keyword arguments, but not both".

new_dict.__name__ = typename
new_dict.__supertype__ = dict
return new_dict
TypedDict = _TypedDictMeta('TypedDict', _TypedDict.__bases__, dict(_TypedDict.__dict__))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm guessing this awkward way of defining the class is to get around the different metaclass syntaxes for Python 2 and 3. But it makes my head explode trying to understand what's happening. I think that one of the problems is actually that TypedDict is itself a metaclass. (We see this when we realize that instances of TypedDict are really classes whose instances are dicts.) So maybe _TypedDict needs to be combined into TypedDictMeta? What tipped me off was that _TypedDict.__new__ ends up not calling object() or super().__new__() but cls.__class__(), which is really a fancy way of spelling TypedDictMeta().

except ImportError:
import collections as collections_abc # type: ignore # PY32 and earlier
from unittest import TestCase, main, skipUnless
from extensions.mypy_extensions import TypedDict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line should somehow be from mypy_extensions import TypedDict, see my comment on __init__.py.

TypedDict('Hi', [('x', int)], y=int)

@skipUnless(PY36, 'Python 3.6 required')
def test_class_syntax_usage(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add "py36" or something similar to the test case name?

@ilevkivskyi
Copy link
Member Author

@gvanrossum Thank you for a review! I implemented your comments in new commits. Also I have fixed an unguarded node look-up in process_typeddict_definition that was causing a crash. It looks like this crash is not related to #2535

Copy link
Member

@gvanrossum gvanrossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, getting closer!

@@ -7,16 +7,76 @@

# NOTE: This module must support Python 2.7 in addition to Python 3.x

import sys
import types
from typing import _type_check # type: ignore
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, it's fine to keep, but please add a comment to the code for posterity. (I also worry that before we know it this will become a de-facto public API -- I've seen others ask for the same functionality, e.g. in the issue asking to restore get_type_hint() for Python 2.)

_fields = kwargs
elif kwargs:
raise TypeError("TypedDict takes either a dict or "
"keyword arguments, but not both")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you break after the comma? (I worry about people grepping for the error message.)

anns.update(base.__dict__.get('__annotations__', {}))
tp_dict.__annotations__ = anns
if name == 'TypedDict':
tp_dict.__new__ = types.MethodType(_typeddict_new, tp_dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK the signature of MethodType is undocumented. Why not just remove the unused inst arg from _typeddict_new and _dict_new and plunk the bare function into tp_dict.__new__? It seems to work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to work.

This approach does not work in Python 2. If __new__ returns an object of "wrong" class, then it looks like isinstance() is called on it and the latter is prohibited. Anyway, I have found a much simpler way: just insert __new__ into class namespace before super().__new__ i.e. type.__new__ is called. The latter will do whatever is necessary for both Python 2 and 3.


def _check_fails(cls, other):
try:
if sys._getframe(1).f_globals['__name__'] not in ['abc', 'functools']:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm... If I comment out this line things still work, suggesting that it's not important that it returns False when invoked from those modules. I suppose this is from when this code was part of typing.py?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

False will be returned also if sys._getframe() fails in non-CPython implementations. In general I think it is better if __subclasscheck__() and __instancecheck__() will return False rather than None.

if name == 'TypedDict':
tp_dict.__new__ = types.MethodType(_typeddict_new, tp_dict)
else:
tp_dict.__new__ = types.MethodType(_dict_new, tp_dict)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, I finally remember why this is so complicated -- we must support both calling TypedDict and subclassing it. I like the new refactoring better, but maybe it's worth adding a comment explaining the situation? (The docstring is more for external use.)

@ilevkivskyi
Copy link
Member Author

@gvanrossum Thank you for another review! I implemented requested changes and added comments in code and one small test.

@gvanrossum gvanrossum merged commit a9d72e4 into python:master Dec 14, 2016
@gvanrossum
Copy link
Member

Whee! Hope @davidfstr can now start implementing some of the features implemented here. (We should probably kill some # type: ignore comments once that's done.)

@ilevkivskyi
Copy link
Member Author

I think we could split the work here. I could take care about class based syntax in Python 3.6, while @davidfstr could take care about keyword based syntax #2492

@davidfstr
Copy link
Contributor

Yep sounds good to me. I was actually planning to work on #2492 next.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants