Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Recognizing dictionary-like objects #2120

Closed
davidfstr opened this issue Sep 11, 2016 · 6 comments
Closed

Recognizing dictionary-like objects #2120

davidfstr opened this issue Sep 11, 2016 · 6 comments

Comments

@davidfstr
Copy link
Contributor

The project I am currently working on, which I am starting to use mypy to typecheck, makes occasional use of what I call "dictionary-like objects". Namely JSON-compatible structures with specific types. For example:

def _create_action(target_type, target_pk, action_type, action_params):
    # Action - A command that can be performed on the planner calendar.
    return dict(
        target_type=target_type,  # one of ['slot', 'day', 'date', 'global']
        target_pk=target_pk,  # int
        action_type=action_type,  # str
        action_params=action_params  # list[Any]
    )

My understanding is that there is no way in mypy to spell the type for a dictionary-like object such as Action above. Instead the best I could do is Mapping[str, Any] or object which is too broad to be useful. All reads of an Action must use a one of the standard keys (target_type, target_pk, action_type, action_params) and the value type for each particular key is known in advance.

It would be nice to annotate the type of Action using something like:

from typing import NamedDict

Action = NamedDict([
    ('target_type', str),  # one of ['slot', 'day', 'date', 'global']
    ('target_pk', int),
    ('action_type', str),
    ('action_params', list[Any]),
])
Action.__doc__ = 'A command that can be performed on the planner calendar.'

def _create_action(target_type, target_pk, action_type, action_params):
    return Action(
        target_type=target_type,
        target_pk=target_pk,
        action_type=action_type,
        action_params=action_params
    )

Thoughts? Is this kind of typed dictionary actually understood by mypy and I've just missed it? Is this a missing capability that has been brought up before?

@elazarg
Copy link
Contributor

elazarg commented Sep 11, 2016

NamedTuple is pretty close:

from typing import NamedTuple

Action = NamedTuple("Action", [
    ('target_type', str),  # one of ['slot', 'day', 'date', 'global']
    ('target_pk', int),
    ('action_type', str),
    ('action_params', list),
])
Action.__doc__ = 'A command that can be performed on the planner calendar.'

_create_action is simply Action.

If you want to get the actual dictionary (for serialization) you have _asdict() which admittedly loses type information.

@elazarg
Copy link
Contributor

elazarg commented Sep 11, 2016

If you really want NamedDict with arbitrary string access, you will need dependent types. See #2115

If you only want access with string literals, you still have something similar to requiring mypy to understand y in x = (1, 'int'); y = x * 2 as Tuple[int, str, int, str]. It does not.

@davidfstr
Copy link
Contributor Author

davidfstr commented Sep 11, 2016

In my case I'd only be advocating for a NamedDict that allows access by string literals only.

To elaborate more on my use case, I have some data structures that spend most of their life either (1) in a JSON-field in a database or (2) being serialized to and from a JavaScript frontend, which is why they are defined directly as dict objects with specific keys rather than as something nicer like namedtuple (which would actually enforce correct key usage at runtime).

So I'm looking for a way to preserve the value as a dict at runtime but get type checking for key accesses by a type checker.

Usage of NamedTuple and _asdict(), as you point out, is very close, but I'd like to avoid rewriting all of my prior code that assumes dictionaries with d['key'] accesses rather than namedtuples with d.key accesses. It would also be nice if the abstraction supports zero-cost conversion to/from plain dict objects: I receive dicts directly from json.loads rather than namedtuples, and I like being able to serialize directly with json.dumps rather than needing to call something like _asdict() on everything.


For further illustration, here are some things I'd love to be able to do:

def create_action() -> Action:
    return Action(
        target_type='slot'
        target_pk=1
        action_type='move_down'
        action_params=[3]
    )  # erases to dict at runtime; type checker can prove all keys are safe

def save_action(model: SomeModel, action: Action):
    model.last_action = json.dumps(action)  # at runtime Action is actually a dict
    model.save()

def load_action(model: SomeModel) -> Action:
    return cast(Action, json.loads(model.last_action))  # at runtime Action is actually a dict

def manipulate_action_example(action: Action):
    action['target_type'] = 'slot'  # type checker accepts 'target_type' as key access

    key = 'target_type'
    action[key] = 'slot'  # type checker likely cannot determine whether valid; maybe warn?

    action['target_typ'] = 'slot'  # type checker rejects provably misspelled key

def inspect_action_example(action: Action):
    assert isinstance(action, dict)
    assert isinstance(action, Action)  # runtime accepts; type checker might warn or error

# A command that can be performed on the planner calendar.
Action = NamedDict('Action', [  # erases to be just a plain dict at runtime
    ('target_type', str),  # one of ['slot', 'day', 'date', 'global']
    ('target_pk', int),
    ('action_type', str),
    ('action_params', list),
], dict)  # optional last param specifies the underlying mapping type; default is dict

I know that I've been using the spelling NamedDict above but something like DictObject or TypedDict might be more appropriate to distance it from NamedTuple since NamedDict does not enforce correct key usage at runtime.


If you only want access with string literals, you still have something similar to requiring mypy to understand y in x = (1, 'int'); y = x * 2 as Tuple[int, str, int, str]. It does not.

I'm afraid I don't follow this comment at all. I've only been talking about dictionaries so far, not tuples or tuple manipulations.

@elazarg
Copy link
Contributor

elazarg commented Sep 11, 2016

Regarding the type definition, you basically want a type that is parameterized, in addition to types, by string literals. Currently the only parameterization that's not only types is that of NamedTuple (and class, if you wish). I think that such an addition to the language should be be discussed in typeshed, python-ideas mailing list or similar.

My last comment was about type checking process; I am sorry about the telegraphic phrasing. Let me elaborate. In order to support checking action['target_typ'] = 'slot' there are 3 possibilities:

  1. Support indexing by arbitrary strings, where the resulting type should depend on the value - this is called dependent-typing, and you've made it clear that it is not proposed.
  2. Support indexing by compile-time constant strings. In such case you'd need constant propagation (which mypy does not do yet) in addition to (3)
  3. Support indexing by string literals, as you suggested. This is reasonable (given the existence of the NamedDict type-constructor) and is similar to supporting setattr('target_typ', 'slot'). For now it has to be special cased in the checker, in similar way to the special casing of tuple access. My example referred to a related (and simpler to implement) special casing - multiplying an instance of Tuple[int, str] by the integer literal 2, which can be inferred as Tuple[int, str, int, str] but is not.

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 11, 2016

There's an existing issue in the typing repo for this (python/typing#28) and also a mypy issue (#985).

Special casing this in the type checker would be possible, similar to special casing tuples. As @elazarg pointed out, this would be harder to do for non-literal keys.

I've thought about this in the past, and the design and implementation wouldn't be simple. Some challenges are mentioned in the issue above. The feature could be pretty useful, though.

@JukkaL
Copy link
Collaborator

JukkaL commented Sep 11, 2016

Closing as dupe. You can continue the discussion in the issues mentioned above -- the typing issue could be better for more general things such as type syntax, while the mypy issue is good for technical details.

@JukkaL JukkaL closed this as completed Sep 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants