Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TypedDict with keys that aren't identifiers, inheritance and WSGI #7654

Closed
shabbyrobe opened this issue Oct 8, 2019 · 5 comments
Closed

Comments

@shabbyrobe
Copy link

shabbyrobe commented Oct 8, 2019

Hello! I'm not sure if this is a bug report (probably not), support request (definitely), a request for a missing feature (probably, though what, exactly, I'm not sure), or just a bit of an "experience report", but I'm struggling a bit with TypedDict while trying to lock down our WSGI environ vars.

It seems that this use case isn't covered particularly well by the current implementation, but I'm not sure if that's just because I'm just not "holding it right". Everything was going swimmingly until I bumped into the WSGI environ keys, then things started to unravel. It doesn't seem to be possible to declare wsgi.input as a key using the class-based syntax:

class Environ(TypedDict):
    wsgi.input: str

That makes sense, wsgi.input isn't syntactically legal... So my first instinct was to quote it, just in case, but nah, "illegal target for annotation":

class Environ(TypedDict):
    "wsgi.input": str

Oh well, it was worth a try, but I knew up front it was a long shot! So then I tried out the alternative syntax, which seems to do the trick nicely:

Environ = TypedDict("Environ", {"wsgi.input": Reader})

Except now I have to find a way to represent my application-specific headers, and I'd like to keep them separate from the core definitions because I have a few applications that need to make use of them. The PEP says that there's inheritance, which would solve the problem perfectly. My first thought was that TypedDict would mirror the syntax for type() and allow me to specify bases, but another quick glance at the PEP says it's only available if I'm using the class syntax. I thought maybe I could just re-use the dict, but I ran into #4128:

_environ_keys = {"wsgi.input": Reader}
Environ = TypedDict("Environ", _environ_keys)

# BORK! TypedDict() expects a dictionary literal as the second argument
MyEnviron = TypedDict("MyEnviron", {**_environ_keys, "HTTP_X_FLEEB": Optional[str]})

That was the end of that thread for now, I went back to just rolling it all into one big TypedDict declared using the alternative syntax; I'll worry about re-use later.

The last thing I ran into was the problem with accepting "extra keys", outlined in #4617, which prevents me from being able to have a fallback "catch-all" for the HTTP_ variables documented here

Variables corresponding to the client-supplied HTTP request headers (i.e., variables whose names begin with HTTP_). The presence or absence of these variables should correspond with the presence or absence of the appropriate HTTP header in the request.

There's a lot of these, of course! It's HTTP headers we're talking about, after all! I'm not really sure if "extra keys" would solve that problem, but I thought I should add it just to make the story complete.

I'm sort of a bit stuck - I've got a lot of WSGI stuff here that I can't really do much with type-wise unless I go through and unwrap environ into a dataclass, but some of this stuff is performance sensitive (we use Django for things that aren't and plain ol' WSGI for things that are, which works great!) and a whole dataclass conversion step is not going to keep us inside our performance budgets.

I'll keep pushing it around the page for a little while longer and see what other approaches I can find but it'd be great if there was a way to support WSGI a bit more thoroughly. environ can be a bit amorphous and TypedDict could be a huge help at making things a bit more resilient and safe to work with.

Thank you!

EDIT: I did keep pushing things around the page. Here's what I currently have (still needs a lot more work, of course): https://github.com/shabbyrobe/wsgitypes. This lets me type the bulk of our WSGI applications; I'll keep adding to it as I find more bits.

I found a way around the inheritance issue: I had been presuming (incorrectly, as it turns out) that trying to inherit a TypedDict declared using the alternative syntax wouldn't work, but it works just fine! For example:

Foo = TypedDict("Foo", {"one": str})
Bar = TypedDict("Bar", {"two": str})
class Baz(Foo, Bar):
    pass
yep = {"one": "yep", "two": "yep"}

This seemed to allow the field-combining behavior I was looking for, so I'll run with it for now.

@gvanrossum
Copy link
Member

Yeah, the way to do this is to use inheritance as you wrote in your EDIT section. (That's also how we deal with things like making some fields mandatory but others not.)

I think there's nothing else to say here, so I'm closing this -- if you disagree, please say so and I'll reopen!

@shabbyrobe
Copy link
Author

shabbyrobe commented Oct 9, 2019

Thanks for the reply @gvanrossum!

I have been pushing this around a bit further today and I'm finding that the contravariance of function parameters is making my inheritance approach unworkable.

I think the problem stems from the fact that I'm trying to use typing to improve the experience for people implementing things within the WSGI framework, rather than using typing to improve the experience for people implementing a WSGI framework.

In my case, I wanted to use the generic parameter to allow using a more specific type within the callable implementation, in order to help protect coding within the bounds of the body of __call__ with some additional safety:

# wsgitypes.py
class Environ(typing_extensions.TypedDict):
    PATH_INFO: str # etc etc

TEnviron = typing.TypeVar("TEnviron", bound=Environ)

# my_app.py
class MoreSpecificEnviron(wsgitypes.Environ):
    HTTP_X_EXTRA: str

class MyWSGIApplication(wsgitypes.Application[MoreSpecificEnviron]):
    def __call__(self, environ: MoreSpecificEnviron, start_response: wsgitypes.StartResponse) -> wsgitypes.Response:
        environ.get("HTTP_X_EXTRA")
        return []

I understand the reasons why contravariance invalidates this construct, but I'm still searching for a way to support this narrowing within the confines of a WSGI application.

For the time being, I've just abandoned all genericity in wsgitypes.Application, pinning it to the base wsgitypes.Environ, and in each of my __call__ functions, I'm just doing this now:

class MyWSGIApplication(wsgitypes.Application):
    def __call__(self, environ: wsgitypes.Environ, start_response: wsgitypes.StartResponse) -> wsgitypes.Response:
        environ = t.cast(MoreSpecificEnviron, environ)
        environ.get("HTTP_X_EXTRA") # works
        return []

I suppose in hindsight, that's really not much different in terms of burden from what I was doing with the generic, so I guess I'm probably close to an optimal solution here? If so, yep I think it makes sense to close, if you feel there is nothing of value that can be taken away to improve the experience.

@gvanrossum
Copy link
Member

I'm sorry, I haven't used WSGI in ages and you left out the definitions for Application and Response, and TEnviron isn't used, so I can't quite follow your code, and I can't copy-paste it into a little test file to see what mypy makes of it.

Perhaps you can boil the problem down to a simpler example (preferably a single self-contained file, with all the imports) that still shows the error you're trying to get rid of?

Using a cast is a measure of last resort, especially if (IIUC) each app will have its own "more specific environment" and need a cast.

If the issue is that the method definition in the subclass reports the error Argument 1 of "__call__" is incompatible with supertype "Application"; supertype defines the argument type as "Environ" (or something similar), maybe you could silence that error by placing # type: ignore[override] on the line where you get that error. That's the proper way to get rid of the contravariance error, see https://mypy.readthedocs.io/en/latest/error_code_list.html#check-validity-of-overrides-override.

@shabbyrobe
Copy link
Author

shabbyrobe commented Oct 9, 2019

I'm sorry, I haven't used WSGI in ages and you left out the definitions for Application and Response, and TEnviron isn't used, so I can't quite follow your code, and I can't copy-paste it into a little test file to see what mypy makes of it.

Ah yes, I'm sorry, I dashed it out a little quickly and couldn't see that I'd left too much out. Here's a (hopefully) self-contained example of what I've been wrestling with:

from typing import cast, Any, Callable, Generic, Iterable, List, Tuple, TypeVar
from typing_extensions import Protocol, TypedDict

Headers = List[Tuple[str, str]]
StartResponse = Callable[[str, Headers], None]
Response = Iterable[bytes]

class Environ(TypedDict):
    PATH_INFO: str # etc etc

TEnviron = TypeVar("TEnviron", bound=Environ, covariant=True)

class Application(Protocol, Generic[TEnviron]):
    def __call__(self, environ: TEnviron, start_response: StartResponse) -> Response:
        ...

class MyEnviron(Environ):
    HTTP_X_EXTRA: str

class MyApplication(Application[MyEnviron]):
    def __call__(self, environ: MyEnviron, start_response: StartResponse) -> Response:
        environ.get("HTTP_X_EXTRA")
        return []

app: Application[Environ] = MyApplication()

In the above example, with contravariant=True, I get this error (v0.730.0, --strict):

v1.py:25: error: Incompatible types in assignment (expression has type "MyApplication", variable has type "Application[Environ]")

If I cheat and tell mypy it's covariant=True instead, the error where I'm using it goes away, but is replaced by this (v0.730.0, --strict):

v1.py:13: error: Covariant type variable 'TEnviron' used in protocol where contravariant one is expected
v1.py:14: error: Cannot use a covariant type variable as a parameter

All these things make sense, I am trying to step off the rails a bit here after all. The issue is just that I'm trying to find a minimally obtrusive way to allow people implementing WSGI applications according to the spec to bring some type-safety to the Environ when it is inherited. I think getting rid of the generic and using the cast might be the least-worst way to do that:

from typing import cast, Any, Callable, Generic, Iterable, List, Tuple, TypeVar
from typing_extensions import Protocol, TypedDict

Headers = List[Tuple[str, str]]
StartResponse = Callable[[str, Headers], None]
Response = Iterable[bytes]

class Environ(TypedDict):
    PATH_INFO: str # etc etc

class Application(Protocol):
    def __call__(self, environ: Environ, start_response: StartResponse) -> Response:
        ...

class MyEnviron(Environ):
    HTTP_X_EXTRA: str

class MyApplication(Application):
    def __call__(self, environ: Environ, start_response: StartResponse) -> Response:
        environ = cast(MyEnviron, environ)
        environ.get("HTTP_X_EXTRA")
        return []

app: Application = MyApplication()

From a user's point of view, the difference seems quite minimal, and it's opt-in. You don't pay for what you don't use (i.e. if you aren't inheriting Environ, it's fine as is):

# with generic:
class MyApplication(Application[MyEnviron]):
    def __call__(self, environ: MyEnviron, start_response: StartResponse) -> Response:
        ...

# with cast
class MyApplication(Application):
    def __call__(self, environ: Environ, start_response: StartResponse) -> Response:
        environ = cast(MyEnviron, environ)

Like you said earlier though, there's probably not really much else to go over with it (unless there's some obviously-more-correct, unobtrusive approach I'm missing that totally makes the problem disappear!). Thanks for taking a look at it for me though, it's much appreciated!

@gvanrossum
Copy link
Member

OK, in that case can you try my suggestion with the second variant? Declare the arg to __call__ in MyApplication as MyEnviron and suppress the error with # type: ignore[override]. (This requires mypy 0.730 or newer.) To me that seems more elegant than the cast, since it expresses the intent more clearly.

Note that presumably MyEnviron should be using total=False (see mypy docs)?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants