Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

type Serializer as generic #374

Merged
merged 1 commit into from
Apr 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 50 additions & 9 deletions src/itsdangerous/serializer.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,13 +10,20 @@
from .signer import _make_keys_list
from .signer import Signer

_TAnyStr = t.TypeVar("_TAnyStr", str, bytes, covariant=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_TAnyStr = t.TypeVar("_TAnyStr", str, bytes, covariant=True)

I would get rid of this, just use t.AnyStr, this actually needs to be invariant. Alternatively you could replace it with a typing_extensions.TypeVar and set default=str instead of covariant=True, that may lead to more robust results compared to relying on just the overloads on __init__.

Copy link
Member Author

@davidism davidism Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got errors from mypy that it had to be covariant for class Serializer(t.Generic[t.AnyStr])

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was because you weren't using it for the payload argument.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After testing again I'm not getting a covariance error from mypy anymore when using AnyStr, not sure what was going on before.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For Protocol mypy does a variance calculation and gives an error if the variance doesn't match. Since you previously only used _TAnyStr in return type annotations it was calculated as covariant, as soon as you added an argument that uses the same TypeVar the variance calculation changed to invariant.

That's why it's sometimes dangerous to re-use the same TypeVar between a Protocol and a Generic, since the variance may be wrong for the Generic, but mypy does not do a variance calculation for Generic. PEP-695 should make this less of a problem in the future, since type vars will generally be scoped to the generic itself, rather than in global scope and all they use auto-variance by default, so you don't have to think about it.


def is_text_serializer(serializer: t.Any) -> bool:

class _PDataSerializer(t.Protocol[_TAnyStr]):
def loads(self, payload: str | bytes) -> t.Any: ...
def dumps(self, obj: t.Any, **kwargs: t.Any) -> _TAnyStr: ...
Comment on lines +16 to +18
Copy link
Contributor

@Daverball Daverball Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class _PDataSerializer(t.Protocol[_TAnyStr]):
def loads(self, payload: str | bytes) -> t.Any: ...
def dumps(self, obj: t.Any, **kwargs: t.Any) -> _TAnyStr: ...
class _PDataSerializer(t.Protocol[t.AnyStr]):
def loads(self, payload: t.AnyStr, /) -> t.Any: ...
def dumps(self, obj: object, /) -> t.AnyStr: ...

You are providing an upper bound here, so you don't want to make the argument list too strict.

For loads you can't use str | bytes because parameters are contravariant, so that would require each serializer to accept both str and bytes. Exactly because parameters are contravariant you can satisfy a function that expects str with a supertype of str, such as a union with bytes.

As for dumps, IIRC unless you specifically use *args: Any, **kwargs: Any neither mypy nor pyright will treat this as a gradual signature, that allows serializers with arbitrary extra arguments, it will instead require each serializer function to take arbitrary arguments. This means you will have to ignore a type error when you pass in serializer_kwargs, but that's a hit you will have to take until we have something like a TypeVarMapping to pass just kwargs along, ParamSpec doesn't really work here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, after making these changes mypy does accept json and pickle as valid.

Copy link
Member Author

@davidism davidism Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pyright is still not happy though. It still insists that modules do not satisfy the protocol.



def is_text_serializer(serializer: _PDataSerializer[t.Any]) -> bool:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could make this a TypeGuard, although it's probably better to wait for TypeIs support to arrive, since that will yield better results. Although you could make it TypeGuard[_PDataSerializer[str]] for now and add a comment to change it to TypeIs later.

"""Checks whether a serializer generates text or binary."""
return isinstance(serializer.dumps({}), str)


class Serializer:
class Serializer(t.Generic[_TAnyStr]):
"""A serializer wraps a :class:`~itsdangerous.signer.Signer` to
enable serializing and securely signing data other than bytes. It
can unsign to verify that the data hasn't been changed.
Expand Down Expand Up @@ -71,7 +78,7 @@ class Serializer:
#: The default serialization module to use to serialize data to a
#: string internally. The default is :mod:`json`, but can be changed
#: to any object that provides ``dumps`` and ``loads`` methods.
default_serializer: t.Any = json
default_serializer: _PDataSerializer[_TAnyStr] = json # type: ignore[assignment]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
default_serializer: _PDataSerializer[_TAnyStr] = json # type: ignore[assignment]
default_serializer: _PDataSerializer[str] = json

This seems more sane to me, if you're worried about subclasses providing a bytes serializer as default, you could change it to _PDataSerializer[t.Any] or a very long-winded t.Union[_PDataSerializer[str], _PDataSerializer[bytes]]. As a class attribute it shouldn't really be generic, that's prone to weird errors.


#: The default ``Signer`` class to instantiate when signing data.
#: The default is :class:`itsdangerous.signer.Signer`.
Expand All @@ -82,11 +89,43 @@ class Serializer:
dict[str, t.Any] | tuple[type[Signer], dict[str, t.Any]] | type[Signer]
] = []

# Tell type checkers that the default type is Serializer[str] if no
# data serializer is provided.
@t.overload
def __init__(
self: Serializer[str],
secret_key: str | bytes | cabc.Iterable[str] | cabc.Iterable[bytes],
salt: str | bytes | None = b"itsdangerous",
serializer: None = None,
serializer_kwargs: dict[str, t.Any] | None = None,
signer: type[Signer] | None = None,
signer_kwargs: dict[str, t.Any] | None = None,
fallback_signers: list[
dict[str, t.Any] | tuple[type[Signer], dict[str, t.Any]] | type[Signer]
]
| None = None,
): ...

@t.overload
def __init__(
self: Serializer[_TAnyStr],
secret_key: str | bytes | cabc.Iterable[str] | cabc.Iterable[bytes],
salt: str | bytes | None = b"itsdangerous",
serializer: _PDataSerializer[_TAnyStr] = ...,
serializer_kwargs: dict[str, t.Any] | None = None,
signer: type[Signer] | None = None,
signer_kwargs: dict[str, t.Any] | None = None,
fallback_signers: list[
dict[str, t.Any] | tuple[type[Signer], dict[str, t.Any]] | type[Signer]
]
| None = None,
): ...

def __init__(
self,
secret_key: str | bytes | cabc.Iterable[str] | cabc.Iterable[bytes],
salt: str | bytes | None = b"itsdangerous",
serializer: t.Any = None,
serializer: _PDataSerializer[_TAnyStr] | None = None,
serializer_kwargs: dict[str, t.Any] | None = None,
signer: type[Signer] | None = None,
signer_kwargs: dict[str, t.Any] | None = None,
Expand All @@ -111,7 +150,7 @@ def __init__(
if serializer is None:
serializer = self.default_serializer

self.serializer: t.Any = serializer
self.serializer: _PDataSerializer[_TAnyStr] = serializer
self.is_text_serializer: bool = is_text_serializer(serializer)

if signer is None:
Expand All @@ -135,7 +174,9 @@ def secret_key(self) -> bytes:
"""
return self.secret_keys[-1]

def load_payload(self, payload: bytes, serializer: t.Any | None = None) -> t.Any:
def load_payload(
self, payload: bytes, serializer: _PDataSerializer[_TAnyStr] | None = None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self, payload: bytes, serializer: _PDataSerializer[_TAnyStr] | None = None
self, payload: bytes, serializer: _PDataSerializer[t.Any] | None = None

The type of the passed in serializer doesn't matter, since it is re-checked.

) -> t.Any:
"""Loads the encoded object. This function raises
:class:`.BadPayload` if the payload is not valid. The
``serializer`` parameter can be used to override the serializer
Expand Down Expand Up @@ -199,7 +240,7 @@ def iter_unsigners(self, salt: str | bytes | None = None) -> cabc.Iterator[Signe
for secret_key in self.secret_keys:
yield fallback(secret_key, salt=salt, **kwargs)

def dumps(self, obj: t.Any, salt: str | bytes | None = None) -> str | bytes:
def dumps(self, obj: t.Any, salt: str | bytes | None = None) -> _TAnyStr:
"""Returns a signed string serialized with the internal
serializer. The return value can be either a byte or unicode
string depending on the format of the internal serializer.
Expand All @@ -208,9 +249,9 @@ def dumps(self, obj: t.Any, salt: str | bytes | None = None) -> str | bytes:
rv = self.make_signer(salt).sign(payload)

if self.is_text_serializer:
return rv.decode("utf-8")
return rv.decode("utf-8") # type: ignore[return-value]

return rv
return rv # type: ignore[return-value]

def dump(self, obj: t.Any, f: t.IO[t.Any], salt: str | bytes | None = None) -> None:
"""Like :meth:`dumps` but dumps into a file. The file handle has
Expand Down
4 changes: 3 additions & 1 deletion src/itsdangerous/timed.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,8 @@
from .serializer import Serializer
from .signer import Signer

_TAnyStr = t.TypeVar("_TAnyStr", str, bytes, covariant=True)

Comment on lines +20 to +21
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
_TAnyStr = t.TypeVar("_TAnyStr", str, bytes, covariant=True)

Same thing here


class TimestampSigner(Signer):
"""Works like the regular :class:`.Signer` but also records the time
Expand Down Expand Up @@ -166,7 +168,7 @@ def validate(self, signed_value: str | bytes, max_age: int | None = None) -> boo
return False


class TimedSerializer(Serializer):
class TimedSerializer(Serializer[_TAnyStr]):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class TimedSerializer(Serializer[_TAnyStr]):
class TimedSerializer(Serializer[t.AnyStr]):

"""Uses :class:`TimestampSigner` instead of the default
:class:`.Signer`.
"""
Expand Down
9 changes: 5 additions & 4 deletions src/itsdangerous/url_safe.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,17 +7,18 @@
from .encoding import base64_decode
from .encoding import base64_encode
from .exc import BadPayload
from .serializer import _PDataSerializer
from .serializer import Serializer
from .timed import TimedSerializer


class URLSafeSerializerMixin(Serializer):
class URLSafeSerializerMixin(Serializer[str]):
"""Mixed in with a regular serializer it will attempt to zlib
compress the string to make it shorter if necessary. It will also
base64 encode the string so that it can safely be placed in a URL.
"""

default_serializer = _CompactJSON
default_serializer: _PDataSerializer[str] = _CompactJSON

def load_payload(
self,
Expand Down Expand Up @@ -68,14 +69,14 @@ def dump_payload(self, obj: t.Any) -> bytes:
return base64d


class URLSafeSerializer(URLSafeSerializerMixin, Serializer):
class URLSafeSerializer(URLSafeSerializerMixin, Serializer[str]):
"""Works like :class:`.Serializer` but dumps and loads into a URL
safe string consisting of the upper and lowercase character of the
alphabet as well as ``'_'``, ``'-'`` and ``'.'``.
"""


class URLSafeTimedSerializer(URLSafeSerializerMixin, TimedSerializer):
class URLSafeTimedSerializer(URLSafeSerializerMixin, TimedSerializer[str]):
"""Works like :class:`.TimedSerializer` but dumps and loads into a
URL safe string consisting of the upper and lowercase character of
the alphabet as well as ``'_'``, ``'-'`` and ``'.'``.
Expand Down