Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NamedTuple to dictionary serialization #425

Closed
karlicoss opened this issue Sep 11, 2023 · 10 comments · Fixed by #549
Closed

NamedTuple to dictionary serialization #425

karlicoss opened this issue Sep 11, 2023 · 10 comments · Fixed by #549
Milestone

Comments

@karlicoss
Copy link

  • cattrs version: 23.1.2
  • Python version: 3.10
  • Operating System: Ubuntu 22.04

Description

For my project, cachew, I've been trying to switch from using my custom serialization cold to some existing library.
After some evaluation, cattrs seems like the most promising since it works so well with python typing annotations.
One thing that stops me is that seems like it doesn't support NamedTuple? I feel like it should be treated just like a @dataclass, but seems like cattrs just treats them as 'unknown' types and leaves intact.

What I Did

This is kinda the minimal snippet that shows the issue:

from dataclasses import dataclass
from typing import NamedTuple

import cattrs

@dataclass
class D:
    value: int


class N(NamedTuple):
    value: int


und = cattrs.unstructure(D(value=123), D)
unn = cattrs.unstructure(N(value=123), N)

print(und, type(und))
print(unn, type(unn))

The result is

{'value': 123} <class 'dict'>
N(value=123) <class '__main__.N'>

So the dataclass is serialized just as expected, but the namedtuple is just left intact.

I searched in cattrs source code, and it seems like NamedTuple isn't mentioned at all -- so perhaps it was just forgotten rather than deliberately left out?
I know I can implement a custom adapter, but feels like it makes sense to have them treated as dataclasses by default?

@Tinche
Copy link
Member

Tinche commented Sep 11, 2023

Howdy,

I think it's a combination of nobody having asked for it and supporting libraries probably being able to handle named tuples natively.

Most of the time, output from cattrs is directly piped to another library, like orjson or msgpack, and those probably support named tuples already (since they are tuples)?

Although now that I think about it, the current approach is inadequate because it will not unstructure deeply. So yeah, agreed on it needing to be fixed. A little bit unsure on how to unstructure named tuples though, my first instinct would be to unstructure to lists (like ordinary tuples), not dicts.

@karlicoss
Copy link
Author

Ah thanks, good to know! IMO they are more like dataclass rather than plain tuples since they are typed properly and have field names. But perhaps this could be configurable if people prefer 'legacy' behaviour for some reason.

@Tinche Tinche added this to the 24.1 milestone Nov 19, 2023
@gorkaerana
Copy link

gorkaerana commented Nov 29, 2023

Hi,

I'd like to contribute to the implementation of this.

A "standard library" way to serialize a NamedTuple would be something like the below, which I adapted from [1] and [2]. (typing.NamedTuple is implemented as a function and so isinstance(a_named_tuple, typing.NamedTuple) throws a TypeError.)

def is_namedtuple(x):
    type_ = type(x)
    bases = type_.__bases__
    if len(bases) != 1 or bases[0] != tuple:
        return False
    fields = getattr(type_, '_fields', None)
    if not isinstance(fields, tuple):
        return False
    return all(isinstance(f, str) for f in fields)

def unstructure(obj):
    if isinstance(obj, dict):
        return {key: unstructure(value) for key, value in obj.items()}
    elif isinstance(obj, list):
        return list(map(unstructure, obj))
    elif is_namedtuple(obj):
        return {key: unstructure(value) for key, value in obj._asdict().items()}
    elif isinstance(obj, tuple):
        return tuple(map(unstructure, obj))
    else:
        return obj

I would suggest implementing the following way:

  1. Add is_namedtuple to src/_compat.py.
  2. Add gen_unstructure_namedtuple to cattrs.Converter.
  3. Add self.register_unstructure_hook_factory(is_namedtuple, self.gen_unstructure_namedtuple) to cattrs.Converter._init_.
  4. Add tests and documentation.

I would very happily put the work detailed in the list in a PR. Do let me know if you'd approach this any other way :)

References:

  1. https://stackoverflow.com/questions/2166818/how-to-check-if-an-object-is-an-instance-of-a-namedtuple
  2. https://stackoverflow.com/questions/33181170/how-to-convert-a-nested-namedtuple-to-a-dict?noredirect=1&lq=1

@Tinche
Copy link
Member

Tinche commented Nov 30, 2023

Interesting!

A couple of points.

I think we will want to unstructure namedtuples into tuples for backwards compatibility. I see your code unstructures them into dictionaries. This is also a useful feature, but it should be a strategy instead. Then anyone can apply this strategy to any converter and get the behavior.

My first instinct would be to try to reuse the gen.make_dict_unstructure_fn functionality. It already supports attrs classes and dataclasses, and I think it can be adapted for NamedTuples easily. This would require a slight refactor, probably splitting part of that function into a private one that takes a list[Attribute]. Using this shares code and will be fast.

Then, write an adapter that can transform a NamedTuple class (not instance) into a list of attributes, like this.

Then, in the strategy, override the hooks to generate the proper unstructuring functions.

That's how I would go about it.

Happy to provide guidance if you decide to try!

@gorkaerana
Copy link

Sounds good! How about I give it a try in a PR and we take it from there?

@Tinche
Copy link
Member

Tinche commented Jan 23, 2024

Go for it!

@Tinche
Copy link
Member

Tinche commented Jan 28, 2024

Howdy, I've just merged #491. This adds the namedtuples-as-tuples functionality to cattrs. I'll leave this issue open while other folks work on a strategy for namedtuples-as-dataclasses.

@Tinche Tinche changed the title NamedTuple aren't serialised by cattrs? NamedTuple to dictionary serialization Apr 18, 2024
@Tinche Tinche linked a pull request Jun 18, 2024 that will close this issue
@gnzsnz
Copy link

gnzsnz commented Jun 23, 2024

this is fantastic!

do you plan a new release to include namedtuples?

@Tinche
Copy link
Member

Tinche commented Jun 23, 2024

Yep, finishing it up now. The only thing left is to rework the readme

@gnzsnz
Copy link

gnzsnz commented Jul 23, 2024

Yep, finishing it up now. The only thing left is to rework the readme

I know that I'm a pain in the ... but I'm looking forward to have namedtuples in cattrs.

I'm happy to contribute with any changes needed, just let me know.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants