-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert AssetSelection subclasses from NamedTuples to Pydantic #19197
Convert AssetSelection subclasses from NamedTuples to Pydantic #19197
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hallelujah!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a relatively light weight back-compat test would be to get a serialized string for one/some of these in master
and check it in directly to a test ensuring it loads successfully. This can de done with serialize_value
and deserialize_value
from dagster._serdes
edit: There are general tests for NamedTuple
-> dataclass
in the serdes library, so adding a specific test here isn't necessary, just some extra validation and maybe a good learning exercise.
@@ -1,8 +1,9 @@ | |||
import collections.abc | |||
import operator | |||
from abc import ABC, abstractmethod | |||
from dataclasses import dataclass, replace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could consider using pydantic
dataclass
if we want these to be run time type checked
https://docs.pydantic.dev/latest/concepts/dataclasses/
but since it appears the original NamedTuple
variants did not have a custom __new__
with checks that would be new additional behavior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought it was a good idea and wanted to do it, but I seems pydantic
doesn't support replace
, so I think for this use case we can keep dataclasses
which does.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would model_copy
work? pydantic/pydantic#3352
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We support both pydantic 1 & 2 and those comments make it seem like model_copy
is 2 only, something to double check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, interesting. That would most likely work. I will implement it before merging.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like copy
-> model_copy
was a name change in 1 -> 2
https://docs.pydantic.dev/latest/migration/#changes-to-pydanticbasemodel
so I think we would need to handle that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we could write a wrapper
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After a few attempts to properly handle inheritance and frozen classes, the code and PR description are updated. @alangenfeld @sryza, lmk what you think!
…rom-namedtuples-to-dataclasses
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe worth adding a test that demonstrates the new runtime type checking that this adds
@@ -396,9 +402,16 @@ def from_coercible(cls, selection: CoercibleToAssetSelection) -> "AssetSelection | |||
def to_serializable_asset_selection(self, asset_graph: AssetGraph) -> "AssetSelection": | |||
return AssetSelection.keys(*self.resolve(asset_graph)) | |||
|
|||
def replace(self, update: dict): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could consider having this be **kwargs
if we wanted to not have to change the replace
callsite args
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 37aea77
@alangenfeld Good call, done in b9db7f0 |
…rom-namedtuples-to-dataclasses
Summary & Motivation
This PR updates
AssetSelection
and its subclasses to inherit frompydantic.BaseModel
instead ofNamedTuple
.NamedTuple._replace()
was replaced byAssetSelection.replace()
, a wrapper forcopy
(pydantic v1) andmodel_copy
(pydantic v2)After several attempts, to make this work
AssetSelection
must inherit frompydantic.BaseModel
frozen=True
, which allows all of them to be immutable, likeNamedTuple
.Because of that, this PR udpates:
asset_selection.py
to reflect the changes.DbtManifestAssetSelection
indagster-dbt
, which inherits fromAssetSelection
. It is now immutable.pydantic.BaseModel
, instantiating new objects require keyword-only arguments except if__init__()
is redefined for the class, so the code was updated to use the keywords.How I Tested These Changes
BK