Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dataclass instead of NamedTuple for serializing AssetCondition #21147

Merged
merged 2 commits into from
Apr 11, 2024

Conversation

OwenKephart
Copy link
Contributor

Summary & Motivation

We're about to start serializing these things, so making a diving save to allow us to use the nicer dataclass API instead of NamedTuples.

In the process of doing this, I realized that we had a weird setup with a children property on the base AssetCondition class that didn't work when using dataclasses. Previously, using NamedTuples meant that the NamedTuple property management stuff would handle the case where you added an explicit children field on the child class, but dataclasses don't work in the same way.

Basically we want a few things:

  • To have a property on all AssetConditions that lists all sub-conditions (if any)
  • To not have to explicitly store a set of sub-conditions in serialized form if there aren't any (mostly a dev ergonomics thing, it's a pain to add a guaranteed-to-be-empty "children" property to all child classes)
  • To make it easy to set invariants that certain AssetConditions will only have a single child, etc.

This new system makes it so that AssetConditions that do have sub-conditions just store them in a field with a different name (currently we're just doing boolean expressions, so I've used "operand" / "operands"), and then implement an explicit "children" property on top of that field. AssetConditions without sub-conditions just use the default "children" implementation, which returns an empty list.

Overall, this setup works a lot nicer and simplifies the class definitions for all these subclasses.

How I Tested These Changes

@maximearmstrong
Copy link
Contributor

That makes sense. If we move away from NamedTuple, should we consider adopting DagsterModel instead of dataclass?

Some additional context here and here, when we did the change for AssetSelection.

Copy link
Member

@alangenfeld alangenfeld left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

children -> operands looks good

as said above, you could switch to DagsterModel - that shouldn't effect the serialization result though so can be done in this PR or after.

@OwenKephart OwenKephart merged commit 454f16f into master Apr 11, 2024
1 check passed
@OwenKephart OwenKephart deleted the owen/use-dataclass-for-asset-condition branch April 11, 2024 14:21
@OwenKephart
Copy link
Contributor Author

Thanks for the info re: DagsterModel -- I'll do that in a followup PR today, just to decouple it from the release timeline, but seems like a good idea

yuhan pushed a commit that referenced this pull request Apr 11, 2024
…21147)

## Summary & Motivation

We're about to start serializing these things, so making a diving save
to allow us to use the nicer dataclass API instead of NamedTuples.

In the process of doing this, I realized that we had a weird setup with
a `children` property on the base AssetCondition class that didn't work
when using dataclasses. Previously, using NamedTuples meant that the
NamedTuple property management stuff would handle the case where you
added an explicit `children` field on the child class, but dataclasses
don't work in the same way.

Basically we want a few things:
- To have a property on all AssetConditions that lists all
sub-conditions (if any)
- To not have to explicitly store a set of sub-conditions in serialized
form if there aren't any (mostly a dev ergonomics thing, it's a pain to
add a guaranteed-to-be-empty "children" property to all child classes)
- To make it easy to set invariants that certain AssetConditions will
only have a single child, etc.

This new system makes it so that AssetConditions that *do* have
sub-conditions just store them in a field with a different name
(currently we're just doing boolean expressions, so I've used "operand"
/ "operands"), and then implement an explicit "children" property on top
of that field. AssetConditions without sub-conditions just use the
default "children" implementation, which returns an empty list.

Overall, this setup works a lot nicer and simplifies the class
definitions for all these subclasses.

## How I Tested These Changes
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants