You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am passing values between tasks using xcom. These values are instances of a attrs class that has another attrs class as a member. The serialisation currently fails to correctly mark the inner class with the class tags to allow it to be deserialized back to the attrs class (it remains as a dict after loading into the downstream task).
What you think should happen instead?
I think that the nested attrs objects should be returned unchanged after deserialization.
How to reproduce
The value of foo passed to task2 has the inner_value as a dict rather than a InnerClass instance.
import attrs
@attrs.define(kw_only=True, frozen=True, slots=False)
class OuterClass:
inner_value: InnerClass
@attrs.define(kw_only=True, frozen=True, slots=False)
class InnerClass:
x: str
@task
def task1():
return OuterClass(inner_value=InnerClass("test!")
@task
def task2(foo: OuterClass):
print(foo) # foo here is OuterClass(inner_value={"x": "test!"})
@dag
def my_dag():
task2(task1())
Operating System
linux (standard airflow slim images extended with airflow providers running on kubernetes)
Versions of Apache Airflow Providers
defaults for 2.8.4
Deployment
Official Apache Airflow Helm Chart
Deployment details
Airflow deployment on Azure Kubernetes using postgres backend db.
Anything else?
I think a simple solution would be to update the following code in airflow.serialization.serde.py (lines 182 to 187), to change the value of recurse in the call to attr.asdict from True to False, leaving the serialization of inner classes to the subsequent call to serialize.
Before:
# attr annotated
if attr.has(cls):
# Only include attributes which we can pass back to the classes constructor
data = attr.asdict(cast(attr.AttrsInstance, o), recurse=True, filter=lambda a, v: a.init)
dct[DATA] = serialize(data, depth + 1)
return dct
After:
# attr annotated
if attr.has(cls):
# Only include attributes which we can pass back to the classes constructor
data = attr.asdict(cast(attr.AttrsInstance, o), recurse=False, filter=lambda a, v: a.init)
dct[DATA] = serialize(data, depth + 1)
return dct
Apache Airflow version
2.8.4
If "Other Airflow 2 version" selected, which one?
No response
What happened?
I am passing values between tasks using xcom. These values are instances of a attrs class that has another attrs class as a member. The serialisation currently fails to correctly mark the inner class with the class tags to allow it to be deserialized back to the attrs class (it remains as a dict after loading into the downstream task).
What you think should happen instead?
I think that the nested attrs objects should be returned unchanged after deserialization.
How to reproduce
The value of
foo
passed totask2
has theinner_value
as adict
rather than aInnerClass
instance.Operating System
linux (standard airflow slim images extended with airflow providers running on kubernetes)
Versions of Apache Airflow Providers
defaults for 2.8.4
Deployment
Official Apache Airflow Helm Chart
Deployment details
Airflow deployment on Azure Kubernetes using postgres backend db.
Anything else?
I think a simple solution would be to update the following code in
airflow.serialization.serde.py
(lines 182 to 187), to change the value ofrecurse
in the call toattr.asdict
fromTrue
toFalse
, leaving the serialization of inner classes to the subsequent call toserialize
.Before:
After:
That is
Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: