Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix faulty executor config serialization logic #26191

Conversation

dstandish
Copy link
Contributor

@dstandish dstandish commented Sep 7, 2022

The bind processor logic had the effect of applying serialization logic multiple times, which produced an incorrect serialization output.

Resolves #26101.
Related: #24117

@dstandish dstandish requested a review from uranusjr September 7, 2022 05:36
Comment on lines 170 to 171
if isinstance(val_copy, dict) and 'pod_override' in val_copy:
val_copy['pod_override'] = BaseSerialization._serialize(val_copy['pod_override'])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be more reliable to check for __type and __var instead? (We could have that logic in serialized_objects instead as a is_serialized function)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you maybe commenting on the wrong line? here we're dealing with non-serialized object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m thinking line 170 should be

if not is_serialized(val_copy):

with the function implemented as

def is_serialized(v):
    if not isinstance(v, dict):
        return False
    return Encoding.TYPE in v and Encoding.VAR in v

This can be used to unify the logic on line 187/188.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so the thing is, re 170, i see what you are suggesting. but the reason we are checking specifically for pod_override here, is because we're not trying to serialize any and all executor configs. in general, they are simply pickled. but because of idiosyncrasies of the design of the k8s objects, it's safer to essentially convert them to json before pickling , which in effect is what the serializer does. basically, we just want to do this for k8s executor configs. if we wanted to do for all configs, your suggestion makes complete sense.

@dstandish dstandish force-pushed the fix-mutability-in-exec-config-bind-processor branch 2 times, most recently from b5feb44 to dc307c3 Compare September 7, 2022 19:24
The bind processor logic had the effect of applying serialization logic multiple times, which produced an incorrect serialization output.

Resolves apache#26101.
@dstandish dstandish force-pushed the fix-mutability-in-exec-config-bind-processor branch from dc307c3 to 62c7999 Compare September 7, 2022 20:27
@jedcunningham jedcunningham added the type:bug-fix Changelog: Bug Fixes label Sep 7, 2022
@jedcunningham jedcunningham added this to the Airflow 2.4.0 milestone Sep 7, 2022
@jedcunningham jedcunningham merged commit 87108d7 into apache:main Sep 7, 2022
@jedcunningham jedcunningham deleted the fix-mutability-in-exec-config-bind-processor branch September 7, 2022 22:44
jedcunningham pushed a commit to astronomer/airflow that referenced this pull request Oct 4, 2022
The bind processor logic had the effect of applying serialization logic multiple times, which produced an incorrect serialization output.

Resolves apache#26101.

(cherry picked from commit 87108d7)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Kubernetes Invalid executor_config, pod_override filled with Encoding.VAR
3 participants