Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filter out default configs when overrides exist. #21539

Merged
merged 22 commits into from
Mar 15, 2022
Merged

Conversation

xyu
Copy link
Contributor

@xyu xyu commented Feb 12, 2022

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that _cmd generated secrets are not written to the files in some cases instead favoring maintaining the raw _cmd settings. Unfortunately during materializing of the configs via as_dict() Airflow defaults are generated and materialized as well including defaults for the non _cmd versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over _cmd versions it results in _cmd settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks _cmd, env, and secrets when materializing configs via as_dict() so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

closes: #20092
related: #18772 #4050


^ Add meaningful description above

Read the Pull Request Guidelines for more information.
In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.

xyu added 9 commits February 12, 2022 18:29
When sending configs to Airflow workers we materialize a temp config file. In apache#18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: apache#20092
Related to: apache#18772 apache#4050
@xyu
Copy link
Contributor Author

xyu commented Feb 19, 2022

This should be a more general solution for #21604 as well

@xyu
Copy link
Contributor Author

xyu commented Feb 21, 2022

Merged trunk in because #21664 from @potiuk was needed to fix broken tests.

@kaxil kaxil added this to the Airflow 2.2.5 milestone Feb 21, 2022
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - but I think description about this behaviour should be added to "as_dict" docstring/

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Feb 26, 2022
@github-actions
Copy link

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@potiuk
Copy link
Member

potiuk commented Feb 28, 2022

And we need another committer chiming in.

@uranusjr
Copy link
Member

It’d be nice if a docstring could be added to explain what filter_by_source does and why it is needed.

@xyu xyu requested a review from potiuk March 3, 2022 18:33
@xyu
Copy link
Contributor Author

xyu commented Mar 4, 2022

@potiuk / @uranusjr can you take a look at the docstrings I've added to see if they make sense? Also if one of you can trigger the full test run after the latest merge that would be awesome and let me know if there's anything else I should patch up with this PR to get it ready.

@potiuk
Copy link
Member

potiuk commented Mar 14, 2022

After looking agian - I think what is also missing here is a bit "user" description in the https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

I think we might have a lot of questions from users about the behaviour of config values as this is a "user" thing. I doubt our users will look in detail in "docstring" - they will mostly look at the documentation to understand how configuration precedence works, and especially - there should be a documentation explaining that it has changed in 2.3.0.

Would it be possible tha tyou add such "user" facing documentation @xyu ?

@xyu
Copy link
Contributor Author

xyu commented Mar 14, 2022

there should be a documentation explaining that it has changed in 2.3.0.

I'm not quite sure what you mean @potiuk, as you pointed out in the original issue (#20092 (comment)) the precedence for configuration options is bare setting over _cmd or secret versions. This PR filters out the bare configs if they are set to Airflow’s built in defaults which then allows the user set _cmd overrides to work again. So this is really fixing a bug not adding or changing config behavior. (I mean I guess it is changing the config behavior but it's changing the behavior to an unbroken as documented state.)

EDIT: Just to clarify with Airflow >= 2.2.1 when as_dict() was called if a "sensitive_config_values" had the bare value (e.g. sql_alchemy_conn) unset but a cmd or secret version of that set (e.g. sql_alchemy_conn_cmd) the dict produced by the as_dict() function would contain the Airflow default for the bare setting set. This in turn causes anything downstream, e.g. Airflow workers to ignore the cmd or secret version of that config and always use Airflow defaults.

@potiuk
Copy link
Member

potiuk commented Mar 14, 2022

I'm not quite sure what you mean @potiuk

I mean that we should mention it was NOT working as expected before.

@xyu
Copy link
Contributor Author

xyu commented Mar 14, 2022

Oh I see as in for a changelog? Or do you mean add another "note" box to this page?

@potiuk
Copy link
Member

potiuk commented Mar 14, 2022

Oh I see as in for a changelog? Or do you mean add another "note" box to this page?

I think would be nice to mentiont it on this page that it was working differently before 2.3.0

@xyu xyu requested a review from kaxil as a code owner March 14, 2022 21:35
@potiuk
Copy link
Member

potiuk commented Mar 15, 2022

Random environmental problems. Merging.

@potiuk potiuk merged commit e07bc63 into apache:main Mar 15, 2022
@boring-cyborg
Copy link

boring-cyborg bot commented Mar 15, 2022

Awesome work, congrats on your first merged pull request!

@ephraimbuddy ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Mar 16, 2022
ephraimbuddy pushed a commit that referenced this pull request Mar 16, 2022
* Filter out default configs when overrides exist.

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: #20092
Related to: #18772 #4050

(cherry picked from commit e07bc63)
ephraimbuddy pushed a commit that referenced this pull request Mar 20, 2022
* Filter out default configs when overrides exist.

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: #20092
Related to: #18772 #4050

(cherry picked from commit e07bc63)
ephraimbuddy pushed a commit that referenced this pull request Mar 22, 2022
* Filter out default configs when overrides exist.

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: #20092
Related to: #18772 #4050

(cherry picked from commit e07bc63)
ephraimbuddy pushed a commit that referenced this pull request Mar 22, 2022
* Filter out default configs when overrides exist.

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: #20092
Related to: #18772 #4050

(cherry picked from commit e07bc63)
ephraimbuddy pushed a commit that referenced this pull request Mar 22, 2022
* Filter out default configs when overrides exist.

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: #20092
Related to: #18772 #4050

(cherry picked from commit e07bc63)
ephraimbuddy pushed a commit that referenced this pull request Mar 22, 2022
* Filter out default configs when overrides exist.

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: #20092
Related to: #18772 #4050

(cherry picked from commit e07bc63)
ephraimbuddy pushed a commit that referenced this pull request Mar 24, 2022
* Filter out default configs when overrides exist.

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: #20092
Related to: #18772 #4050

(cherry picked from commit e07bc63)
ephraimbuddy pushed a commit that referenced this pull request Mar 26, 2022
* Filter out default configs when overrides exist.

When sending configs to Airflow workers we materialize a temp config file. In #18772 a feature was added so that `_cmd` generated secrets are not written to the files in some cases instead favoring maintaining the raw `_cmd` settings. Unfortunately during materializing of the configs via `as_dict()` Airflow defaults are generated and materialized as well including defaults for the non `_cmd` versions of some settings. And due to Airflow setting precedence stating bare versions of settings winning over `_cmd` versions it results in `_cmd` settings being discarded:
https://airflow.apache.org/docs/apache-airflow/stable/howto/set-config.html

This change checks `_cmd`, env, and secrets when materializing configs via `as_dict()` so that if the bare versions of the values is exactly the same as Airflow defaults and we have "hidden" / special versions of these configs that are trying to be set we remove the bare versions so that the correct version can be used.

Fixes: #20092
Related to: #18772 #4050

(cherry picked from commit e07bc63)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
full tests needed We need to run full set of tests for this PR to merge type:bug-fix Changelog: Bug Fixes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

PR #18772 breaks sql_alchemy_conn_cmd config
5 participants