-
Notifications
You must be signed in to change notification settings - Fork 14.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Migrate Amazon Provider Package's SecretsManagerBackend
's full_url_mode=False
implementation.
#26571
Comments
That's a lot to digest. Question before anyone dives deeply (I guess this is asking for TL;DR;). I think this is very important to decide who should focus more on it and try to provide some answers/who the consensus should be with. Q: Does this propoosal make any changes in SecretsBackend interface (common for all the Secret Backends) or can it be done exclusively in the Amazon Provider without changing Secrets Backend API backwards compatibilty? Clarifying question 1: Does the change require changing anything in any other "community" Secrets Backends? Depending on those answers, then the discussion will go though completely different flow. If the 1/2 answer is - nope, no changes needed, they will continue to work and is fully backwards compatible - then this is purely amazon provider discussion, and @ferruzzi @o-nikolas @vincbeck are probably the best "discutants" here - to decide what is the best approch and whether they are fine with cleaning the code and releasing backwards-incompatible, new major provider version (and when). Otherwise - I think this calls for a devlist discussion - at least pointing to this GitHub Issue and far bigger group of people need to be involved. Also in this case I am afraid for Airlfow 2 such changes will be impossible because of our SemVer compatibilty - mostly this is a promise to our users that there will be no breaking changes in 2.* line. And we do not know yet (or even if) we are going to have breaking Airflow 3. There are multiple reasons why we would like to not do it and we even have not opened the discussion yet. |
100% of changes I am proposing are in the Amazon provider package's (You can think of a lot of these changes I am proposing as ways to make the Amazon
No.
There are backwards incompatibilities, strictly speaking. None of these backwards incompatibilities impact the Airflow base implementation whatsoever. Only the provider package. For example, with the first stage proposal--
Overall the biggest impact of this is that a user using
To clarify, I am discussing possible breaking changes in the provider package, not base Airflow. When I say "breaking," I think most of these are ultimately minor changes, I think. They are just strictly speaking breaking, hence why I am noting them as such. |
Cool. Then I think most of the comments on those are on the side of AWS people :) |
Hey @dwreeves, thanks for this fantastically detailed write-up! Although my head is still spinning a bit, it would be worse if this post wasn't as detailed as it is, so thanks again 😃 Personally, I don't think we should be so afraid of breaking changes, especially ones that affect such small percentages of users. To me it can be more of a risk to maintain complicated code for longer periods of time and also having ambiguous behaviours for longer allows folks to happily use the old features even when they should be moving off of them. And at the end of the day, sem ver versioning does it's job, folks can stick to older versions if they really require these strange old behaviours. If some super valuable bug fix arrives in a future version, we can always think about cherry-picking it backwards using the shared governance model @potiuk outlined earlier this year. So, I'm happy to combine steps 1 and 2, and after waiting appropriately long enough, to go ahead with step 3. But I'd love to hear what others think as well. Cheers, |
Brave ! |
Sorry it's so much! I think the Examples section is the clearest explanation of what the issues are with the current And yes, it is a small # of users affected. Not the worst. And up-to-date, affected users are all getting deprecation warnings right now. The end result though is I think a much more intuitive SecretsManagerBackend... at the end of it all, a user will be able to just copy+paste any arbitrary secret that works as an environment variable, into the AWS Secrets Manager, and that will be that without any surprises. (This was the challenge I originally had that frustrated me and motivated me to work on this.) |
@dwreeves very detail explanation 👍 💯 🥇 My thought about As I understand release after October 11, 2022 would For First. We can get rid of this argument (just do something It could be archived by different ways, it might be something like this sample import json
def validate(v: str) -> str:
if not v:
return v
v = v.strip()
try:
parsed_value = json.loads(v)
except json.JSONDecodeError as ex:
if v.startswith("{") or "://" not in v:
raise ValueError(f"Not valid JSON-Object or URL-encoded string. Original error:\n * {ex}")
else:
if not isinstance(parsed_value, dict):
raise TypeError(f"Expected JSON-object, got {type(v).__name__}.")
return v
values_from_sb = [
'{"conn_type": "aws"}', # JSON Object - Valid
'aws://', # String, URL encoded - Valid
'"aws://"', # JSON String, Valid JSON object, Invalid Secret Backend
"{'conn_type': 'aws'}", # repr of dict, not failed during parsing however it failed on url-unparse
'{"conn_type": "aws"', # Invalid JSON object, missing `}`
'\t\t\t\t\t{"conn_type": "aws"}\n', # Some whitespaces
]
for value in values_from_sb:
try:
print(f"{value!r} is valid value for SB: {validate(value)!r}")
except Exception as ex:
print(f"Unable to parse: {value!r}. Original error:\n - {ex}") Second. It also might be a good idea to add ability to store as JSON not only in |
Busy last few weeks for me, but I'll be getting to this soon. |
Objective
I am following up on all the changes I've made in PR #25432 and which were originally discussed in issue #25104.
The objective of the deprecations introduced in #25432 is to flag and remove "odd" behaviors in the
SecretsManagerBackend
.The objective of this issue being opened is to discuss them, and hopefully reach a consensus on how to move forward implementing the changes.
I realize that a lot of the changes I made and their philosophy were under-discussed, so I will place the discussion here.
What does it mean for a behavior to be "odd"?
You can think of the behaviors of
SecretsManagerBackend
, and which secret encodings it supports, as a Venn diagram.Ideally,
SecretsManagerBackend
supports at least everything the base API supports. This is a pretty straightforward "principle of least astonishment" requirement.For example, it would be "astonishing" if copy+pasting a secret that works with the base API did not work in the
SecretsManagerBackend
.To be clear, it would also be "astonishing" if the reverse were not true-- i.e. copy+pasting a valid secret from
SecretsManagerBackend
doesn't work with, say, environment variables. That said, adding new functionality is less astonishing than when a subclass doesn't inherit 100% of the supported behaviors of what it is subclassing. So although adding support for new secret encodings is arguably not desirable (all else equal), I think we can all agree it's not as bad as the reverse.Examples
I will cover two examples where we can see the "Venn diagram" nature of the secrets backend, and how some behaviors that are supported in one implementation are not supported in another:
Example 1
Imagine the following environment variable secret:
Prior to #25432, this was not a secret that worked with the
SecretsManagerBackend
, even though it did work with base Airflow'sEnvironmentVariablesBackend
(as of 2.3.0) due to the values not being URL-encoded.Additionally, the
EnvironmentVariablesBackend
is smart enough to choose whether a secret should be treated as a JSON or a URI without having to be explicitly told. In a sense, this is also an incompatibility-- why should theEnvironmentVariablesBackend
be "smarter" than theSecretsManagerBackend
when it comes to discerning JSONs from URIs, and supporting both at the same time rather than needing secrets to be always one type of serialization?Example 2
Imagine the following environment variable secret:
This is not a valid secret in Airflow's base
EnvironmentVariablesBackend
implementation, although it is a valid secret inSecretsManagerBackend
.There are two things that make it invalid in the
EnvironmentVariablesBackend
but valid inSecretsManagerBackend
:ast.litera_eval
inSecretsManagerBackend
means that a Python dict repr is allowed to be read in.user
andpass
are invalid field names in the base API; these should belogin
andpassword
, respectively. But the_standardize_secret_keys()
method in theSecretsManagerBackend
implementation makes it valid.Additionally, note that this secret also parses differently in the
SecretsManagerBackend
than theEnvironmentVariablesBackend
: the password"is%20url%20encoded"
renders as"is url encoded"
in theSecretsManagerBackend
, but is left untouched by the base API when using a JSON.List of odd behaviors
Prior to #25432, the following behaviors were a part of the
SecretsManagerBackend
specification that I would consider "odd" because they are not part of the base API implementation:full_url_mode=False
still required URL-encoded parameters for JSON values.ast.literal_eval
was used instead ofjson.dumps
, which means that theSecretsManagerBackend
onfull_url_mode=False
was supporting Python dict reprs and other non-JSONs.full_url_mode=False
to determine what is a JSON or URI.get_conn_value()
always must return a URI.SecretsManagerBackend
allowed for atypical / flexible field names (such asuser
instead oflogin
) via the_standardize_secret_keys()
method.We also introduced a new odd behavior in order to assist with the migration effort, which is:
are_secret_values_urlencoded
to support secrets whose encodings are "non-idempotent".In the below sections, I discuss each behavior in more detail, and I've added my own opinions about how odd these behaiors are and the estimated adverse impact of removing the behaviors.
Behavior 1: URL-encoding JSON values
This was the original behavior that frustrated me and motivated me to open issues + submit PRs.
With the "idempotency" check, I've done my best to smooth out the transition away from URL-encoded JSON values.
The requirement is now mostly removed, to the extent that the removal of this behavior can be backwards compatible and as smooth as possible:
backend_kwargs
to haveare_secret_values_urlencoded
set.I will admit that I was surprised at how smooth we could make the developer experience around migrating this behavior for the majority of use cases.
When you consider...
.. it makes me almost want to hold off on fully removing this behavior for a little while longer, just to make sure developers read their logs and see the deprecation warning.
Behavior 2:
ast.literal_eval
for deserializing JSON secretsIt is hard to feel bad for anyone who is adversely impacted by this removal:
Providing a
DeprecationWarning
for this behavior was already going above and beyond, and we can definitely remove this soon.Behavior 3:
full_url_mode=False
is required for JSON secretsThis behavior is odd because the base API does not require such a thing-- whether it is a JSON or a URI can be inferred by checking whether the first character of the string is
{
.Because it is possible to modify this behavior without introducing breaking changes, moving from lack of optionality for the
full_url_mode
kwarg can be considered a feature addition.Ultimately, we would want to switch from
full_url_mode: bool = True
tofull_url_mode: bool | None = None
.In the proposed implementation, when
full_url_mode is None
, we just use whether the value starts with{
to check if it is a JSON. Only whenfull_url_mode
is abool
would we explicitly raise errors e.g. if a JSON was given whenfull_url_mode=True
, or a URI was given whenfull_url_mode=False
.Behavior 4:
get_conn_value()
must return URIThe idea that the callback invoked by
get_connection()
(now calledget_conn_value()
, but previously calledget_conn_uri()
) can return a JSON is a new Airflow 2.3.0 behavior.This behavior cannot change until at least October 11th, because it is required for
2.2.0
backwards compatibility. Via Airflow'sREADME.md
:Changing this behavior after October 11th is just a matter of whether maintainers are OK with introduce a breaking change to the
2.2.x
folks who are relying on JSON secrets.Note that right now,
get_conn_value()
is avoided entirely whenfull_url_mode=False
andget_connection()
is called. So although there is a deprecation warning, it is almost never hit.Behavior 5: Flexible field names via
_standardize_secret_keys()
This is one of those things that is very hard to remove. Removing it can be quite disruptive!
It is also a low priority to remove because unlike some other behaviors, it does not detract from
SecretsManagerBackend
being a "strict superset" with the base API.Maybe it will just be the case that
SecretsManagerBackend
has this incompatibility with the base API going forward indefinitely?Even still, we should consider the two following proposals:
user
should probably be switched tologin
, both in thedict[str, list[str]]
used to implement the find+replace, and also in the documentation. I do not forsee any issues with doing this.I think # 1 should be uncontroversial, but # 2 may be more controversial. I do not want to detract from my other points by staking out too firm an opinion on this one, so the best solution may simply be to not touch this for now. In fact, not touching this is exactly what I did with the original PR.
Behavior 6:
are_secret_values_urlencoded
kwargIn the original discussion #25104, @potiuk told me to add something like this. I tried my best to avoid users needing to do this, hence the "idempotency" check. So only a few users actually need to specify this to assist in the migration of their secrets.
This was introduced as a "pending" deprecation because frankly, it is an odd behavior to have ever been URL-encoding these JSON values, and it only exists as a necessity to aid in migrating secrets. In our ideal end state, this doesn't exist.
Eventually when it comes time, removing this will not be all that disruptive:
full_url_mode=False
are_secret_values_urlencoded
should be set toFalse
. Users should never be manually setting toTrue
!So we're talking about a small percent of a small minority of users who will ever see or need to set this
are_secret_values_urlencoded
kwarg. And even then, they should have setare_secret_values_urlencoded
toFalse
to assist in migrating.Proposal for Next Steps
All three steps require breaking changes.
Proposed Step 1
ast.literal_eval
for deserializing JSON secretsfull_url_mode=False
is required for JSON secretsget_conn_value()
must return URIRight now the code is frankly a mess. I take some blame for that, as I did introduce the mess. But the mess is all inservice of backwards compatibility.
Removing Behavior 4 vastly simplifies the code, and means we don't need to continue overriding the
get_connection()
method.Removing Behavior 2 also simplifies the code, and is a fairly straightforward change.
Removing Behavior 3 is fully backwards compatible (so no deprecation warnings required) and provides a much nicer user experience overall.
The main thing blocking "Proposed Step 1" is the requirement that
2.2.x
be supported until at least October 11th.Alternative to Proposed Step 1
It is possible to remove Behavior 2 and Behavior 3 without removing Behavior 4, and do so in a way that keeps
2.2.x
backwards compatibility.It will still however leave a mess of the code.
I am not sure how eager the Amazon Provider Package maintainers are to keep backwards compatibility here. Between the 1 year window, plus the fact that this can only possibly impact people using both the
SecretsManagerBackend
and who havefull_url_mode=False
turned on, it seems like not an incredibly huge deal to just scrap support for2.2.x
here when the time comes. But it is not appropriate for me to decide this, so I must be clear and say that we can start trimming away some of the odd behaviors without breaking Airflow2.2.x
backwards compatibility, and that the main benefit of breaking backwards compatibility is the source code becoming way simpler.Proposed Step 2
are_secret_values_urlencoded
kwargPersonally, I don't think we should rush on this. The reason I think we should take our time here is because the current way this works is surprisingly non-disruptive (i.e. no config changes required to migrate for most users), but fully removing the behavior may be pretty disruptive, especially to people who don't read their logs carefully.
Alternative to Proposed Step 2
The alternative to this step is to combine this step with step 1, instead of holding off for a future date.
The main arguments in favor of combining with step 1 are:
full_url_mode
and it is arguable that we're being too delicate with what was arguably a semi-experimental and odd feature to begin with; it's only become less experimental by the stroke of luck that Airflow 2.3.0 supports JSON-encoded secrets in the base API.Proposed Step 3
are_secret_values_urlencoded
kwargOnce URL-encoding is no longer happening for JSON secrets, and all non-idempotent secrets have been cast or explicitly handled, and we've deprecated everything appropriately, we can finally remove
are_secret_values_urlencoded
.Conclusion
The original deprecations introduced were under-discussed, but hopefully now you both know where I was coming from, and also agree with the changes I made.
If you disagree with the deprecations that I introduced, I would also like to hear about that and why, and we can see about rolling any of them back.
Please let me know what you think about the proposed steps for changes to the code base.
Please also let me know what you think an appropriate schedule is for introducing the changes, and whether you think I should consider one of the alternatives (both considered and otherwise) to the steps I outlined in the penultimate section.
Other stuff
Use case/motivation
(See above)
Related issues
SecretsManagerBackend()
does not need to require URL-encoded values forfull_url_mode=False
connections. #25104Are you willing to submit a PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: