Support encryption for triggers parameters #36492

hussein-awala · 2023-12-29T23:49:37Z

This PR implements a way to encrypt some of the trigger parameters using fernet before storing them in the database.

potiuk

LGTM - but let's wait for others to review as well.

dirrao

Nice work.
Can't we access the variables and connections from the trigger run?
What is the recommended pattern for accessing the sensitive information from the trigger run?

potiuk · 2023-12-30T06:10:26Z

Nice work. Can't we access the variables and connections from the trigger run? What is the recommended pattern for accessing the sensitive information from the trigger run?

We can - when we instantiate trigger, not when it runs (when it runs it runs in async-io event loop and it can only access disk, networking and any other resources via asynchronous calls - which means that accessign them has to follow async/await semantics. And When Trigger is deserialized by the Triggerer, the deserializing does not run regular __init__ calll on the Trigger class - it just deserializes it from the DB.

And yes - we could potentially implement some _post_deserialize call to retrieve sensitive information from secrets or variables - however in a number of cases this means that Triggerer would have to access some authentication information to retrieve it. For example it would have to have the same "account" credentials that worker has or sometimes it might mean that it would have to have access to (say) same kubernetes config or .aws or .gcs authentication in order to retrieve that data.

While it is possible, it adds additional requirement on triggerer process - it would have to have the same authentication information set in its environment as worker has. But this is not necessary in most cases - as long as "serialization/deserialization" has all the necessary information.

dirrao · 2023-12-30T07:00:59Z

Nice work. Can't we access the variables and connections from the trigger run? What is the recommended pattern for accessing the sensitive information from the trigger run?

We can - when we instantiate trigger, not when it runs (when it runs it runs in async-io event loop and it can only access disk, networking and any other resources via asynchronous calls - which means that accessign them has to follow async/await semantics. And When Trigger is deserialized by the Triggerer, the deserializing does not run regular __init__ calll on the Trigger class - it just deserializes it from the DB.

And yes - we could potentially implement some _post_deserialize call to retrieve sensitive information from secrets or variables - however in a number of cases this means that Triggerer would have to access some authentication information to retrieve it. For example it would have to have the same "account" credentials that worker has or sometimes it might mean that it would have to have access to (say) same kubernetes config or .aws or .gcs authentication in order to retrieve that data.

While it is possible, it adds additional requirement on triggerer process - it would have to have the same authentication information set in its environment as worker has. But this is not necessary in most cases - as long as "serialization/deserialization" has all the necessary information.

Thanks for the detailed explanation. We usually access secrets from either disk or connections or encrypted variables in airflow. We can access the same in the trigger while it is running the event loop. Check the same in the below S3 Trigger. We usually don't generate new secrets at runtime in the worker and share them with trigger

airflow/airflow/providers/amazon/aws/triggers/s3.py

Line 86 in 2bc34ff

async with self.hook.async_conn as client:

potiuk · 2023-12-30T07:30:56Z

We usually access secrets from either disk or connections or encrypted variables in airflow

Correct. This is possible, no doubt about it.

But it might quite a bit complex for a number of Triggers. I am not sure how the async_hook works. But if you would like to use Airflow's Connection (either from the DB or from Secrets) I think it would not work (It would need async-aware access to the DB or Secrets). I would love to see how it can work in this case ? Could you please explain it - maybe I am missing something? I believe it uses the built-in async aiobotocore credental/session retrieval but that would not use Airfow's Connections/Secrets? Or am I wrong ?

Also in some cases (like for example when authentication information is coming from externally mounted volumes) is simply not available. Good exmple is when you have a managed service where the authentication information is available on a mounted volume, but the volume is not mounted in Triggerer. . Yes. This is a limitation of this particular service, but it also shows a more general problem where Triggerer would need generally the same auth environment that worker (which might not be desired or possible).

dirrao · 2023-12-30T07:55:57Z

We usually access secrets from either disk or connections or encrypted variables in airflow

Correct. This is possible, no doubt about it.

But it might quite a bit complex for a number of Triggers. I am not sure how the async_hook works. But if you would like to use Airflow's Connection (either from the DB or from Secrets) I think it would not work (It would need async-aware access to the DB or Secrets). I would love to see how it can work in this case ? Could you please explain it - maybe I am missing something? I believe it uses the built-in async aiobotocore credental/session retrieval but that would not use Airfow's Connections/Secrets? Or am I wrong ?

Yes. You are right. I will verify and update you on the DB access.

Also in some cases (like for example when authentication information is coming from externally mounted volumes) is simply not available. Good exmple is when you have a managed service where the authentication information is available on a mounted volume, but the volume is not mounted in Triggerer. . Yes. This is a limitation of this particular service, but it also shows a more general problem where Triggerer would need generally the same auth environment that worker (which might not be desired or possible).

Yes. It make sense.

airflow/models/trigger.py

Testing conversation resolution

hussein-awala · 2023-12-30T13:05:14Z

Merging is blocked
The base branch requires all conversations on code to be resolved.

🎉

potiuk · 2023-12-30T13:06:16Z

And resolving the conversation enables merging. All cool :)

dirrao · 2023-12-30T13:33:18Z

We usually access secrets from either disk or connections or encrypted variables in airflow

Correct. This is possible, no doubt about it.
But it might quite a bit complex for a number of Triggers. I am not sure how the async_hook works. But if you would like to use Airflow's Connection (either from the DB or from Secrets) I think it would not work (It would need async-aware access to the DB or Secrets). I would love to see how it can work in this case ? Could you please explain it - maybe I am missing something? I believe it uses the built-in async aiobotocore credental/session retrieval but that would not use Airfow's Connections/Secrets? Or am I wrong ?

Yes. You are right. I will verify and update you on the DB access.

I have tried accessing the variables from the trigger run function multiple times after multiple context switches in event loop. I can access the variable values without any issues. I am sure we can access the connection details similarly. So, that we can avoid reshaping complete connection details through trigger kw_args.

Also in some cases (like for example when authentication information is coming from externally mounted volumes) is simply not available. Good exmple is when you have a managed service where the authentication information is available on a mounted volume, but the volume is not mounted in Triggerer. . Yes. This is a limitation of this particular service, but it also shows a more general problem where Triggerer would need generally the same auth environment that worker (which might not be desired or possible).

Yes. It make sense.

potiuk · 2023-12-30T14:01:34Z

I have tried accessing the variables from the trigger run function multiple times after multiple context switches in event loop. I can access the variable values without any issues

Yes. Accessing will work. But the fact that you can do it - does not mean you should :).

The problem is that if you do it during the event loop, the access (if not using async/await) will block - in I/O or networking and when it is blocked, no other triggers are running because event loop runs synchronously. See what happens (and warnings you get) when you just "sleep()" for a second or so. Or give your Triggered fractional CPU (say 0.05) - then you will see that even short - 50ms blocking operation will be disastrous for the event loop.

hussein-awala · 2023-12-30T14:06:05Z

The problem is that if you do it during the event loop, the access (if not using async/await) will block - in I/O or networking and when it is blocked, no other triggers are running because event loop runs synchronously. See what happens (and warnings you get) when you just "sleep()" for a second or so. Or give your Triggered fractional CPU (say 0.05) - then you will see that even short - 50ms blocking operation will be disastrous for the event loop.

I'm running a POC to make this easily possible without blocking the event loop and will create a new AIP soon 🤞

potiuk · 2023-12-30T14:08:54Z

I'm running a POC to make this easily possible without blocking the event loop and will create a new AIP soon 🤞

👀

dirrao · 2023-12-30T14:27:35Z

I have tried accessing the variables from the trigger run function multiple times after multiple context switches in event loop. I can access the variable values without any issues

Yes. Accessing will work. But the fact that you can do it - does not mean you should :).

The problem is that if you do it during the event loop, the access (if not using async/await) will block - in I/O or networking and when it is blocked, no other triggers are running because event loop runs synchronously. See what happens (and warnings you get) when you just "sleep()" for a second or so. Or give your Triggered fractional CPU (say 0.05) - then you will see that even short - 50ms blocking operation will be disastrous for the event loop.

Yes. You are right. It makes even more sense in the case of request/response APIs. However, in airflow, the trigger might run for seconds/minutes/hours (ex: S3KeyTrigger, KubernetesPodTrigger, etc.). I believe accessing the credentials one time should be ok.

potiuk · 2023-12-30T15:00:43Z

Yes. You are right. It makes even more sense in the case of request/response APIs. However, in airflow, the trigger might run for seconds/minutes/hours (ex: S3KeyTrigger, KubernetesPodTrigger, etc.). I believe accessing the credentials one time should be ok.

Not necessarily. This is generally a very bad idea to make a blocking call in event loop. All kinds of problems - even DNS lookup hanging for a second will heavily impact the ability of Triggerer. It is supposed to handle 1000s of triggers running in single even loop. So even if one of the triggers occasionally hangs for a second, it might have enormous impact on performance of trigger.

jscheffl

I do not fully and hardly insist on improvements of pytest, take it as "recommondation". Approving in advance ;-D

tests/models/test_trigger.py

hussein-awala · 2024-01-09T22:19:56Z

I do not fully and hardly insist on improvements of pytest, take it as "recommondation". Approving in advance ;-D

I just updated the test.

Actually, since we test the loaded value:

    loaded_trigger: SensitiveKwargsTrigger = TriggerRunner().trigger_row_to_trigger_instance(
        trigger_row, SensitiveKwargsTrigger
    )
    assert loaded_trigger.param1 == "value1"
    assert loaded_trigger.param2 == "value2"

we don't care how the value is stored, we just need to check that it is not the original one, as you suggested. 👍

hussein-awala · 2024-01-09T23:29:16Z

The PR was open for two weeks. Since there are no objections to the proposed solution, I will merge it.

dstandish · 2024-01-10T06:18:24Z

Hi, just saw this. Just curious, why not just always encrypt, instead of forcing each implementer to mess with the kwargs naming?

potiuk · 2024-01-10T08:26:39Z

Hi, just saw this. Just curious, why not just always encrypt, instead of forcing each implementer to mess with the kwargs naming?

My take: encrypting all of it makes trigger data opaque and difficult to diagnose in case of problems.

Also In the future optimisation cases it makes it impossible (or very slow) to do some housekeeping. Good example of this is encrypted uses session in users db. We could not delete old sessions without seeing all the metadata there that was encrypted in blob - for example we have not been able to do very efficient "historical session" deletion because session creation time was part of the blob.

The current implementation uses ExtendedJson to store kwargs in the DB - which in modern dbs (even MySQL(!) we could make efficient queries for kwargs. I can easily imagine the use of it for particular types of triggers (Find all deferred KPOs that have been using this namespace for example) . I think - for example - when we go to multitenancy and even (maybe in the future) to multi-multi-tenancy, this kind of queryable metadata might become really useful even for internal Airflow stuff. But even now it can provide valueable information and diagnostics for some power users.

jscheffl · 2024-01-10T10:58:21Z

Hi, just saw this. Just curious, why not just always encrypt, instead of forcing each implementer to mess with the kwargs naming?

My take: encrypting all of it makes trigger data opaque and difficult to diagnose in case of problems.

Also In the future optimisation cases it makes it impossible (or very slow) to do some housekeeping. Good example of this is encrypted uses session in users db. We could not delete old sessions without seeing all the metadata there that was encrypted in blob - for example we have not been able to do very efficient "historical session" deletion because session creation time was part of the blob.

The current implementation uses ExtendedJson to store kwargs in the DB - which in modern dbs (even MySQL(!) we could make efficient queries for kwargs. I can easily imagine the use of it for particular types of triggers (Find all deferred KPOs that have been using this namespace for example) . I think - for example - when we go to multitenancy and even (maybe in the future) to multi-multi-tenancy, this kind of queryable metadata might become really useful even for internal Airflow stuff. But even now it can provide valueable information and diagnostics for some power users.

Mhm, I agree on "Troubleshooting" encryption is making it harder - but the Picked JSON/Object data anyway can not be queried on DB Level w/o Python.

Consindering that the Triggering information is very volatile (I am not sure but was expecting after a task is completed it is clened-up and does not pile-up?) it would be really reasonable to encrypt just all. Then less code is needed.

potiuk · 2024-01-10T11:07:27Z

Consindering that the Triggering information is very volatile (I am not sure but was expecting after a task is completed it is clened-up and does not pile-up?) it would be really reasonable to encrypt just all. Then less code is needed.

Fair point (but JSON data can be queried in modern DBs)

potiuk · 2024-01-10T11:08:20Z

https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-json/

jscheffl · 2024-01-10T21:15:41Z

https://www.postgresqltutorial.com/postgresql-tutorial/postgresql-json/

..but I assume we don't want to implement N different dialects for all supported DBs - and anyway once we retrieve the data we need to get all the data anyway, not a subset.

* Implement a way to encrypt some trigger params stored in the DB * Update deferring doc * Create a variable for encryption prefix * Update the test (cherry picked from commit 8fb55f2)

potiuk · 2024-01-14T22:31:02Z

Yeah. I am fine with encrypting it all. It does not seem too problemattic and it has the nice property of securing EVERYTHING.

potiuk · 2024-01-14T22:31:17Z

@hussein-awala - what do you think?

* Implement a way to encrypt some trigger params stored in the DB * Update deferring doc * Create a variable for encryption prefix * Update the test

This reverts commit 8fb55f2.

…pache#38253) This reverts commit 8fb55f2.

hussein-awala added 2 commits December 30, 2023 00:47

Implement a way to encrypt some trigger params stored in the DB

c9a7f74

Update deferring doc

1b64a1f

hussein-awala added this to the Airflow 2.8.1 milestone Dec 29, 2023

hussein-awala requested review from dstandish, potiuk, kaxil, ashb and XD-DENG as code owners December 29, 2023 23:49

boring-cyborg bot added area:Scheduler including HA (high availability) scheduler area:Triggerer kind:documentation labels Dec 29, 2023

potiuk previously approved these changes Dec 30, 2023

View reviewed changes

dirrao reviewed Dec 30, 2023

View reviewed changes

eladkal approved these changes Dec 30, 2023

View reviewed changes

dirrao approved these changes Dec 30, 2023

View reviewed changes

potiuk reviewed Dec 30, 2023

View reviewed changes

airflow/models/trigger.py Show resolved Hide resolved

hussein-awala requested a review from potiuk December 30, 2023 13:00

hussein-awala requested a review from jscheffl January 9, 2024 00:04

Merge branch 'main' into trigger_encryption

5396e4a

jscheffl approved these changes Jan 9, 2024

View reviewed changes

tests/models/test_trigger.py Outdated Show resolved Hide resolved

Update the test

1e620c0

hussein-awala merged commit 8fb55f2 into apache:main Jan 9, 2024
53 checks passed

ephraimbuddy added the type:bug-fix Changelog: Bug Fixes label Jan 10, 2024

ephraimbuddy modified the milestones: Airflow 2.8.1, Airflow 2.9.0 Jan 15, 2024

hussein-awala mentioned this pull request Jan 15, 2024

Encrypt all string trigger attributes #36801

Closed

GoVulnBot mentioned this pull request Jan 24, 2024

x/vulndb: potential Go vuln in github.com/apache/airflow: CVE-2023-51702 golang/vulndb#2475

Closed

hussein-awala added a commit to hussein-awala/airflow that referenced this pull request Mar 17, 2024

Revert "Support encryption for triggers parameters (apache#36492)"

80e568d

This reverts commit 8fb55f2.

hussein-awala mentioned this pull request Mar 17, 2024

Encrypt all trigger attributes #38233

Merged

ephraimbuddy added a commit to astronomer/airflow that referenced this pull request Mar 18, 2024

Revert "Support encryption for triggers parameters (apache#36492)"

1e093a8

This reverts commit 8fb55f2.

ephraimbuddy added a commit that referenced this pull request Mar 18, 2024

Revert "Support encryption for triggers parameters (#36492)" (#38253)

671ba75

This reverts commit 8fb55f2.

ephraimbuddy removed the type:bug-fix Changelog: Bug Fixes label Mar 18, 2024

pateash pushed a commit to pateash/airflow that referenced this pull request Mar 18, 2024

Revert "Support encryption for triggers parameters (apache#36492)" (a…

9bf63b5

…pache#38253) This reverts commit 8fb55f2.

ephraimbuddy added the changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) label Mar 25, 2024

utkarsharma2 pushed a commit to astronomer/airflow that referenced this pull request Apr 22, 2024

Revert "Support encryption for triggers parameters (apache#36492)" (a…

25efda3

…pache#38253) This reverts commit 8fb55f2.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support encryption for triggers parameters #36492

Support encryption for triggers parameters #36492

hussein-awala commented Dec 29, 2023

potiuk left a comment

dirrao left a comment

potiuk commented Dec 30, 2023 •

edited

Loading

dirrao commented Dec 30, 2023

potiuk commented Dec 30, 2023 •

edited

Loading

dirrao commented Dec 30, 2023

hussein-awala commented Dec 30, 2023

potiuk commented Dec 30, 2023

dirrao commented Dec 30, 2023

potiuk commented Dec 30, 2023

hussein-awala commented Dec 30, 2023

potiuk commented Dec 30, 2023

dirrao commented Dec 30, 2023

potiuk commented Dec 30, 2023

jscheffl left a comment

hussein-awala commented Jan 9, 2024 •

edited

Loading

hussein-awala commented Jan 9, 2024

dstandish commented Jan 10, 2024

potiuk commented Jan 10, 2024 •

edited

Loading

jscheffl commented Jan 10, 2024

potiuk commented Jan 10, 2024

potiuk commented Jan 10, 2024

jscheffl commented Jan 10, 2024

potiuk commented Jan 14, 2024

potiuk commented Jan 14, 2024

Support encryption for triggers parameters #36492

Support encryption for triggers parameters #36492

Conversation

hussein-awala commented Dec 29, 2023

potiuk left a comment

Choose a reason for hiding this comment

dirrao left a comment

Choose a reason for hiding this comment

potiuk commented Dec 30, 2023 • edited Loading

dirrao commented Dec 30, 2023

potiuk commented Dec 30, 2023 • edited Loading

dirrao commented Dec 30, 2023

hussein-awala commented Dec 30, 2023

potiuk commented Dec 30, 2023

dirrao commented Dec 30, 2023

potiuk commented Dec 30, 2023

hussein-awala commented Dec 30, 2023

potiuk commented Dec 30, 2023

dirrao commented Dec 30, 2023

potiuk commented Dec 30, 2023

jscheffl left a comment

Choose a reason for hiding this comment

hussein-awala commented Jan 9, 2024 • edited Loading

hussein-awala commented Jan 9, 2024

dstandish commented Jan 10, 2024

potiuk commented Jan 10, 2024 • edited Loading

jscheffl commented Jan 10, 2024

potiuk commented Jan 10, 2024

potiuk commented Jan 10, 2024

jscheffl commented Jan 10, 2024

potiuk commented Jan 14, 2024

potiuk commented Jan 14, 2024

potiuk commented Dec 30, 2023 •

edited

Loading

potiuk commented Dec 30, 2023 •

edited

Loading

hussein-awala commented Jan 9, 2024 •

edited

Loading

potiuk commented Jan 10, 2024 •

edited

Loading