Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Refactored __new__ magic method of BaseOperatorMeta to avoid bad mixing classic and decorated operators #37937

Merged
merged 96 commits into from
Mar 24, 2024

Conversation

dabla
Copy link
Contributor

@dabla dabla commented Mar 6, 2024

After the discussion started by @potiuk on the dev list, I've started a new pull request in which I implemented a decorator which checks from where the execute method of an operator is being called. Maybe the check will be to restrictive, so I hope we will see the effect of this in the integration tests as the check is disabled when running unit tests, otherwise the execute method of an operator would never be able to run. I've added 2 unit tests, one case where it's being called directly and thus fails and another one where it's indirectly being called from a TaskInstance.

Any remarks are welcome.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

…o include an execute safeguard which prohibits called the execute method manually from a Python callable or @task decorated python method.
@dabla dabla requested a review from uranusjr as a code owner March 6, 2024 12:44
@uranusjr
Copy link
Member

uranusjr commented Mar 6, 2024

Is it possible (desirable?) to raise an error when the operator is instantiated, instead of when execute is called?

@dabla
Copy link
Contributor Author

dabla commented Mar 6, 2024

Is it possible (desirable?) to raise an error when the operator is instantiated, instead of when execute is called?

I've tried that at first as that was also my first idea, but then the decorator will already be applied (or not) when the class get's loaded instead of when calling the executed method, which then makes it impossible to test because it's to late as I modify the unit_test_mode env var before each run of the test.

Copy link
Contributor

@Taragolis Taragolis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tests/models/test_baseoperatormeta.py Outdated Show resolved Hide resolved
tests/models/test_baseoperatormeta.py Outdated Show resolved Hide resolved
tests/models/test_baseoperatormeta.py Outdated Show resolved Hide resolved
tests/models/test_baseoperatormeta.py Outdated Show resolved Hide resolved
davidblain-infrabel and others added 22 commits March 6, 2024 15:53
…est_mode config parameter and use pytest instead of unittest to run the test
…run method of task instance in hope test won't fail on Airflow
… toggle the test_mode on it when used in unit tests
… toggle the test_mode on it when used in unit tests
@dabla
Copy link
Contributor Author

dabla commented Mar 21, 2024

I've tried refactoring it with sentinel but now integration tests start failing probably because it's considering the operator being called outside of TaskInstance while this was not the case with traceback.

@dabla
Copy link
Contributor Author

dabla commented Mar 21, 2024

I've tried refactoring it with sentinel but now integration tests start failing probably because it's considering the operator being called outside of TaskInstance while this was not the case with traceback.

Ok found out why this is failing, it because now the test_mode flag is used from TaskInstance but in those failing tests it's not and it's calling the execute methode directly on operator as a unit test (celery).

@potiuk
Copy link
Member

potiuk commented Mar 22, 2024

@ashb ?

Copy link
Member

@ashb ashb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Happy with the code now, I haven't caught up on the mailing list discussion though

Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. If @ashb is ok with it, we can merge it now for 2.9.0. Having a warning is "good enough" for now. If we decide to do something more (for example allow such use to behave more "as expected") we can always add it later.

@potiuk potiuk merged commit 694826d into apache:main Mar 24, 2024
46 checks passed
@ephraimbuddy ephraimbuddy added the type:misc/internal Changelog: Misc changes that should appear in change log label Mar 25, 2024
utkarsharma2 pushed a commit to astronomer/airflow that referenced this pull request Apr 22, 2024
…d bad mixing classic and decorated operators (apache#37937)

* refactor: Refactored __new__ magic method of BaseOperatorMeta class to include an execute safeguard which prohibits called the execute method manually from a Python callable or @task decorated python method.

* refactor: Get unit_test_mode from airflow config instead of directly retrieving it from env variable

* fix: Fixed import of AirflowException

* refactor: Use conf_vars instead of patch.dict to alter Airflow unit_test_mode config parameter and use pytest instead of unittest to run the test

* fix: Fixed match on expected exception message

* refactor: Added test case where HelloWorldOperator is called from within an PythonOperator

* refactor: Refactored TestBaseOperatorMeta by having a dedicated fixture for TaskInstance

* refactor: Reformatted some files that where failing due to static checks

* refactor: Try disabling unit_test_mode right just before calling the run method of task instance in hope test won't fail on Airflow

* refactor: Re-ordered imports as asked by static checks

* refactor: Added license at top of new test module

* refactor: Refactored ExecutorSafeguard decorator as a class so we can toggle the test_mode on it when used in unit tests

* refactor: Removed patch of sqlalchemy Session on test methods

* refactor: Refactored ExecutorSafeguard decorator as a class so we can toggle the test_mode on it when used in unit tests

* Revert "refactor: Refactored ExecutorSafeguard decorator as a class so we can toggle the test_mode on it when used in unit tests"

This reverts commit fabea74.

* refactor: Refactored ExecutorSafeguard decorator as a class so we can toggle the test_mode on it when used in unit tests

* refactor: Added docstring to ExecutorSafeguard decorator

* docs: Reformatted docstring of ExecutorSafeguard

* refactor: Added missing white line between docstring and test_mode class var of ExecutorSafeguard class

* refactor: Fixed unit tests for ExecutorSafeguard

* refactor: Reformatted baseoperator and test_baseoperatormeta file as expected by static check

* fix: Fixed import of partial function from functools

* refactor: Refactored multiple patches into one context manager so it's Python 3.8 compatible

* refactor: Reformatted patch statement in TestExecutorSafeguard

* refactor: Added allow_mixing attribute to BaseOperator and added test case when allow_mixing is enabled

* refactor: Fixed static checks on baseoperator

* fix: Forgot to also add allow_mixin parameter to partial method in BaseOperator

* refactor: Trying to fix example in docstring of allow_mixin param in BaseOperator

* refactor: Added bool type to allow_mixing attribute of BaseOperator

* refactor: Added allow_mixin parameter

* refactor: Added init file in resources package

* refactor: Changed docstring of BaseOperator to raw string due to backslash present in example

* fix: Fixed name of allow_mixin parameter

* fix: Fixed check on allow_mixin property

* fix: Fixed allow_mixin property in test

* refactor: Added allow_mixin to schema.json and assertion in test_dag_serialization

* refactor: Refactored tests using dag_maker fixture and running them as db_test + fixed issue of nested calls with ExecutorSafeguard

* refactor: Moved import of Context under type checking

* refactor: Reformatted TestExecutorSafeguard

* refactor: Simplified check on test_mode in wrapper

* refactor: Improved check as a classic operator can also be called through a python function called by a Python operator

* refactor: Fixed additional static checks

* refactor: Fixed additional static checks

* refactor: Use sentinel instead of traceback to detect if execute was called outside TaskInstance

* refactor: Use test_mode from TaskInstance

* refactor: Removed unused imports

* refactor: Put message in one line

* Revert "refactor: Use test_mode from TaskInstance"

This reverts commit 8a35ebb

* refactor: Put message on one line

* refactor: Fixed passing of sentinel arg when resume_execution is being called

* refactor: Reformatted test

* refactor: Check if next_kwargs is not None

* refactor: Fixed DAG example in docstring of chain_linear method

* refactor: Renamed allow_mixin parameter of BaseOperator to allow_nested_operators

---------

Co-authored-by: David Blain <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants