-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue with unit tests in subprocesses on macOS #34037
Conversation
When unit tests are run on local virtual env on macOS, a lot of them fail because mocks do not work as expected when they are used in subprocesses. See https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods "Changed in version 3.8: On macOS, the spawn start method is now the default (...)". On Unix systems "fork" is the default. When you mock a method with patch() in unittest.mock, this mock will be accessible in subprocess only if the start command was "fork" and not "spawn". I don't know how many tests were affected by it, for me, it was test_scheduler_job.py.
# mocks from unittest.mock work correctly in subprocesses only if they are created by "fork" method | ||
mp_start_method = fork |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's know issue that MacOS doesn't work correctly in fork mode in Python (python/cpython#58037 and https://bugs.python.org/issue30385#msg293958), that basically why spawn default method in python.
Airflow itself use fork
in some places even in MacOS and that the main reason for Segmentation Faults, in some other places it is controlled by mp_start_method
.
So I would be very surprised if this change fix in one place and do not break in another. In conclusion I want to say that some tests just not work in MacOS and developer should use Breeze however I have no objection to adding this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this change only for internal airflow tests might be better add it into conftest.py
Lines 38 to 44 in 7700fb1
os.environ["AIRFLOW__CORE__DAGS_FOLDER"] = os.fspath(AIRFLOW_TESTS_DIR / "dags") | |
os.environ["AIRFLOW__CORE__UNIT_TEST_MODE"] = "True" | |
os.environ["AWS_DEFAULT_REGION"] = os.environ.get("AWS_DEFAULT_REGION") or "us-east-1" | |
os.environ["CREDENTIALS_DIR"] = os.environ.get("CREDENTIALS_DIR") or "/files/airflow-breeze-config/keys" | |
os.environ["AIRFLOW_ENABLE_AIP_44"] = os.environ.get("AIRFLOW_ENABLE_AIP_44") or "true" | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand. I knew of the problem with fork in macOS but I thought it's just unstable but functionally correct for Airflow purposes, so I figured it's still useful to switch it for testing. I will review other tests to see if there is any regression on macOS after this change. I personally use breeze but it's a little cumbersome to use when debugging code with IDE.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between adding to unit_tests.cfg and conftest.py?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use or not unit_tests.cfg
controlled by AIRFLOW__CORE__UNIT_TEST_MODE
and will affect all Airflow users who use this variable in their tests
In opposite conftest.py
only for internal Airflow Tests, and only affect contributors to Airflow, who might run tests in Breeze. In additional some tests could override AIRFLOW__CORE__UNIT_TEST_MODE
to False.
That is my thoughts, lets wait what he other says. Maybe it could really be better place it in unit_tests.cfg
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the difference between adding to unit_tests.cfg and conftest.py?
To add what @Taragolis wrote - this has also changed recently - it used to be independently set but as of last month or so I changed it in the way that AIRFLOW__CORE__UNIT_TEST_MODE
is actually set in conftest.py
- so basically all Airflow unit tests always have the AIRFLOW__CORE__UNIT_TEST_MODE
set to True. This was all part of restructuring the configuration so that configuration can be contributed by providers.
Right now - no matter how you run your tests they will have it set to True.
So it could be set in either of the two places.
However, I think it's better to set this variable in conftest.py and only for MacOS because in conftest.py
you can set it conditionally only when MacOS is detected. That's the nice thing about conftest.py
that you can set variables and test configuraiton dynamically - by writing the imperative code rather than declare the configuration.
This might have the nice thing that eventually somoene could run the full test suite on MacOS as well and fix all the issues that introduces (also conditionally) - while CI would continue to run the default "fork" mode in CI.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK,I moved the change to conftest.py
anded condition for MacOS only.
I looked at other tests and didn't notice any negative impact of the tests on my machine.
fantastic :). |
When unit tests are run on local virtual env on macOS, a lot of them fail because mocks do not work as expected when they are used in subprocesses. See https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods "Changed in version 3.8: On macOS, the spawn start method is now the default (...)". On Unix systems "fork" is the default (so development with breeze is not affected).
When you mock a method with patch() in unittest.mock, this mock will be accessible in subprocess only if the start command was "fork" and not "spawn".
I don't know how many tests were affected by it, for me, it was a couple of tests in test_scheduler_job.py.
^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named
{pr_number}.significant.rst
or{issue_number}.significant.rst
, in newsfragments.