Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix issue with unit tests in subprocesses on macOS #34037

Merged
merged 3 commits into from
Sep 3, 2023

Conversation

Bisk1
Copy link
Contributor

@Bisk1 Bisk1 commented Sep 2, 2023

When unit tests are run on local virtual env on macOS, a lot of them fail because mocks do not work as expected when they are used in subprocesses. See https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods "Changed in version 3.8: On macOS, the spawn start method is now the default (...)". On Unix systems "fork" is the default (so development with breeze is not affected).

When you mock a method with patch() in unittest.mock, this mock will be accessible in subprocess only if the start command was "fork" and not "spawn".
I don't know how many tests were affected by it, for me, it was a couple of tests in test_scheduler_job.py.


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

When unit tests are run on local virtual env on macOS, a lot of them fail because mocks do not work as expected when they are used in subprocesses.
See https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods
"Changed in version 3.8: On macOS, the spawn start method is now the default (...)". On Unix systems "fork" is the default.

When you mock a method with patch() in unittest.mock, this mock will be accessible in subprocess only if the start command was "fork" and not "spawn".
I don't know how many tests were affected by it, for me, it was test_scheduler_job.py.
@Bisk1 Bisk1 changed the title Fix issue with unit tests in subprocesses on Mac Fix issue with unit tests in subprocesses on macOS Sep 2, 2023
@Bisk1
Copy link
Contributor Author

Bisk1 commented Sep 2, 2023

Test results for test_scheduler_job.py before the change:
image
After the change:
image

Comment on lines 62 to 63
# mocks from unittest.mock work correctly in subprocesses only if they are created by "fork" method
mp_start_method = fork
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's know issue that MacOS doesn't work correctly in fork mode in Python (python/cpython#58037 and https://bugs.python.org/issue30385#msg293958), that basically why spawn default method in python.

Airflow itself use fork in some places even in MacOS and that the main reason for Segmentation Faults, in some other places it is controlled by mp_start_method.

So I would be very surprised if this change fix in one place and do not break in another. In conclusion I want to say that some tests just not work in MacOS and developer should use Breeze however I have no objection to adding this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this change only for internal airflow tests might be better add it into conftest.py

os.environ["AIRFLOW__CORE__DAGS_FOLDER"] = os.fspath(AIRFLOW_TESTS_DIR / "dags")
os.environ["AIRFLOW__CORE__UNIT_TEST_MODE"] = "True"
os.environ["AWS_DEFAULT_REGION"] = os.environ.get("AWS_DEFAULT_REGION") or "us-east-1"
os.environ["CREDENTIALS_DIR"] = os.environ.get("CREDENTIALS_DIR") or "/files/airflow-breeze-config/keys"
os.environ["AIRFLOW_ENABLE_AIP_44"] = os.environ.get("AIRFLOW_ENABLE_AIP_44") or "true"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand. I knew of the problem with fork in macOS but I thought it's just unstable but functionally correct for Airflow purposes, so I figured it's still useful to switch it for testing. I will review other tests to see if there is any regression on macOS after this change. I personally use breeze but it's a little cumbersome to use when debugging code with IDE.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between adding to unit_tests.cfg and conftest.py?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use or not unit_tests.cfg controlled by AIRFLOW__CORE__UNIT_TEST_MODE and will affect all Airflow users who use this variable in their tests

In opposite conftest.py only for internal Airflow Tests, and only affect contributors to Airflow, who might run tests in Breeze. In additional some tests could override AIRFLOW__CORE__UNIT_TEST_MODE to False.

That is my thoughts, lets wait what he other says. Maybe it could really be better place it in unit_tests.cfg

Copy link
Member

@potiuk potiuk Sep 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the difference between adding to unit_tests.cfg and conftest.py?

To add what @Taragolis wrote - this has also changed recently - it used to be independently set but as of last month or so I changed it in the way that AIRFLOW__CORE__UNIT_TEST_MODE is actually set in conftest.py - so basically all Airflow unit tests always have the AIRFLOW__CORE__UNIT_TEST_MODE set to True. This was all part of restructuring the configuration so that configuration can be contributed by providers.

Right now - no matter how you run your tests they will have it set to True.

So it could be set in either of the two places.

However, I think it's better to set this variable in conftest.py and only for MacOS because in conftest.py you can set it conditionally only when MacOS is detected. That's the nice thing about conftest.py that you can set variables and test configuraiton dynamically - by writing the imperative code rather than declare the configuration.

This might have the nice thing that eventually somoene could run the full test suite on MacOS as well and fix all the issues that introduces (also conditionally) - while CI would continue to run the default "fork" mode in CI.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK,I moved the change to conftest.py anded condition for MacOS only.
I looked at other tests and didn't notice any negative impact of the tests on my machine.

@potiuk potiuk merged commit 5833320 into apache:main Sep 3, 2023
@potiuk
Copy link
Member

potiuk commented Sep 3, 2023

fantastic :).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants