Fix issue with unit tests in subprocesses on macOS #34037

Bisk1 · 2023-09-02T15:01:31Z

When unit tests are run on local virtual env on macOS, a lot of them fail because mocks do not work as expected when they are used in subprocesses. See https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods "Changed in version 3.8: On macOS, the spawn start method is now the default (...)". On Unix systems "fork" is the default (so development with breeze is not affected).

When you mock a method with patch() in unittest.mock, this mock will be accessible in subprocess only if the start command was "fork" and not "spawn".
I don't know how many tests were affected by it, for me, it was a couple of tests in test_scheduler_job.py.

^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

When unit tests are run on local virtual env on macOS, a lot of them fail because mocks do not work as expected when they are used in subprocesses. See https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods "Changed in version 3.8: On macOS, the spawn start method is now the default (...)". On Unix systems "fork" is the default. When you mock a method with patch() in unittest.mock, this mock will be accessible in subprocess only if the start command was "fork" and not "spawn". I don't know how many tests were affected by it, for me, it was test_scheduler_job.py.

Bisk1 · 2023-09-02T15:25:57Z

Test results for test_scheduler_job.py before the change:

After the change:

Taragolis · 2023-09-02T15:37:52Z

airflow/config_templates/unit_tests.cfg

+# mocks from unittest.mock work correctly in subprocesses only if they are created by "fork" method
+mp_start_method = fork


It's know issue that MacOS doesn't work correctly in fork mode in Python (python/cpython#58037 and https://bugs.python.org/issue30385#msg293958), that basically why spawn default method in python.

Airflow itself use fork in some places even in MacOS and that the main reason for Segmentation Faults, in some other places it is controlled by mp_start_method.

So I would be very surprised if this change fix in one place and do not break in another. In conclusion I want to say that some tests just not work in MacOS and developer should use Breeze however I have no objection to adding this.

If this change only for internal airflow tests might be better add it into conftest.py

airflow/tests/conftest.py

Lines 38 to 44 in 7700fb1

os.environ["AIRFLOW__CORE__DAGS_FOLDER"] = os.fspath(AIRFLOW_TESTS_DIR / "dags")

os.environ["AIRFLOW__CORE__UNIT_TEST_MODE"] = "True"

os.environ["AWS_DEFAULT_REGION"] = os.environ.get("AWS_DEFAULT_REGION") or "us-east-1"

os.environ["CREDENTIALS_DIR"] = os.environ.get("CREDENTIALS_DIR") or "/files/airflow-breeze-config/keys"

os.environ["AIRFLOW_ENABLE_AIP_44"] = os.environ.get("AIRFLOW_ENABLE_AIP_44") or "true"

I understand. I knew of the problem with fork in macOS but I thought it's just unstable but functionally correct for Airflow purposes, so I figured it's still useful to switch it for testing. I will review other tests to see if there is any regression on macOS after this change. I personally use breeze but it's a little cumbersome to use when debugging code with IDE.

What is the difference between adding to unit_tests.cfg and conftest.py?

Use or not unit_tests.cfg controlled by AIRFLOW__CORE__UNIT_TEST_MODE and will affect all Airflow users who use this variable in their tests

In opposite conftest.py only for internal Airflow Tests, and only affect contributors to Airflow, who might run tests in Breeze. In additional some tests could override AIRFLOW__CORE__UNIT_TEST_MODE to False.

That is my thoughts, lets wait what he other says. Maybe it could really be better place it in unit_tests.cfg

What is the difference between adding to unit_tests.cfg and conftest.py?

To add what @Taragolis wrote - this has also changed recently - it used to be independently set but as of last month or so I changed it in the way that AIRFLOW__CORE__UNIT_TEST_MODE is actually set in conftest.py - so basically all Airflow unit tests always have the AIRFLOW__CORE__UNIT_TEST_MODE set to True. This was all part of restructuring the configuration so that configuration can be contributed by providers.

Right now - no matter how you run your tests they will have it set to True.

So it could be set in either of the two places.

However, I think it's better to set this variable in conftest.py and only for MacOS because in conftest.py you can set it conditionally only when MacOS is detected. That's the nice thing about conftest.py that you can set variables and test configuraiton dynamically - by writing the imperative code rather than declare the configuration.

This might have the nice thing that eventually somoene could run the full test suite on MacOS as well and fix all the issues that introduces (also conditionally) - while CI would continue to run the default "fork" mode in CI.

OK，I moved the change to conftest.py anded condition for MacOS only.
I looked at other tests and didn't notice any negative impact of the tests on my machine.

potiuk · 2023-09-03T15:51:06Z

fantastic :).

Bisk1 changed the title ~~Fix issue with unit tests in subprocesses on Mac~~ Fix issue with unit tests in subprocesses on macOS Sep 2, 2023

Taragolis reviewed Sep 2, 2023

View reviewed changes

daniel.dylag added 2 commits September 3, 2023 14:03

Move config to conftest.py and enable only in macOS

11975db

static check

3c004cf

Taragolis approved these changes Sep 3, 2023

View reviewed changes

potiuk approved these changes Sep 3, 2023

View reviewed changes

potiuk merged commit 5833320 into apache:main Sep 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix issue with unit tests in subprocesses on macOS #34037

Fix issue with unit tests in subprocesses on macOS #34037

Bisk1 commented Sep 2, 2023 •

edited

Loading

Bisk1 commented Sep 2, 2023

Taragolis Sep 2, 2023

Taragolis Sep 2, 2023

Bisk1 Sep 2, 2023

Bisk1 Sep 2, 2023

Taragolis Sep 2, 2023

potiuk Sep 3, 2023 •

edited

Loading

Bisk1 Sep 3, 2023

potiuk commented Sep 3, 2023

		# mocks from unittest.mock work correctly in subprocesses only if they are created by "fork" method
		mp_start_method = fork


	os.environ["AIRFLOW__CORE__DAGS_FOLDER"] = os.fspath(AIRFLOW_TESTS_DIR / "dags")
	os.environ["AIRFLOW__CORE__UNIT_TEST_MODE"] = "True"
	os.environ["AWS_DEFAULT_REGION"] = os.environ.get("AWS_DEFAULT_REGION") or "us-east-1"
	os.environ["CREDENTIALS_DIR"] = os.environ.get("CREDENTIALS_DIR") or "/files/airflow-breeze-config/keys"
	os.environ["AIRFLOW_ENABLE_AIP_44"] = os.environ.get("AIRFLOW_ENABLE_AIP_44") or "true"

Fix issue with unit tests in subprocesses on macOS #34037

Fix issue with unit tests in subprocesses on macOS #34037

Conversation

Bisk1 commented Sep 2, 2023 • edited Loading

Bisk1 commented Sep 2, 2023

Taragolis Sep 2, 2023

Choose a reason for hiding this comment

Taragolis Sep 2, 2023

Choose a reason for hiding this comment

Bisk1 Sep 2, 2023

Choose a reason for hiding this comment

Bisk1 Sep 2, 2023

Choose a reason for hiding this comment

Taragolis Sep 2, 2023

Choose a reason for hiding this comment

potiuk Sep 3, 2023 • edited Loading

Choose a reason for hiding this comment

Bisk1 Sep 3, 2023

Choose a reason for hiding this comment

potiuk commented Sep 3, 2023

Bisk1 commented Sep 2, 2023 •

edited

Loading

potiuk Sep 3, 2023 •

edited

Loading