Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix pytest from working outside breeze #43082

Merged
merged 1 commit into from
Oct 16, 2024
Merged

Conversation

kaxil
Copy link
Member

@kaxil kaxil commented Oct 16, 2024

This was missed in #42985 . Without this airflow.providers.__path__ had 2 registered paths:

['/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers/src/airflow/providers',
 '/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers']

This prevents the tests from running outside of breeze and we get the following error:

ERROR tests/core/test_settings.py::test_usage_data_collection_disabled[true-True-True] - airflow.exceptions.AirflowConfigException: ("The provider apache-airflow-providers-src-airflow-providers-amazon is attempting to contribute configuration section aws that has already been added before. The source of it: apache-airflow-providers-amazon. This is forbidden. A provider can only add new sections. It cannot contribute options to existing sections or override other provider's configuration.", <class 'UserWarning'>)

We get this error because the Providers Manager uses airflow.providers.__path__ to register providers. Because we have 2 paths, it registers the same provider twice leading two the above error.

for path in airflow.providers.__path__: # type: ignore[attr-defined]

Example registration:

('apache-airflow-providers-src-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud ', ...

('apache-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud '

This wasn't a problem in breeze as it sets AIRFLOW_SOURCES env var in Dockerfile

. "${AIRFLOW_SOURCES:-/opt/airflow}"/scripts/in_container/_in_container_script_init.sh


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

This was missed in apache#42985 . Without this `airflow.providers.__path__` had 2 registered paths:

```
['/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers/src/airflow/providers',
 '/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers']
```

This prevents the tests from running outside of breeze and we get the following error:

```
ERROR tests/core/test_settings.py::test_usage_data_collection_disabled[true-True-True] - airflow.exceptions.AirflowConfigException: ("The provider apache-airflow-providers-src-airflow-providers-amazon is attempting to contribute configuration section aws that has already been added before. The source of it: apache-airflow-providers-amazon. This is forbidden. A provider can only add new sections. It cannot contribute options to existing sections or override other provider's configuration.", <class 'UserWarning'>)
```

We get this error because the *Providers Manager* uses `airflow.providers.__path__` to register providers. Because we have 2 paths, it registers the same provider twice leading two the above error.

https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/airflow/providers_manager.py#L662

Example registration:
```
('apache-airflow-providers-src-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud ', ...

('apache-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud '
```

This wasn't a problem in breeze as it sets `AIRFLOW_SOURCES` env var in Dockerfile
https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/scripts/docker/entrypoint_ci.sh#L24
@kaxil kaxil merged commit 12afc1d into apache:main Oct 16, 2024
17 checks passed
@kaxil kaxil deleted the fix-path branch October 16, 2024 15:01
@amoghrajesh
Copy link
Contributor

Ah thanks. Let me try it out by rebasing!

@amoghrajesh
Copy link
Contributor

Thank you @kaxil!
It works as expected!

R7L208 pushed a commit to R7L208/airflow that referenced this pull request Oct 17, 2024
This was missed in apache#42985 . Without this `airflow.providers.__path__` had 2 registered paths:

```
['/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers/src/airflow/providers',
 '/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers']
```

This prevents the tests from running outside of breeze and we get the following error:

```
ERROR tests/core/test_settings.py::test_usage_data_collection_disabled[true-True-True] - airflow.exceptions.AirflowConfigException: ("The provider apache-airflow-providers-src-airflow-providers-amazon is attempting to contribute configuration section aws that has already been added before. The source of it: apache-airflow-providers-amazon. This is forbidden. A provider can only add new sections. It cannot contribute options to existing sections or override other provider's configuration.", <class 'UserWarning'>)
```

We get this error because the *Providers Manager* uses `airflow.providers.__path__` to register providers. Because we have 2 paths, it registers the same provider twice leading two the above error.

https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/airflow/providers_manager.py#L662

Example registration:
```
('apache-airflow-providers-src-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud ', ...

('apache-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud '
```

This wasn't a problem in breeze as it sets `AIRFLOW_SOURCES` env var in Dockerfile
https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/scripts/docker/entrypoint_ci.sh#L24
harjeevanmaan pushed a commit to harjeevanmaan/airflow that referenced this pull request Oct 23, 2024
This was missed in apache#42985 . Without this `airflow.providers.__path__` had 2 registered paths:

```
['/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers/src/airflow/providers',
 '/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers']
```

This prevents the tests from running outside of breeze and we get the following error:

```
ERROR tests/core/test_settings.py::test_usage_data_collection_disabled[true-True-True] - airflow.exceptions.AirflowConfigException: ("The provider apache-airflow-providers-src-airflow-providers-amazon is attempting to contribute configuration section aws that has already been added before. The source of it: apache-airflow-providers-amazon. This is forbidden. A provider can only add new sections. It cannot contribute options to existing sections or override other provider's configuration.", <class 'UserWarning'>)
```

We get this error because the *Providers Manager* uses `airflow.providers.__path__` to register providers. Because we have 2 paths, it registers the same provider twice leading two the above error.

https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/airflow/providers_manager.py#L662

Example registration:
```
('apache-airflow-providers-src-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud ', ...

('apache-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud '
```

This wasn't a problem in breeze as it sets `AIRFLOW_SOURCES` env var in Dockerfile
https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/scripts/docker/entrypoint_ci.sh#L24
PaulKobow7536 pushed a commit to PaulKobow7536/airflow that referenced this pull request Oct 24, 2024
This was missed in apache#42985 . Without this `airflow.providers.__path__` had 2 registered paths:

```
['/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers/src/airflow/providers',
 '/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers']
```

This prevents the tests from running outside of breeze and we get the following error:

```
ERROR tests/core/test_settings.py::test_usage_data_collection_disabled[true-True-True] - airflow.exceptions.AirflowConfigException: ("The provider apache-airflow-providers-src-airflow-providers-amazon is attempting to contribute configuration section aws that has already been added before. The source of it: apache-airflow-providers-amazon. This is forbidden. A provider can only add new sections. It cannot contribute options to existing sections or override other provider's configuration.", <class 'UserWarning'>)
```

We get this error because the *Providers Manager* uses `airflow.providers.__path__` to register providers. Because we have 2 paths, it registers the same provider twice leading two the above error.

https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/airflow/providers_manager.py#L662

Example registration:
```
('apache-airflow-providers-src-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud ', ...

('apache-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud '
```

This wasn't a problem in breeze as it sets `AIRFLOW_SOURCES` env var in Dockerfile
https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/scripts/docker/entrypoint_ci.sh#L24
@pratik-m
Copy link

Hi @kaxil - I'm trying to run tests locally (without breeze) and I'm running into the error ModuleNotFoundError: No module named 'airflow.providers'. I couldn't find references to this issue..do I need to do any further setup to resolve this issue?

stacktrace

(airflow-312) pratik@pratik-Inspiron-16-7630-2-in-1:~/personal/projects/airflow$ pytest tests/utils/test_db_cleanup.py
================================================= test session starts ==================================================
platform linux -- Python 3.12.7, pytest-8.3.3, pluggy-1.5.0 -- /home/pratik/.local/share/hatch/env/virtual/apache-airflow/lGCtOum7/airflow-312/bin/python
cachedir: .pytest_cache
rootdir: /home/pratik/personal/projects/airflow
configfile: pyproject.toml
plugins: requests-mock-1.12.1, anyio-4.4.0, custom-exit-code-0.3.0, instafail-0.5.0, mock-3.14.0, asyncio-0.24.0, timeouts-1.2.1, cov-5.0.0, xdist-3.6.1, rerunfailures-14.0, time-machine-2.15.0, icdiff-0.9
asyncio: mode=Mode.STRICT, default_loop_scope=function
setup timeout: 0.0s, execution timeout: 0.0s, teardown timeout: 0.0s
collected 0 items / 1 error

======================================================== ERRORS ========================================================
___________________________________ ERROR collecting tests/utils/test_db_cleanup.py ____________________________________
ImportError while importing test module '/home/pratik/personal/projects/airflow/tests/utils/test_db_cleanup.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
../../../.local/share/hatch/env/virtual/apache-airflow/lGCtOum7/airflow-312/lib/python3.12/site-packages/_pytest/python.py:493: in importtestmodule
    mod = import_path(
../../../.local/share/hatch/env/virtual/apache-airflow/lGCtOum7/airflow-312/lib/python3.12/site-packages/_pytest/pathlib.py:582: in import_path
    importlib.import_module(module_name)
/usr/lib/python3.12/importlib/__init__.py:90: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
<frozen importlib._bootstrap>:1387: in _gcd_import
    ???
<frozen importlib._bootstrap>:1360: in _find_and_load
    ???
<frozen importlib._bootstrap>:1331: in _find_and_load_unlocked
    ???
<frozen importlib._bootstrap>:935: in _load_unlocked
    ???
../../../.local/share/hatch/env/virtual/apache-airflow/lGCtOum7/airflow-312/lib/python3.12/site-packages/_pytest/assertion/rewrite.py:184: in exec_module
    exec(co, module.__dict__)
tests/utils/test_db_cleanup.py:34: in <module>
    from airflow.models import DagModel, DagRun, TaskInstance
airflow/models/__init__.py:78: in __getattr__
    val = import_string(f"{path}.{name}")
airflow/utils/module_loading.py:39: in import_string
    module = import_module(module_path)
/usr/lib/python3.12/importlib/__init__.py:90: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
airflow/models/dag.py:95: in <module>
    from airflow.models.abstractoperator import AbstractOperator, TaskStateChangeCallback
airflow/models/abstractoperator.py:33: in <module>
    from airflow.template.templater import Templater
airflow/template/templater.py:23: in <module>
    from airflow.io.path import ObjectStoragePath
airflow/io/__init__.py:30: in <module>
    from airflow.providers_manager import ProvidersManager
airflow/providers_manager.py:39: in <module>
    from airflow.providers.standard.hooks.filesystem import FSHook
E   ModuleNotFoundError: No module named 'airflow.providers'
=============================================== short test summary info ================================================
ERROR tests/utils/test_db_cleanup.py
!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! Interrupted: 1 error during collection !!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!
============================================= 1 warning, 1 error in 0.83s ==============================================
(airflow-312) pratik@pratik-Inspiron-16-7630-2-in-1:~/personal/projects/airflow$

@amoghrajesh
Copy link
Contributor

@pratik-m if you are running it locally, you need to install the providers library, try running this from your root inside your virtual environment:

pip install -e ./providers

@pratik-m
Copy link

Awesome! this worked ! Thanks @amoghrajesh !

potiuk added a commit to potiuk/airflow that referenced this pull request Oct 29, 2024
After apache#42505, you need to get through extra hoops to develop
providers in Airflow's monorepo.

This is a simple (not yet uv-specific) documentation on how to
install providers in editable mode when you want to develop
providers, so that you can run unit test.

Copied mostly from apache#43082

It's not yet full set of docs explaining how to use workspaces
and UV. This shoudl be handled via  apache#43200
@potiuk
Copy link
Member

potiuk commented Oct 29, 2024

Followed it up with simple instructions / guidelines #43468 for contributors - not yet fully describing how to use uv workspaces but should be good for now.

kaxil added a commit that referenced this pull request Oct 29, 2024
…3468)

* Add simple instructions for installing providers in editable mode

After #42505, you need to get through extra hoops to develop
providers in Airflow's monorepo.

This is a simple (not yet uv-specific) documentation on how to
install providers in editable mode when you want to develop
providers, so that you can run unit test.

Copied mostly from #43082

It's not yet full set of docs explaining how to use workspaces
and UV. This shoudl be handled via  #43200

* Apply suggestions from code review

Co-authored-by: Amogh Desai <[email protected]>

---------

Co-authored-by: Kaxil Naik <[email protected]>
Co-authored-by: Amogh Desai <[email protected]>
ellisms pushed a commit to ellisms/airflow that referenced this pull request Nov 13, 2024
This was missed in apache#42985 . Without this `airflow.providers.__path__` had 2 registered paths:

```
['/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers/src/airflow/providers',
 '/Users/kaxilnaik/Documents/GitHub/astronomer/airflow/providers']
```

This prevents the tests from running outside of breeze and we get the following error:

```
ERROR tests/core/test_settings.py::test_usage_data_collection_disabled[true-True-True] - airflow.exceptions.AirflowConfigException: ("The provider apache-airflow-providers-src-airflow-providers-amazon is attempting to contribute configuration section aws that has already been added before. The source of it: apache-airflow-providers-amazon. This is forbidden. A provider can only add new sections. It cannot contribute options to existing sections or override other provider's configuration.", <class 'UserWarning'>)
```

We get this error because the *Providers Manager* uses `airflow.providers.__path__` to register providers. Because we have 2 paths, it registers the same provider twice leading two the above error.

https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/airflow/providers_manager.py#L662

Example registration:
```
('apache-airflow-providers-src-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud ', ...

('apache-airflow-providers-yandex',
  {'yandex': {'description': 'This section contains settings for Yandex Cloud '
```

This wasn't a problem in breeze as it sets `AIRFLOW_SOURCES` env var in Dockerfile
https://github.com/apache/airflow/blob/75b22940ac4d36c31380669da2aa32fe46d70d32/scripts/docker/entrypoint_ci.sh#L24
ellisms pushed a commit to ellisms/airflow that referenced this pull request Nov 13, 2024
…ache#43468)

* Add simple instructions for installing providers in editable mode

After apache#42505, you need to get through extra hoops to develop
providers in Airflow's monorepo.

This is a simple (not yet uv-specific) documentation on how to
install providers in editable mode when you want to develop
providers, so that you can run unit test.

Copied mostly from apache#43082

It's not yet full set of docs explaining how to use workspaces
and UV. This shoudl be handled via  apache#43200

* Apply suggestions from code review

Co-authored-by: Amogh Desai <[email protected]>

---------

Co-authored-by: Kaxil Naik <[email protected]>
Co-authored-by: Amogh Desai <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants