Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable MyPy #19891

Closed
1 of 10 tasks
potiuk opened this issue Nov 30, 2021 · 157 comments · Fixed by #21020
Closed
1 of 10 tasks

Re-enable MyPy #19891

potiuk opened this issue Nov 30, 2021 · 157 comments · Fixed by #21020
Labels
area:dev-env CI, pre-commit, pylint and other changes that do not change the behavior of the final code kind:meta High-level information important to the community mypy Fixing MyPy problems after bumpin MyPy to 0.990

Comments

@potiuk
Copy link
Member

potiuk commented Nov 30, 2021

Why Mypy re-enable

For a few weeks MyPy checks have been disabled after the switch to Python 3.7 (per #19317).

We should, however, re-enable it back as it is very useful in catching a number of mistakes.

How does it work

We 've re-added the mypy pre-commit now - with mypy bumped to 0.910. This version detects far more errors and we should fix them all before we switch the CI check back.

  • mypy will be running for incremental changes in pre-commit, same as before. This will enable incremental fixes of the code changed by committers who use pre-commits locally

  • mypy on CI runs in non-failing mode. When the main pre-commit check is run, mypy is disabled, but then it is run as a separate step (which does not fail but will show the result of running mypy on all our code). This will enable us to track the progress of fixes

Can I help with the effort, you ask?

We started concerted effort now and incrementally fix all the mypy incompatibilities - ideally package/by/package to avoid huge code reviews. We'd really appreciate a number of people to contribute, so that we can re-enable mypy back fully and quickly :).

How can I help?

What you need is:

  • checkout main
  • ./breeeze build-image
  • pip install pre-commit
  • pre-commit install

This will enable automated checks for when you do a regular contribution. When you make your change, any MyPy issues will be reporteed and you need to fix them all to commit. You can also commit with --no-verify flag to skip that, bu, well, if you can improve airlfow a little - why not?

How can I help more ?

You can add PRs that are fixing whole packages, without contributing features or bugfixes. Please refer to this issue #19891 and ideally comment below in the issue that you want to take care of a package (to avoid duplicate work).

An easy way to run MyPy check for package can be done either from the host:

find DIRECTORY -name "*.py" | xargs pre-commit run mypy --files

or from ./breeze shell:

mypy --namespace-packages DIRECTORY

Current list of mypy PRs:

https://github.com/apache/airflow/pulls?q=is%3Aopen+is%3Apr+label%3Amypy

Remaining packages

Here is the list of remaining packages to be "mypy compliant" generated with:

pre-commit run mypy --all-files 2>&1 | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" | grep "error:" | sort | awk 'FS=":" { print $1 }' | xargs dirname | sort | uniq -c | xargs -n 2 printf "* [ ] (%4d) %s\n"
  • ( 1) airflow/api/common/experimental
  • ( 1) airflow/contrib/sensors
  • ( 1) airflow/example_dags
  • ( 1) airflow/jobs
  • ( 4) airflow/models
  • ( 1) airflow/providers/microsoft/winrm/hooks
  • ( 1) airflow/providers/ssh/hooks
  • ( 1) tests/providers/amazon/aws/hooks
  • ( 1) tests/providers/google/cloud/hooks

Committer

  • I acknowledge that I am a maintainer/committer of the Apache Airflow project.
@potiuk potiuk added kind:meta High-level information important to the community area:dev-env CI, pre-commit, pylint and other changes that do not change the behavior of the final code labels Nov 30, 2021
@josh-fell
Copy link
Contributor

josh-fell commented Nov 30, 2021

Happy to help here. Do you suspect a large number of fixes needed, and if so, should we coordinate over a Slack channel or just keep it to this issue?

@potiuk
Copy link
Member Author

potiuk commented Nov 30, 2021

Yeah we have about 800 issues :( . I think what I might do is to generate a list of packages and we can subscribe to them here in comments and check them when they are done.

@potiuk
Copy link
Member Author

potiuk commented Nov 30, 2021

There you are @josh-fell - we have 992 unique issues

@josh-fell
Copy link
Contributor

josh-fell commented Nov 30, 2021

Oof. Let the fun begin!

Naturally, I'll sign up for all of the */example_dags. And I'll take the Azure ones too if you don't mind as my first pass.

@potiuk
Copy link
Member Author

potiuk commented Nov 30, 2021

I will start with "airflow/ core subdirs ":)

 airflow/executors
 airflow/hooks
 airflow/jobs
 airflow/kubernetes
 airflow/macros
 airflow/models
 airflow/operators

@potiuk
Copy link
Member Author

potiuk commented Nov 30, 2021

But we need to merge #19890 first

@ashb
Copy link
Member

ashb commented Nov 30, 2021

Ouch. I guess some/many of those are from upgrading Mypy and it adding new better checks, rather than us having broken things, right?

@potiuk
Copy link
Member Author

potiuk commented Nov 30, 2021

Ouch. I guess some/many of those are from upgrading Mypy and it adding new better checks, rather than us having broken things, right?

I am not sure. Just from the number of those my guess is we simply missed a lot of errors via workarounding lack of namespace support (we had to convert files to packages and pass packages to mypy not files and I think it had bad side-effects).

Also I am not sure yet if mypy will detect all errors in " consistent" way with the "parallel support from pre-commit" - it might be that when you pass all files in one run, it will detect more errors. I will check it soon to see - but it is not needed to start fixing.

We might want to switch to this non-parallel mode, though it will be even more pain for local development (but then we might finally switch to mypyd support).

@potiuk
Copy link
Member Author

potiuk commented Nov 30, 2021

Just checked.

Parallel pre-commit is useless for mypy - seems that it will not speedup the tests anyway as mypy pretty much loads as many python code as it reaches out from the files, and with our code structure, any import to airflow will pretty much load everything always. So when we run MyPy in parallel, My 16 cores are fully busy and produce the same result in 2m6s as single busy core in 2m13s. Results are the same.

I will turn on serial mode :).

Serial:

[jarek:~/code/airflow] enable-mypy-as-non-failing-step(+3/-4)+ 8s ± pre-commit run mypy --all-files 2>&1 | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" | grep "error:" | sort | awk 'FS=" " { print $1 }' | uniq > /tmp/res-serial.out 

[jarek:~/code/airflow] enable-mypy-as-non-failing-step(+3/-4)+ 2m13s ± wc /tmp/res-serial.out 
  963   963 48752 /tmp/res-serial.out

Paralel:

[jarek:~/code/airflow] enable-mypy-as-non-failing-step(+3/-4)+ ± pre-commit run mypy --all-files 2>&1 | sed -r "s/\x1B\[([0-9]{1,2}(;[0-9]{1,2})?)?[mGK]//g" | grep "error:" | sort | awk 'FS=" " { print $1 }' | uniq > /tmp/res-paralle.out

[jarek:~/code/airflow] enable-mypy-as-non-failing-step(+1/-2)+ 2m6s ± wc /tmp/res-paralle.out 
  963   963 48752 /tmp/res-paralle.out
[jarek:~/code/airflow] enable-mypy-as-non-failing-step(+1/-2)+ ± 

@potiuk
Copy link
Member Author

potiuk commented Nov 30, 2021

Opened since 2015: python/mypy#933

@potiuk
Copy link
Member Author

potiuk commented Nov 30, 2021

BTW. @josh-fell and others - I already have quite a number of fixes - semi-automated - (but not a lot of them are verified in #19334 - so let me take a look and maybe some packages/parts will be cleanly applicable and we can get rid of many issues - but I want to merge this one first and the apply what I have there package by package (or maybe group of packages0

@uranusjr
Copy link
Member

uranusjr commented Dec 1, 2021

I just read a few PRs on this, and the general impression I’m getting is we are using type: ignore way too much. I feel we should try a bit harder to fix the actual typing errors instead of liberally slapping on ignores everywhere.

@potiuk
Copy link
Member Author

potiuk commented Dec 1, 2021

I just read a few PRs on this, and the general impression I’m getting is we are using type: ignore way too much. I feel we should try a bit harder to fix the actual typing errors instead of liberally slapping on ignores everywhere.

Agree.

@potiuk
Copy link
Member Author

potiuk commented Dec 1, 2021

Those PRs were just "Quick" extract of what I've done before really quickly - but I agreee we should rather refactore some of the "core" code because it is often done "badly" using shortcuts and very loose approach to types (reusing same variables for different types freely for example).

I hope more peopel will join and it will take just a short while to complete .

@potiuk
Copy link
Member Author

potiuk commented Dec 1, 2021

cc: @khalidmammadov

@khalidmammadov
Copy link
Contributor

I can take a look to tests/* folder if it's ok?

@potiuk
Copy link
Member Author

potiuk commented Dec 1, 2021

Perfect!

@Bowrna
Copy link
Contributor

Bowrna commented Dec 1, 2021

I will check this issue, understand it and start making contributions.

potiuk added a commit to potiuk/airflow that referenced this issue Jan 24, 2022
@potiuk
Copy link
Member Author

potiuk commented Jan 24, 2022

ALLRIGHT:
🤞 everyone: #21020

potiuk added a commit that referenced this issue Jan 24, 2022
@potiuk
Copy link
Member Author

potiuk commented Jan 24, 2022

Thanks everyone!

MyPY is re-enabled now. Let's keep our watchful eyes over the next few days to not merge changes that are not rebased to latest main (and quickly fix those that cause main failures it it happens).

The next one when we upgrade to new MyPY release I think :)

potiuk added a commit to potiuk/airflow that referenced this issue Apr 26, 2022
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Jul 10, 2022
Part of apache/airflow#19891

GitOrigin-RevId: 2d092021d7749e985062fb94f90c5a6415022d62
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Jul 10, 2022
Part of apache/airflow#19891

Before:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
airflow/providers/amazon/aws/sensors/s3.py:182: error: Incompatible types in assignment (expression has type "function", variable has type "Callable[..., bool]")  [assignment]
            check_fn: Callable[..., bool] = self.check_fn_user if self.check_fn_user is not None else self.check_fn
                                            ^
Found 1 error in 1 file (checked 33 source files)
```

After:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
Success: no issues found in 33 source files
```
GitOrigin-RevId: b15027410b4a985c15b1d7b2b2a0eedf2173f416
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Aug 27, 2022
Part of apache/airflow#19891

GitOrigin-RevId: 2d092021d7749e985062fb94f90c5a6415022d62
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Aug 27, 2022
Part of apache/airflow#19891

Before:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
airflow/providers/amazon/aws/sensors/s3.py:182: error: Incompatible types in assignment (expression has type "function", variable has type "Callable[..., bool]")  [assignment]
            check_fn: Callable[..., bool] = self.check_fn_user if self.check_fn_user is not None else self.check_fn
                                            ^
Found 1 error in 1 file (checked 33 source files)
```

After:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
Success: no issues found in 33 source files
```
GitOrigin-RevId: b15027410b4a985c15b1d7b2b2a0eedf2173f416
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Oct 4, 2022
Part of apache/airflow#19891

GitOrigin-RevId: 2d092021d7749e985062fb94f90c5a6415022d62
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Oct 4, 2022
Part of apache/airflow#19891

Before:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
airflow/providers/amazon/aws/sensors/s3.py:182: error: Incompatible types in assignment (expression has type "function", variable has type "Callable[..., bool]")  [assignment]
            check_fn: Callable[..., bool] = self.check_fn_user if self.check_fn_user is not None else self.check_fn
                                            ^
Found 1 error in 1 file (checked 33 source files)
```

After:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
Success: no issues found in 33 source files
```
GitOrigin-RevId: b15027410b4a985c15b1d7b2b2a0eedf2173f416
aglipska pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Oct 7, 2022
Part of apache/airflow#19891

GitOrigin-RevId: 2d092021d7749e985062fb94f90c5a6415022d62
aglipska pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Oct 7, 2022
Part of apache/airflow#19891

Before:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
airflow/providers/amazon/aws/sensors/s3.py:182: error: Incompatible types in assignment (expression has type "function", variable has type "Callable[..., bool]")  [assignment]
            check_fn: Callable[..., bool] = self.check_fn_user if self.check_fn_user is not None else self.check_fn
                                            ^
Found 1 error in 1 file (checked 33 source files)
```

After:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
Success: no issues found in 33 source files
```
GitOrigin-RevId: b15027410b4a985c15b1d7b2b2a0eedf2173f416
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Dec 7, 2022
Part of apache/airflow#19891

GitOrigin-RevId: 2d092021d7749e985062fb94f90c5a6415022d62
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Dec 7, 2022
Part of apache/airflow#19891

Before:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
airflow/providers/amazon/aws/sensors/s3.py:182: error: Incompatible types in assignment (expression has type "function", variable has type "Callable[..., bool]")  [assignment]
            check_fn: Callable[..., bool] = self.check_fn_user if self.check_fn_user is not None else self.check_fn
                                            ^
Found 1 error in 1 file (checked 33 source files)
```

After:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
Success: no issues found in 33 source files
```
GitOrigin-RevId: b15027410b4a985c15b1d7b2b2a0eedf2173f416
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Jan 27, 2023
Part of apache/airflow#19891

GitOrigin-RevId: 2d092021d7749e985062fb94f90c5a6415022d62
leahecole pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Jan 27, 2023
Part of apache/airflow#19891

Before:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
airflow/providers/amazon/aws/sensors/s3.py:182: error: Incompatible types in assignment (expression has type "function", variable has type "Callable[..., bool]")  [assignment]
            check_fn: Callable[..., bool] = self.check_fn_user if self.check_fn_user is not None else self.check_fn
                                            ^
Found 1 error in 1 file (checked 33 source files)
```

After:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
Success: no issues found in 33 source files
```
GitOrigin-RevId: b15027410b4a985c15b1d7b2b2a0eedf2173f416
kosteev pushed a commit to kosteev/composer-airflow-test-copybara that referenced this issue Sep 12, 2024
Part of apache/airflow#19891

GitOrigin-RevId: 2d092021d7749e985062fb94f90c5a6415022d62
kosteev pushed a commit to kosteev/composer-airflow-test-copybara that referenced this issue Sep 12, 2024
Part of apache/airflow#19891

Before:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
airflow/providers/amazon/aws/sensors/s3.py:182: error: Incompatible types in assignment (expression has type "function", variable has type "Callable[..., bool]")  [assignment]
            check_fn: Callable[..., bool] = self.check_fn_user if self.check_fn_user is not None else self.check_fn
                                            ^
Found 1 error in 1 file (checked 33 source files)
```

After:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
Success: no issues found in 33 source files
```
GitOrigin-RevId: b15027410b4a985c15b1d7b2b2a0eedf2173f416
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Sep 17, 2024
Part of apache/airflow#19891

GitOrigin-RevId: 2d092021d7749e985062fb94f90c5a6415022d62
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Sep 17, 2024
Part of apache/airflow#19891

Before:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
airflow/providers/amazon/aws/sensors/s3.py:182: error: Incompatible types in assignment (expression has type "function", variable has type "Callable[..., bool]")  [assignment]
            check_fn: Callable[..., bool] = self.check_fn_user if self.check_fn_user is not None else self.check_fn
                                            ^
Found 1 error in 1 file (checked 33 source files)
```

After:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
Success: no issues found in 33 source files
```
GitOrigin-RevId: b15027410b4a985c15b1d7b2b2a0eedf2173f416
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Nov 7, 2024
Part of apache/airflow#19891

GitOrigin-RevId: 2d092021d7749e985062fb94f90c5a6415022d62
kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Nov 7, 2024
Part of apache/airflow#19891

Before:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
airflow/providers/amazon/aws/sensors/s3.py:182: error: Incompatible types in assignment (expression has type "function", variable has type "Callable[..., bool]")  [assignment]
            check_fn: Callable[..., bool] = self.check_fn_user if self.check_fn_user is not None else self.check_fn
                                            ^
Found 1 error in 1 file (checked 33 source files)
```

After:

```
root@12e9fcfb4678:/opt/airflow# mypy --namespace-packages  airflow/providers/amazon/aws/sensors
Success: no issues found in 33 source files
```
GitOrigin-RevId: b15027410b4a985c15b1d7b2b2a0eedf2173f416
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-env CI, pre-commit, pylint and other changes that do not change the behavior of the final code kind:meta High-level information important to the community mypy Fixing MyPy problems after bumpin MyPy to 0.990
Projects
None yet
Development

Successfully merging a pull request may close this issue.

10 participants