Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make graphviz dependency optional #36647

Merged
merged 1 commit into from
Jan 7, 2024

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Jan 7, 2024

The graphviz dependency has been problematic as Airflow required dependency - especially for ARM-based installations. Graphviz packages require binary graphviz libraries - which is already a limitation, but they also require to install graphviz Python bindings to be build and installed. This does not work for older Linux installation but - more importantly - when you try to install Graphviz libraries for Python 3.8, 3.9 for ARM M1 MacBooks, the packages fail to install because Pythonn bindings compilation for M1 can only work for Python 3.10+.

There is not an easy solution for that except commenting out graphviz dependency from setup.py, when you want to install Airflow for Python 3.8, 3.9 for MacBook M1.

However Graphviz is really used in two places:

  • when you want to render DAGs wia airflow CLI - either to an image or directly to terminal (for terminals/systems supporting imgcat)

  • when you want to rended ER diagram after you modified Airflow models

The latter is a development-only feature, the former is production feature, however it is a very niche one.

This PR turns rendering of the images in Airflow in optional feature (only working when graphviz python bindings are installed) and effectively turns graphviz into an optional extra (and removes it from requirements).

This is not a breaking chnage technically - the CLIs to reder the DAGs is still there and IF you already have graphviz installed, it will continue working as it did before. The only problem when it does not work is where you do not have graphviz installed it will raise an error and inform that you need it.

Graphviz will remain to be installed for most users:

  • the Airflow Image will still contain graphviz library, because it is added there as extra
  • when previous version of Airflow has been installed already, then graphviz library is already installed there and Airflow will continue working as it did

The only change will be a new installation of new version of Airflow from the scratch, where graphviz wlll need to be specified as extra or installed separatley in order to enable DAG rendering option.

Taking into account this behaviour (which only requires to install a graphviz package), this should not be considered as a breaking change.

Extracted from: #36537


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

The `graphviz` dependency has been problematic as Airflow required
dependency - especially for ARM-based installations. Graphviz
packages require binary graphviz libraries - which is already a
limitation, but they also require to install graphviz Python
bindings to be build and installed. This does not work for older
Linux installation but - more importantly - when you try
to install Graphviz libraries for Python 3.8, 3.9 for ARM M1
MacBooks, the packages fail to install because Python bindings
compilation for M1 can only work for Python 3.10+.

There is not an easy solution for that except commenting out
graphviz dependency from setup.py, when you want to install Airflow
for Python 3.8, 3.9 for MacBook M1.

However Graphviz is really used in two places:

* when you want to render DAGs wia airflow CLI - either to an image
  or directly to terminal (for terminals/systems supporting imgcat)

* when you want to render ER diagram after you modified Airflow
  models

The latter is a development-only feature, the former is production
feature, however it is a very niche one.

This PR turns rendering of the images in Airflow in optional feature
(only working when graphviz python bindings are installed) and
effectively turns graphviz into an optional extra (and removes it
from requirements).

This is not a breaking change technically - the CLIs to render the
DAGs is still there and IF you already have graphviz installed, it
will continue working as it did before. The only problem when it
does not work is where you do not have graphviz installed for
fresh installation and it will raise an error and inform that you need it.

Graphviz will remain to be installed for most users:

* the Airflow Image will still contain graphviz library, because
  it is added there as extra
* when previous version of Airflow has been installed already, then
  graphviz library is already installed there and Airflow will
  continue working as it did

The only change will be a new installation of new version of Airflow
from the scratch, where graphviz will need to be specified as extra
or installed separately in order to enable DAG rendering option.

Taking into account this behaviour (which only requires to install
a graphviz package), this should not be considered as a breaking
change.

Extracted from: apache#36537
@potiuk potiuk force-pushed the make-graphviz-dependency-optional branch from 643e98a to 5d39632 Compare January 7, 2024 14:59
@potiuk potiuk merged commit 89f1737 into apache:main Jan 7, 2024
78 checks passed
@potiuk potiuk deleted the make-graphviz-dependency-optional branch January 7, 2024 16:02
@ephraimbuddy ephraimbuddy added the type:misc/internal Changelog: Misc changes that should appear in change log label Jan 10, 2024
@ephraimbuddy ephraimbuddy added this to the Airflow 2.8.1 milestone Jan 10, 2024
@potiuk potiuk added changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) and removed changelog:skip Changes that should be skipped from the changelog (CI, tests, etc..) labels Jan 13, 2024
potiuk added a commit that referenced this pull request Jan 13, 2024
The `graphviz` dependency has been problematic as Airflow required
dependency - especially for ARM-based installations. Graphviz
packages require binary graphviz libraries - which is already a
limitation, but they also require to install graphviz Python
bindings to be build and installed. This does not work for older
Linux installation but - more importantly - when you try
to install Graphviz libraries for Python 3.8, 3.9 for ARM M1
MacBooks, the packages fail to install because Python bindings
compilation for M1 can only work for Python 3.10+.

There is not an easy solution for that except commenting out
graphviz dependency from setup.py, when you want to install Airflow
for Python 3.8, 3.9 for MacBook M1.

However Graphviz is really used in two places:

* when you want to render DAGs wia airflow CLI - either to an image
  or directly to terminal (for terminals/systems supporting imgcat)

* when you want to render ER diagram after you modified Airflow
  models

The latter is a development-only feature, the former is production
feature, however it is a very niche one.

This PR turns rendering of the images in Airflow in optional feature
(only working when graphviz python bindings are installed) and
effectively turns graphviz into an optional extra (and removes it
from requirements).

This is not a breaking change technically - the CLIs to render the
DAGs is still there and IF you already have graphviz installed, it
will continue working as it did before. The only problem when it
does not work is where you do not have graphviz installed for
fresh installation and it will raise an error and inform that you need it.

Graphviz will remain to be installed for most users:

* the Airflow Image will still contain graphviz library, because
  it is added there as extra
* when previous version of Airflow has been installed already, then
  graphviz library is already installed there and Airflow will
  continue working as it did

The only change will be a new installation of new version of Airflow
from the scratch, where graphviz will need to be specified as extra
or installed separately in order to enable DAG rendering option.

Taking into account this behaviour (which only requires to install
a graphviz package), this should not be considered as a breaking
change.

Extracted from: #36537

(cherry picked from commit 89f1737)
ephraimbuddy pushed a commit that referenced this pull request Jan 15, 2024
The `graphviz` dependency has been problematic as Airflow required
dependency - especially for ARM-based installations. Graphviz
packages require binary graphviz libraries - which is already a
limitation, but they also require to install graphviz Python
bindings to be build and installed. This does not work for older
Linux installation but - more importantly - when you try
to install Graphviz libraries for Python 3.8, 3.9 for ARM M1
MacBooks, the packages fail to install because Python bindings
compilation for M1 can only work for Python 3.10+.

There is not an easy solution for that except commenting out
graphviz dependency from setup.py, when you want to install Airflow
for Python 3.8, 3.9 for MacBook M1.

However Graphviz is really used in two places:

* when you want to render DAGs wia airflow CLI - either to an image
  or directly to terminal (for terminals/systems supporting imgcat)

* when you want to render ER diagram after you modified Airflow
  models

The latter is a development-only feature, the former is production
feature, however it is a very niche one.

This PR turns rendering of the images in Airflow in optional feature
(only working when graphviz python bindings are installed) and
effectively turns graphviz into an optional extra (and removes it
from requirements).

This is not a breaking change technically - the CLIs to render the
DAGs is still there and IF you already have graphviz installed, it
will continue working as it did before. The only problem when it
does not work is where you do not have graphviz installed for
fresh installation and it will raise an error and inform that you need it.

Graphviz will remain to be installed for most users:

* the Airflow Image will still contain graphviz library, because
  it is added there as extra
* when previous version of Airflow has been installed already, then
  graphviz library is already installed there and Airflow will
  continue working as it did

The only change will be a new installation of new version of Airflow
from the scratch, where graphviz will need to be specified as extra
or installed separately in order to enable DAG rendering option.

Taking into account this behaviour (which only requires to install
a graphviz package), this should not be considered as a breaking
change.

Extracted from: #36537

(cherry picked from commit 89f1737)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:dev-tools area:production-image Production image improvements and fixes kind:documentation type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants