-
Notifications
You must be signed in to change notification settings - Fork 14.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create an airflow upgrade-check
command in 1.10 to ease upgrade path to 2.0.
#8765
Comments
airflow upgrade-check
command in 1.10 to eas upgrade path to 2.0.airflow upgrade-check
command in 1.10 to ease upgrade path to 2.0.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
@turbaszek Can you create Google Docs about it? Without a broader context, the titles of changes are of little help. I think we should now prepare a plan that will describe how we can detect breaking change and determine what to do when we detect them.
A lot of changes are backward compatible, and if not, we also need to detect and notify them.
|
I think some of the changes we can apply easily and safely. I think for example all the deprecation notices can be fixed rather easily - we already have information about it (we test all the depracations - I already used it in backport packages) - so we can super-easily and safely apply those in automated way. Similarly some of the changes above (rename parameter names etc.). I would really like to provide such tool to the users where it is no-brainer and can be automated. For most other changes I think we should simply detect potential problems and flag them - providing helpful hint how the problem could be fixed. And I agree GDOc about it might be better than issues/splitting them now. We can even split the work among ourselves and make recommendations individually for each of those - happy to share this with the rest. Then we can review them and turn into issues. |
We should definitely create the rules similar to what Kamil did in his PR and additionally have a flag that automatically fixes (where possible). Google Doc, Confluence (https://cwiki.apache.org/confluence/display/AIRFLOW/) is fine if we want to review and add comments. |
Gdoc fine for building up, but anything we point end users at should be in our official docs. A mode to doc things automatically is desirable, but probably not on by default - as much as I wish everyone used git, I know some people out there don't, so we shouldn't change their files without confirmation (perhaps show a diff then ask y/n?) |
Bowler has this option |
I've drafted this doc: The notes from UPDATING.md are split into few general groups (some are present in few places). I suppose that not all entries are important to users / will impact their Airflows. |
@turbaszek I looked at this document. Great job! I have looked at what changes we have and I think one tool may not solve all our problems. I think that apart from this command, we should take additional action.
I also added other ideas to new section - "Other ideas" in your docs. PS. I created "area:upgrade" label on Github to track the related issues. |
Yes I think the upgrade guide for CLI is a much better option than scrpits checking.
Interesting idea, but do you have any idea how to catch "these exceptions" (related to Airflow 2.0) not all? |
I think we can detect all problems and display them to the user. Why do you want to hide some problems? |
So your idea is to collect all exceptions to db, not only those related to migration / Airflow2? |
We can catch all the warnings, but those related to migration fall into one of two categories: DeprecationWarning, FutureWarning. |
I got it, I was misled by the "exceptions" but we are talking about warnings |
Base for Based on this gdoc:
We may consider the following rules as a good start (those are major breaking changes):
and config related changes:
In case of operator related changes I would like to suggest a single rule I think we should first focus on the "check and warn" approach so we have any tool. Once we have it we may consider "check and apply changes" - even, in this case, we should probably first focus on major problems (config, new imports, etc). The second step can use "report" generated by Summoning all elders: @mik-laj @vikramkoka @ashb @potiuk @kaxil @feluelle @dimberman @houqp @ryw @Fokko |
Sounds great to start with. I am sure we will have more rules added by the users, but those seem like great to create issues for. Elder Jarek. |
Part of Issue apache#8765 - adding a rule to check for undefined jinja variables when upgrading to Airflow2.0 Logic - Use a DagBag to pull all dags and iterate over every dag. For every dag the task will be rendered using an updated Jinja Environment using - jinja2.DebugUndefined This will render the template leaving undefined variables as they were. Using regex we can extract the variables and present possible error cases when upgrading.
To make it easier for users to upgrade from 1.10 to 2.0 (when it eventually comes out) we should create a single
upgrade-check
command in 1.10 that checks the following things. We could also have a mode that makes some of these changes in place (with confirmation from user) to automate it.Rules
Major breaking changes:
ConnTypeIsNotNullableRule
- Not-nullable conn_type column in connection table (done)UniqueConnIdRule
- Unique conn_id in connection table Add UniqueConnIdRule rule and unittest #11222CutomOperatorUsesMetaclassRule
- BaseOperator uses metaclass Create CustomOperatorUsesMetaclassRule to ease upgrade to Airflow 2.0 #11038UsingSQLFromBaseHookRule
- Remove SQL support in base_hook Create UsingSQLFromBaseHookRule to ease upgrade to Airflow 2.0 #11039ChainBetwenDAGAndOperatorNotAllowedRule
- Assigning task to a DAG using bitwise shift (bit-shift) operators are no longer supportedAirflowMacroPluginRemovedRule
- Removal of airflow.AirflowMacroPlugin classNoAdditionalArgsInOperatorsRule
- Additional arguments passed to BaseOperator cause an exception Create NoAdditionalArgsInOperatorsRule to ease upgrade to Airflow 2.0 #11042MesosExecutorRemovedRule
- Removal of Mesos ExecutorConfig related changes:
HostnameCallableRule
- Unifyhostname_callable
option incore
section Create HostnameCallableRule to ease upgrade to Airflow 2.0 #11044StatNameHandlerNotSupportedRule
- Drop plugin support for stat_name_handlerLoggingConfigurationRule
- Logging configuration has been moved to new sectionNoGCPServiceAccountKeyInConfigRule
- Remove gcp_service_account_keys option in airflow.cfg fileFernetEnabledRule
- Fernet is enabled by defaultKubernetesWorkerAnnotationsRule
- Changes to propagating Kubernetes worker annotationsLegacyUIDeprecatedRule
- Deprecate legacy UI in favor of FAB RBAC UITaskHandlersMovedRule
- GCSTaskHandler has been moved, WasbTaskHandler has been moved, StackdriverTaskHandler has been moved , S3TaskHandler has been moved, ElasticsearchTaskHandler has been moved, CloudwatchTaskHandler has been movedSendGridMovedRule
- SendGrid emailer has been movedCustomExecutorsRequireFullPathRule
- Custom executors is loaded using full import pathImport changes:
ImportChangesRule
- uses a mapold_operator_name -> list of possible problems
so we can create a single DagBag and scan all used operators and raise information about changes. It should also suggest what providers packages users should use.How to guide
To implement a new rule we had to create a class that inherits from
airflow.upgrade.rules.base_rule.BaseRule
. It will be auto-registered and used byairflow upgrade-check
command. The custom rule class has to havetitle
,description
properties and should implementcheck
method which returns a list of error messages in case of incompatibility.For example:
airflow/airflow/upgrade/rules/conn_type_is_not_nullable.py
Lines 25 to 42 in ea36166
Remeber to open the PR against
v1-10-test
branch.The text was updated successfully, but these errors were encountered: