Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate smart sensors #20151

Merged
merged 6 commits into from
Dec 15, 2021
Merged

Conversation

jedcunningham
Copy link
Member

@jedcunningham jedcunningham commented Dec 8, 2021

Smart sensors are being replaced with Deferrable Operators. As they were
marked as an early-access feature, we can remove them before Airflow 3.

(Don't merge before December 15th: https://lists.apache.org/thread/3t6v6n8cz5hrmc1wjqp728v0dv5hl18q)

@boring-cyborg boring-cyborg bot added area:Scheduler including HA (high availability) scheduler kind:documentation labels Dec 8, 2021
Copy link
Member

@potiuk potiuk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice.

@github-actions github-actions bot added the full tests needed We need to run full set of tests for this PR to merge label Dec 8, 2021
@github-actions
Copy link

github-actions bot commented Dec 8, 2021

The PR most likely needs to run full matrix of tests because it modifies parts of the core of Airflow. However, committers might decide to merge it quickly and take the risk. If they don't merge it quickly - please rebase it to the latest main at your convenience, or amend the last commit of the PR, and push it with --force-with-lease.

@ephraimbuddy
Copy link
Contributor

What do you think about using deprecated on the Sensor Class? It gives a very nice message when a deprecated function/class is used. e.g:

DeprecationWarning: Call to deprecated function (or staticmethod) get_pools. (Use Pool.get_pools() instead) -- Deprecated since version 2.2.3.
    pools = sorted(pool_api.get_pools(), key=lambda p: p.pool)

UPDATING.md Show resolved Hide resolved
@potiuk
Copy link
Member

potiuk commented Dec 9, 2021

What do you think about using deprecated on the Sensor Class? It gives a very nice message when a deprecated function/class is used. e.g:

Fully agree. And also we should add a note in our docs, explaining at the very least how the users should approach the migration as correctly pointed out by @turbaszek in the devlist. I think this should be standard of any deprecation we do. We have done quite a bad job in the past assuming that our users read all the discussions and AIPs we write. They don't. They need some guidance.

@@ -145,6 +145,12 @@ def __init__(

self.dagbag = DagBag(dag_folder=self.subdir, read_dags_from_db=True, load_op_links=False)

if conf.getboolean('smart_sensor', 'use_smart_sensor'):
warnings.warn(
'Smart sensors are deprecated. Please use Deferrable Operators instead.',
Copy link
Member

@kaxil kaxil Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you'll think adding a link to the docsite for deferrable operators will be helpful?

@potiuk @turbaszek @eladkal and others ?

Copy link
Member

@potiuk potiuk Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep . As described in the devlist. We definitely need a place in the doc (and linked from here) where we explain (shortly):

  1. why we cannot provide precise migration instructions
  2. what should be general approach of the user who has to migrate. Something alongside "For every smart sensor you use, you have to either use or write your own defferable operator if want to achieve similar results. If you want to know how to write your own operator - find it [here]" (another link)

We should simply assume that the users have not read the docs, know nothing about deferrable operators and they should understand that they need to write they own deferrable operator instead BEFORE they read the whole documentation (which we shoudl helpfuly link so that they do not have to google it or search the documentation).

That's teh "approach I 'd use for any deprecation" . I think those two pieces are crucial:

  1. Explaining "why" we cannot give them instructions
  2. Explaining 'what" they need to do before actually reading the documentation is crucial (and link to detailed instruction on how they approach it).

The 1) prevents questions "Please give me exact instructions on how to remove the deprecation?"
The 2) prevents questions "What do I need to to do now?".

That's it. No more, no less.

It's simply giving the right answers to the anticipated questions straight there when user sees the warning.

Copy link
Contributor

@eladkal eladkal Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we can be more explicit about what needs to be done - we should.
However I believe that if we know that a feature is deprecated we should say it as soon as possible - regardless if we have all the needed docs / explanations. At the end the explanations are for users who need to migrate from the deprecated features to the new one BUT there are also users who may just now playing with the feature and thinking to start use it - for them we just need to say that it's deprecated. They don't need migration docs.

So this is a tradeoff. i'm happy with what we have so far. We can always add more information.

Also want to add that we've been in this case before :)
TaskGroups <-> SubDags are also not 1-to-1

@jedcunningham jedcunningham force-pushed the deprecate_smart_sensors branch from 3957d3d to 0c44268 Compare December 9, 2021 19:57
@jedcunningham
Copy link
Member Author

Ready for another look. I've:

  • Added a section on migrating from "smart" -> deferrable
  • Added a link to that section in the deprecation warnings and UPDATING
  • Added a deprecation warning when a sensor task becomes "smart" - this lands in the task log
  • Added the list of sensors that are allowed to be "smart" in the SchedulerJob deprecation warning:
    Smart sensors are deprecated, yet can be used for {'NamedHivePartitionSensor'} sensors. Please use Deferrable Operators instead. See http://apache-airflow-docs.s3-website.eu-central-1.amazonaws.com/docs/apache-airflow/latest/concepts/smart-sensors.html#migrating-to-deferrable-operators for more info.
    

The SchedulerJob message is intended for the folks operating Airflow, and the message in the task log for end users.

@potiuk
Copy link
Member

potiuk commented Dec 9, 2021

Fantastic @jedcunningham! Thank you. This is perfect!

Copy link
Member

@turbaszek turbaszek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks @jedcunningham!

jedcunningham and others added 6 commits December 15, 2021 09:27
Smart sensors are being replaced with Deferrable Operators. As they were
marked as an early-access feature, we can remove them before Airflow 3.
@jedcunningham jedcunningham force-pushed the deprecate_smart_sensors branch from 2d5fffd to 019e165 Compare December 15, 2021 16:32
@jedcunningham jedcunningham added this to the Airflow 2.3.0 milestone Dec 15, 2021
@jedcunningham jedcunningham marked this pull request as ready for review December 15, 2021 18:56
@jedcunningham jedcunningham merged commit 77813b4 into apache:main Dec 15, 2021
@jedcunningham jedcunningham deleted the deprecate_smart_sensors branch December 15, 2021 18:57
potiuk pushed a commit that referenced this pull request Jan 23, 2022
Smart sensors are being replaced with Deferrable Operators. As they were
marked as an early-access feature, we can remove them before Airflow 3.

(cherry picked from commit 77813b4)
@potiuk potiuk added the type:doc-only Changelog: Doc Only label Jan 23, 2022
@potiuk potiuk modified the milestones: Airflow 2.3.0, Airflow 2.2.4 Jan 23, 2022
jedcunningham added a commit that referenced this pull request Jan 27, 2022
Smart sensors are being replaced with Deferrable Operators. As they were
marked as an early-access feature, we can remove them before Airflow 3.

(cherry picked from commit 77813b4)
@jedcunningham jedcunningham added type:misc/internal Changelog: Misc changes that should appear in change log and removed type:doc-only Changelog: Doc Only labels Feb 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:Scheduler including HA (high availability) scheduler full tests needed We need to run full set of tests for this PR to merge kind:documentation type:misc/internal Changelog: Misc changes that should appear in change log
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants