Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Can't add external executors -> Plugin manager #3

Closed
osallou opened this issue Jun 5, 2015 · 17 comments
Closed

Can't add external executors -> Plugin manager #3

osallou opened this issue Jun 5, 2015 · 17 comments

Comments

@osallou
Copy link

osallou commented Jun 5, 2015

Would be nice to be able to add other executors "out of airflow codebase".

List of executors is hard coded in airflow/executors/init.py

A kinda plugin mechanism could allow to add other executors.

@mistercrunch
Copy link
Member

For operators, you can derive BaseOperator anywhere outside of the Airflow code and use them in your DAGs (we do that internally for operators that aren't relevant to the open source community).

For executors they are a little more deeply embedded in the code as you pointed out. If you need a hook in https://github.com/mistercrunch/airflow/blob/master/airflow/executors/__init__.py I'll be happy to accept it. Could be something like:

try:
    from airflow_custom import DEFAULT_EXECUTOR
except:
    DEFAULT_EXECUTOR = None

And then somehow have this take precedence over the if statements underneath.

Then all you need is a airflow_custom.py module in your environment that defines DEFAULT_EXECUTOR as a derivative of BaseExecutor.

@osallou
Copy link
Author

osallou commented Jun 6, 2015

To ease a community development of operators,i think that a plugin
mechanism such as provided by yappsy would be best. A plugin dir in AIRFLOW
dir could simply hold those airflow_XXX operators and init would simply
load the selected plugin (if not one of the base operator).

I can fork the project and propose such solution with a pull request if you
want.

Le sam. 6 juin 2015 00:21, Maxime Beauchemin [email protected] a
écrit :

For operators, you can derive BaseOperator anywhere outside of the Airflow
code and use them in your DAGs (we do that internally for operators that
aren't relevant to the open source community).

For executors they are a little more deeply embedded in the code as you
pointed out. If you need a hook in
https://github.com/mistercrunch/airflow/blob/master/airflow/executors/__init__.py
I'll be happy to accept it. Could be something like:

try:
from airflow_custom import DEFAULT_EXECUTOR
except:
DEFAULT_EXECUTOR = None

And then somehow have this take precedence over the if statements
underneath.

Then all you need is a airflow_custom.py module in your environment that
defines DEFAULT_EXECUTOR as a derivative of BaseExecutor.


Reply to this email directly or view it on GitHub
#3 (comment).

@mistercrunch
Copy link
Member

Sounds perfect. Would the folder structure go $REPO/plugins/<plugin_set>/operators/<operator>.py or straight $REPO/plugins/operators/<operator>.py ? How should we enable plugins / plugin sets? Load all we find in the folders when present?

@nerdvegas
Copy link

+1 for plugin architecture via Yapsy.

@mistercrunch
Copy link
Member

@osallou , I just read about Yapsy and this sounds like a great idea. We can squeeze a plugin_folder setting in airflow.cfg and operators / executors / macros in that folder would get discovered and integrated.

Internally we were mentioned the possibility of having plugins that would have UI components to them. Seems like it'd be doable too eventually.

@osallou
Copy link
Author

osallou commented Jun 7, 2015

I used Yapsy in several of my projects, it is really easy and you can group plugins by "type" (operators, executors, ...). Then you only need to match your config with available plugins.

Do you want me to code it and send a pull request or do you prefer to manage it yourself?

@osallou
Copy link
Author

osallou commented Jun 7, 2015

Regarding structure and Yapsy use, I think that all plugins could go directly in a plugin dir (defined in config or $airhome/plugins by default) then you load the one defined in config file (for operator).
With base objects, you can group plugins and load the expected one with something like

    # Build the manager
    simplePluginManager = PluginManager()
    # Tell it the default place(s) where to find plugins
    simplePluginManager.setPluginPlaces([plugins_dir_from_config])
    simplePluginManager.setCategoriesFilter({
       "Operator": BaseOperator,
       "Executor": BaseExecutor
     })
    # Collect all plugins
    simplePluginManager.collectPlugins()
    # Get an instance of the executor defined in config
    for pluginInfo in simplePluginManager.getPluginsOfCategory("Executor"):
       if pluginInfo.plugin_object.get_name() == executor_defined_in_config:
         self.executor = pluginInfo.plugin_object

@mistercrunch
Copy link
Member

Sounds like we'd need a bit of code in operators/__init__.py to integrate the plugins that are instances of BaseOperator if we want them to be namespaced there. They could also be namespaced under airflow.plugins.operators.PluggedInOperator

I'm not sure how it's usually done, but it seems like it'd be nice to have them integrated in airflow.operators (same goes for executors and macros)

@codewithcheese
Copy link
Contributor

A plugin system would be useful. I just wrote a hook and sensor operator for RabbitMQ so I could fire off a task when a queue became empty. master...codewithcheese:rabbitmq_hook_sensor

@mistercrunch
Copy link
Member

I'm going to work on a plugin system over the next week, seems pretty straightforward. @codewithcheese, we support defining pool of tasks in Airflow, can your use case be handled by Airflow pools? http://pythonhosted.org/airflow/concepts.html#pools

@codewithcheese
Copy link
Contributor

Actually I am using rabbitmq for data processing in a different project and need to run some bash commands when it is complete. I was sharing that to demonstrate that a plugin system for hooks and not so necessarily operators would be useful to me.

@mistercrunch
Copy link
Member

I'm starting work on a plugin system using yapsy, I'll paste a link to the PR here when it's baked. I'm planning on integrating hooks, operators, macros, webviews, executors and I think that's it for now. We have use cases internally so that justifies the work.

@codewithcheese
Copy link
Contributor

👍

@mistercrunch mistercrunch changed the title Can't add external executors Can't add external executors -> Plugin manager Jun 11, 2015
@mistercrunch
Copy link
Member

Merged 22ac771
Documented here: http://pythonhosted.org/airflow/plugins.html

Let me know what you think

@mistercrunch
Copy link
Member

It's out on pypi (v1.1.0)

@osallou
Copy link
Author

osallou commented Jun 17, 2015

Sounds nice and fitting needs.
Thanks

Le mer. 17 juin 2015 03:09, Maxime Beauchemin [email protected] a
écrit :

Merged 22ac771
22ac771
Documented here: http://pythonhosted.org/airflow/plugins.html

Let me know what you think


Reply to this email directly or view it on GitHub
#3 (comment).

@codewithcheese
Copy link
Contributor

Sweeet thanks!. ill try it out. Now i can stop maintaining a fork for my deployment.

mistercrunch pushed a commit that referenced this issue Oct 1, 2015
aoen pushed a commit to aoen/incubator-airflow that referenced this issue Apr 10, 2020
… Default Retries and fix a small DAG refresh bug (apache#8)

* fb64f2e: [TWTR][AIRFLOW-XXX] Twitter Airflow Customizations + Fixup job scheduling without explicit_defaults_for_timestamp

* reformat

* 6607e48(airflow:master): [AIRFLOW-3160] Load latest_dagruns asynchronously, speed up front page load time apache#4005

* a93d550:

* a93d550: (HEAD, twitter/1.10+twtr) [TWTR][[AIRFLOW-4939]] Add Default Retries and fix a small DAG refresh bug (apache#3) (2 weeks ago)

* flake8 fix
y2k-shubham pushed a commit to y2k-shubham/airflow that referenced this issue Apr 11, 2020
mobuchowski pushed a commit to mobuchowski/airflow that referenced this issue Jan 4, 2022
* Added tests for the MarqueDag library
tatiana pushed a commit to tatiana/airflow that referenced this issue Oct 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants