Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

create public interface for user-defined "generic" components #2699

Open
thesuperzapper opened this issue May 3, 2022 · 4 comments
Open

Comments

@thesuperzapper
Copy link
Member

Generic components are powerful because the same component can be run both locally, and in a kubeflow/airflow pipeline. This makes it easier to develop iteratively by running the pipeline locally (rather than spamming your kubeflow/airflow cluster with jobs).

We can provide a public interface for people to define their own "generic" components in addition to the built-in ones we already have (Jupyter Notebook, Python Script, R Script).


Users could implement their "generic" components by implementing a Python class with methods like:

The available user-inputs (for generating the node-properties UI) could be defined by implementing "property" methods on this class.

We can then provide @xxxx decorators for each type of UI input we have ("dropdown", "list", "checkbox", etc).
For example, @elyra.dropdown(options=["option_1","option_2"], default="option_1") would display a dropdown and would pass parameters like selected_option to the annotated method.


Here is a very rough implementation of a generic-component class with one dropdown input called greeting_text that simply runs a print() function:

import kfp
from elyra.pipeline.local.processor_local import OperationProcessor
from elyra.pipeline.pipeline import GenericOperation


class MyGenericComponent(ElyraGenericComponent):

    def run_on_local(self) -> OperationProcessor:

        # `CustomOperationProcessor` is a custom subclass of `OperationProcessor`
        class CustomOperationProcessor(OperationProcessor):
            def __init__(self, text_to_print: str):
                self.text_to_print = text_to_print
                super().__init__()
            def process(self, operation: GenericOperation, elyra_run_name: str):
                print(self.text_to_print)

        operation_processor = CustomOperationProcessor(
            text_to_print=self.greeting_text()
        )
        return operation_processor

    def run_on_kubeflow(self) -> kfp.dsl.ContainerOp:
        container_op_factory = kfp.components.create_component_from_func(
            func=lambda text_to_print: print(text_to_print),
            base_image='python:3.9'
        )
        container_op = container_op_factory(
            text_to_print=self.greeting_text()
        )
        return container_op

    def run_on_airflow(self) -> ElyraAirflowOperation:
        # `ElyraAirflowOperation` is a class that replaces the current dictionary we use to pass 
        # the list of operations for the "airflow_template.jinja2" template
        elyra_airflow_operation = ElyraAirflowOperation(
            class_name="airflow.operators.python.PythonOperator",
            component_params={"python_callable": f"lambda: print({self.greeting_text()})"}
        )
        return elyra_airflow_operation

    @elyra.dropdown(display_name="Greeting Text", options=["morning", "night"], default="morning")
    def greeting_text(self, selected_option: str) -> str:
        if selected_option == "morning":
            return "Good morning, World!"
        elif selected_option == "night":
            return "Good night, World!"
        else:
            assert False
@thesuperzapper
Copy link
Member Author

@akchinSTC @ptitzler any thoughts on if the above proposal is acceptable?

I think this is a very useful feature and will really set Elyra apart as a "generic" abstraction for pipelines.

@thesuperzapper thesuperzapper added this to the 4.0.0 milestone May 31, 2022
@thesuperzapper
Copy link
Member Author

@akchinSTC I have added this to the 4.0.0 milestone.

A public interface for "generic components" is a very valuable feature that no other pipeline tool has, adding it would make Elyra a powerful high-level abstraction above Airflow, Kubeflow and Local-Python.

This is NOT to say that we must use the specific proposal above, just that we should consider how best to achieve user-provided "generic components" for the 4.0.0 release.

@lresende
Copy link
Member

What would be a concrete example of a "bring your own generic component" that can't be exposed as either a script or a notebook?

The issue is that runtimes are an extension point, and we have already seen a few runtime implementations being done by users, and the "run_on_xxx" won't be very scalable.

@lresende
Copy link
Member

Also, generic components will have to reinvent the new KFP APIs, and we might go away from it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants