diff --git a/.gitignore b/.gitignore index 7dacadec87..337b3bdb64 100644 --- a/.gitignore +++ b/.gitignore @@ -30,4 +30,5 @@ cookbook/release-snacks download-artifact/ typescript .bash_history -.venv/ \ No newline at end of file +.venv/ +cookbook/docs/_tags/ \ No newline at end of file diff --git a/cookbook/case_studies/bioinformatics/blast/README.rst b/cookbook/case_studies/bioinformatics/blast/README.rst index 3a496382ea..4c4e6bde11 100644 --- a/cookbook/case_studies/bioinformatics/blast/README.rst +++ b/cookbook/case_studies/bioinformatics/blast/README.rst @@ -3,6 +3,8 @@ Nucleotide Sequence Querying with BLASTX ---------------------------------------- +.. tags:: Advanced + This tutorial shows how computational biology intermixes with Flyte. The problem statement we will be looking at is querying a nucleotide sequence against a local protein database, to identify potential homologues. This guide will show you how to: diff --git a/cookbook/case_studies/feature_engineering/eda/README.md b/cookbook/case_studies/feature_engineering/eda/README.md deleted file mode 100644 index ee244aa617..0000000000 --- a/cookbook/case_studies/feature_engineering/eda/README.md +++ /dev/null @@ -1,5 +0,0 @@ -Run workflows in this directory with the custom-built base image like so: - -```shell -pyflyte run --remote notebook.py:notebook_wf --image ghcr.io/flyteorg/flytecookbook:eda-latest -``` diff --git a/cookbook/case_studies/feature_engineering/eda/README.rst b/cookbook/case_studies/feature_engineering/eda/README.rst index 1eea836b8c..c0bc6271cc 100644 --- a/cookbook/case_studies/feature_engineering/eda/README.rst +++ b/cookbook/case_studies/feature_engineering/eda/README.rst @@ -1,6 +1,8 @@ EDA, Feature Engineering, and Modeling With Papermill ===================================================== +.. tags:: Data, Jupyter, Intermediate + Exploratory Data Analysis (EDA) refers to the critical process of performing initial investigations on data to discover patterns, spot anomalies, test hypotheses and check assumptions with the help of summary statistics and graphical representations. diff --git a/cookbook/case_studies/feature_engineering/eda/notebook.py b/cookbook/case_studies/feature_engineering/eda/notebook.py index b5b5e8ce03..9136315c51 100644 --- a/cookbook/case_studies/feature_engineering/eda/notebook.py +++ b/cookbook/case_studies/feature_engineering/eda/notebook.py @@ -1,6 +1,6 @@ """ Flyte Pipeline in One Jupyter Notebook -======================================= +====================================== In this example, we will implement a simple pipeline that takes hyperparameters, does EDA, feature engineering, and measures the Gradient Boosting model's performance using mean absolute error (MAE), all in one notebook. diff --git a/cookbook/case_studies/feature_engineering/feast_integration/README.rst b/cookbook/case_studies/feature_engineering/feast_integration/README.rst index 7935718a58..6f9faeb47f 100644 --- a/cookbook/case_studies/feature_engineering/feast_integration/README.rst +++ b/cookbook/case_studies/feature_engineering/feast_integration/README.rst @@ -1,6 +1,8 @@ Feast Integration ----------------- +.. tags:: Data, MachineLearning, Advanced + `Feast `__ is an operational data system for managing and serving machine learning features to models in production. Flyte provides a way to train models and perform feature engineering as a single pipeline. diff --git a/cookbook/case_studies/ml_training/house_price_prediction/README.md b/cookbook/case_studies/ml_training/house_price_prediction/README.md deleted file mode 100644 index f694a48520..0000000000 --- a/cookbook/case_studies/ml_training/house_price_prediction/README.md +++ /dev/null @@ -1,5 +0,0 @@ -Run workflows in this directory with the custom-built base image like so: - -```shell -pyflyte run --remote house_price_predictor.py:house_price_predictor_trainer --image ghcr.io/flyteorg/flytecookbook:house_price_prediction-latest -``` diff --git a/cookbook/case_studies/ml_training/house_price_prediction/README.rst b/cookbook/case_studies/ml_training/house_price_prediction/README.rst index 92079404e2..e05e17652f 100644 --- a/cookbook/case_studies/ml_training/house_price_prediction/README.rst +++ b/cookbook/case_studies/ml_training/house_price_prediction/README.rst @@ -1,5 +1,7 @@ House Price Regression ------------------------ +---------------------- + +.. tags:: Data, MachineLearning, DataFrame, Intermediate House Price Regression refers to the prediction of house prices based on various factors, using the XGBoost Regression model (in our case). In this example, we will train our data on the XGBoost model to predict house prices in multiple regions. diff --git a/cookbook/case_studies/ml_training/house_price_prediction/multiregion_house_price_predictor.py b/cookbook/case_studies/ml_training/house_price_prediction/multiregion_house_price_predictor.py index 1c2eb9e71a..b7867fe9e1 100644 --- a/cookbook/case_studies/ml_training/house_price_prediction/multiregion_house_price_predictor.py +++ b/cookbook/case_studies/ml_training/house_price_prediction/multiregion_house_price_predictor.py @@ -1,7 +1,7 @@ """ Predicting House Price in Multiple Regions Using XGBoost and Dynamic Workflows -------------------------------------------------------------------------------- +------------------------------------------------------------------------------ In this tutorial, we will understand how to predict house prices in multiple regions using XGBoost, and :ref:`dynamic workflows ` in Flyte. diff --git a/cookbook/case_studies/ml_training/mnist_classifier/README.rst b/cookbook/case_studies/ml_training/mnist_classifier/README.rst index 38ed4009aa..f8a0674438 100644 --- a/cookbook/case_studies/ml_training/mnist_classifier/README.rst +++ b/cookbook/case_studies/ml_training/mnist_classifier/README.rst @@ -3,6 +3,8 @@ MNIST Classification With PyTorch and W&B ----------------------------------------- +.. tags:: MachineLearning, GPU, Advanced + PyTorch ======= diff --git a/cookbook/case_studies/ml_training/mnist_classifier/pytorch_single_node_multi_gpu.py b/cookbook/case_studies/ml_training/mnist_classifier/pytorch_single_node_multi_gpu.py index 4308292daa..0db02f5cac 100644 --- a/cookbook/case_studies/ml_training/mnist_classifier/pytorch_single_node_multi_gpu.py +++ b/cookbook/case_studies/ml_training/mnist_classifier/pytorch_single_node_multi_gpu.py @@ -1,6 +1,6 @@ """ Single Node, Multi GPU Training --------------------------------- +------------------------------- When you need to scale up model training in pytorch, you can use the :py:class:`~torch:torch.nn.DataParallel` for single node, multi-gpu/cpu training or :py:class:`~torch:torch.nn.parallel.DistributedDataParallel` for multi-node, diff --git a/cookbook/case_studies/ml_training/nlp_processing/README.rst b/cookbook/case_studies/ml_training/nlp_processing/README.rst index 686a9f1ee7..c12d7b9712 100644 --- a/cookbook/case_studies/ml_training/nlp_processing/README.rst +++ b/cookbook/case_studies/ml_training/nlp_processing/README.rst @@ -1,6 +1,8 @@ NLP Processing -------------- +.. tags:: MachineLearning, UI, Intermediate + This tutorial will demonstrate how to process text data and generate word embeddings and visualizations as part of a Flyte workflow. It's an adaptation of the official Gensim `Word2Vec tutorial `__. diff --git a/cookbook/case_studies/ml_training/pima_diabetes/README.rst b/cookbook/case_studies/ml_training/pima_diabetes/README.rst index 50d319c2fd..a4f1762e69 100644 --- a/cookbook/case_studies/ml_training/pima_diabetes/README.rst +++ b/cookbook/case_studies/ml_training/pima_diabetes/README.rst @@ -1,5 +1,7 @@ Diabetes Classification ------------------------- +----------------------- + +.. tags:: MachineLearning, Intermediate The workflow demonstrates how to train an XGBoost model. The workflow is designed for the `Pima Indian Diabetes dataset `__. diff --git a/cookbook/case_studies/ml_training/pima_diabetes/diabetes.py b/cookbook/case_studies/ml_training/pima_diabetes/diabetes.py index b0cae1b1bb..4e733d3e73 100644 --- a/cookbook/case_studies/ml_training/pima_diabetes/diabetes.py +++ b/cookbook/case_studies/ml_training/pima_diabetes/diabetes.py @@ -1,6 +1,6 @@ """ Train and Validate a Diabetes Classification XGBoost Model ------------------------------------------------------------ +---------------------------------------------------------- Watch a demo of sandbox creation and a sample execution of the pima diabetes pipeline below. diff --git a/cookbook/case_studies/ml_training/spark_horovod/README.rst b/cookbook/case_studies/ml_training/spark_horovod/README.rst index db3434d007..4c2d67f7bb 100644 --- a/cookbook/case_studies/ml_training/spark_horovod/README.rst +++ b/cookbook/case_studies/ml_training/spark_horovod/README.rst @@ -1,7 +1,9 @@ .. _spark_horovod: Forecasting Rossman Store Sales with Horovod and Spark ----------------------------------------------------------------------- +------------------------------------------------------ + +.. tags:: MachineLearning, Integration, Advanced The problem statement we will be looking at is forecasting sales using `rossmann store sales `__ data. Our example is an adaptation of the `Horovod-Spark example `__. diff --git a/cookbook/core/containerization/multi_images.py b/cookbook/core/containerization/multi_images.py index 0a66ce2ac7..bc08e59831 100644 --- a/cookbook/core/containerization/multi_images.py +++ b/cookbook/core/containerization/multi_images.py @@ -4,6 +4,8 @@ Multiple Container Images in a Single Workflow ---------------------------------------------- +.. tags:: Containerization, Intermediate + When working locally, it is recommended to install all requirements of your project locally (maybe in a single virtual environment). It gets complicated when you want to deploy your code to a remote environment since most tasks in Flyte (function tasks) are deployed using a Docker Container. diff --git a/cookbook/core/containerization/raw_container.py b/cookbook/core/containerization/raw_container.py index 263f938610..73bc9c3210 100644 --- a/cookbook/core/containerization/raw_container.py +++ b/cookbook/core/containerization/raw_container.py @@ -2,7 +2,9 @@ .. _raw_container: Using Raw Containers ---------------------- +-------------------- + +.. tags:: Containerization, Advanced This example demonstrates how to use arbitrary containers in 5 different languages, all orchestrated in flytekit seamlessly. Flyte mounts an input data volume where all the data needed by the container is available and an output data volume diff --git a/cookbook/core/containerization/spot_instances.py b/cookbook/core/containerization/spot_instances.py index 00575922ae..4dda5384d3 100644 --- a/cookbook/core/containerization/spot_instances.py +++ b/cookbook/core/containerization/spot_instances.py @@ -2,6 +2,8 @@ Using Spot/Preemptible Instances -------------------------------- +.. tags:: AWS, GCP, Intermediate + """ # %% # What Are Spot/Preemptible Instances? diff --git a/cookbook/core/containerization/use_secrets.py b/cookbook/core/containerization/use_secrets.py index 30cb284eda..e6bc2fc297 100644 --- a/cookbook/core/containerization/use_secrets.py +++ b/cookbook/core/containerization/use_secrets.py @@ -6,6 +6,8 @@ Using Secrets in a Task ======================= +.. tags:: Kubernetes, Intermediate + Flyte supports running a variety of tasks, from containers to SQL queries and service calls. For Flyte-run containers to request and access secrets, Flyte provides a native Secret construct. diff --git a/cookbook/core/containerization/workflow_labels_annotations.py b/cookbook/core/containerization/workflow_labels_annotations.py index 7bc9cc7dbd..5afea395bd 100644 --- a/cookbook/core/containerization/workflow_labels_annotations.py +++ b/cookbook/core/containerization/workflow_labels_annotations.py @@ -2,6 +2,8 @@ Adding Workflow Labels and Annotations -------------------------------------- +.. tags:: Kubernetes, Intermediate + In Flyte, workflow executions are created as Kubernetes resources. These can be extended with `labels `__ and `annotations `__. diff --git a/cookbook/core/control_flow/chain_entities.py b/cookbook/core/control_flow/chain_entities.py index 4a7f026ae7..14d543fa9b 100644 --- a/cookbook/core/control_flow/chain_entities.py +++ b/cookbook/core/control_flow/chain_entities.py @@ -2,6 +2,8 @@ Chain Flyte Entities -------------------- +.. tags:: Basic + Data passing between tasks or workflows need not necessarily happen through parameters. In such a case, if you want to explicitly construct the dependency, flytekit provides a mechanism to chain Flyte entities using the ``>>`` operator. diff --git a/cookbook/core/control_flow/checkpoint.py b/cookbook/core/control_flow/checkpoint.py index f36f4a1190..25ad19bbff 100644 --- a/cookbook/core/control_flow/checkpoint.py +++ b/cookbook/core/control_flow/checkpoint.py @@ -1,6 +1,8 @@ """ Intratask Checkpoints ----------------------- +--------------------- + +.. tags:: MachineLearning, Intermediate .. note:: diff --git a/cookbook/core/control_flow/conditions.py b/cookbook/core/control_flow/conditions.py index 05d0507136..048fb575f3 100644 --- a/cookbook/core/control_flow/conditions.py +++ b/cookbook/core/control_flow/conditions.py @@ -1,6 +1,9 @@ """ Conditions --------------- +---------- + +.. tags:: Intermediate + Flytekit supports conditions as a first class construct in the language. Conditions offer a way to selectively execute branches of a workflow based on static or dynamic data produced by other tasks or come in as workflow inputs. Conditions are very performant to be evaluated. However, they are limited to certain binary and logical operators and can diff --git a/cookbook/core/control_flow/dynamics.py b/cookbook/core/control_flow/dynamics.py index 33f3512ce8..ac266f0569 100644 --- a/cookbook/core/control_flow/dynamics.py +++ b/cookbook/core/control_flow/dynamics.py @@ -3,6 +3,8 @@ Dynamic Workflows ================= +.. tags:: Intermediate + A workflow is typically static when the directed acyclic graph's (DAG) structure is known at compile-time. However, in cases where a run-time parameter (for example, the output of an earlier task) determines the full DAG structure, you can use dynamic workflows by decorating a function with ``@dynamic``. diff --git a/cookbook/core/control_flow/map_task.py b/cookbook/core/control_flow/map_task.py index 70ff7cc6e9..c46ce192ab 100644 --- a/cookbook/core/control_flow/map_task.py +++ b/cookbook/core/control_flow/map_task.py @@ -2,6 +2,8 @@ Map Tasks --------- +.. tags:: Intermediate + A map task lets you run a pod task or a regular task over a list of inputs within a single workflow node. This means you can run thousands of instances of the task without creating a node for every instance, providing valuable performance gains! diff --git a/cookbook/core/control_flow/merge_sort.py b/cookbook/core/control_flow/merge_sort.py index ebda6a9ffd..3407e448ec 100644 --- a/cookbook/core/control_flow/merge_sort.py +++ b/cookbook/core/control_flow/merge_sort.py @@ -2,7 +2,9 @@ .. _advanced_merge_sort: Implementing Merge Sort ------------------------- +----------------------- + +.. tags:: Intermediate FlyteIdl (the fundamental building block of the Flyte Language) allows various programming language features: conditionals, recursion, custom typing, and more. diff --git a/cookbook/core/control_flow/subworkflows.py b/cookbook/core/control_flow/subworkflows.py index 967e87c691..d1cc396c16 100644 --- a/cookbook/core/control_flow/subworkflows.py +++ b/cookbook/core/control_flow/subworkflows.py @@ -4,6 +4,8 @@ SubWorkflows ------------ +.. tags:: Intermediate + Subworkflows are similar to :ref:`launch plans ` since they allow users to kick off one workflow from within another. What's the Difference? diff --git a/cookbook/core/extend_flyte/backend_plugins.py b/cookbook/core/extend_flyte/backend_plugins.py index fd5db48e56..4813419489 100644 --- a/cookbook/core/extend_flyte/backend_plugins.py +++ b/cookbook/core/extend_flyte/backend_plugins.py @@ -1,9 +1,11 @@ """ .. _extend-plugin-flyte-backend: -############################### +########################## Writing Backend Extensions -############################### +########################## + +.. tags:: Extensibility, Contribute, Intermediate Now that you have landed here, we can assume that you have exhausted your options of extending and want to extend Flyte in a way that adds new capabilities to the platform. diff --git a/cookbook/core/extend_flyte/container_interface.py b/cookbook/core/extend_flyte/container_interface.py index 7144956660..d54df20b49 100644 --- a/cookbook/core/extend_flyte/container_interface.py +++ b/cookbook/core/extend_flyte/container_interface.py @@ -4,6 +4,8 @@ Container Interface ------------------- +.. tags:: Extensibility, Contribute, Intermediate + Flyte typically interacts with containers in the course of its task execution (since most tasks are container tasks). This is what that process looks like: diff --git a/cookbook/core/extend_flyte/custom_types.py b/cookbook/core/extend_flyte/custom_types.py index 3add806249..b44885e0bb 100644 --- a/cookbook/core/extend_flyte/custom_types.py +++ b/cookbook/core/extend_flyte/custom_types.py @@ -4,6 +4,8 @@ Writing Custom Flyte Types -------------------------- +.. tags:: Extensibility, Contribute, Intermediate + Flyte is a strongly-typed framework for authoring tasks and workflows. But there are situations when the existing types do not directly work. This is true with any programming language! diff --git a/cookbook/core/extend_flyte/prebuilt_container.py b/cookbook/core/extend_flyte/prebuilt_container.py index 8cabc00666..297255c1ba 100644 --- a/cookbook/core/extend_flyte/prebuilt_container.py +++ b/cookbook/core/extend_flyte/prebuilt_container.py @@ -4,6 +4,8 @@ Pre-built Container Task Plugin ------------------------------- +.. tags:: Extensibility, Contribute, Intermediate + A pre-built container task plugin runs a pre-built container. The following are the advantages of using a pre-built container in comparison to a user-defined container: - Shifts the burden of writing Dockerfile from the user who uses the task in workflows to the author of the task type. diff --git a/cookbook/core/extend_flyte/user_container.py b/cookbook/core/extend_flyte/user_container.py index ec76f0ef0e..6ebd1b3321 100644 --- a/cookbook/core/extend_flyte/user_container.py +++ b/cookbook/core/extend_flyte/user_container.py @@ -4,6 +4,8 @@ User Container Task Plugin -------------------------- +.. tags:: Extensibility, Contribute, Intermediate + A user container task plugin runs a user-defined container that has the user code. This tutorial will walk you through writing your own sensor-style plugin that allows users to wait for a file to land diff --git a/cookbook/core/flyte_basics/basic_workflow.py b/cookbook/core/flyte_basics/basic_workflow.py index 9fdc8de514..33fbd059b4 100644 --- a/cookbook/core/flyte_basics/basic_workflow.py +++ b/cookbook/core/flyte_basics/basic_workflow.py @@ -1,6 +1,8 @@ """ Workflows ----------- +--------- + +.. tags:: Basic Once you have a handle on :ref:`tasks `, you can dive into Flyte workflows. Together, Flyte tasks and workflows make up the fundamental building blocks of Flyte. diff --git a/cookbook/core/flyte_basics/deck.py b/cookbook/core/flyte_basics/deck.py index a3d506d02b..9f57d9643f 100644 --- a/cookbook/core/flyte_basics/deck.py +++ b/cookbook/core/flyte_basics/deck.py @@ -1,6 +1,8 @@ """ Flyte Decks -------------- +----------- + +.. tags:: UI, Basic Deck enables users to get customizable and default visibility into their tasks. diff --git a/cookbook/core/flyte_basics/decorating_tasks.py b/cookbook/core/flyte_basics/decorating_tasks.py index f5563a3592..5ca2e9b305 100644 --- a/cookbook/core/flyte_basics/decorating_tasks.py +++ b/cookbook/core/flyte_basics/decorating_tasks.py @@ -2,6 +2,8 @@ Decorating Tasks ---------------- +.. tags:: Intermediate + A simple way of modifying the behavior of tasks is by using decorators to wrap your task functions. In order to make sure that your decorated function contains all the type annotation and docstring diff --git a/cookbook/core/flyte_basics/decorating_workflows.py b/cookbook/core/flyte_basics/decorating_workflows.py index 5979604a4b..88f2a0d2d7 100644 --- a/cookbook/core/flyte_basics/decorating_workflows.py +++ b/cookbook/core/flyte_basics/decorating_workflows.py @@ -2,6 +2,8 @@ Decorating Workflows -------------------- +.. tags:: Intermediate + The behavior of workflows can be modified in a light-weight fashion by using the built-in :py:func:`~functools.wraps` decorator pattern, similar to using decorators to :ref:`customize task behavior `. However, unlike in the case of diff --git a/cookbook/core/flyte_basics/documented_workflow.py b/cookbook/core/flyte_basics/documented_workflow.py index e31d8fb125..596e35f236 100644 --- a/cookbook/core/flyte_basics/documented_workflow.py +++ b/cookbook/core/flyte_basics/documented_workflow.py @@ -2,6 +2,8 @@ Add Docstrings to Workflows --------------------------- +.. tags:: Basic + Documented code helps enhance the readability of the code. Flyte supports docstrings to document your code. Docstrings are stored in FlyteAdmin and shown on the UI in the launch form. diff --git a/cookbook/core/flyte_basics/files.py b/cookbook/core/flyte_basics/files.py index 9de5865f8e..51a466a57e 100644 --- a/cookbook/core/flyte_basics/files.py +++ b/cookbook/core/flyte_basics/files.py @@ -1,6 +1,8 @@ """ Working With Files -------------------- +------------------ + +.. tags:: Data, Basic Files are one of the most fundamental entities that users of Python work with, and they are fully supported by Flyte. In the IDL, they are known as diff --git a/cookbook/core/flyte_basics/folders.py b/cookbook/core/flyte_basics/folders.py index d4f565f2e4..87d49dabff 100644 --- a/cookbook/core/flyte_basics/folders.py +++ b/cookbook/core/flyte_basics/folders.py @@ -1,6 +1,8 @@ """ Working With Folders ---------------------- +-------------------- + +.. tags:: Data, Basic In addition to files, folders are another fundamental operating system primitive users often work with. Flyte supports folders in the form of `multi-part blobs `__. diff --git a/cookbook/core/flyte_basics/hello_world.py b/cookbook/core/flyte_basics/hello_world.py index f50cd4b497..3a766d0594 100644 --- a/cookbook/core/flyte_basics/hello_world.py +++ b/cookbook/core/flyte_basics/hello_world.py @@ -1,7 +1,10 @@ """ Hello World ------------- +----------- + +.. tags:: Basic + This simple workflow calls a task that returns "Hello World" and then just sets that as the final output of the workflow. """ import typing diff --git a/cookbook/core/flyte_basics/imperative_wf_style.py b/cookbook/core/flyte_basics/imperative_wf_style.py index 52fe90c3a2..03d8f6d7d6 100644 --- a/cookbook/core/flyte_basics/imperative_wf_style.py +++ b/cookbook/core/flyte_basics/imperative_wf_style.py @@ -4,6 +4,8 @@ Imperative Workflows -------------------- +.. tags:: Basic + Workflows are typically created and specified by decorating a function with the ``@workflow`` decorator. This will run through the body of the function at compile time, using the subsequent calls of the underlying tasks to determine and record the workflow structure. This is the declarative style and makes sense when a human is writing it up by hand. diff --git a/cookbook/core/flyte_basics/lp.py b/cookbook/core/flyte_basics/lp.py index 1920768d93..15f787c547 100644 --- a/cookbook/core/flyte_basics/lp.py +++ b/cookbook/core/flyte_basics/lp.py @@ -2,6 +2,8 @@ Launch Plans ------------ +.. tags:: Basic + Launch plans bind a partial or complete list of inputs necessary to launch a workflow, along with optional run-time overrides such as notifications, schedules and more. Launch plan inputs must only assign inputs already defined in the reference workflow definition. diff --git a/cookbook/core/flyte_basics/named_outputs.py b/cookbook/core/flyte_basics/named_outputs.py index aa4975a212..799797963a 100644 --- a/cookbook/core/flyte_basics/named_outputs.py +++ b/cookbook/core/flyte_basics/named_outputs.py @@ -2,6 +2,8 @@ Named Outputs ------------- +.. tags:: Basic + By default, Flyte names the outputs of a task or workflow using a standardized convention. All the outputs are named as ``o1, o2, o3, ... o.`` where ``o`` is the standard prefix and ``1, 2, .. `` is the index position within the return values. diff --git a/cookbook/core/flyte_basics/reference_task.py b/cookbook/core/flyte_basics/reference_task.py index 9f6cb06642..0a3bc71b9e 100644 --- a/cookbook/core/flyte_basics/reference_task.py +++ b/cookbook/core/flyte_basics/reference_task.py @@ -2,6 +2,8 @@ Reference Task -------------- +.. tags:: Intermediate + A :py:func:`flytekit.reference_task` references the Flyte tasks that have already been defined, serialized, and registered. You can reference tasks from other projects and create a workflow that uses tasks declared by others. These tasks can be in their own containers, python runtimes, flytekit versions, and even different languages. diff --git a/cookbook/core/flyte_basics/shell_task.py b/cookbook/core/flyte_basics/shell_task.py index 01e6494c3b..530bd8a513 100644 --- a/cookbook/core/flyte_basics/shell_task.py +++ b/cookbook/core/flyte_basics/shell_task.py @@ -2,6 +2,8 @@ Run Bash Scripts Using ShellTask -------------------------------- +.. tags:: Intermediate + To run bash scripts from within Flyte, ShellTask can be used. In this example, let's define three ShellTasks to run simple bash commands. .. note:: diff --git a/cookbook/core/flyte_basics/task.py b/cookbook/core/flyte_basics/task.py index a0447c0ad4..0daefd5e4e 100644 --- a/cookbook/core/flyte_basics/task.py +++ b/cookbook/core/flyte_basics/task.py @@ -4,6 +4,8 @@ Tasks ----- +.. tags:: Basic + Task is a fundamental building block and an extension point of Flyte, which encapsulates the users' code. They possess the following properties: #. Versioned (usually tied to the ``git sha``) diff --git a/cookbook/core/flyte_basics/task_cache.py b/cookbook/core/flyte_basics/task_cache.py index 6819dd3b6a..e40dec14c6 100644 --- a/cookbook/core/flyte_basics/task_cache.py +++ b/cookbook/core/flyte_basics/task_cache.py @@ -1,6 +1,8 @@ """ Caching --------- +------- + +.. tags:: Basic Flyte provides the ability to cache the output of task executions to make the subsequent executions faster. A well-behaved Flyte task should generate deterministic output given the same inputs and task functionality. diff --git a/cookbook/core/flyte_basics/task_cache_serialize.py b/cookbook/core/flyte_basics/task_cache_serialize.py index 11451c19fc..e2289466c0 100644 --- a/cookbook/core/flyte_basics/task_cache_serialize.py +++ b/cookbook/core/flyte_basics/task_cache_serialize.py @@ -2,6 +2,8 @@ Cache Serializing ----------------- +.. tags:: Intermediate + Serializing means only executing a single instance of a unique cacheable task (determined by the cache_version parameter and task signature) at a time. Using this mechanism, Flyte ensures that during multiple concurrent executions of a task only a single instance is evaluated and all others wait until completion and reuse the resulting cached outputs. Ensuring serialized evaluation requires a small degree of overhead to coordinate executions using a lightweight artifact reservation system. Therefore, this should be viewed as an extension to rather than a replacement for non-serialized cacheable tasks. It is particularly well fit for long running or otherwise computationally expensive tasks executed in scenarios similar to the following examples: diff --git a/cookbook/core/scheduled_workflows/lp_schedules.py b/cookbook/core/scheduled_workflows/lp_schedules.py index de49566655..b232ae30e3 100644 --- a/cookbook/core/scheduled_workflows/lp_schedules.py +++ b/cookbook/core/scheduled_workflows/lp_schedules.py @@ -2,7 +2,9 @@ .. _launchplan_schedules: Scheduling Workflows Example ------------------------------ +---------------------------- + +.. tags:: Basic :ref:`flyte:divedeep-launchplans` can be set to run automatically on a schedule using the Flyte Native Scheduler. For workflows that depend on knowing the kick-off time, Flyte supports passing in the scheduled time (not the actual time, which may be a few seconds off) as an argument to the workflow. diff --git a/cookbook/core/type_system/custom_objects.py b/cookbook/core/type_system/custom_objects.py index 9aa766136d..33ff2e90e2 100644 --- a/cookbook/core/type_system/custom_objects.py +++ b/cookbook/core/type_system/custom_objects.py @@ -4,6 +4,8 @@ Using Custom Python Objects --------------------------- +.. tags:: Basic + Flyte supports passing JSON between tasks. But to simplify the usage for the users and introduce type-safety, Flytekit supports passing custom data objects between tasks. diff --git a/cookbook/core/type_system/enums.py b/cookbook/core/type_system/enums.py index eb72c524e7..3f98282658 100644 --- a/cookbook/core/type_system/enums.py +++ b/cookbook/core/type_system/enums.py @@ -1,6 +1,8 @@ """ Using Enum types -------------------------- +---------------- + +.. tags:: Basic Sometimes you may want to restrict the set of inputs / outputs to a finite set of acceptable values. This is commonly achieved using Enum types in programming languages. diff --git a/cookbook/core/type_system/flyte_pickle.py b/cookbook/core/type_system/flyte_pickle.py index ed5f3d93ac..5458b4ebe2 100644 --- a/cookbook/core/type_system/flyte_pickle.py +++ b/cookbook/core/type_system/flyte_pickle.py @@ -2,7 +2,9 @@ .. _flyte_pickle: Using Flyte Pickle ----------------------------- +------------------ + +.. tags:: Basic Flyte enforces type safety by leveraging type information to be able to compile tasks/workflows, which enables all sorts of nice features (like static analysis of tasks/workflows, conditional branching, etc.) diff --git a/cookbook/core/type_system/flyte_python_types.py b/cookbook/core/type_system/flyte_python_types.py index 9346a7dc39..e5ab1e84ab 100644 --- a/cookbook/core/type_system/flyte_python_types.py +++ b/cookbook/core/type_system/flyte_python_types.py @@ -2,7 +2,9 @@ .. _flytekit_to_flyte_type_mapping: Flyte and Python Types ------------------------ +---------------------- + +.. tags:: Basic FlyteKit automatically maps Python types to Flyte types. This section provides details of the mappings, but for the most part you can skip this section, as almost all of Python types are mapped automatically. diff --git a/cookbook/core/type_system/pytorch_types.py b/cookbook/core/type_system/pytorch_types.py index e180cd328f..f08f7044cc 100644 --- a/cookbook/core/type_system/pytorch_types.py +++ b/cookbook/core/type_system/pytorch_types.py @@ -4,6 +4,8 @@ PyTorch Types ============= +.. tags:: MachineLearning, Basic + Flyte promotes the use of strongly-typed data to make it easier to write pipelines that are more robust and easier to test. Flyte is primarily used for machine learning besides data engineering. To simplify the communication between Flyte tasks, especially when passing around tensors and models, we added support for the PyTorch types. diff --git a/cookbook/core/type_system/schema.py b/cookbook/core/type_system/schema.py index d5edb4a463..a3ce2f7a9a 100644 --- a/cookbook/core/type_system/schema.py +++ b/cookbook/core/type_system/schema.py @@ -1,6 +1,8 @@ """ Using Schemas ------------------- +------------- + +.. tags:: DataFrame, Basic This example explains how an untyped schema is passed between tasks using :py:class:`pandas.DataFrame`. Flytekit makes it possible for users to directly return or accept a :py:class:`pandas.DataFrame`, which are automatically diff --git a/cookbook/core/type_system/structured_dataset.py b/cookbook/core/type_system/structured_dataset.py index 7b70bcca2b..f93a062869 100644 --- a/cookbook/core/type_system/structured_dataset.py +++ b/cookbook/core/type_system/structured_dataset.py @@ -4,6 +4,8 @@ Structured Dataset ------------------ +.. tags:: DataFrame, Basic, Data + Structured dataset is a superset of Flyte Schema. The ``StructuredDataset`` Transformer can write a dataframe to BigQuery, s3, or any storage by registering new structured dataset encoder and decoder. diff --git a/cookbook/core/type_system/typed_schema.py b/cookbook/core/type_system/typed_schema.py index a00cc01c4e..1a941f3661 100644 --- a/cookbook/core/type_system/typed_schema.py +++ b/cookbook/core/type_system/typed_schema.py @@ -2,7 +2,9 @@ .. _typed_schema: Typed Columns in a Schema --------------------------- +------------------------- + +.. tags:: DataFrame, Basic, Data This example explains how a typed schema can be used in Flyte and declared in flytekit. diff --git a/cookbook/deployment/configure_logging_links.py b/cookbook/deployment/configure_logging_links.py index 8b857d9646..de9fd0d7b9 100644 --- a/cookbook/deployment/configure_logging_links.py +++ b/cookbook/deployment/configure_logging_links.py @@ -4,6 +4,8 @@ Configuring Logging Links in UI ------------------------------- +.. tags:: Deployment, Intermediate, UI + To debug your workflows in production, you want to access logs from your tasks as they run. These logs are different from the core Flyte platform logs, are specific to execution, and may vary from plugin to plugin; for example, Spark may have driver and executor logs. diff --git a/cookbook/deployment/configure_use_gpus.py b/cookbook/deployment/configure_use_gpus.py index bdb7804d53..6c26552f7e 100644 --- a/cookbook/deployment/configure_use_gpus.py +++ b/cookbook/deployment/configure_use_gpus.py @@ -4,6 +4,8 @@ Configuring Flyte to Access GPUs -------------------------------- +.. tags:: Deployment, Infrastructure, GPU, Intermediate + Along with the simpler resources like CPU/Memory, you may want to configure and access GPU resources. Flyte allows you to configure the GPU access poilcy for your cluster. GPUs are expensive and it would not be ideal to treat machines with GPUs and machines with CPUs equally. You may want to reserve machines with GPUs for tasks diff --git a/cookbook/deployment/customizing_resources.py b/cookbook/deployment/customizing_resources.py index 1eae0f5dbe..66c4f3b937 100644 --- a/cookbook/deployment/customizing_resources.py +++ b/cookbook/deployment/customizing_resources.py @@ -1,8 +1,12 @@ """ Customizing Task Resources ---------------------------- +-------------------------- + +.. tags:: Deployment, Infrastructure, Basic + One of the reasons to use a hosted Flyte environment is the potential of leveraging CPU, memory and storage resources, far greater than what's available locally. Flytekit makes it possible to specify these requirements declaratively and close to where the task itself is declared. + """ # %% diff --git a/cookbook/deployment/deploying_workflows.py b/cookbook/deployment/deploying_workflows.py index eae81c6e14..2cd12bb863 100644 --- a/cookbook/deployment/deploying_workflows.py +++ b/cookbook/deployment/deploying_workflows.py @@ -2,6 +2,8 @@ Deploying Workflows - Registration ---------------------------------- +.. tags:: Deployment, Intermediate + Locally, Flytekit relies on the Python interpreter to execute tasks and workflows. To leverage the full power of Flyte, we recommend using a deployed backend of Flyte. Flyte can be run on any Kubernetes cluster (for example, a local cluster like `kind `__), in a cloud environment, diff --git a/cookbook/deployment/lp_notifications.py b/cookbook/deployment/lp_notifications.py index 8dcb18d169..cfd1b191ca 100644 --- a/cookbook/deployment/lp_notifications.py +++ b/cookbook/deployment/lp_notifications.py @@ -3,6 +3,8 @@ Notifications ############# +.. tags:: Intermediate + """ # %% diff --git a/cookbook/docs-requirements.in b/cookbook/docs-requirements.in index 17dfae02fa..20950f659b 100644 --- a/cookbook/docs-requirements.in +++ b/cookbook/docs-requirements.in @@ -16,4 +16,5 @@ sphinxcontrib-yt sphinx-tabs<3.4.1 astroid grpcio < 1.49.0 -grpcio-status < 1.49.0 \ No newline at end of file +grpcio-status < 1.49.0 +sphinx-tags diff --git a/cookbook/docs-requirements.txt b/cookbook/docs-requirements.txt index 1088c0b5a4..4921b9853e 100644 --- a/cookbook/docs-requirements.txt +++ b/cookbook/docs-requirements.txt @@ -26,6 +26,8 @@ certifi==2022.9.24 # via requests cffi==1.15.1 # via cryptography +cfgv==3.3.1 + # via pre-commit chardet==5.0.0 # via binaryornot charset-normalizer==2.1.1 @@ -52,6 +54,8 @@ deprecated==1.2.13 # via flytekit diskcache==5.4.0 # via flytekit +distlib==0.3.6 + # via virtualenv docker==6.0.1 # via flytekit docker-image-py==0.1.12 @@ -64,6 +68,8 @@ docutils==0.17.1 # sphinx-panels # sphinx-rtd-theme # sphinx-tabs +filelock==3.8.0 + # via virtualenv flyteidl==1.1.22 # via flytekit flytekit==1.2.3 @@ -76,7 +82,7 @@ fonttools==4.38.0 # via matplotlib furo @ git+https://github.com/flyteorg/furo@main # via -r docs-requirements.in -googleapis-common-protos==1.56.4 +googleapis-common-protos==1.57.0 # via # flyteidl # grpcio-status @@ -91,6 +97,8 @@ grpcio-status==1.48.2 # flytekit htmlmin==0.1.12 # via pandas-profiling +identify==2.5.8 + # via pre-commit idna==3.4 # via requests imagehash==4.3.1 @@ -129,7 +137,7 @@ markdown==3.4.1 # via flytekitplugins-deck-standard markupsafe==2.1.1 # via jinja2 -marshmallow==3.18.0 +marshmallow==3.19.0 # via # dataclasses-json # marshmallow-enum @@ -159,6 +167,8 @@ natsort==8.2.0 # via flytekit networkx==2.8.8 # via visions +nodeenv==1.7.0 + # via pre-commit numpy==1.23.4 # via # imagehash @@ -201,8 +211,12 @@ pillow==9.3.0 # imagehash # matplotlib # visions +platformdirs==2.5.4 + # via virtualenv plotly==5.11.0 # via flytekitplugins-deck-standard +pre-commit==2.20.0 + # via sphinx-tags protobuf==3.20.3 # via # flyteidl @@ -257,6 +271,7 @@ pyyaml==6.0 # cookiecutter # flytekit # pandas-profiling + # pre-commit # sphinx-autoapi regex==2022.10.31 # via docker-image-py @@ -310,6 +325,7 @@ sphinx==4.5.0 # sphinx-prompt # sphinx-rtd-theme # sphinx-tabs + # sphinx-tags # sphinxcontrib-yt sphinx-autoapi==2.0.0 # via -r docs-requirements.in @@ -317,7 +333,7 @@ sphinx-basic-ng==1.0.0b1 # via furo sphinx-code-include==1.1.1 # via -r docs-requirements.in -sphinx-copybutton==0.5.0 +sphinx-copybutton==0.5.1 # via -r docs-requirements.in sphinx-fontawesome==0.0.6 # via -r docs-requirements.in @@ -331,6 +347,8 @@ sphinx-rtd-theme==1.1.1 # via -r docs-requirements.in sphinx-tabs==3.4.0 # via -r docs-requirements.in +sphinx-tags==0.1.6 + # via -r docs-requirements.in sphinxcontrib-applehelp==1.0.2 # via sphinx sphinxcontrib-devhelp==1.0.2 @@ -360,10 +378,12 @@ tenacity==8.1.0 text-unidecode==1.3 # via python-slugify toml==0.10.2 - # via responses + # via + # pre-commit + # responses tqdm==4.64.1 # via pandas-profiling -types-toml==0.10.8 +types-toml==0.10.8.1 # via responses typing-extensions==4.4.0 # via @@ -381,11 +401,13 @@ urllib3==1.26.12 # flytekit # requests # responses +virtualenv==20.16.7 + # via pre-commit visions[type_image_path]==0.7.5 # via pandas-profiling websocket-client==1.4.2 # via docker -wheel==0.38.2 +wheel==0.38.4 # via # -r ./common/requirements-common.in # flytekit @@ -396,3 +418,6 @@ wrapt==1.14.1 # flytekit zipp==3.10.0 # via importlib-metadata + +# The following packages are considered to be unsafe in a requirements file: +# setuptools diff --git a/cookbook/docs/Makefile b/cookbook/docs/Makefile index d57719612a..f151b643cb 100644 --- a/cookbook/docs/Makefile +++ b/cookbook/docs/Makefile @@ -19,3 +19,4 @@ clean: rm -rf _build rm -rf auto_* rm -rf auto/* + rm -rf _tags/* diff --git a/cookbook/docs/conf.py b/cookbook/docs/conf.py index 577259e47c..21dfae6715 100644 --- a/cookbook/docs/conf.py +++ b/cookbook/docs/conf.py @@ -199,6 +199,7 @@ def __call__(self, filename): "sphinxcontrib.mermaid", "sphinxcontrib.yt", "sphinx_tabs.tabs", + "sphinx_tags", ] # Add any paths that contain templates here, relative to this directory. @@ -223,6 +224,11 @@ def __call__(self, filename): # The master toctree document. master_doc = "index" +# Tags config +tags_create_tags = True +tags_page_title = "Tag" +tags_overview_title = "All Tags" + pygments_style = "tango" pygments_dark_style = "monokai" diff --git a/cookbook/docs/contribute.rst b/cookbook/docs/contribute.rst index 21696cb940..0abefb5ace 100644 --- a/cookbook/docs/contribute.rst +++ b/cookbook/docs/contribute.rst @@ -1,6 +1,8 @@ Example Contribution Guide ########################### +.. tags:: Contribute, Basic + The examples documentation provides an easy way for the community to learn about the rich set of features that Flyte offers, and we are constantly improving them with your help! diff --git a/cookbook/docs/index.rst b/cookbook/docs/index.rst index 478a31cbcd..8fa375fe46 100644 --- a/cookbook/docs/index.rst +++ b/cookbook/docs/index.rst @@ -4,6 +4,8 @@ User Guide ############## +`Flytesnacks Tags <_tags/tagsindex.html>`__ + If this is your first time using Flyte, check out the `Getting Started `_ guide. This *User Guide*, the :doc:`Tutorials `, and the :doc:`Integrations ` examples cover all of @@ -190,6 +192,7 @@ Table of Contents auto/integrations/aws/sagemaker_training/index auto/integrations/aws/sagemaker_pytorch/index auto/integrations/aws/athena/index + auto/integrations/aws/batch/index auto/integrations/external_services/hive/index auto/integrations/external_services/snowflake/index auto/integrations/gcp/bigquery/index diff --git a/cookbook/integrations/aws/athena/README.rst b/cookbook/integrations/aws/athena/README.rst index 7b6f6827ca..672ecd3950 100644 --- a/cookbook/integrations/aws/athena/README.rst +++ b/cookbook/integrations/aws/athena/README.rst @@ -2,6 +2,8 @@ AWS Athena ########## +.. tags:: Data, Integration, AWS, Advanced + Executing Athena Queries ======================== Flyte backend can be connected with Athena. Once enabled, it allows you to query AWS Athena service (Presto + ANSI SQL Support) and retrieve typed schema (optionally). diff --git a/cookbook/integrations/aws/batch/README.rst b/cookbook/integrations/aws/batch/README.rst index 6c57579dd3..64b89f48b3 100644 --- a/cookbook/integrations/aws/batch/README.rst +++ b/cookbook/integrations/aws/batch/README.rst @@ -2,6 +2,8 @@ AWS Batch ########## +.. tags:: Data, Integration, AWS, Advanced + Executing Batch Job ======================= Flyte backend can be connected with batch. Once enabled, it allows you to run regular task on AWS batch. diff --git a/cookbook/integrations/aws/batch/batch.py b/cookbook/integrations/aws/batch/batch.py index b993451d84..f9f18eb07d 100644 --- a/cookbook/integrations/aws/batch/batch.py +++ b/cookbook/integrations/aws/batch/batch.py @@ -1,6 +1,7 @@ """ AWS Batch -############ +######### + This example shows how to use a Flyte AWS batch plugin to execute a tasks on batch service. With AWS Batch, there is no need to install and manage batch computing software or server clusters that you use to run your jobs, allowing you to focus on analyzing results and solving problems. diff --git a/cookbook/integrations/aws/sagemaker_pytorch/README.rst b/cookbook/integrations/aws/sagemaker_pytorch/README.rst index bcce3ac349..0c5690052c 100644 --- a/cookbook/integrations/aws/sagemaker_pytorch/README.rst +++ b/cookbook/integrations/aws/sagemaker_pytorch/README.rst @@ -1,6 +1,8 @@ AWS Sagemaker Pytorch ===================== +.. tags:: Integration, MachineLearning, AWS, Advanced + This plugin shows an example of using Sagemaker custom training, with Pytorch distributed training. diff --git a/cookbook/integrations/aws/sagemaker_training/README.rst b/cookbook/integrations/aws/sagemaker_training/README.rst index d3516f823f..1215aa0be4 100644 --- a/cookbook/integrations/aws/sagemaker_training/README.rst +++ b/cookbook/integrations/aws/sagemaker_training/README.rst @@ -3,6 +3,8 @@ AWS Sagemaker Training ====================== +.. tags:: Integration, MachineLearning, AWS, Advanced + This section provides examples of Flyte Plugins that are designed to work with AWS Hosted services like Sagemaker, EMR, Athena, Redshift etc diff --git a/cookbook/integrations/aws/sagemaker_training/sagemaker_builtin_algo_training.py b/cookbook/integrations/aws/sagemaker_training/sagemaker_builtin_algo_training.py index 2998733146..42d2eeb32f 100644 --- a/cookbook/integrations/aws/sagemaker_training/sagemaker_builtin_algo_training.py +++ b/cookbook/integrations/aws/sagemaker_training/sagemaker_builtin_algo_training.py @@ -1,6 +1,7 @@ """ Built-in Sagemaker Algorithms ############################# + This example will show how it is possible to work with built-in algorithms with Amazon Sagemaker and perform hyper-parameter optimization using Sagemaker HPO. diff --git a/cookbook/integrations/aws/sagemaker_training/sagemaker_custom_training.py b/cookbook/integrations/aws/sagemaker_training/sagemaker_custom_training.py index 74114a825f..bac48e8375 100644 --- a/cookbook/integrations/aws/sagemaker_training/sagemaker_custom_training.py +++ b/cookbook/integrations/aws/sagemaker_training/sagemaker_custom_training.py @@ -1,6 +1,7 @@ """ Custom Sagemaker Algorithms ########################### + This script shows an example of how to simply convert your tensorflow training scripts to run on Amazon Sagemaker with very few modifications. """ diff --git a/cookbook/integrations/external_services/airflow/README.rst b/cookbook/integrations/external_services/airflow/README.rst index 7fba60e12b..e605bebb74 100644 --- a/cookbook/integrations/external_services/airflow/README.rst +++ b/cookbook/integrations/external_services/airflow/README.rst @@ -1,6 +1,8 @@ Airflow Provider ================ +.. tags:: Integration, Intermediate + The ``airflow-provider-flyte`` package provides an operator, a sensor, and a hook that integrates Flyte into Apache Airflow. ``FlyteOperator`` is helpful to trigger a task/workflow in Flyte and ``FlyteSensor`` enables monitoring a Flyte execution status for completion. diff --git a/cookbook/integrations/external_services/hive/README.rst b/cookbook/integrations/external_services/hive/README.rst index c9a7d49f17..bde7e18152 100644 --- a/cookbook/integrations/external_services/hive/README.rst +++ b/cookbook/integrations/external_services/hive/README.rst @@ -1,6 +1,8 @@ Hive ==== +.. tags:: Integration, Data, Advanced + Flyte backend can be connected with various hive services. Once enabled it can allow you to query a hive service (e.g. Qubole) and retrieve typed schema (optionally). This section will provide how to use the Hive Query Plugin using flytekit python diff --git a/cookbook/integrations/external_services/hive/hive.py b/cookbook/integrations/external_services/hive/hive.py index 47a8f14ad6..2b07b1f47f 100644 --- a/cookbook/integrations/external_services/hive/hive.py +++ b/cookbook/integrations/external_services/hive/hive.py @@ -1,6 +1,6 @@ """ Hive Tasks ------------ +---------- Tasks often start with a data gathering step, and often that data is gathered through Hive. Flytekit allows users to run any kind of Hive query (including queries with multiple statements and staging query commands). diff --git a/cookbook/integrations/external_services/snowflake/README.rst b/cookbook/integrations/external_services/snowflake/README.rst index 503a388cbf..cdf3cc5294 100644 --- a/cookbook/integrations/external_services/snowflake/README.rst +++ b/cookbook/integrations/external_services/snowflake/README.rst @@ -1,6 +1,8 @@ Snowflake ========= +.. tags:: Integration, Data, Advanced, SQL + Flyte backend can be connected with snowflake service. Once enabled it can allow you to query a snowflake service. This section will provide how to use the Snowflake Query Plugin using flytekit python. diff --git a/cookbook/integrations/flytekit_plugins/dolt/README.rst b/cookbook/integrations/flytekit_plugins/dolt/README.rst index 22172f3b39..231c63d3af 100644 --- a/cookbook/integrations/flytekit_plugins/dolt/README.rst +++ b/cookbook/integrations/flytekit_plugins/dolt/README.rst @@ -1,6 +1,8 @@ Dolt ==== +.. tags:: Integration, Data, SQL, Intermediate + The ``DoltTable`` plugin is a wrapper that uses `Dolt `__ to move data between ``pandas.DataFrame``'s at execution time and database tables at rest. diff --git a/cookbook/integrations/flytekit_plugins/greatexpectations/README.rst b/cookbook/integrations/flytekit_plugins/greatexpectations/README.rst index 51e95ff0bc..1540bc713a 100644 --- a/cookbook/integrations/flytekit_plugins/greatexpectations/README.rst +++ b/cookbook/integrations/flytekit_plugins/greatexpectations/README.rst @@ -3,6 +3,8 @@ Great Expectations ================== +.. tags:: Integration, Data, DataFrame, Intermediate + **Great Expectations** is a Python-based open-source library for validating, documenting, and profiling your data. It helps maintain data quality and improve communication about data between teams. diff --git a/cookbook/integrations/flytekit_plugins/modin_examples/README.rst b/cookbook/integrations/flytekit_plugins/modin_examples/README.rst index 3d2ffd7c82..85a25ddbcb 100644 --- a/cookbook/integrations/flytekit_plugins/modin_examples/README.rst +++ b/cookbook/integrations/flytekit_plugins/modin_examples/README.rst @@ -1,6 +1,8 @@ Modin ====== +.. tags:: Integration, DataFrame, MachineLearning, Intermediate + Modin is a pandas-accelerator that helps handle large datasets. Pandas works gracefully with small datasets since it is inherently single-threaded, and designed to work on a single CPU core. With large datasets, the performance of pandas drops (becomes slow or runs out of memory) due to single core usage. diff --git a/cookbook/integrations/flytekit_plugins/onnx_examples/README.rst b/cookbook/integrations/flytekit_plugins/onnx_examples/README.rst index 748ecd190d..a7a1bccbe8 100644 --- a/cookbook/integrations/flytekit_plugins/onnx_examples/README.rst +++ b/cookbook/integrations/flytekit_plugins/onnx_examples/README.rst @@ -3,6 +3,8 @@ ONNX ==== +.. tags:: Integration, MachineLearning, Intermediate + Open Neural Network Exchange (`ONNX `__) is an open standard format for representing machine learning and deep learning models. It enables interoperability between different frameworks and streamlines the path from research to production. diff --git a/cookbook/integrations/flytekit_plugins/onnx_examples/tensorflow_onnx.py b/cookbook/integrations/flytekit_plugins/onnx_examples/tensorflow_onnx.py index c7089c396c..4362a991f5 100644 --- a/cookbook/integrations/flytekit_plugins/onnx_examples/tensorflow_onnx.py +++ b/cookbook/integrations/flytekit_plugins/onnx_examples/tensorflow_onnx.py @@ -1,6 +1,6 @@ """ TensorFlow Example -------------------- +------------------ In this example, we will see how to convert a tensorflow model to an ONNX model. diff --git a/cookbook/integrations/flytekit_plugins/pandera_examples/README.rst b/cookbook/integrations/flytekit_plugins/pandera_examples/README.rst index 836713dc43..ae14ad1802 100644 --- a/cookbook/integrations/flytekit_plugins/pandera_examples/README.rst +++ b/cookbook/integrations/flytekit_plugins/pandera_examples/README.rst @@ -1,6 +1,8 @@ Pandera ======= +.. tags:: Integration, DataFrame, Data, Intermediate + Flytekit python natively supports :ref:`many data types `, including a :ref:`FlyteSchema ` type for type-annotating pandas dataframes. The flytekit pandera plugin provides an alternative for diff --git a/cookbook/integrations/flytekit_plugins/pandera_examples/validating_and_testing_ml_pipelines.py b/cookbook/integrations/flytekit_plugins/pandera_examples/validating_and_testing_ml_pipelines.py index 5b0a772b6f..7913c4d365 100644 --- a/cookbook/integrations/flytekit_plugins/pandera_examples/validating_and_testing_ml_pipelines.py +++ b/cookbook/integrations/flytekit_plugins/pandera_examples/validating_and_testing_ml_pipelines.py @@ -2,6 +2,8 @@ Validating and Testing Machine Learning Pipelines ------------------------------------------------- +.. tags:: Integration, DataFrame, MachineLearning, Intermediate + In this example we'll show you how to use :ref:`pandera.SchemaModel ` to annotate dataframe inputs and outputs in an `sklearn `__ model-training pipeline. diff --git a/cookbook/integrations/flytekit_plugins/papermilltasks/README.rst b/cookbook/integrations/flytekit_plugins/papermilltasks/README.rst index 85a9b5195d..e36b57e990 100644 --- a/cookbook/integrations/flytekit_plugins/papermilltasks/README.rst +++ b/cookbook/integrations/flytekit_plugins/papermilltasks/README.rst @@ -1,6 +1,8 @@ Papermill ========= +.. tags:: Integration, Jupyter, Intermediate + It is possible to run a Jupyter notebook as a Flyte task using `papermill `_. Papermill executes the notebook as a whole, so before using this plugin, it is essential to construct your notebook as recommended by papermill. When using this plugin, there are a few important things to keep in mind: diff --git a/cookbook/integrations/flytekit_plugins/papermilltasks/simple.py b/cookbook/integrations/flytekit_plugins/papermilltasks/simple.py index d69aeb9e6e..be222c6dad 100644 --- a/cookbook/integrations/flytekit_plugins/papermilltasks/simple.py +++ b/cookbook/integrations/flytekit_plugins/papermilltasks/simple.py @@ -1,6 +1,7 @@ """ Jupyter Notebook Tasks ------------------------ +---------------------- + In this example, we will show how to create a flyte task that runs a simple notebook, accepts one input variable, transforms it, and produces one output. This can be generalized to multiple inputs and outputs. diff --git a/cookbook/integrations/flytekit_plugins/sql/README.rst b/cookbook/integrations/flytekit_plugins/sql/README.rst index 8ec6412131..0e848540f5 100644 --- a/cookbook/integrations/flytekit_plugins/sql/README.rst +++ b/cookbook/integrations/flytekit_plugins/sql/README.rst @@ -1,6 +1,8 @@ SQL === +.. tags:: Integration, Data, SQL, Intermediate + Flyte tasks are not always restricted to running user-supplied containers, nor even containers at all. Indeed, this is one of the most important design decisions in Flyte. Non-container tasks can have arbitrary targets for execution -- an API that executes SQL queries like SnowFlake, BigQuery, a synchronous WebAPI, etc. diff --git a/cookbook/integrations/flytekit_plugins/whylogs_examples/README.rst b/cookbook/integrations/flytekit_plugins/whylogs_examples/README.rst index e11c92e531..369ef923ee 100644 --- a/cookbook/integrations/flytekit_plugins/whylogs_examples/README.rst +++ b/cookbook/integrations/flytekit_plugins/whylogs_examples/README.rst @@ -1,6 +1,8 @@ whylogs ======= +.. tags:: Intermediate, Data, DataFrame, Intermediate + whylogs is an open source software that allows you to log and inspect differents aspects of your data and ML models. It creates efficient and mergeable statistical summaries of your datasets, called profiles, that have similar properties to logs produced by regular software applications. diff --git a/cookbook/integrations/gcp/bigquery/README.rst b/cookbook/integrations/gcp/bigquery/README.rst index 670d8e181a..c1496cdf6e 100644 --- a/cookbook/integrations/gcp/bigquery/README.rst +++ b/cookbook/integrations/gcp/bigquery/README.rst @@ -1,6 +1,8 @@ BigQuery ======== +.. tags:: GCP, Data, Integration, Advanced + Flyte backend can be connected with BigQuery service. Once enabled, it can allow you to query a BigQuery table. This section will provide how to use the BigQuery Plugin using flytekit python. diff --git a/cookbook/integrations/kubernetes/k8s_spark/README.rst b/cookbook/integrations/kubernetes/k8s_spark/README.rst index f8c9ee9b46..706790fce3 100644 --- a/cookbook/integrations/kubernetes/k8s_spark/README.rst +++ b/cookbook/integrations/kubernetes/k8s_spark/README.rst @@ -3,6 +3,8 @@ Kubernetes Spark Jobs ===================== +.. tags:: Spark, Integration, DistributedComputing, Data, Advanced + Flyte can execute Spark jobs natively on a Kubernetes Cluster, which manages a virtual cluster's lifecycle, spin-up, and tear down. It leverages the open-sourced `Spark On K8s Operator `__ and can be enabled without signing up for any service. This is like running a ``transient spark cluster``—a type of cluster spun up for a specific Spark job and torn down after completion. diff --git a/cookbook/integrations/kubernetes/k8s_spark/pyspark_pi.py b/cookbook/integrations/kubernetes/k8s_spark/pyspark_pi.py index 77b9279dfe..51e6c06d7e 100644 --- a/cookbook/integrations/kubernetes/k8s_spark/pyspark_pi.py +++ b/cookbook/integrations/kubernetes/k8s_spark/pyspark_pi.py @@ -2,7 +2,8 @@ .. _intermediate_using_spark_tasks: Writing a PySpark Task ------------------------- +---------------------- + Flyte has an optional plugin that makes it possible to run `Apache Spark `_ jobs natively on your kubernetes cluster. This plugin has been used extensively at Lyft and is battle tested. It makes it extremely easy to run your pyspark (coming soon to scala/java) code as a task. The plugin creates a new virtual cluster for the spark execution dynamically and Flyte will manage the execution, auto-scaling for the spark job. diff --git a/cookbook/integrations/kubernetes/kfmpi/README.rst b/cookbook/integrations/kubernetes/kfmpi/README.rst index 804a18a700..0e6058e916 100644 --- a/cookbook/integrations/kubernetes/kfmpi/README.rst +++ b/cookbook/integrations/kubernetes/kfmpi/README.rst @@ -3,6 +3,8 @@ MPI Operator ============ +.. tags:: Integration, DistributedComputing, MachineLearning, KubernetesOperator, Advanced + The upcoming example shows how to use MPI in Horovod. Horovod diff --git a/cookbook/integrations/kubernetes/kfpytorch/README.rst b/cookbook/integrations/kubernetes/kfpytorch/README.rst index e64d19433b..1fdafdc7b8 100644 --- a/cookbook/integrations/kubernetes/kfpytorch/README.rst +++ b/cookbook/integrations/kubernetes/kfpytorch/README.rst @@ -3,6 +3,8 @@ Kubeflow Pytorch ================ +.. tags:: Integration, DistributedComputing, MachineLearning, KubernetesOperator, Advanced + This plugin uses the Kubeflow Pytorch Operator and provides an extremely simplified interface for executing distributed training using various pytorch backends. Installation diff --git a/cookbook/integrations/kubernetes/kfpytorch/pytorch_mnist.py b/cookbook/integrations/kubernetes/kfpytorch/pytorch_mnist.py index 0d8b9c92ed..f6d2f92d56 100644 --- a/cookbook/integrations/kubernetes/kfpytorch/pytorch_mnist.py +++ b/cookbook/integrations/kubernetes/kfpytorch/pytorch_mnist.py @@ -1,6 +1,7 @@ """ Distributed Pytorch --------------------- +------------------- + This example is adapted from the default example available on Kubeflow's pytorch site. `here `_ It has been modified to show how to integrate it with Flyte and can be probably simplified and cleaned up. diff --git a/cookbook/integrations/kubernetes/kftensorflow/README.rst b/cookbook/integrations/kubernetes/kftensorflow/README.rst index 7fd1d82039..b98cf12268 100644 --- a/cookbook/integrations/kubernetes/kftensorflow/README.rst +++ b/cookbook/integrations/kubernetes/kftensorflow/README.rst @@ -1,6 +1,8 @@ Kubeflow TensorFlow =================== +.. tags:: Integration, DistributedComputing, MachineLearning, KubernetesOperator, Advanced + TensorFlow operator is useful to natively run distributed TensorFlow training jobs on Flyte. It is a wrapper built around `Kubeflow's TensorFlow operator `__. diff --git a/cookbook/integrations/kubernetes/pod/README.rst b/cookbook/integrations/kubernetes/pod/README.rst index 4cb324281a..8a3c34735e 100644 --- a/cookbook/integrations/kubernetes/pod/README.rst +++ b/cookbook/integrations/kubernetes/pod/README.rst @@ -1,6 +1,8 @@ Kubernetes Pods =============== +.. tags:: Integration, Kubernetes, Advanced + Flyte tasks (Python functions decorated with :py:func:`@task `) are essentially single functions loaded in one container. But often, there is a need to run a job with more than one container, in cases such as: diff --git a/cookbook/integrations/kubernetes/ray_example/README.rst b/cookbook/integrations/kubernetes/ray_example/README.rst index 59db220941..e7ff0fd1c5 100644 --- a/cookbook/integrations/kubernetes/ray_example/README.rst +++ b/cookbook/integrations/kubernetes/ray_example/README.rst @@ -1,6 +1,8 @@ KubeRay ======== +.. tags:: Integration, DistributedComputing, KubernetesOperator, Advanced + `KubeRay `__ is an open source toolkit to run Ray applications on Kubernetes. It provides tools to improve running and managing Ray on Kubernetes. - Ray Operator diff --git a/cookbook/integrations/kubernetes/ray_example/ray_example.py b/cookbook/integrations/kubernetes/ray_example/ray_example.py index 880485f721..7a290afa15 100644 --- a/cookbook/integrations/kubernetes/ray_example/ray_example.py +++ b/cookbook/integrations/kubernetes/ray_example/ray_example.py @@ -1,6 +1,7 @@ """ Ray Tasks ----------- +--------- + Ray task allows you to run a Ray job on an existing Ray cluster or create a Ray cluster by using the Ray operator. diff --git a/cookbook/larger_apps/README.rst b/cookbook/larger_apps/README.rst index 339ae63b8c..827326b9ea 100644 --- a/cookbook/larger_apps/README.rst +++ b/cookbook/larger_apps/README.rst @@ -1,7 +1,9 @@ .. _larger_apps: Building Large Apps --------------------- +------------------- + +.. tags:: Deployment, Intermediate So far in the *User Guide* you've been running Flyte workflows as one-off scripts, which is useful for quick prototyping and iteration of small ideas diff --git a/cookbook/larger_apps/larger_apps_deploy.py b/cookbook/larger_apps/larger_apps_deploy.py index 406d97a481..ec7e48ebac 100644 --- a/cookbook/larger_apps/larger_apps_deploy.py +++ b/cookbook/larger_apps/larger_apps_deploy.py @@ -2,7 +2,7 @@ .. _larger_apps_deploy: Deploy to the Cloud --------------------------------- +------------------- Prerequisites ^^^^^^^^^^^^^^^^ diff --git a/cookbook/larger_apps/larger_apps_iterate.py b/cookbook/larger_apps/larger_apps_iterate.py index a8be1d1d8c..d942bc1acf 100644 --- a/cookbook/larger_apps/larger_apps_iterate.py +++ b/cookbook/larger_apps/larger_apps_iterate.py @@ -2,7 +2,7 @@ .. _larger_apps_iterate: Iterate and Re-deploy ----------------------- +--------------------- In this guide, you'll learn how to iterate on and re-deploy your tasks and workflows. diff --git a/cookbook/larger_apps/larger_apps_setup.py b/cookbook/larger_apps/larger_apps_setup.py index 863a8a735b..11c13b439b 100644 --- a/cookbook/larger_apps/larger_apps_setup.py +++ b/cookbook/larger_apps/larger_apps_setup.py @@ -2,7 +2,7 @@ .. _larger_apps_build: Setup a Project ----------------- +--------------- Prerequisites ^^^^^^^^^^^^^^^^ diff --git a/cookbook/remote_access/README.rst b/cookbook/remote_access/README.rst index 00dc81ebe6..54169b81b0 100644 --- a/cookbook/remote_access/README.rst +++ b/cookbook/remote_access/README.rst @@ -3,6 +3,8 @@ Remote Access ------------- +.. tags:: Deployment, Remote, CLI, Intermediate + Flyte provides multiple ways of creating, registering, and inspecting Flyte backend entities. The main entities include Flyte tasks, workflows, launchplans, as well as their associated execution entities (for more details, see :ref:`divedeep`). This section diff --git a/cookbook/remote_access/register_project.py b/cookbook/remote_access/register_project.py index a9864614af..60246c6a6e 100644 --- a/cookbook/remote_access/register_project.py +++ b/cookbook/remote_access/register_project.py @@ -1,6 +1,6 @@ """ Creating a New Project -------------------------- +---------------------- Creates project to be used as a home for the flyte resources of tasks and workflows. Refer to the `flytectl API reference `__ diff --git a/cookbook/remote_access/remote_task.py b/cookbook/remote_access/remote_task.py index 76ffe9b8df..5a28a5e865 100644 --- a/cookbook/remote_access/remote_task.py +++ b/cookbook/remote_access/remote_task.py @@ -1,6 +1,6 @@ """ Running a Task --------------------- +-------------- Flytectl ========