From 0464f5ebf19b4e801c631a0f38ad01e8ae02000a Mon Sep 17 00:00:00 2001 From: Ketan Umare Date: Tue, 11 May 2021 14:46:01 -0700 Subject: [PATCH 1/3] Update README.md Signed-off-by: Ketan Umare --- README.md | 16 ++++++++-------- 1 file changed, 8 insertions(+), 8 deletions(-) diff --git a/README.md b/README.md index d6333c0761..886ad51bc2 100644 --- a/README.md +++ b/README.md @@ -139,13 +139,14 @@ To dig deeper into Flyte, refer to the [Documentation](https://docs.flyte.org/en - Snappy Console - Python CLI and Golang CLI (flytectl) - Written in **Golang** and optimized for large running jobs' performance +- [Grafana templates](https://grafana.com/orgs/flyte) (user/system observability) ### In Progress -- Grafana templates (user/system observability) -- Helm chart for Flyte -- Performance optimization -- Flink-K8s +- Helm chart for Flyte (coming soon - June) +- Flink-K8s (coming soon - June) +- One click deploy to AWS +- Reactive pipelines & Events ## 🔌 Available Plugins @@ -162,11 +163,10 @@ To dig deeper into Flyte, refer to the [Documentation](https://docs.flyte.org/en - Distributed Tensorflow (K8s Native) - [TFOperator](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/kubernetes/kftensorflow/index.html) - Papermill notebook execution ([Python](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/flytekit_plugins/papermilltasks/index.html) and Spark) - Type safe and data checking for Pandas dataframe using Pandera +- Versioned datastores using DoltHub and Dolt +- Use SQLAlchemy to query any relational database +- Build your own plugins that use library containers -### In Queue - -- Reactive pipelines -- A lot more integrations!

From fd0ed784ec888bb3405c1ebb75a64cbefede81dd Mon Sep 17 00:00:00 2001 From: Samhita Alla Date: Wed, 12 May 2021 18:52:55 +0530 Subject: [PATCH 2/3] Update README (#1024) * updated readme Signed-off-by: Samhita Alla * updated readme Signed-off-by: Samhita Alla * removed css Signed-off-by: Samhita Alla * moved from html to md Signed-off-by: Samhita Alla --- README.md | 24 ++++++++++++++++++------ 1 file changed, 18 insertions(+), 6 deletions(-) diff --git a/README.md b/README.md index 886ad51bc2..c6fed2ec8e 100644 --- a/README.md +++ b/README.md @@ -8,7 +8,7 @@

-Flyte is a production-grade, container-native, type-safe workflow and pipelines platform optimized for large scale processing and machine learning written in Golang + Flyte is a workflow automation platform for complex, mission-critical data and ML processes at scale

@@ -56,6 +56,13 @@ Flyte is a production-grade, container-native, type-safe workfl +## ⏳ Five Reasons to Use Flyte +- Kubernetes-Native Workflow Automation Platform +- Ergonomic SDK's in Python, Java & Scala +- Versioned & Auditable +- Reproducible Pipelines +- Strong Data Typing + ## 💥 Introduction Flyte is a structured programming and distributed processing platform that enables highly concurrent, scalable and maintainable workflows for `Machine Learning` and `Data Processing`. It is a fabric that connects disparate computation backends using a type safe data dependency graph. It records all changes to a pipeline, making it possible to rewind time. It also stores @@ -104,18 +111,22 @@ To dig deeper into Flyte, refer to the [Documentation](https://docs.flyte.org/en - Used at _Scale_ in production by **500+** users at Lyft with more than **1 million** executions and **40+ million** container executions per month +- A data aware platform - Enables **collaboration across your organization**, as in: - Execute distributed data pipelines/workflows - Reuse tasks across projects, users, and workflows + - Makes it easy to stitch together workflows from different teams and domain experts - Backtrace to a specified workflow - Compare results of training workflows over time and across pipelines - Share workflows and tasks across your teams -- **[Quick registration](https://docs.flyte.org/projects/cookbook/en/latest/tutorial.html)** -- start locally and scale to the cloud instantly + - Simplifies the complexity of multi-step, multi-owner workflows +- **[Quick registration](https://docs.flyte.org/en/latest/getting_started.html)** -- start locally and scale to the cloud instantly - **Centralized Inventory** constituting Tasks, Workflows and Executions - **gRPC / REST** interface to define and execute tasks and workflows - **Type safe** construction of pipelines -- each task has an interface which is characterized by its input and output; thus, illegal construction of pipelines fails during declaration rather than at runtime -- Supports multiple **[data types](https://docs.flyte.org/projects/cookbook/en/latest/core.html)** for machine learning and data processing pipelines, such as Blobs (images, arbitrary files), Directories, Schema (columnar structured data), collections, maps etc. +- Supports multiple **[data types](https://docs.flyte.org/projects/cookbook/en/latest/auto/type_system/index.html)** for machine learning and data processing pipelines, such as Blobs (images, arbitrary files), Directories, Schema (columnar structured data), collections, maps etc. - Memoization and Lineage tracking +- Provides logging and observability - Workflow features: - Start with one task, convert to a pipeline, attach **[multiple schedules](https://docs.flyte.org/projects/cookbook/en/latest/auto/deployment/workflow/lp_schedules.html)**, trigger using a programmatic API, or on-demand - Parallel step execution @@ -133,6 +144,7 @@ To dig deeper into Flyte, refer to the [Documentation](https://docs.flyte.org/en - Declarative pipelines - **Multi cloud support** (AWS, GCP and others) - Extensible core, modularized, and deep observability +- No single point of failure and is resilient by design - Automated notifications to Slack, Email, and Pagerduty - [Multi K8s cluster support](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/kubernetes/pod/index.html) - Out of the box support to run **[Spark jobs on K8s](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/kubernetes/k8s_spark/index.html)**, **[Hive queries](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/external_services/hive/index.html)**, etc. @@ -159,10 +171,10 @@ To dig deeper into Flyte, refer to the [Documentation](https://docs.flyte.org/en - [Qubole Hive](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/external_services/hive/index.html) - Presto Queries - Distributed Pytorch (K8s Native) -- [Pytorch Operator](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/kubernetes/kfpytorch/index.html) -- Sagemaker([builtin algorithms](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/aws/sagemaker_training/sagemaker_builtin_algo_training.html) & [custom models](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/aws/sagemaker_training/sagemaker_custom_training.html)) -- Distributed Tensorflow (K8s Native) - [TFOperator](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/kubernetes/kftensorflow/index.html) +- Sagemaker ([builtin algorithms](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/aws/sagemaker_training/sagemaker_builtin_algo_training.html) & [custom models](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/aws/sagemaker_training/sagemaker_custom_training.html)) +- Distributed Tensorflow (K8s Native) -- [TFOperator](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/kubernetes/kftensorflow/index.html) - Papermill notebook execution ([Python](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/flytekit_plugins/papermilltasks/index.html) and Spark) -- Type safe and data checking for Pandas dataframe using Pandera +- [Type safe and data checking for Pandas dataframe](https://docs.flyte.org/projects/cookbook/en/latest/auto/integrations/flytekit_plugins/pandera/index.html) using Pandera - Versioned datastores using DoltHub and Dolt - Use SQLAlchemy to query any relational database - Build your own plugins that use library containers From 27d003cd1ecf3315efcaddcc156c2ec7e29168ee Mon Sep 17 00:00:00 2001 From: SandraGH5 <80421934+SandraGH5@users.noreply.github.com> Date: Wed, 12 May 2021 11:34:58 -0700 Subject: [PATCH 3/3] Update README.md --- README.md | 49 ++++++++++++++++++++++++++----------------------- 1 file changed, 26 insertions(+), 23 deletions(-) diff --git a/README.md b/README.md index c6fed2ec8e..46ba72be7b 100644 --- a/README.md +++ b/README.md @@ -56,23 +56,24 @@ -## ⏳ Five Reasons to Use Flyte -- Kubernetes-Native Workflow Automation Platform -- Ergonomic SDK's in Python, Java & Scala -- Versioned & Auditable -- Reproducible Pipelines -- Strong Data Typing - ## 💥 Introduction Flyte is a structured programming and distributed processing platform that enables highly concurrent, scalable and maintainable workflows for `Machine Learning` and `Data Processing`. It is a fabric that connects disparate computation backends using a type safe data dependency graph. It records all changes to a pipeline, making it possible to rewind time. It also stores a history of all executions and provides an intuitive UI, CLI and REST/gRPC API to interact with the computation. -Flyte is more than a workflow engine -- it provides `workflow` as a core concept and a single unit of execution called `task` as a top level concept. Multiple tasks arranged in a data +Flyte is more than a workflow engine -- it uses a `workflow` as a core concept and a `task` (a single unit of execution) as a top level concept. Multiple tasks arranged in a data producer-consumer order create a workflow. `Workflows` and `Tasks` can be written in any language, with out of the box support for [Python](https://github.com/flyteorg/flytekit), [Java and Scala](https://github.com/spotify/flytekit-java). + +## ⏳ Five Reasons to Use Flyte +- Kubernetes-Native Workflow Automation Platform +- Ergonomic SDK's in Python, Java & Scala +- Versioned & Auditable +- Reproducible Pipelines +- Strong Data Typing +

🚀 Quick Start @@ -87,7 +88,7 @@ With [docker installed](https://docs.docker.com/get-docker/), run the following This creates a local Flyte sandbox. Once the sandbox is ready, you should see the following message: `Flyte is ready! Flyte UI is available at http://localhost:30081/console`. -Go ahead and visit http://localhost:30081/console to view the Flyte dashboard. +Visit http://localhost:30081/console to view the Flyte dashboard. Here's a quick visual tour of the console. @@ -112,19 +113,19 @@ To dig deeper into Flyte, refer to the [Documentation](https://docs.flyte.org/en - Used at _Scale_ in production by **500+** users at Lyft with more than **1 million** executions and **40+ million** container executions per month - A data aware platform -- Enables **collaboration across your organization**, as in: - - Execute distributed data pipelines/workflows - - Reuse tasks across projects, users, and workflows - - Makes it easy to stitch together workflows from different teams and domain experts - - Backtrace to a specified workflow - - Compare results of training workflows over time and across pipelines - - Share workflows and tasks across your teams - - Simplifies the complexity of multi-step, multi-owner workflows +- Enables **collaboration across your organization** by: + - Executing distributed data pipelines/workflows + - Reusing tasks across projects, users, and workflows + - Making it easy to stitch together workflows from different teams and domain experts + - Backtracing to a specified workflow + - Comparing results of training workflows over time and across pipelines + - Sharing workflows and tasks across your teams + - Simplifying the complexity of multi-step, multi-owner workflows - **[Quick registration](https://docs.flyte.org/en/latest/getting_started.html)** -- start locally and scale to the cloud instantly - **Centralized Inventory** constituting Tasks, Workflows and Executions - **gRPC / REST** interface to define and execute tasks and workflows -- **Type safe** construction of pipelines -- each task has an interface which is characterized by its input and output; thus, illegal construction of pipelines fails during declaration rather than at runtime -- Supports multiple **[data types](https://docs.flyte.org/projects/cookbook/en/latest/auto/type_system/index.html)** for machine learning and data processing pipelines, such as Blobs (images, arbitrary files), Directories, Schema (columnar structured data), collections, maps etc. +- **Type safe** construction of pipelines -- each task has an interface which is characterized by its input and output, so illegal construction of pipelines fails during declaration rather than at runtime +- Supports multiple **[data types](https://docs.flyte.org/projects/cookbook/en/latest/auto/type_system/index.html)** for machine learning and data processing pipelines, such as Blobs (images, arbitrary files), Directories, Schema (columnar structured data), collections, maps, etc. - Memoization and Lineage tracking - Provides logging and observability - Workflow features: @@ -214,16 +215,18 @@ To dig deeper into Flyte, refer to the [Documentation](https://docs.flyte.org/en

-Here are the resources that would help you get a better understanding of Flyte. +Here are some resources to help you learn more about Flyte. ### Communication Channels - [Slack Org](https://forms.gle/UVuek9WfBoweiqcJA) -- [Email list](https://groups.google.com/a/flyte.org/g/users) +- [Email list](https://groups.google.com/u/0/a/flyte.org/g/users) +- [Twitter](https://twitter.com/flyteorg) +- [LinkedIn Discussion Group](https://www.linkedin.com/groups/13962256/) ### Biweekly Community Sync -- 📣 **Flyte OSS Community Sync** happens every alternate Tuesday, 9am-10am PDT ([Checkout the events calendar & subscribe](https://calendar.google.com/calendar/embed?src=admin%40flyte.org&ctz=America%2FLos_Angeles)). Here's the [zoom link](https://us04web.zoom.us/j/71298741279?pwd=TDR1RUppQmxGaDRFdzBOa2lHN1dsZz09). +- 📣 **Flyte OSS Community Sync** happens every other Tuesday, 9am-10am PDT ([Checkout the events calendar](https://calendar.google.com/calendar/embed?src=admin%40flyte.org&ctz=America%2FLos_Angeles)). Here's the [zoom link](https://us04web.zoom.us/j/71298741279?pwd=TDR1RUppQmxGaDRFdzBOa2lHN1dsZz09). - Meeting notes and backlog of topics are captured in [doc](https://docs.google.com/document/d/1Jb6eOPOzvTaHjtPEVy7OR2O5qK1MhEs3vv56DX2dacM/edit#heading=h.c5ha25xc546e). - If you'd like to revisit any community sync meeting that has happened, you can access the [video recordings](https://www.youtube.com/channel/UCNduEoLOToNo3nFVly-vUTQ). @@ -240,7 +243,7 @@ Here are the resources that would help you get a better understanding of Flyte. ### Blog Posts -[Blog site](https://blog.flyte.org/) +[Flyte blog site](https://blog.flyte.org/) ### Podcasts