Awesome Kedro

An opinionated Python framework for creating reproducible, maintainable and modular data science code.

Projects

Churn Prediction with Kedro by Laíza Parizotto, a project that tackles a data science challenge of predicting customer churn for a fictional financial institution, using Kedro to build an effective pipeline for a production-ready machine learning model.
Response Recommendation System for BarefootLaw by Kasun Amarasinghe, Carlos Caro, Nupoor Gandhi and Raphaelle Roffo, an extensive Data Science for Social Good (DSSG) at Imperial College London project that recommends responses to law related queries
Augury by Craig Franklin, machine-learning functionality for predicting AFL match results in the Tipresias app
CausalLift by Yusuke Minami, a Python package for Uplift Modeling in real-world business
PipelineX by Yusuke Minami, a Python package to develop pipelines for rapid Machine/Deep Learning experimentation using Kedro and MLflow. Example projects using PyTorch, Pandas, and OpenCV are available.
kedro-mlflow-example by Tom Goldenberg, a project that demonstrates how to integrate MLflow with a Kedro codebase
kedro-wdbc-tf by Abhinav Prakash, this project uses a Kedro template to create Deep Learning workflow. The model training was done with TensorFlow against the wdbc (Breast Cancer) dataset.
twitter-sentiment-analysis by Avi Agarwal, a project that demonstrates how to use Kedro to train and evaluate an NLP-based machine learning model.
Anomaly Detection Pipeline with Kedro by Kenneth Leung, a project that demonstrates how to use Kedro for fraud detection on credit card transaction data using an Isolation Forest machine learning model.

Plugins

find-kedro - Automatically construct pipelines using pytest style pattern matching.
kedro-accelerator - Speeds up pipelines by parallelizing I/O in the background.
kedro-airflow - Makes it easy to deploy Kedro projects to Airflow.
kedro-airflow-k8s - Enables running a Kedro pipeline with Airflow on a Kubernetes cluster.
kedro-argo - Converts Kedro pipelines to Argo pipelines.
kedro-auto-catalog - A configurable replacement for kedro catalog create that allows you to create default dataset types other than MemoryDataset.
kedro-azureml - Enables running a Kedro pipeline with Azure ML Pipelines service.
kedro-dataframe-dropin - Lets you swap out pandas datasets for modin or RAPIDs equivalents for specialised use to speed up your workflows (e.g on GPUs).
kedro-datasets - A collection of Kedro data connectors.
kedro-docker - Makes it easy to package Kedro projects with Docker.
kedro-dolt - Allows you to expand the data versioning abilities of data scientists and engineers
kedro-great - The easiest way to integrate Kedro and Great Expectations.
kedro-grpc-server - Creates a gRPC server for your kedro pipelines.
kedro-kubeflow - Lets you run and schedule pipelines on Kubernetes clusters using Kubeflow Pipelines.
kedro-mlflow - Allows usage of MLFlow in Kedro projects.
kedro-neptune - Integration of Kedro with Neptune.ai.
kedro-pandas-profiling - "Profiles" data in the catalog.
kedro-partitioned - Extends the functionality on processing partitioned data.
kedro-sagemaker - Enables running a Kedro pipeline with Amazon SageMaker service.
kedro-static-viz - Generates a static Kedro-Viz site (HTML, CSS, JS)
kedro-viz - Helps visualise Kedro data and analytics pipelines.
kedro-vertexai - Enables running a Kedro pipeline with Vertex AI Pipelines service.
kedro-wings - Automatically creates catalog entries to simplify Kedro pipeline writing.- more-kedro - (Hook) library for on the fly typing and validation of parameter dictionaries and default value backed data loading.
steel-toes - Prevent changing downstream catalog data on your teammates while developing in parallel.
vineyard-kedro - Custom DataSet and Runner which enables sharing intermediate data between tasks in Kedro pipelines using Vineyard, a cloud-native in-memory object manager.

For more:

kedro-plugin topic on GitHub

Blog posts

For more:

#kedro tag on dev.to

Videos

Intros

What is Kedro? Why is it useful? A Non-Technical Intro to Kedro - An intro for management people.
PyConUS 20201 - Reproducible and maintainable data science code with Kedro
Principled Data Science Workflows
Production-level data pipelines that make everyone happy using Kedro
Kedro - Nubank ML Meetup (Portuguese)
Data Science Best Practices con Kedro (Spanish)

Howtos

Refactor your Jupyter notebooks using Kedro
Introduction to Kedro training with Joel Schwarzmann
Creating Shared Catalogs for your Kedro Projects on GitHub
Deployable REST Enabled Data Pipelines with Flask, Docker, Kedro
How to begin writing tests for your Pipelines
How To Customize Your Kedro CLI Options
How to Get/Write Data from/to a SQL Database - Use pandas.SQLTableDataSet or pandas.SQLQueryDataSet.
How to Lazily Evaluate Chunks of a Big Pandas DataFrame
How to Setup PySpark for your Kedro Pipeline
Kedro Great: Use Great Expectations with Ease! - Show how to use kedro-great to e.g. validate data container meta data (columns, etc.).

For more:

@kedro-python on YouTube

### News

Kedro Community Update - April 2023 - Kedro 0.18.7, new OmegaConfigLoader, experiment tracking in Kedro Viz, improvements in Databricks workflow, and more.
Let's look at Kedro 0.17.0!
Kedro 0.16.0 was just Released! - Release notes (features) of Kedro 0.16.0 explained.

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
README.md		README.md
kedro_banner.png		kedro_banner.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Awesome Kedro

Contents

Projects

Plugins

Blog posts

Videos

Intros

Howtos

Support

About

Releases

Packages

laizaparizotto/awesome-kedro

Folders and files

Latest commit

History

Repository files navigation

Awesome Kedro

Contents

Projects

Plugins

Blog posts

Videos

Intros

Howtos

Support

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages