An opinionated Python framework for creating reproducible, maintainable and modular data science code.
- Churn Prediction with Kedro by Laíza Parizotto, a project that tackles a data science challenge of predicting customer churn for a fictional financial institution, using Kedro to build an effective pipeline for a production-ready machine learning model.
- Response Recommendation System for BarefootLaw by Kasun Amarasinghe, Carlos Caro, Nupoor Gandhi and Raphaelle Roffo, an extensive Data Science for Social Good (DSSG) at Imperial College London project that recommends responses to law related queries
- Augury by Craig Franklin, machine-learning functionality for predicting AFL match results in the Tipresias app
- CausalLift by Yusuke Minami, a Python package for Uplift Modeling in real-world business
- PipelineX by Yusuke Minami, a Python package to develop pipelines for rapid Machine/Deep Learning experimentation using Kedro and MLflow. Example projects using PyTorch, Pandas, and OpenCV are available.
- kedro-mlflow-example by Tom Goldenberg, a project that demonstrates how to integrate MLflow with a Kedro codebase
- kedro-wdbc-tf by Abhinav Prakash, this project uses a Kedro template to create Deep Learning workflow. The model training was done with TensorFlow against the wdbc (Breast Cancer) dataset.
- twitter-sentiment-analysis by Avi Agarwal, a project that demonstrates how to use Kedro to train and evaluate an NLP-based machine learning model.
- Anomaly Detection Pipeline with Kedro by Kenneth Leung, a project that demonstrates how to use Kedro for fraud detection on credit card transaction data using an Isolation Forest machine learning model.
- find-kedro - Automatically construct pipelines using pytest style pattern matching.
- kedro-accelerator - Speeds up pipelines by parallelizing I/O in the background.
- kedro-airflow - Makes it easy to deploy Kedro projects to Airflow.
- kedro-airflow-k8s - Enables running a Kedro pipeline with Airflow on a Kubernetes cluster.
- kedro-argo - Converts Kedro pipelines to Argo pipelines.
- kedro-auto-catalog - A configurable replacement for kedro catalog create that allows you to create default dataset types other than
MemoryDataset
. - kedro-azureml - Enables running a Kedro pipeline with Azure ML Pipelines service.
- kedro-dataframe-dropin - Lets you swap out pandas datasets for modin or RAPIDs equivalents for specialised use to speed up your workflows (e.g on GPUs).
- kedro-datasets - A collection of Kedro data connectors.
- kedro-docker - Makes it easy to package Kedro projects with Docker.
- kedro-dolt - Allows you to expand the data versioning abilities of data scientists and engineers
- kedro-great - The easiest way to integrate Kedro and Great Expectations.
- kedro-grpc-server - Creates a gRPC server for your kedro pipelines.
- kedro-kubeflow - Lets you run and schedule pipelines on Kubernetes clusters using Kubeflow Pipelines.
- kedro-mlflow - Allows usage of MLFlow in Kedro projects.
- kedro-neptune - Integration of Kedro with Neptune.ai.
- kedro-pandas-profiling - "Profiles" data in the catalog.
- kedro-partitioned - Extends the functionality on processing partitioned data.
- kedro-sagemaker - Enables running a Kedro pipeline with Amazon SageMaker service.
- kedro-static-viz - Generates a static Kedro-Viz site (HTML, CSS, JS)
- kedro-viz - Helps visualise Kedro data and analytics pipelines.
- kedro-vertexai - Enables running a Kedro pipeline with Vertex AI Pipelines service.
- kedro-wings - Automatically creates catalog entries to simplify Kedro pipeline writing.- more-kedro - (Hook) library for on the fly typing and validation of parameter dictionaries and default value backed data loading.
- steel-toes - Prevent changing downstream catalog data on your teammates while developing in parallel.
- vineyard-kedro - Custom
DataSet
andRunner
which enables sharing intermediate data between tasks in Kedro pipelines using Vineyard, a cloud-native in-memory object manager.
For more:
- kedro-plugin topic on GitHub
- Building and Managing Data Science Pipelines with Kedro
- Deploying Kedro Pipelines to Apache Airflow
- Writing your first kedro Nodes
- Setting Parameters in kedro
- Add New Dependencies to Your Kedro Project
- Running your Kedro Pipeline from the command line
- kedro Virtual Environment
- Kedro Pipeline Create
- Kedro Install
- Kedro Git Init
- Kedro New
- What is Kedro
- How I Kedro
- Incremental Versioned Datasets in Kedro
- Productionizing ML Pipelines with Airflow, Kedro, and Great Expectations
- Change Data Capture With Kedro and Dolt
- Applying data engineering to applications with Kedro
- Running Machine Learning Pipelines with Kedro, Kubeflow and Airflow
- Introducing Kedro: Yetunde Dada, Principal Product Manager at QuantumBlack
- Standardization of End-to-End Data Pipeline for AI Project Using Kedro
- Using Kedro pipelines to train Amazon SageMaker models
- Kedro 6 Months In
- Jungle Scout case study: Kedro, Airflow, and MLFlow use on production code
- Building a Production-Level Data Pipeline Using Kedro
- Designing a "Router" for kedro
- Power is nothing without control
- Start small and grow big MLOps2020
- Get Started with Machine Learning Pipelines at Kedro
- Mid Meet Py - Ep.14 - Interview with Waylon Walker
- How to find datasets in your kedro catalog
- How Kedro handles your inputs
- Post mortem debugging sessions with Kedro hooks
- Start small and grow big MLOps2020
- Create Configurable Kedro Hooks
- What's an example use case of Kedro?
- Make Notebook Pipeline with Kedro+Papermill
- 25 Hot New Data Tools and What They DON’T do
- Kedro Hooks Intro - creating the kedro-preflight hook
- Next Generation Data Science and Data Engineering Frameworks
- Understanding best-practice Python tooling by comparing popular project templates
- A story using the Kedro pipeline library
- Transparent data flow with Kedro
- Comparison of Python pipeline packages: Airflow, Luigi, Metaflow, Kedro & PipelineX
- Kedro in Jupyter Notebooks On Google GCP Dataproc
- Building a Pipeline with Kedro for an ML Competition
- Using Kedro and MLflow Deploying and versioning data pipelines at scale
- Ship Faster With An Opinionated Data Pipeline Framework Episode** 100
- Some cool open-source Python packages for Machine Learning
- Kedro: A New Tool For Data Science
- Standardization of End-to-End Data Pipeline for AI Project Using Kedro
- The latest and greatest in Kedro — We’re growing our community
- Kedro-Airflow 0.4.0 — Orchestrating Kedro Pipelines with Airflow
- Beyond the Notebook and into the Data Science Framework Revolution
- Element AI uses Kedro to apply research and develop enterprise AI models
- Introducing Kedro Hooks
- Getting Started with Kedro
- Introducing Kedro
- Deploying and Versioning Data Pipelines at Scale
- Kedro hands-on Build your own demographics atlas. Pt. 2: building footprints classification
- Kedro (Python template for production-quality ML data pipelines)
- Enhance your kedro experiences with these tips
- Kedro: The Best Python Framework for Data Science!!
- kedro-in-6-months
- Deploying a Recommendation System the Kedro Way
- Efficient Data Sharing in Data Science Pipelines on Kubernetes
For more:
- #kedro tag on dev.to
- What is Kedro? Why is it useful? A Non-Technical Intro to Kedro - An intro for management people.
- PyConUS 20201 - Reproducible and maintainable data science code with Kedro
- Principled Data Science Workflows
- Production-level data pipelines that make everyone happy using Kedro
- Kedro - Nubank ML Meetup (Portuguese)
- Data Science Best Practices con Kedro (Spanish)
- Refactor your Jupyter notebooks using Kedro
- Introduction to Kedro training with Joel Schwarzmann
- Creating Shared Catalogs for your Kedro Projects on GitHub
- Deployable REST Enabled Data Pipelines with Flask, Docker, Kedro
- How to begin writing tests for your Pipelines
- How To Customize Your Kedro CLI Options
- How to Get/Write Data from/to a SQL Database - Use
pandas.SQLTableDataSet
orpandas.SQLQueryDataSet
. - How to Lazily Evaluate Chunks of a Big Pandas DataFrame
- How to Setup PySpark for your Kedro Pipeline
- Kedro Great: Use Great Expectations with Ease! - Show how to use kedro-great to e.g. validate data container meta data (columns, etc.).
For more:
### News
- Kedro Community Update - April 2023 - Kedro 0.18.7, new
OmegaConfigLoader
, experiment tracking in Kedro Viz, improvements in Databricks workflow, and more. - Let's look at Kedro 0.17.0!
- Kedro 0.16.0 was just Released! - Release notes (features) of Kedro 0.16.0 explained.