Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

start: add meta title, description and key term (data pipelines) [SEO] #1857

Merged
merged 8 commits into from
Oct 29, 2020
14 changes: 11 additions & 3 deletions content/docs/start/data-pipelines.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,10 @@
# Data Pipelines
---
title: 'Get Started: Data Pipelines'
description: 'Get started with pipelines in DVC. Learn how to capture, organize,
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved
version, and reproduce your data science and machine learning workflows.'
---

# Get Started: Data Pipelines
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved

Versioning large data files and directories for data science is great, but not
enough. How is data filtered, transformed, or used to train ML models? DVC
Expand All @@ -7,7 +13,9 @@ that produce a final result.

DVC pipelines and their data can also be easily versioned (using Git). This
allows you to better organize your project, and reproduce your workflow and
results later exactly as they were built originally!
results later exactly as they were built originally! This allows you to capture
simple ETL workflows, better organize data science projects, or build detailed
machine learning pipelines (to name a few uses).
jorgeorpinel marked this conversation as resolved.
Show resolved Hide resolved

## Pipeline stages

Expand Down Expand Up @@ -300,7 +308,7 @@ important problems:
and which commands will generate the pipeline results (such as an ML model).
Storing these files in Git makes it easy to version and share.
- _Continuous Delivery and Continuous Integration (CI/CD) for ML_ - describing
projects in way that it can be reproduced (built) is the fist necessary step
projects in way that it can be reproduced (built) is the first necessary step
before introducing CI/CD systems. See our sister project,
[CML](https://cml.dev/) for some examples.

Expand Down