merged step1 and intro and other fixes in #29

iterative · Mar 10, 2021 · f142413 · f142413
1 parent 1c0cd33
commit f142413
Show file tree

Hide file tree

Showing 12 changed files with 28 additions and 40 deletions.
diff --git a/...rted/stages/02-manual-data-preparation.md → ...rted/stages/01-manual-data-preparation.md b/...rted/stages/02-manual-data-preparation.md → ...rted/stages/01-manual-data-preparation.md
diff --git a/get-started/stages/01-whats-a-stage.md b/get-started/stages/01-whats-a-stage.md
diff --git a/get-started/stages/03-adding-a-stage.md → get-started/stages/02-adding-a-stage.md b/get-started/stages/03-adding-a-stage.md → get-started/stages/02-adding-a-stage.md
diff --git a/get-started/stages/04-running-a-stage.md → get-started/stages/03-running-a-stage.md b/get-started/stages/04-running-a-stage.md → get-started/stages/03-running-a-stage.md
diff --git a/...tarted/stages/05-how-dvc-tracks-stages.md → ...tarted/stages/04-how-dvc-tracks-stages.md b/...tarted/stages/05-how-dvc-tracks-stages.md → ...tarted/stages/04-how-dvc-tracks-stages.md
diff --git a/...d/stages/06-how-directories-are-cached.md → ...d/stages/05-how-directories-are-cached.md b/...d/stages/06-how-directories-are-cached.md → ...d/stages/05-how-directories-are-cached.md
diff --git a/...rted/stages/07-add-featurization-stage.md → ...rted/stages/06-add-featurization-stage.md b/...rted/stages/07-add-featurization-stage.md → ...rted/stages/06-add-featurization-stage.md
diff --git a/...started/stages/08-reproduce-a-pipeline.md → ...started/stages/07-reproduce-a-pipeline.md b/...started/stages/08-reproduce-a-pipeline.md → ...started/stages/07-reproduce-a-pipeline.md
diff --git a/...arted/stages/09-visualize-the-pipeline.md → ...arted/stages/08-visualize-the-pipeline.md b/...arted/stages/09-visualize-the-pipeline.md → ...arted/stages/08-visualize-the-pipeline.md
diff --git a/get-started/stages/10-ending.md → get-started/stages/09-ending.md b/get-started/stages/10-ending.md → get-started/stages/09-ending.md
diff --git a/get-started/stages/index.json b/get-started/stages/index.json
@@ -7,43 +7,39 @@
         "steps": [
             {
                 "title": "Step 1",
-                "text": "01-whats-a-stage.md"
+                "text": "01-manual-data-preparation.md"
             },
             {
                 "title": "Step 2",
-                "text": "02-manual-data-preparation.md"
+                "text": "02-adding-a-stage.md"
             },
             {
                 "title": "Step 3",
-                "text": "03-adding-a-stage.md"
+                "text": "03-running-a-stage.md"
             },
             {
                 "title": "Step 4",
-                "text": "04-running-a-stage.md"
+                "text": "04-how-dvc-tracks-stages.md"
             },
             {
                 "title": "Step 5",
-                "text": "05-how-dvc-tracks-stages.md"
+                "text": "05-how-directories-are-cached.md"
             },
             {
                 "title": "Step 6",
-                "text": "06-how-directories-are-cached.md"
+                "text": "06-add-featurization-stage.md"
             },
             {
                 "title": "Step 7",
-                "text": "07-add-featurization-stage.md"
+                "text": "07-reproduce-a-pipeline.md"
             },
             {
                 "title": "Step 8",
-                "text": "08-reproduce-a-pipeline.md"
-            },
-            {
-                "title": "Step 9",
-                "text": "09-visualize-the-pipeline.md"
+                "text": "08-visualize-the-pipeline.md"
             },
             {
                 "title": "Congratulations!",
-                "text": "10-ending.md"
+                "text": "09-ending.md"
             }
         ],
         "intro": {

diff --git a/get-started/stages/intro.md b/get-started/stages/intro.md
@@ -1,17 +1,27 @@
-The commands that we have seen so far (`add`, `push`, `pull`, etc.) provide a
-useful framework to track, save, and share models and large data files. In some
-cases and projects, this could be all you need.
-
-Usually, in ML projects, you need to process data and generate outputs in a
+In ML projects, usually we need to process data and generate outputs in a
 reproducible way. This requires establishing a connection between the data
-processed, the program that processes them, its parameters and the outputs.
-
-In a typical machine learning project we have the following stages: 
+processed, the program that processes them, its parameters, and the outputs.
 
 ![](/dvc/courses/get-started/stages/assets/example-flow.png)
 
 This process is reflected in DVC with a [data pipeline][bcpipeline]. In this
-scenario we begin to build pipelines using stage definitions and connect them
+scenario, we begin to build pipelines using stage definitions and connect them
 together.
 
 [bcpipeline]: https://dvc.org/doc/user-guide/basic-concepts/pipeline
+
+[Stages][bcstage] are the basic building blocks of pipelines in DVC. They define
+and execute an action, like data import or feature extraction, and usually
+produce some output. 
+
+[bcstage]: https://dvc.org/doc/user-guide/basic-concepts/stage
+
+We have a machine learning project already provided in `~/project`. We provided
+source files in `~/project/src/`, downloaded data to `data/data.xml`, and made
+it smaller. You can review these steps in more detail in [Data and Model
+Versioning][v] and [Accessing Data and Models][a] scenarios.
+
+[v]: https://katacoda.com/dvc/courses/get-started/versioning
+[a]: https://katacoda.com/dvc/courses/get-started/accessing
+
+You can use the editor to browse the project.