From c31d9713cbc2ffd2d4903db8a23c902fd18393c7 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel <jorge@orpinel.com>
Date: Tue, 19 Nov 2019 18:59:06 -0600
Subject: [PATCH 1/8] use-cases: address smaller points from review (#795)

---
 static/docs/use-cases/data-registry.md | 17 ++++++++---------
 1 file changed, 8 insertions(+), 9 deletions(-)
diff --git a/static/docs/use-cases/data-registry.md b/static/docs/use-cases/data-registry.md
index a5eead5b21..937c6e9d72 100644
--- a/static/docs/use-cases/data-registry.md
+++ b/static/docs/use-cases/data-registry.md
@@ -13,13 +13,13 @@ example, project A may use a data file to begin its data
 same file; Instead of
 [adding it](/doc/command-reference/add#example-single-file) it to both projects,
 B can simply import it from A. Furthermore, the version of the data file
-imported to B can be an older iteration than what's currently used in A.
+imported to B can be different than what's currently used in A.
 
 Keeping this in mind, we could build a <abbr>DVC project</abbr> dedicated to
 tracking and versioning datasets (or any kind of large files). This way we would
-have a repository that has all the metadata and change history for the project's
-data. We can see who updated what, and when; use pull requests to update data
-the same way you do with code; and we don't need ad-hoc conventions to store
+have a repository with all the metadata and history of changes in the project's
+data. We can see who updated what, and when, use pull requests to update data
+the same way you do with code, and we don't need ad-hoc conventions to store
 different data versions. Other projects can share the data in the registry by
 downloading (`dvc get`) or importing (`dvc import`) them for use in different
 data processes.
@@ -28,9 +28,8 @@ The advantages of using a DVC **data registry** project are:
 
 - Data as code: Improve _lifecycle management_ with versioning of simple
   directory structures (like Git for your cloud storage), without ad-hoc
-  conventions. Leverage Git and Git hosting features such as change history,
-  branching, pull requests, reviews, and even continuous deployment of ML
-  models.
+  conventions. Leverage Git and Git hosting features such as commits, branching,
+  pull requests, reviews, and even continuous deployment of ML models.
 - Reusability: Reproduce and organize _feature stores_ with a simple CLI
   (`dvc get` and `dvc import` commands, similar to software package management
   systems like `pip`).
@@ -49,8 +48,8 @@ The advantages of using a DVC **data registry** project are:
 
 ## Example
 
-A dataset we use for several of our examples and tutorials is one containing
-2800 images of cats and dogs. We partitioned the dataset in two for our
+A dataset we use for several of our examples and tutorials contains 2800 images
+of cats and dogs. We partitioned the dataset in two for our
 [Versioning Tutorial](/doc/tutorials/versioning), and backed up the parts on a
 storage server, downloading them with `wget` in our examples. This setup was
 then revised to download the dataset with `dvc get` instead, so we created the

From 6002cba2d1e166cd1b628212382531340db6a396 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel <jorge@orpinel.com>
Date: Wed, 20 Nov 2019 18:25:05 -0600
Subject: [PATCH 2/8] use-cases: reinforce hypothetical phrasing in data
 registry intro paragraph

per https://github.com/iterative/dvc.org/issues/795#issuecomment-556114361
---
 static/docs/use-cases/data-registry.md | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/static/docs/use-cases/data-registry.md b/static/docs/use-cases/data-registry.md
index 937c6e9d72..eccaeedb15 100644
--- a/static/docs/use-cases/data-registry.md
+++ b/static/docs/use-cases/data-registry.md
@@ -10,7 +10,7 @@ different projects (similar to package management systems, but for data), DVC
 also includes the `dvc get`, `dvc import`, and `dvc update` commands. For
 example, project A may use a data file to begin its data
 [pipeline](/doc/command-reference/pipeline), but project B also requires this
-same file; Instead of
+same file. Instead of
 [adding it](/doc/command-reference/add#example-single-file) it to both projects,
 B can simply import it from A. Furthermore, the version of the data file
 imported to B can be different than what's currently used in A.
@@ -18,13 +18,13 @@ imported to B can be different than what's currently used in A.
 Keeping this in mind, we could build a <abbr>DVC project</abbr> dedicated to
 tracking and versioning datasets (or any kind of large files). This way we would
 have a repository with all the metadata and history of changes in the project's
-data. We can see who updated what, and when, use pull requests to update data
-the same way you do with code, and we don't need ad-hoc conventions to store
-different data versions. Other projects can share the data in the registry by
-downloading (`dvc get`) or importing (`dvc import`) them for use in different
-data processes.
+data. We could see who updated what, and when, use pull requests to update data
+(the same way we do with code), and avoid ad-hoc conventions to store different
+data versions. This is what we call a data registry. Other projects can share
+datasets in a registry by downloading (`dvc get`) or importing (`dvc import`)
+them for use in different data processes.
 
-The advantages of using a DVC **data registry** project are:
+Advantages of using a DVC **data registry** project:
 
 - Data as code: Improve _lifecycle management_ with versioning of simple
   directory structures (like Git for your cloud storage), without ad-hoc

From 47ebae5868f88b11b6fda55b70a7b6df48b6c9d9 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel <jorge@orpinel.com>
Date: Wed, 20 Nov 2019 18:50:45 -0600
Subject: [PATCH 3/8] use-cases: partitioned->split in data registry case

per #795
and https://github.com/iterative/dvc.org/issues/795#issuecomment-556114361
---
 static/docs/use-cases/data-registry.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/static/docs/use-cases/data-registry.md b/static/docs/use-cases/data-registry.md
index eccaeedb15..adcc0a7990 100644
--- a/static/docs/use-cases/data-registry.md
+++ b/static/docs/use-cases/data-registry.md
@@ -49,7 +49,7 @@ Advantages of using a DVC **data registry** project:
 ## Example
 
 A dataset we use for several of our examples and tutorials contains 2800 images
-of cats and dogs. We partitioned the dataset in two for our
+of cats and dogs. We split the dataset in two for our
 [Versioning Tutorial](/doc/tutorials/versioning), and backed up the parts on a
 storage server, downloading them with `wget` in our examples. This setup was
 then revised to download the dataset with `dvc get` instead, so we created the

From a578c15d58384a25ac85fb9e1fa6c5b6f163e521 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel <jorge@orpinel.com>
Date: Wed, 20 Nov 2019 18:56:50 -0600
Subject: [PATCH 4/8] use-cases: geatly simplify mention about project
 inter-dependency in data reg

per https://github.com/iterative/dvc.org/issues/795#issuecomment-556114361
and https://github.com/iterative/dvc.org/issues/795#issuecomment-556651871
---
 static/docs/use-cases/data-registry.md | 10 +++-------
 1 file changed, 3 insertions(+), 7 deletions(-)

diff --git a/static/docs/use-cases/data-registry.md b/static/docs/use-cases/data-registry.md
index adcc0a7990..45fc308360 100644
--- a/static/docs/use-cases/data-registry.md
+++ b/static/docs/use-cases/data-registry.md
@@ -7,13 +7,9 @@ tracking of datasets and any other <abbr>data artifacts</abbr>.
 
 With the aim to enable reusability of these versioned artifacts between
 different projects (similar to package management systems, but for data), DVC
-also includes the `dvc get`, `dvc import`, and `dvc update` commands. For
-example, project A may use a data file to begin its data
-[pipeline](/doc/command-reference/pipeline), but project B also requires this
-same file. Instead of
-[adding it](/doc/command-reference/add#example-single-file) it to both projects,
-B can simply import it from A. Furthermore, the version of the data file
-imported to B can be different than what's currently used in A.
+also includes the `dvc get`, `dvc import`, and `dvc update` commands. This means
+that a project can depend on data from an external <abbr>DVC project</abbr>, but
+chaining several projects this way can easily become messy...
 
 Keeping this in mind, we could build a <abbr>DVC project</abbr> dedicated to
 tracking and versioning datasets (or any kind of large files). This way we would

From d9ad1ab2fb60e26fb2fdf6f51f5a6040b335cc2f Mon Sep 17 00:00:00 2001
From: Jorge Orpinel <jorge@orpinel.com>
Date: Thu, 21 Nov 2019 19:00:11 -0600
Subject: [PATCH 5/8] use-cases: improve intro to example in data registry case

---
 static/docs/use-cases/data-registry.md | 12 ++++++------
 1 file changed, 6 insertions(+), 6 deletions(-)

diff --git a/static/docs/use-cases/data-registry.md b/static/docs/use-cases/data-registry.md
index 45fc308360..cb8a07f0f3 100644
--- a/static/docs/use-cases/data-registry.md
+++ b/static/docs/use-cases/data-registry.md
@@ -44,17 +44,17 @@ Advantages of using a DVC **data registry** project:
 
 ## Example
 
-A dataset we use for several of our examples and tutorials contains 2800 images
-of cats and dogs. We split the dataset in two for our
-[Versioning Tutorial](/doc/tutorials/versioning), and backed up the parts on a
-storage server, downloading them with `wget` in our examples. This setup was
-then revised to download the dataset with `dvc get` instead, so we created the
+A dataset we commonly use for several of our examples and tutorials contains
+2800 images of cats and dogs. We split it in two for our
+[Versioning Tutorial](/doc/tutorials/versioning). Originally, the parts were
+backed up on a storage server, and downloaded with `wget`. This setup was then
+revised to download the dataset sing `dvc get` instead, so we created the
 [dataset-registry](https://github.com/iterative/dataset-registry)) repository, a
 <abbr>DVC project</abbr> hosted on GitHub, to version the dataset (see its
 [`tutorial/ver`](https://github.com/iterative/dataset-registry/tree/master/tutorial/ver)
 directory).
 
-However, there are a few problems with the way this dataset is structured. Most
+However, there are a few problems with the way that dataset is structured. Most
 importantly, this single dataset is tracked by 2 different
 [DVC-files](/doc/user-guide/dvc-file-format), instead of 2 versions of the same
 one, which would better reflect the intentions of this dataset... Fortunately,

From 50b772ea806d078e974b7144bc87419db0a498e1 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel <jorge@orpinel.com>
Date: Sat, 23 Nov 2019 00:09:24 -0600
Subject: [PATCH 6/8] use-cases: rephrase much of the data registry example to
 improve its logic and readability

per https://github.com/iterative/dvc.org/issues/795#issuecomment-557228299
---
 static/docs/use-cases/data-registry.md | 101 +++++++++++++------------
 1 file changed, 52 insertions(+), 49 deletions(-)

diff --git a/static/docs/use-cases/data-registry.md b/static/docs/use-cases/data-registry.md
index cb8a07f0f3..def518eb38 100644
--- a/static/docs/use-cases/data-registry.md
+++ b/static/docs/use-cases/data-registry.md
@@ -45,28 +45,29 @@ Advantages of using a DVC **data registry** project:
 ## Example
 
 A dataset we commonly use for several of our examples and tutorials contains
-2800 images of cats and dogs. We split it in two for our
+2800 images of cats and dogs, which was split it in two for our
 [Versioning Tutorial](/doc/tutorials/versioning). Originally, the parts were
-backed up on a storage server, and downloaded with `wget`. This setup was then
-revised to download the dataset sing `dvc get` instead, so we created the
-[dataset-registry](https://github.com/iterative/dataset-registry)) repository, a
-<abbr>DVC project</abbr> hosted on GitHub, to version the dataset (see its
+backed up on a storage server, and downloaded with
+[`wget`](https://www.gnu.org/software/wget/). This was then revised in order to
+download the parts with `dvc get` instead, so we created the
+[dataset-registry](https://github.com/iterative/dataset-registry)
+<abbr>project</abbr> to version the dataset (in the
 [`tutorial/ver`](https://github.com/iterative/dataset-registry/tree/master/tutorial/ver)
 directory).
 
-However, there are a few problems with the way that dataset is structured. Most
-importantly, this single dataset is tracked by 2 different
-[DVC-files](/doc/user-guide/dvc-file-format), instead of 2 versions of the same
-one, which would better reflect the intentions of this dataset... Fortunately,
-we have also prepared an improved alternative in the
+However, there's a few problems with the way that dataset is versioned. Most
+importantly, this split dataset is tracked by 2 different
+[DVC-files](/doc/user-guide/dvc-file-format) (one for each part), instead of 2
+versions of a single DVC-file. An initial version could have the first part
+only, while an update would have the entire, unified dataset. Fortunately, we
+have also prepared this improved alternative in the
 [`use-cases/`](https://github.com/iterative/dataset-registry/tree/master/use-cases)
 directory of the same <abbr>DVC repository</abbr>.
 
-To create a
-[first version](https://github.com/iterative/dataset-registry/tree/cats-dogs-v1/use-cases)
+To create the
+[initial version](https://github.com/iterative/dataset-registry/tree/cats-dogs-v1/use-cases)
 of our dataset, we extracted the first part into the `use-cases/cats-dogs`
-directory (illustrated below), and ran `dvc add use-cases/cats-dogs` to
-[track the entire directory](https://dvc.org/doc/command-reference/add#example-directory).
+directory, illustrated below:
 
 ```dvc
 $ tree use-cases/cats-dogs --filelimit 3
@@ -80,7 +81,10 @@ use-cases/cats-dogs
         └── dogs [400 image files]
 ```
 
-In a local DVC project, we could have obtained this dataset at this point with
+Then we ran `dvc add use-cases/cats-dogs` to
+[track the entire directory](https://dvc.org/doc/command-reference/add#example-directory).
+
+At this point, we could have obtained this dataset in another DVC project with
 the following command:
 
 ```dvc
@@ -90,15 +94,16 @@ $ dvc import git@github.com:iterative/dataset-registry.git \
 
 > Note that unlike `dvc get`, which can be used from any directory, `dvc import`
 > always needs to run from an [initialized](/doc/command-reference/init) DVC
-> project.
+> project. Remember also that with both commands, the data comes from the source
+> project's remote storage, not from the Git repository itself.
 
 <details>
 
 ### Expand for actionable command (optional)
 
 The command above is meant for informational purposes only. If you actually run
-it in a DVC project, although it should work, it will import the latest version
-of `use-cases/cats-dogs` from `dataset-registry`. The following command would
+it, although it will work, it will import the latest version of
+`use-cases/cats-dogs` from `dataset-registry`. The following command would
 actually bring in the version in question:
 
 ```dvc
@@ -112,54 +117,52 @@ See the `dvc import` command reference for more details on the `--rev`
 
 </details>
 
-Importing keeps the connection between the local project and the source data
-registry where we are downloading the dataset from. This is achieved by creating
-a particular kind of [DVC-file](/doc/user-guide/dvc-file-format) that uses the
-`repo` field (a.k.a. _import stage_). (This file can be used for versioning the
-import with Git.)
+Importing keeps the connection between the local <abbr>project</abbr> and the
+data source (registry <abbr>repository</abbr>). This is achieved by creating a
+particular kind of [DVC-file](/doc/user-guide/dvc-file-format) (a.k.a. _import
+stage_) that includes a `repo` field. (This file can be used staged and
+committed with Git.)
 
 > For a sample DVC-file resulting from `dvc import`, refer to
 > [this example](/doc/command-reference/import#example-data-registry).
 
-Back in our **dataset-registry** project, a
+Back in our **dataset-registry** project, the
 [second version](https://github.com/iterative/dataset-registry/tree/cats-dogs-v2/use-cases)
 of our dataset was created by extracting the second part, with 1000 additional
-images (500 cats, 500 dogs), into the same directory structure. Then, we simply
-ran `dvc add use-cases/cats-dogs` again.
+images (500 cats, 500 dogs) on top of the existing directory structure. Then, we
+simply ran `dvc add use-cases/cats-dogs` again.
 
-In our local project, all we have to do in order to obtain this latest version
-of the dataset is to run:
+All we would have to do in order to obtain this latest version in another
+project where the first version was previously imported, is to run:
 
 ```dvc
 $ dvc update cats-dogs.dvc
 ```
 
-This is possible because of the connection that the import stage saved among
-local and source projects, as explained earlier.
-
 <details>
 
 ### Expand for actionable command (optional)
 
-As with the previous hidden note, actually trying the commands above should
-produced the expected results, but not for obvious reasons. Specifically, the
-initial `dvc import` command would have already obtained the latest version of
-the dataset (as noted before), so this `dvc update` is unnecessary and won't
-have an effect.
+As with the previous hidden note, actually trying the command above will produce
+the desired results, but not for obvious reasons. The initial `dvc import`
+command would have already obtained the latest version of the dataset (as noted
+before), so this `dvc update` is unnecessary and won't have any effect.
 
-If you ran the `dvc import --rev cats-dogs-v1 ...` command instead, its import
-stage (DVC-file) would be fixed to that Git tag (`cats-dogs-v1`). In order to
-update it, do not use `dvc update`. Instead, re-import the data by using the
-original import command (without `--rev`). Refer to
-[this example](http://localhost:3000/doc/command-reference/import#example-fixed-revisions-re-importing)
-for more information.
+And if you ran the `dvc import --rev cats-dogs-v1 ...` command instead, its
+import stage (DVC-file) would be
+[fixed to that revision](/doc/command-reference/import#example-fixed-revisions-re-importing)
+(`cats-dogs-v1` tag), so `dvc update` would also be ineffective. In order to
+actually "update" it, re-import the data instead, by now running the initial
+import command (the one without `--rev`):
 
-</details>
+```dvc
+$ dvc import git@github.com:iterative/dataset-registry.git \
+             use-cases/cats-dogs
+```
 
-This downloads new and changed files in `cats-dogs/` from the source project,
-and updates the metadata in the import stage DVC-file.
+</details>
 
-As an extra detail, notice that so far our local project is working only with a
-local <abbr>cache</abbr>. It has no need to setup a
-[remotes](/doc/command-reference/remote) to [pull](/doc/command-reference/pull)
-or [push](/doc/command-reference/push) this dataset.
+This is possible because of the connection that the import stage saved among
+local and source projects, as explained earlier. The update downloads new and
+changed files in `cats-dogs/` based on the source project, and updates the
+metadata in the import stage DVC-file.

From 55ab757106eb8a19fe25317488fb3bbfcc97b4b9 Mon Sep 17 00:00:00 2001
From: Jorge Orpinel <jorge@orpinel.com>
Date: Sun, 24 Nov 2019 17:45:41 -0600
Subject: [PATCH 7/8] review usage of ellipses thoughout docs per
 https://github.com/iterative/dvc.org/pull/805#discussion_r349956273

---
 static/docs/command-reference/get.md          | 2 +-
 static/docs/command-reference/install.md      | 7 +++----
 static/docs/tutorials/deep/reproducibility.md | 2 +-
 static/docs/use-cases/data-registry.md        | 2 +-
 4 files changed, 6 insertions(+), 7 deletions(-)

diff --git a/static/docs/command-reference/get.md b/static/docs/command-reference/get.md
index 120b3c98a3..f1cbc6c6e2 100644
--- a/static/docs/command-reference/get.md
+++ b/static/docs/command-reference/get.md
@@ -163,7 +163,7 @@ different names, and not currently tracked by Git:
 $ git status
 ...
 Untracked files:
-  (use "git add <file>..." to include in what will be committed)
+  (use "git add <file> ..." to include in what will be committed)
 
 	model.bigrams.pkl
 	model.monograms.pkl
diff --git a/static/docs/command-reference/install.md b/static/docs/command-reference/install.md
index cda7101d8b..ff2c9710a2 100644
--- a/static/docs/command-reference/install.md
+++ b/static/docs/command-reference/install.md
@@ -155,7 +155,7 @@ checkout the `6-featurization` tag:
 $ git checkout 6-featurization
 Note: checking out '6-featurization'.
 
-You are in 'detached HEAD' state.  ...
+You are in 'detached HEAD' state...
 
 $ dvc status
 
@@ -216,7 +216,7 @@ We can now repeat the command run earlier, to see the difference.
 $ git checkout 6-featurization
 Note: checking out '6-featurization'.
 
-You are in 'detached HEAD' state. ...
+You are in 'detached HEAD' state...
 
 HEAD is now at d13ba9a add featurization stage
 
@@ -257,8 +257,7 @@ helpfully informs us the workspace is out of sync. We should therefore run the
 
 ```dvc
 $ dvc repro evaluate.dvc
-
-... much output
+...
 To track the changes with git run:
 
     git add featurize.dvc train.dvc evaluate.dvc
diff --git a/static/docs/tutorials/deep/reproducibility.md b/static/docs/tutorials/deep/reproducibility.md
index 1e3ad9fcb3..25d1e7024f 100644
--- a/static/docs/tutorials/deep/reproducibility.md
+++ b/static/docs/tutorials/deep/reproducibility.md
@@ -34,7 +34,7 @@ $ dvc repro model.p.dvc
 $ dvc repro
 ```
 
-Tries to reproduce the same pipeline... But there is still nothing to reproduce.
+Tries to reproduce the same pipeline, but there is still nothing to reproduce.
 
 ## Adding bigrams
 
diff --git a/static/docs/use-cases/data-registry.md b/static/docs/use-cases/data-registry.md
index def518eb38..52269b8745 100644
--- a/static/docs/use-cases/data-registry.md
+++ b/static/docs/use-cases/data-registry.md
@@ -9,7 +9,7 @@ With the aim to enable reusability of these versioned artifacts between
 different projects (similar to package management systems, but for data), DVC
 also includes the `dvc get`, `dvc import`, and `dvc update` commands. This means
 that a project can depend on data from an external <abbr>DVC project</abbr>, but
-chaining several projects this way can easily become messy...
+chaining several projects this way can easily become messy.
 
 Keeping this in mind, we could build a <abbr>DVC project</abbr> dedicated to
 tracking and versioning datasets (or any kind of large files). This way we would

From d125437dcfe5e7ac9a6b7665a6f5423d418bba7d Mon Sep 17 00:00:00 2001
From: Jorge Orpinel <jorge@orpinel.com>
Date: Sun, 24 Nov 2019 20:02:44 -0600
Subject: [PATCH 8/8] use-cases: remove remark about imports getting messy per
 https://github.com/iterative/dvc.org/issues/795#issuecomment-557943717 (and
 https://github.com/iterative/dvc.org/pull/805#pullrequestreview-321998559)

---
 static/docs/use-cases/data-registry.md | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/static/docs/use-cases/data-registry.md b/static/docs/use-cases/data-registry.md
index 52269b8745..b03433b9dc 100644
--- a/static/docs/use-cases/data-registry.md
+++ b/static/docs/use-cases/data-registry.md
@@ -8,8 +8,7 @@ tracking of datasets and any other <abbr>data artifacts</abbr>.
 With the aim to enable reusability of these versioned artifacts between
 different projects (similar to package management systems, but for data), DVC
 also includes the `dvc get`, `dvc import`, and `dvc update` commands. This means
-that a project can depend on data from an external <abbr>DVC project</abbr>, but
-chaining several projects this way can easily become messy.
+that a project can depend on data from an external <abbr>DVC project</abbr>.
 
 Keeping this in mind, we could build a <abbr>DVC project</abbr> dedicated to
 tracking and versioning datasets (or any kind of large files). This way we would