diff --git a/content/blog/2019-03-05-march-19-dvc-heartbeat.md b/content/blog/2019-03-05-march-19-dvc-heartbeat.md
index 1b839f2225..863069e2e9 100644
--- a/content/blog/2019-03-05-march-19-dvc-heartbeat.md
+++ b/content/blog/2019-03-05-march-19-dvc-heartbeat.md
@@ -144,9 +144,9 @@ liking and see your data files listed there.
### Q: [Managing data and pipelines with DVC on HDFS](https://discordapp.com/channels/485586884165107732/485596304961962003/545562334983356426)
With DVC, you could connect your data sources from HDFS with your pipeline in
-your local project, by simply specifying it as an external dependency. For
-example let’s say your script `process.cmd` works on an input file on HDFS and
-then downloads a result to your local workspace, then with DVC it could look
+your local project, by specifying it as an external dependency. For example
+let’s say your script `process.cmd` works on an input file on HDFS and then
+downloads a result to your local workspace, then with DVC it could look
something like:
```dvc
diff --git a/content/blog/2019-05-21-may-19-dvc-heartbeat.md b/content/blog/2019-05-21-may-19-dvc-heartbeat.md
index da6653ce14..090340a9e0 100644
--- a/content/blog/2019-05-21-may-19-dvc-heartbeat.md
+++ b/content/blog/2019-05-21-may-19-dvc-heartbeat.md
@@ -256,9 +256,9 @@ $ dvc metrics show metrics.json \
There are a few options to add a new dependency:
-- simply opening a file with your favorite editor and adding a dependency there
- without md5. DVC will understand that that stage is changed and will re-run
- and re-calculate md5 checksums during the next DVC repro;
+- opening a file with your favorite editor and adding a dependency there without
+ md5. DVC will understand that that stage is changed and will re-run and
+ re-calculate md5 checksums during the next DVC repro;
- use `dvc run --no-exec` is another option. It will rewrite the existing file
for you with new parameters.
diff --git a/content/blog/2020-02-17-a-public-reddit-dataset.md b/content/blog/2020-02-17-a-public-reddit-dataset.md
index b7bb152bd4..ac5acd5cb3 100644
--- a/content/blog/2020-02-17-a-public-reddit-dataset.md
+++ b/content/blog/2020-02-17-a-public-reddit-dataset.md
@@ -110,7 +110,7 @@ you'll need to [install DVC](https://dvc.org/doc/install); one of the simplest
ways is `pip install dvc`.
Say you have a directory on your local machine where you plan to build some
-analysis scripts. Simply run
+analysis scripts. You run:
```dvc
$ dvc get https://github.com/iterative/aita_dataset \
@@ -225,7 +225,7 @@ $ dvc import https://github.com/iterative/aita_dataset \
```
Then, because the dataset in your workspace is linked to our dataset repository,
-you can update it by simply running:
+you can update it by running:
```dvc
$ dvc update aita_clean.csv
@@ -317,10 +317,10 @@ refine these existing methods. And there’s almost certainly room to push the
state of the art in asshole detection!
If you're interested in learning more about using Reddit data, check out
-[pushshift.io](https://pushshift.io/), a database that contains basically all of
-Reddit's content (so why make this dataset? I wanted to remove some of the
-barriers to analyzing text from r/AmItheAsshole by providing an
-already-processed and cleaned version of the data that can be downloaded with a
-line of code; pushshift takes some work). You might use pushshift's API and/or
-praw to augment this dataset in some way- perhaps to compare activity in this
-subreddit with another, or broader patterns on Reddit.
+[pushshift.io](https://pushshift.io/), a database that contains all of Reddit's
+content (so why make this dataset? I wanted to remove some of the barriers to
+analyzing text from r/AmItheAsshole by providing an already-processed and
+cleaned version of the data that can be downloaded with a line of code;
+pushshift takes some work). You might use pushshift's API and/or praw to augment
+this dataset in some way- perhaps to compare activity in this subreddit with
+another, or broader patterns on Reddit.
diff --git a/content/blog/2020-04-16-april-20-community-gems.md b/content/blog/2020-04-16-april-20-community-gems.md
index 64376744a2..c791cda4fb 100644
--- a/content/blog/2020-04-16-april-20-community-gems.md
+++ b/content/blog/2020-04-16-april-20-community-gems.md
@@ -106,7 +106,7 @@ $ dvc pull process_data_stage.dvc
You can also use `dvc pull` at the level of individual files. This might be
needed if your DVC pipeline file creates 10 outputs, for example, and you only
want to pull one (say, `model.pkl`, your trained model) from remote DVC storage.
-You'd simply run
+You'd run:
```dvc
$ dvc pull model.pkl
diff --git a/content/docs/api-reference/open.md b/content/docs/api-reference/open.md
index cddae26836..c5c0a8065e 100644
--- a/content/docs/api-reference/open.md
+++ b/content/docs/api-reference/open.md
@@ -113,7 +113,7 @@ should handle the event-driven parsing of the document in this case.) This
increases the performance of the code (minimizing memory usage), and is
typically faster than loading the whole data into memory.
-> If you just needed to load the complete file contents into memory, you can use
+> If you wanted to load the complete file contents into memory, you can use
> `dvc.api.read()` instead:
>
> ```py
@@ -127,7 +127,7 @@ typically faster than loading the whole data into memory.
## Example: Accessing private repos
-This is just a matter of using the right `repo` argument, for example an SSH URL
+The key for this is to use the right `repo` argument, for example an SSH URL
(requires that the
[credentials are configured](https://help.github.com/en/github/authenticating-to-github/connecting-to-github-with-ssh)
locally):
diff --git a/content/docs/command-reference/checkout.md b/content/docs/command-reference/checkout.md
index e0be68bc46..c92df101fd 100644
--- a/content/docs/command-reference/checkout.md
+++ b/content/docs/command-reference/checkout.md
@@ -102,8 +102,8 @@ be pulled from remote storage using `dvc pull`.
## Examples
-Let's employ a simple workspace with some data, code, ML models,
-pipeline stages, such as the DVC project created for the
+Let's create a workspace with some data, code, ML models, pipeline
+stages, such as the DVC project created for the
[Get Started](/doc/tutorials/get-started). Then we can see what happens with
`git checkout` and `dvc checkout` as we switch from tag to tag.
@@ -151,8 +151,8 @@ baseline-experiment <- First simple version of the model
bigrams-experiment <- Uses bigrams to improve the model
```
-We can now just run `dvc checkout` that will update the most recent `model.pkl`,
-`data.xml`, and other files that are tracked by DVC. The model file hash
+We can now run `dvc checkout` to update the most recent `model.pkl`, `data.xml`,
+and other files that are tracked by DVC. The model file hash
`662eb7f64216d9c2c1088d0a5e2c6951` will be used in the `train.dvc`
[stage file](/doc/command-reference/run):
diff --git a/content/docs/command-reference/commit.md b/content/docs/command-reference/commit.md
index 213e931c9f..8abaea9c67 100644
--- a/content/docs/command-reference/commit.md
+++ b/content/docs/command-reference/commit.md
@@ -44,7 +44,7 @@ further detailed below.
other change that doesn't cause changed stage outputs. However, DVC will
notice that some dependencies and have changed, and expect you to
reproduce the whole pipeline. If you're sure no pipeline results would change,
- just use `dvc commit` to force update the related DVC-files and cache.
+ use `dvc commit` to force update the related DVC-files and cache.
Let's take a look at what is happening in the first scenario closely. Normally
DVC commands like `dvc add`, `dvc repro` or `dvc run` commit the data to the
@@ -95,8 +95,8 @@ reproducibility in those cases.
## Examples
-Let's employ a simple workspace with some data, code, ML models,
-pipeline stages, such as the DVC project created for the
+Let's create a workspace with some data, code, ML models, pipeline
+stages, such as the DVC project created for the
[Get Started](/doc/tutorials/get-started). Then we can see what happens with
`git commit` and `dvc commit` in different situations.
diff --git a/content/docs/command-reference/import-url.md b/content/docs/command-reference/import-url.md
index ab5894f4d0..ed5c0e0b6d 100644
--- a/content/docs/command-reference/import-url.md
+++ b/content/docs/command-reference/import-url.md
@@ -29,7 +29,7 @@ external data source changes. Example scenarios:
- A shared dataset on a remote storage that is managed and updated outside DVC.
> Note that `dvc get-url` corresponds to the first step this command performs
-> (just download the file or directory).
+> (just downloads the file or directory).
The `dvc import-url` command helps the user create such an external data
dependency without having to manually copying files from the supported remote
@@ -78,7 +78,7 @@ Specific explanations:
is necessary to track if the specified remote file (URL) changed to download
it again.
-- `remote://myremote/path/to/file` notation just means that a DVC
+- `remote://myremote/path/to/file` notation means that a DVC
[remote](/doc/command-reference/remote) `myremote` is defined and when DVC is
running. DVC automatically expands this URL into a regular S3, SSH, GS, etc
URL by appending `/path/to/file` to the `myremote`'s configured base path.
diff --git a/content/docs/command-reference/install.md b/content/docs/command-reference/install.md
index d554b12662..b550ba4206 100644
--- a/content/docs/command-reference/install.md
+++ b/content/docs/command-reference/install.md
@@ -262,7 +262,7 @@ matching what is referenced by the DVC-files.
To follow this example, start with the same workspace as before, making sure it
is not in a _detached HEAD_ state by running `git checkout master`.
-If we simply edit one of the code files:
+Let's imagine we have modified the file `src/featurization.py`:
```dvc
$ vi src/featurization.py
diff --git a/content/docs/command-reference/list.md b/content/docs/command-reference/list.md
index 7464ed557b..3b7eef0941 100644
--- a/content/docs/command-reference/list.md
+++ b/content/docs/command-reference/list.md
@@ -19,8 +19,8 @@ positional arguments:
DVC, by effectively replacing data files, models, directories with DVC-files
(`.dvc`), hides actual locations and names. This means that you don't see data
files when you browse a DVC repository on Git hosting (e.g.
-Github), you just see the DVC-files. This makes it hard to navigate the project
-to find data artifacts for use with `dvc get`, `dvc import`, or
+Github), you see the DVC-files. This makes it hard to navigate the project to
+find data artifacts for use with `dvc get`, `dvc import`, or
`dvc.api`.
`dvc list` prints a virtual view of a DVC repository, as if files and
diff --git a/content/docs/command-reference/metrics/show.md b/content/docs/command-reference/metrics/show.md
index 398a5d4ee4..8bece891e5 100644
--- a/content/docs/command-reference/metrics/show.md
+++ b/content/docs/command-reference/metrics/show.md
@@ -32,7 +32,7 @@ compares them with a previous version.
## Options
- `-a`, `--all-branches` - print metric file contents in all Git branches
- instead of just those present in the current workspace. It can be used to
+ instead of using those present in the current workspace. It can be used to
compare different experiments. Note that this can be combined with `-T` below,
for example using the `-aT` flag.
diff --git a/content/docs/command-reference/pull.md b/content/docs/command-reference/pull.md
index 4905677898..31023dd114 100644
--- a/content/docs/command-reference/pull.md
+++ b/content/docs/command-reference/pull.md
@@ -35,7 +35,7 @@ The default remote is used (see `dvc config core.remote`) unless the `--remote`
option is used. See `dvc remote` for more information on how to configure a
remote.
-With no arguments, just `dvc pull` or `dvc pull --remote `, it downloads
+With no arguments, use `dvc pull` or `dvc pull --remote `, it downloads
only the files (or directories) missing from the workspace by searching all
[DVC-files](/doc/user-guide/dvc-file-format) currently in the
project. It will not download files associated with earlier commits
@@ -59,7 +59,7 @@ reflinks or hardlinks to put it in the workspace without copying. See
## Options
- `-a`, `--all-branches` - determines the files to download by examining
- DVC-files in all Git branches instead of just those present in the current
+ DVC-files in all Git branches instead of those present in the current
workspace. It's useful if branches are used to track experiments or project
checkpoints. Note that this can be combined with `-T` below, for example using
the `-aT` flag.
@@ -94,7 +94,7 @@ reflinks or hardlinks to put it in the workspace without copying. See
- `-j `, `--jobs ` - number of threads to run simultaneously to
handle the downloading of files from the remote. The default value is
- `4 * cpu_count()`. For SSH remotes, the default is just `4`. Using more jobs
+ `4 * cpu_count()`. For SSH remotes, the default value is `4`. Using more jobs
may improve the total download speed if a combination of small and large files
are being fetched.
@@ -136,7 +136,7 @@ The workspace looks almost like in this
└── train.dvc
```
-We can now just run `dvc pull` to download the most recent `data/data.xml`,
+We can now run `dvc pull` to download the most recent `data/data.xml`,
`model.pkl`, and other DVC-tracked files into the workspace:
```dvc
diff --git a/content/docs/command-reference/push.md b/content/docs/command-reference/push.md
index 65f41494e4..137e42d1f9 100644
--- a/content/docs/command-reference/push.md
+++ b/content/docs/command-reference/push.md
@@ -54,9 +54,9 @@ none are specified on the command line nor in the configuration. The default
remote is used (see `dvc config core.remote`) unless the `--remote` option is
used. See `dvc remote` for more information on how to configure a remote.
-With no arguments, just `dvc push` or `dvc push --remote REMOTE`, it uploads
-only the files (or directories) that are new in the local repository to remote
-storage. It will not upload files associated with earlier commits in the
+With no arguments, `dvc push` or `dvc push --remote REMOTE`, it uploads only the
+files (or directories) that are new in the local repository to remote storage.
+It will not upload files associated with earlier commits in the
repository (if using Git), nor will it upload files that have not
changed.
@@ -73,7 +73,7 @@ to push.
## Options
- `-a`, `--all-branches` - determines the files to upload by examining DVC-files
- in all Git branches instead of just those present in the current workspace.
+ in all Git branches instead of using files present in the current workspace.
It's useful if branches are used to track experiments or project checkpoints.
Note that this can be combined with `-T` below, for example using the `-aT`
flag.
@@ -103,7 +103,7 @@ to push.
- `-j `, `--jobs ` - number of threads to run simultaneously to
handle the uploading of files from the remote. The default value is
- `4 * cpu_count()`. For SSH remotes, the default is just `4`. Using more jobs
+ `4 * cpu_count()`. For SSH remotes, the default value is `4`. Using more jobs
may improve the total download speed if a combination of small and large files
are being fetched.
diff --git a/content/docs/command-reference/remote/add.md b/content/docs/command-reference/remote/add.md
index 04d9cce860..1d1e676290 100644
--- a/content/docs/command-reference/remote/add.md
+++ b/content/docs/command-reference/remote/add.md
@@ -197,9 +197,8 @@ $ dvc remote add -d myremote "azure://"
To start using a GDrive remote, fist add it with a
[valid URL format](/doc/user-guide/setup-google-drive-remote#url-format). Then
-simply use any DVC command that needs it (e.g. `dvc pull`, `dvc fetch`,
-`dvc push`), and follow the instructions to connect your Google Drive with DVC.
-For example:
+use any DVC command that needs it (e.g. `dvc pull`, `dvc fetch`, `dvc push`),
+and follow the instructions to connect your Google Drive with DVC. For example:
```dvc
$ dvc remote add -d myremote gdrive://0AIac4JZqHhKmUk9PDA/dvcstore
diff --git a/content/docs/command-reference/status.md b/content/docs/command-reference/status.md
index 8f8a3d6d98..b6589956b3 100644
--- a/content/docs/command-reference/status.md
+++ b/content/docs/command-reference/status.md
@@ -107,11 +107,10 @@ workspace) is different from remote storage. Bringing the two into sync requires
(specified in the `core.remote` config option).
- `-a`, `--all-branches` - compares cache content against all Git branches
- instead of just the current workspace. This basically runs the same status
- command in every branch of this repo. The corresponding branches are shown in
- the status output. Applies only if `--cloud` or a `-r` remote is specified.
- Note that this can be combined with `-T` below, for example using the `-aT`
- flag.
+ instead of the current workspace. This basically runs the same status command
+ in every branch of this repo. The corresponding branches are shown in the
+ status output. Applies only if `--cloud` or a `-r` remote is specified. Note
+ that this can be combined with `-T` below, for example using the `-aT` flag.
- `-T`, `--all-tags` - same as `-a` above, but applies to Git tags as well as
the workspace. Note that both options can be combined, for example using the
diff --git a/content/docs/command-reference/update.md b/content/docs/command-reference/update.md
index 4c6d359d99..ba885f3fe5 100644
--- a/content/docs/command-reference/update.md
+++ b/content/docs/command-reference/update.md
@@ -70,8 +70,8 @@ Importing 'model.pkl (git@github.com:iterative/example-get-started)'
As DVC mentions, the import stage (DVC-file) `model.pkl.dvc` is created. This
[stage file](/doc/command-reference/run) is frozen by default though, so to
[reproduce](/doc/command-reference/repro) it, we would need to run
-`dvc unfreeze` on it first, then `dvc repro` (and `dvc freeze` again). Let's
-just run `dvc update` on it instead:
+`dvc unfreeze` on it first, then `dvc repro` (and `dvc freeze` again). Let's run
+`dvc update` on it instead:
```dvc
$ dvc update model.pkl.dvc
diff --git a/content/docs/tutorials/get-started/data-access.md b/content/docs/tutorials/get-started/data-access.md
index 9416e69a1c..1ad3b3767d 100644
--- a/content/docs/tutorials/get-started/data-access.md
+++ b/content/docs/tutorials/get-started/data-access.md
@@ -25,11 +25,11 @@ cats-dogs.dvc
The benefit of this command over browsing a Git hosting website is that the list
includes files and directories tracked by **both Git and DVC**.
-## Just download it
+## Download it
-One way is to simply download the data with `dvc get`. This is useful when
-working outside of a DVC project environment, for example in an
-automated ML model deployment task:
+One way is to download the data with `dvc get`. This is useful when working
+outside of a DVC project environment, for example in an automated
+ML model deployment task:
```dvc
$ dvc get https://github.com/iterative/dataset-registry \
diff --git a/content/docs/tutorials/get-started/data-pipelines.md b/content/docs/tutorials/get-started/data-pipelines.md
index 977b7978fd..8f138cb169 100644
--- a/content/docs/tutorials/get-started/data-pipelines.md
+++ b/content/docs/tutorials/get-started/data-pipelines.md
@@ -163,9 +163,9 @@ This would be a good point to commit the changes with Git. This includes any
## Reproduce
-Imagine you're just cloning the repository created so far, in
-another computer. It's extremely easy for anyone to reproduce the result
-end-to-end, by using `dvc repro`.
+Imagine you're cloning the repository created so far, in another
+computer. It's extremely easy for anyone to reproduce the result end-to-end, by
+using `dvc repro`.
@@ -198,7 +198,7 @@ executes the necessary commands to rebuild all the pipeline
## Visualize
Having built our pipeline, we need a good way to understand its structure.
-Seeing a graph of connected stage files would help. DVC lets you do just that,
+Seeing a graph of connected stage files would help. DVC lets you do that,
without leaving the terminal!
```dvc
diff --git a/content/docs/tutorials/get-started/data-versioning.md b/content/docs/tutorials/get-started/data-versioning.md
index 650beb3787..d7e817c78f 100644
--- a/content/docs/tutorials/get-started/data-versioning.md
+++ b/content/docs/tutorials/get-started/data-versioning.md
@@ -228,8 +228,8 @@ after `git clone` and `git pull`.
### 👉 Expand to simulate a fresh clone of this repo
-Let's just remove the directory added so far, both from workspace
-and cache:
+Let's remove the directory added so far, both from workspace and
+cache:
```dvc
$ rm -f datadir .dvc/cache/a3/04afb96060aad90176268345e10355
diff --git a/content/docs/tutorials/get-started/experiments.md b/content/docs/tutorials/get-started/experiments.md
index 6035d19d2d..1f1fdcfae5 100644
--- a/content/docs/tutorials/get-started/experiments.md
+++ b/content/docs/tutorials/get-started/experiments.md
@@ -139,7 +139,7 @@ back and forth. To find the best-performing experiment or track the progress,
described in one of the previous sections).
Let's run evaluate for the latest `bigrams` experiment we created earlier. It
-mostly takes just running the `dvc repro`:
+mostly takes running the `dvc repro`:
```dvc
$ git checkout master
diff --git a/content/docs/tutorials/pipelines.md b/content/docs/tutorials/pipelines.md
index 5283c3ea2c..1565a41528 100644
--- a/content/docs/tutorials/pipelines.md
+++ b/content/docs/tutorials/pipelines.md
@@ -183,9 +183,9 @@ outs:
persist: false
```
-Just like the DVC-file we created earlier with `dvc add`, this stage file uses
-`md5` hashes (that point to the cache) to describe and version
-control dependencies and outputs. Output `data/Posts.xml` file is saved as
+Like the DVC-file we created earlier with `dvc add`, this stage file uses `md5`
+hashes (that point to the cache) to describe and version control
+dependencies and outputs. Output `data/Posts.xml` file is saved as
`.dvc/cache/a3/04afb96060aad90176268345e10355` and linked (or copied) to the
workspace, as well as added to `.gitignore`.
@@ -331,8 +331,8 @@ $ dvc metrics show
It's time to save our [pipeline](/doc/command-reference/pipeline). You can
confirm that we do not tack files or raw datasets with Git, by using the
-`git status` command. We are just saving a snapshot of the DVC-files that
-describe data, transformations (stages), and relationships between them.
+`git status` command. We are saving a snapshot of the DVC-files that describe
+data, transformations (stages), and relationships between them.
```dvc
$ git add *.dvc auc.metric data/.gitignore
diff --git a/content/docs/understanding-dvc/how-it-works.md b/content/docs/understanding-dvc/how-it-works.md
index 732433fca5..4f4a6b28e5 100644
--- a/content/docs/understanding-dvc/how-it-works.md
+++ b/content/docs/understanding-dvc/how-it-works.md
@@ -84,7 +84,7 @@
$ cd myrepo
$ git pull # download tracked data from remote storage
$ dvc checkout # checkout data files
- $ ls -l data/ # You just got gigabytes of data through Git and DVC:
+ $ ls -l data/ # You downloaded gigabytes of data through Git and DVC:
total 1017488
-r-------- 2 501 staff 273M Jan 27 03:48 Posts-test.tsv
diff --git a/content/docs/use-cases/shared-development-server.md b/content/docs/use-cases/shared-development-server.md
index 4131d5af6b..b808958eed 100644
--- a/content/docs/use-cases/shared-development-server.md
+++ b/content/docs/use-cases/shared-development-server.md
@@ -103,7 +103,7 @@ $ git commit -m "process clean data"
$ git push
```
-And now you can just as easily make their work appear in your workspace with:
+And now you can make their previous work appear in your workspace with:
```dvc
$ git pull
diff --git a/content/docs/user-guide/external-dependencies.md b/content/docs/user-guide/external-dependencies.md
index fbd42f231f..274121349c 100644
--- a/content/docs/user-guide/external-dependencies.md
+++ b/content/docs/user-guide/external-dependencies.md
@@ -35,8 +35,8 @@ directory.
## Examples
As examples, let's take a look at a [stage](/doc/command-reference/run) that
-simply moves a local file from an external location, producing a `data.txt.dvc`
-stage file (DVC-file).
+moves a local file from an external location, producing a `data.txt.dvc` stage
+file (DVC-file).
> Note that some of these commands use the `/home/shared` directory, typical in
> Linux distributions.
diff --git a/content/docs/user-guide/managing-external-data.md b/content/docs/user-guide/managing-external-data.md
index 994640eadc..e45ad57b17 100644
--- a/content/docs/user-guide/managing-external-data.md
+++ b/content/docs/user-guide/managing-external-data.md
@@ -43,7 +43,7 @@ in the same external/remote file system first.
## Examples
For the examples, let's take a look at a [stage](/doc/command-reference/run)
-that simply moves local file to an external location, producing a `data.txt.dvc`
+that moves local file to an external location, producing a `data.txt.dvc`
DVC-file.
### Local file system path
diff --git a/src/utils/shared/expiration.js b/src/utils/shared/expiration.js
index 7e426ea7b3..5b882008f7 100644
--- a/src/utils/shared/expiration.js
+++ b/src/utils/shared/expiration.js
@@ -17,7 +17,7 @@ function getExpirationDate({ date, expires }) {
/*
This is the primary logic to check if a date is expired,
- It simply uses Moment to parse a date input and comparse that to the current
+ It uses Moment to parse a date input and comparse that to the current
time.
Use this on the result of getExpirationDate to get both pieces of
information.