cmd ref: clarify ... (argparse remainder) in dvc run command ar…

…g, and - complete `dvc remote add` `name` arg help output same as in iterative/dvc@ac5a37c - change "aka" for "a.k.a" throughout docs
iterative · Jun 27, 2019 · 81b4400 · 81b4400
1 parent 99bb9ff
commit 81b4400
Show file tree

Hide file tree

Showing 3 changed files with 22 additions and 13 deletions.
diff --git a/static/docs/commands-reference/remote_add.md b/static/docs/commands-reference/remote_add.md
@@ -17,7 +17,7 @@ usage: dvc remote add [-h] [--global] [--system] [--local] [-q | -v]
                       [-d] [-f] name url
 
 positional arguments:
-  name           Name.
+  name           Name of the remote.
   url            URL. (See supported URLs below.)
 ```
 

diff --git a/static/docs/commands-reference/run.md b/static/docs/commands-reference/run.md
@@ -12,24 +12,27 @@ usage: dvc run [-h] [-q | -v] [-d DEPS] [-o OUTS] [-O OUTS_NO_CACHE]
                [--ignore-build-cache] [--remove-outs] [--no-commit]
                [--outs-persist OUTS_PERSIST]
                [--outs-persist-no-cache OUTS_PERSIST_NO_CACHE]
-               command
+               ...
 
 positional arguments:
   command               Command to execute.
 ```
 
 ## Description
 
-`dvc run` provides an interface to build a computational graph (aka pipeline).
-It's a way to describe commands, data inputs and intermediate results that went
-into a model (or other data results). By explicitly specifying a list of
-dependencies (with `-d` option) and outputs (with `-o`, `-O`, `-m`, or `-M`
-options) DVC can connect individual stages (commands) into a directed acyclic
-graph (DAG). `dvc repro` provides an interface to check state and reproduce this
-graph later. This concept is similar to the one of the `Makefile` but DVC
-captures data and caches data artifacts along the way. Check this
-[example](/doc/get-started/example-pipeline) to learn more and try to build a
-pipeline.
+`dvc run` provides an interface to build a computational graph (a.k.a.
+pipeline). It's a way to describe commands, data inputs and intermediate results
+that go into creating a ML model (or other data results). By explicitly
+specifying a list of dependencies (with `-d` option) and outputs (with `-o`,
+`-O`, `-m`, or `-M` options) DVC can connect each individual stage (command)
+into a directed acyclic graph (DAG). All the command-line input provided to
+`dvc run` after the optional arguments (`-` or `--` dashed options) will become
+the required `command` argument.
+
+> Remember to wrap the `command` with `"` quotes if there are special characters
+> in it like `|` (pipe) or `<`, `>` (redirection) that would otherwise apply to
+> the entire `dvc run` command. E.g.
+> `dvc run -d script.sh "script.sh > /dev/null 2>&1"`
 
 Unless the `-f` options is used, by default the DVC-file name generated is
 `<file>.dvc`, where `<file>` is file name of the first output (`-o`, `-O`, `-m`,
@@ -42,6 +45,12 @@ graph integrity properties before creating a new stage. For example, for every
 output there should be only one stage that explicitly specifies it. There should
 be no cycles, etc.
 
+Note that `dvc repro` provides an interface to check state and reproduce this
+graph later. This concept is similar to the one of the `Makefile` but DVC
+captures data and caches data artifacts along the way. Check this
+[example](/doc/get-started/example-pipeline) to learn more and try to build a
+pipeline.
+
 ## Options
 
 - `-d`, `--deps` - specify a file or a directory the stage depends on. Multiple

diff --git a/static/docs/user-guide/large-dataset-optimization.md b/static/docs/user-guide/large-dataset-optimization.md
@@ -80,7 +80,7 @@ efficiency:
    > instead deleted and then replaced with a new file, otherwise it might cause
    > cache corruption – and automatic deletion of cached files by DVC.
 
-3. **`symlink`** - symbolic (aka "soft") links are the most efficient way to
+3. **`symlink`** - symbolic (a.k.a. "soft") links are the most efficient way to
    link your data to cache if your repo and your cache directory are located on
    different file systems/drives (i.e. repo is located on SSD for performance,
    but cache dir is located on HDD for bigger storage).