Skip to content

Commit

Permalink
run: review -d usage and consistency t/o docs
Browse files Browse the repository at this point in the history
per #1700
  • Loading branch information
jorgeorpinel committed Jan 4, 2021
1 parent 049204c commit 667550a
Show file tree
Hide file tree
Showing 5 changed files with 26 additions and 24 deletions.
10 changes: 5 additions & 5 deletions content/docs/command-reference/params/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@ Define a [stage](/doc/command-reference/run) that depends on params `lr`,
specify `layers` and `epochs` from the `train` group:

```dvc
$ dvc run -n train -d users.csv -o model.pkl \
$ dvc run -n train -d train.py -d users.csv -o model.pkl \
-p lr,train.epochs,train.layers \
python train.py
```
Expand Down Expand Up @@ -130,7 +130,7 @@ Alternatively, the entire group of parameters `train` can be referenced, instead
of specifying each of the group parameters separately:

```dvc
$ dvc run -n train -d users.csv -o model.pkl \
$ dvc run -n train -d train.py -d users.csv -o model.pkl \
-p lr,train \
python train.py
```
Expand All @@ -139,7 +139,7 @@ In the examples above, the default parameters file name `params.yaml` was used.
This file name can be redefined with a prefix in the `-p` argument:

```dvc
$ dvc run -n train -d logs/ -o users.csv \
$ dvc run -n train -d train.py -d logs/ -o users.csv -f \
-p parse_params.yaml:threshold,classes_num \
python train.py
```
Expand Down Expand Up @@ -182,7 +182,7 @@ The following [stage](/doc/command-reference/run) depends on params `BOOL`,
`INT`, as well as `TrainConfig`'s `EPOCHS` and `layers`:

```dvc
$ dvc run -n train -d users.csv -o model.pkl \
$ dvc run -n train -d train.py -d users.csv -o model.pkl \
-p params.py:BOOL,INT,TrainConfig.EPOCHS,TrainConfig.layers \
python train.py
```
Expand Down Expand Up @@ -227,7 +227,7 @@ can be referenced
supported), instead of the parameters in it:

```dvc
$ dvc run -n train -d users.csv -o model.pkl \
$ dvc run -n train -d train.py -d users.csv -o model.pkl \
-p params.py:BOOL,INT,TestConfig \
python train.py
```
Expand Down
22 changes: 11 additions & 11 deletions content/docs/command-reference/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,7 +73,7 @@ so on (see `dvc dag`). This graph can be restored by DVC later to modify or

```dvc
$ dvc run -n printer -d write.sh -o pages ./write.sh
$ dvc run -n scanner -d read.sh -d pages -o signed.pdf ./read.sh
$ dvc run -n scanner -d read.sh -d pages -o signed.pdf ./read.sh pages
```

Stage dependencies can be any file or directory, either untracked, or more
Expand Down Expand Up @@ -151,7 +151,7 @@ variables in it that should be evaluated dynamically. Examples:

```dvc
$ dvc run -n my_stage "./my_script.sh > /dev/null 2>&1"
$ dvc run -n my_stage './my_script.sh $MYENVVAR'
$ dvc run -n my_stage -f './my_script.sh $MYENVVAR'
```

## Options
Expand Down Expand Up @@ -317,17 +317,17 @@ dataset (`20180226` is a seed value):

```dvc
$ dvc run -n train \
-d matrix-train.p -d train_model.py \
-o model.p \
python train_model.py matrix-train.p 20180226 model.p
-d train_model.py -d matrix-train.p -o model.p \
python train_model.py 20180226 model.p
```

To update a stage that is already defined, the `-f` (`--force`) option is
needed. Let's update the seed for the `train` stage:

```dvc
$ dvc run -n train -f -d matrix-train.p -d train_model.py -o model.p \
python train_model.py matrix-train.p 18494003 model.p
$ dvc run -n train --force \
-d train_model.p -d matrix-train.p -o model.p \
python train_model.py 18494003 model.p
```

## Example: Separate stages in a subdirectory
Expand All @@ -341,7 +341,7 @@ $ cd more_stages/
$ dvc run -n process_data \
-d data.in \
-o result.out \
./my_script.sh data.in result.out
./my_script.sh --in data.in --out result.out
$ tree ..
.
├── dvc.yaml
Expand Down Expand Up @@ -379,7 +379,7 @@ Execute an R script that parses the XML file:
$ dvc run -n parse \
-d parsingxml.R -d data/Posts.xml \
-o data/Posts.csv \
Rscript parsingxml.R data/Posts.xml data/Posts.csv
Rscript parsingxml.R --in data/Posts.xml --out data/Posts.csv
```

To visualize how these stages are connected into a pipeline (given their outputs
Expand Down Expand Up @@ -421,9 +421,9 @@ Define a stage with both regular dependencies as well as parameter dependencies:

```dvc
$ dvc run -n train \
-d matrix-train.p -d train_model.py -o model.p \
-d train_model.py -d matrix-train.p -o model.p \
-p seed,train.lr,train.epochs
python train_model.py matrix-train.p model.p
python train_model.py 20200105 model.p
```

`train_model.py` will include some code to open and parse the parameters:
Expand Down
2 changes: 1 addition & 1 deletion content/docs/start/data-pipelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ $ dvc run -n prepare \
```

A `dvc.yaml` file is generated. It includes information about the command we ran
(`python src/prepare.py`), its <abbr>dependencies</abbr>, and
(`python src/prepare.py data/data.xml`), its <abbr>dependencies</abbr>, and
<abbr>outputs</abbr>.

<details>
Expand Down
8 changes: 5 additions & 3 deletions content/docs/use-cases/shared-development-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,8 +80,9 @@ Let's say you are cleaning up raw data for later stages:

```dvc
$ dvc add raw
$ dvc run -n clean_data -d raw -o clean ./cleanup.py raw clean
# The data is cached in the shared location.
$ dvc run -n clean_data -d cleanup.py -d raw -o clean \
./cleanup.py raw clean
# The data is cached in the shared location.
$ git add raw.dvc dvc.yaml dvc.lock .gitignore
$ git commit -m "cleanup raw data"
$ git push
Expand All @@ -97,7 +98,8 @@ manually. After this, they could decide to continue building this
$ git pull
$ dvc checkout
A raw # Data is linked from cache to workspace.
$ dvc run -n process_clean_data -d clean -o processed ./process.py clean process
$ dvc run -n process_clean_data -d process.py -d clean -o processed
./process.py clean processed
$ git add dvc.yaml dvc.lock
$ git commit -m "process clean data"
$ git push
Expand Down
8 changes: 4 additions & 4 deletions content/docs/user-guide/how-to/add-deps-or-outs-to-a-stage.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,13 +39,13 @@ output. To add a missing dependency (`data/raw.csv`) as well as a missing output
> dependency/output to the stage:
>
> ```dvc
> $ dvc run -f --no-exec \
> -n prepare \
> -d data/raw.csv \
> $ dvc run -n prepare \
> -f --no-exec \
> -d src/prepare.py \
> -d data/raw.csv \
> -o data/train \
> -o data/validate \
> python src/prepare.py
> python src/prepare.py data/raw.csv
> ```
>
> `-f` overwrites the stage in `dvc.yaml`, while `--no-exec` updates the stage
Expand Down

0 comments on commit 667550a

Please sign in to comment.