diff --git a/content/docs/command-reference/repro.md b/content/docs/command-reference/repro.md index ea924c7a82..8245f80434 100644 --- a/content/docs/command-reference/repro.md +++ b/content/docs/command-reference/repro.md @@ -13,7 +13,7 @@ usage: dvc repro [-h] [-q | -v] [-f] [-s] [-c ] [-m] [--dry] [-i] [--no-commit] [--downstream] [targets [targets ...]] positional arguments: - targets Stage or .dvc file to reproduce. 'Dvcfile' by default. + targets Stage or .dvc file to reproduce ``` ## Description @@ -40,9 +40,6 @@ There's a few ways to restrict the stages that will be regenerated by this command: by specifying stage file `targets`, or by using the `--single-item`, `--cwd`, or other options. -If specific [DVC-files](/doc/user-guide/dvc-files-and-directories) (`targets`) -are omitted, `Dvcfile` will be assumed. - `dvc repro` does not run `dvc fetch`, `dvc pull` or `dvc checkout` to get data files, intermediate or final results. @@ -101,8 +98,7 @@ only execute the final stage. (non-recursively) if multiple stage files are given as `targets`. - `-c `, `--cwd ` - directory within the project to reproduce from. - If no `targets` are given, it attempts to use `Dvcfile` in the specified - directory. Instead of using `--cwd`, one can alternately specify a target in a + Instead of using `--cwd`, one can alternately specify a target in a subdirectory as `path/to/target.dvc`. This option can be useful for example with subdirectories containing a separate pipeline that can either be reproduced as part of the pipeline in the parent directory, or as an @@ -169,7 +165,7 @@ only execute the final stage. ## Examples For simplicity, let's build a pipeline defined below. (If you want get your -hands-on something more real, see this shot +hands-on something more real, see this short [pipeline tutorial](/doc/tutorials/pipelines)). It takes this `text.txt` file: ``` @@ -184,18 +180,13 @@ best And runs a few simple transformations to filter and count numbers: ```dvc -$ dvc run -f filter.dvc -d text.txt -o numbers.txt \ +$ dvc run -n filter -d text.txt -o numbers.txt \ "cat text.txt | egrep '[0-9]+' > numbers.txt" -$ dvc run -f Dvcfile -d numbers.txt -d process.py -M count.txt \ +$ dvc run -n count -d numbers.txt -d process.py -M count.txt \ "python process.py numbers.txt > count.txt" ``` -> Note that using `-f Dvcfile` with `dvc run` above is optional, the stage file -> name would otherwise default to `count.txt.dvc`. We use `Dvcfile` in this -> example because that's the default stage file name `dvc repro` will read -> without having to provide any `targets`. - Where `process.py` is a script that, for simplicity, just prints the number of lines: @@ -213,23 +204,23 @@ The result of executing these `dvc run` commands should look like this: ```dvc $ tree . -├── Dvcfile <---- second stage with a default DVC name ├── count.txt <---- result: "2" -├── filter.dvc <---- first stage +├── dvc.lock <---- file to record pipeline state +├── dvc.yaml <---- file containing list of stages. ├── numbers.txt <---- intermediate result of the first stage ├── process.py <---- code that implements data transformation └── text.txt <---- text file to process ``` -You may want to check the contents of `Dvcfile` and `count.txt` for later +You may want to check the contents of `dvc.lock` and `count.txt` for later reference. -Ok, now, let's run the `dvc repro` command (remember, by default it reproduces -outputs tracked in `Dvcfile`, in this case `count.txt`): +Ok, now, let's run the `dvc repro` command: ```dvc $ dvc repro -WARNING: assuming default target 'Dvcfile'. +Stage 'filter' didn't change, skipping +Stage 'count' didn't change, skipping Data and pipelines are up to date. ``` @@ -247,17 +238,14 @@ If we now run `dvc repro`, we should see this: ```dvc $ dvc repro -WARNING: assuming default target 'Dvcfile'. -Stage 'Dvcfile' changed. -Reproducing 'Dvcfile' -Running command: - python process.py numbers.txt > count.txt -Output 'count.txt' doesn't use cache. Skipping saving. -Saving information to 'Dvcfile'. +Stage 'filter' didn't change, skipping +Running stage 'count' with command: + python3 process.py numbers.txt > count.txt +Updating lock file 'dvc.lock' ``` -You can now check that `Dvcfile` and `count.txt` have been updated with the new -information and updated dependency/output file hash values, and a new result, +You can now check that `dvc.lock` and `count.txt` have been updated with the new +information: updated dependency/output file hash values, and a new result, respectively. ## Example: Downstream @@ -277,14 +265,13 @@ Now, using the `--downstream` option results in the following output: ```dvc $ dvc repro --downstream -WARNING: assuming default target 'Dvcfile'. Data and pipelines are up to date. ``` -The reason being that the `text.txt` file is a dependency in the target -[DVC-file](/doc/user-guide/dvc-files-and-directories) (`Dvcfile` by default). -This `Dvcfile` stage is dependent on `filter.dvc`, which happens first in this -pipeline (shown in the following figure): +The reason being that the `text.txt` file is a dependency in the last stage of +the pipeline (used by default by `dvc repro`), This last `count` stage is +dependent on `filter` stage, which happens first in this pipeline (shown in the +following figure): ```dvc $ dvc dag @@ -296,6 +283,6 @@ $ dvc dag * * .---------. - | Dvcfile | + | count | `---------' ```