Skip to content

Commit

Permalink
document locking in dvc run/repro
Browse files Browse the repository at this point in the history
  • Loading branch information
efiop committed Jan 5, 2020
1 parent 0b607e9 commit c737e48
Show file tree
Hide file tree
Showing 2 changed files with 46 additions and 0 deletions.
38 changes: 38 additions & 0 deletions public/static/docs/command-reference/repro.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,6 +45,44 @@ files, intermediate or final results. It saves all the data files, intermediate
or final results into the <abbr>DVC cache</abbr> (unless `--no-commit` option is
specified), and updates stage files with the new checksum information.

### Running other dvc commands in parallel

See
[Running other dvc commands in parallel](/doc/command-reference/run#running-other-dvc-commands-in-parallel).

### Parallel stage execution

Currently `dvc repro` is not able to parallelize execution by itself (see
[iterative/dvc#755](https://github.com/iterative/dvc/issues/755)), so if you
need to do that you could launch multiple `dvc repro`s yourself. For example,
say your DAG looks something like:

```
$ dvc pipeline show --ascii result.py
+--------+ +--------+
| A1.dvc | | B1.dvc |
+--------+ +--------+
* *
* *
* *
+--------+ +--------+
| A2.dvc | | B2.dvc |
+--------+ +--------+
* *
** **
* *
+------------+
| result.dvc |
+------------+
```

so it consists of two pipeline branches (pipeline `A` and pipeline `B`) and the
final `result` stage. To reproduce both branches at the same time, you could run
`dvc repro A2.dvc` and `dvc repro B2.dvc` at the same time (e.g. by running them
in separate terminals). After both are done running, you could then run
`dvc repro result.dvc` that will see that both branches are already up-to-date
and will only run the final stage.

## Options

- `-f`, `--force` - reproduce a pipeline, regenerating its results, even if no
Expand Down
8 changes: 8 additions & 0 deletions public/static/docs/command-reference/run.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,14 @@ captures data and <abbr>caches</abbr> relevant <abbr>data artifacts</abbr> along
the way. See [this example](/doc/get-started/example-pipeline) to learn more and
try creating a pipeline.

### Running other dvc commands in parallel

When running your command, DVC will remove the project lock (`.dvc/lock` file),
so that you will be able to run other DVC commands (e.g. `dvc run`,
`dvc import`, `dvc repro` etc) in parallel. However, it uses per-path read-write
locking instead, to guarantee that no two DVC instances would be writing to the
same path and don't write to paths that are being read from by another instance.

### Avoiding unexpected behavior

We don't want to tell you how to write your code! However, please be aware that
Expand Down

0 comments on commit c737e48

Please sign in to comment.